Written and Maintained by Gregory Nacu

Featured Posts

C64OS.com has grown to be more than just a blog about one developer's progress, it is becoming a resource to surround and support a type of C64 user that wants to benefit from the Commodore community and get the most out of modern hardware expansions for their beloved platform.

After writing many posts on the C64 OS weblog, the unfortunate reality is that some of my best work gets lost in the stream of news and developments. Be sure not to miss these full–length editorial reviews:

May 16, 2017Editorial

Review: FREEZE64 Fanzine

December 5, 2016Editorial

World of Commodore '16

Programming Reference

August 4, 2017Programming Reference

6502 / 6510 Instruction Set

August 4, 2017Programming Reference

Commodore 64 PETSCII Codes

August 3, 2017Programming Reference

Commodore 64 Screen Codes


Needs some ideas? Trying searching for:
PETSCII, Animation, Memory or Pointers

Recent Posts

September 18, 2017Programming Theory

Organizing a Big Module

September 11, 2017Programming Theory

Toolkit Introduction

August 15, 2017Programming Theory

Organizing Module Layout

August 4, 2017Programming Reference

6502 / 6510 Instruction Set

August 4, 2017Programming Reference

Commodore 64 PETSCII Codes

August 3, 2017Programming Reference

Commodore 64 Screen Codes

August 1, 2017Programming Theory

Base Conversion in 6502 (2/2)

July 21, 2017Hardware

Commodore Logo Mark Patch

July 5, 2017Programming Theory

Object Orientation in 6502

June 30, 2017Programming Theory

Base Conversion in 6502 (1/2)

June 20, 2017Software

Huge Site Update

June 5, 2017Software

Recursive File Copier in BASIC

May 29, 2017Technical Deep Dive

Implementing Factorial in 6502

May 16, 2017Editorial

Review: FREEZE64 Fanzine

May 9, 2017Programming Theory

Pointers in Practice, Menus

May 1, 2017Programming Theory

Loading Sequential Files

April 27, 2017Programming Theory

HomeBase Applications

April 21, 2017Programming Theory

Application Loading

April 6, 2017Programming Theory

Memory Manager Development

March 27, 2017Software

Petscii Art Animation

March 27, 2017Programming Theory

Making Course Corrections, Timers

March 21, 2017Software

PETSCII Art Renderer

March 6, 2017Programming Theory

Code Module Exports Table

March 1, 2017Technical Deep Dive

Recursion and C64 OS's Menu UI (3/3)

February 22, 2017Technical Deep Dive

C-Style Structures in 6502 (2/3)

Older Posts

Full Post Archive

Subscribe to C64OS.com with your favorite RSS Reader

News, Editorials, Progress and Reference

September 18, 2017Programming Theory

Organizing a Big Module

Let's start with a description of the Toolkit drawing system.

As should be evident from my last post, Toolkit Introduction, I've been hard at work (during whatever spare time I can manage) designing and building out the Toolkit module of C64 OS. I spent the weekend (re)implementing two of the biggest components of the view drawing system which I haven't even had a chance to discuss yet here. To be honest, I've been avoiding discussing it, because it's been so in flux. But, given that this is a blog about my progress, maybe I should be talking more regularly about the thoughts I've been having and the dead ends I've hit and the things I've tried. Drawing, in general, is a complex topic. So it's really hard to cover in a single post. I will eventually dedicate an entire post just to talking about how the drawing system works.

In brief, there is a global structure, maintained by the screen module, called the draw context. It defines the screen and color memory origins (2 byte pointers each), the width and height, 1 byte each because they're in character cells, and these define a rectangular region on the screen. The draw rect is always an on–screen area measured in character cells, so width and height don't need to be 16-bit. 40 columns and 25 rows is well below needing 16-bits. Additionally, the draw context has two more values, offset top and offset left. These are both 16-bit and represent the scroll offsets of the draw context.

The scroll offsets are what make things particularly complex, but as far as I can tell they're necessary. They are what allow the hierarchy of nested views to be scrolled. And they have to be 16-bit otherwise nothing would be able to scroll more than just a few screens.

Every view has the ability to draw itself, and when it does it uses the details in the current draw context to know how much space is available, where it starts in memory, and how offset its origin is from the virtual origin of the view itself (those are the scroll offsets). Typically a view like a button, a label, a checkbox or whatever just have to draw themselves. Views do have the ability to have children, of course, but the leaf views don't have children. A button, for example, even though it inherits node properties (including a child pointer) will never actually have children. In fact if you instantiate a view and assign it as the child of a button, that child will never be drawn, because button's draw routine does not attempt to pass drawing control to any children, because it assumes it doesn't have any.

The View view, aka the root class from which all the other views descend, implements a draw routine which on its own doesn't need to do much except clear the rectangle defined by the draw context. But View is also the main container for laying out children. So, if you want to have a UI that puts two scroll views side by side (or one above the other) for a sort of two-up dual pane, those two scroll views are siblings of each other, and they share a parent. That parent is most likely View. View has a special feature of its draw routine that recursively calls draw on each of its child views. This logic is quite complex, so one wouldn't want to implement it twice, as you'll see very shortly.

There are two other container views, Scroll view and Split view. Scroll view is effectively just View but with scrollbars that can be interacted with to change the offset top and offset left properties of its own draw context. And split view maintains two children and an interactable control for changing the draw context's height of its two children (if horizontal split) or width (if vertical split). But these two container views do not reimplement View's draw logic. They just set the draw context in a custom way, set a couple of custom pointers, and then call View's own draw routines. And that is a big benefit of object oriented code.

When View is recursively walking through its children there are actually three steps it needs to go through. And two of these are what I was working on finishing up this weekend. They are:

  • Resize Node
  • Bounds Check
  • Recontext

A resize occurs when the width or height metrics of a view changes. A global resize flag is set and a view's size is recomputed. This has down stream effects, because views can be anchored in such a way that their size changes as a result of their parent changing size. A resize is usually caused by interaction with a split view but view metrics can also be changed programmatically. If they are one simply has to set the global resize flag manually. There are some efficiencies baked into how this works so that not every view has to have its size recomputed, and one of the flags of the view_rsmask is used to help determine if the view needs to be resized.

Next a bounds check is performed. The containing view, before recursing to its next child uses the metrics of the child and compares against the draw context to determine if any part of the child will be visible on screen. If the scroll offsets are set in such a way that the child is scrolled out of view, then the parent simply skips over this child and moves on to the next.

Recontext is a step that will take me a moment to unpack. Each view, when it draws itself, is drawing itself into a global 10-byte draw context, as described earlier. However, that global draw context changes as the system recursively moves through the view hierarchy. Later on, let's say you mouse down on a button, or you type a character when an input view is in focus, in such a case only that one singular view is updated and needs to redraw itself, (to highlight the button, or insert a character, etc.) But the global draw context is no longer relevant to the view that needs to redraw. We could redraw the entire screen but that would be much too slow. Instead, each view maintains a copy of the draw context, as it was when the view last drew itself.

Those contexts become out–of–date, however, if there is a scroll or resize. So another global flag indicates if a recontext needs to happen. The recontext routine takes the current draw context, for the parent or containing view, and modifies it (it always either stays the same or gets smaller, it cannot ever become bigger) according to the offset and size metrics of the child. The child then backs up that new context onto itself. And lastly it draws itself. When the recontext flag is not set, each time a view is to be drawn the global context is set from the copy of the context on the view, which is much less computationally expensive than the recontext logic. In the event that a single view has to be redrawn, the context is simply copied from that view to the global context and then its draw routine is called as normal. It has no idea it isn't being called as part of an entire screen refresh.

This is very cool stuff. I'm having lots of fun. This is easily my favorite type of code to work on.

The devil in the details, the problem of complex code

So the above was as quick an overview as I can give about how drawing works without getting into the nitty gritty technical details. But I feel it was necessary to give at least this level of detail on how it works to understand the level of complexity being worked with.

Last night, after midnight, as I was getting tired and ready to wrap up, I got the dreaded error message:


The last time I went through this and I asked about it on IRC, everyone seemed to nod in agreement that the time is now right for me to switch to cross assembly. As opposed to the coding native on a C64 (actually a C128) that I've been doing up to this point. A C64's limited memory and computational capacity puts limitations not just on what can be run, but also what can be coded. There is a limit to how many labels can be used. Assembly programming labels stand in for constants, and for memory addresses that will only resolved at assemble–time.

When I first started learning 65021 (coming up on a year ago next month) I was so wet behind the ears that I just started coding everything into one big file. It didn't take me more than a couple of months before I encountered my first Labelname Overflow. I resisted then the temptation and near universal advice to switch to cross assembly, by coming up with an ability to break the project into more manageable modules. This solution and its subsequent refinements to make it more workable are well documented across several posts:

Now I've hit this problem again. Except instead of the entire C64 OS project being too big and unwieldy, just the Toolkit module itself has become too big and unwieldy! It's a similar situation but I think it needs a different solution. And I don't yet know what that's going to be.

Here are some stats. The main source file, toolkit.a, is 56 blocks. A block is 256 bytes, minus the two byte link pointer. So a rough measurement in kilobytes is to divide block count by 4. Toolkit is therefore about 14 kilobytes. But that's the source code, which is full of comments so I don't forget how all this stuff is supposed to work. Assembled, last I assembled it, was about 7 blocks. Or, less than 2K. Less than 2 kilobytes, on a machine with 64 kilobytes of ram. Toolkit's code is under 3% of total available memory and I'm already overflowing the labels! How does anyone write a game that fills or nearly fills the C64's memory?

I think the answer to that question is that they either cross assemble, or huge regions of memory are dedicated to sprites, graphics, music and level data, leaving a much smaller area of memory for the game's engine code in the first place. Or the parts of the game are divided up manually into areas of memory where the connections between the parts can be hardcoded.

As an operating system, C64 OS obviously should try to take up as little memory as it can, leaving as much free memory for the application and its data as possible. This should be on my side for not running into label overflows. But I've also got at least two things working against me.

Toolkit is without a doubt the largest and most complex module of the bunch. But furthermore, because it is object oriented, it is also the most label heavy. I mean, the definition of the view class is effectively just a long list of all the labels that represent offsets to its properties. This problem is exacerbated by another interesting limitation.

Macros are super handy for not having to type everything out long form. They're particularly useful when you're doing lots of 16-bit math and pointer manipulation. But when you call a macro, all of its arguments have to fit on one 40 column line. The macro name automatically gets indented 8 spaces. So, after a macro name that's 8 or so characters, plus spaces, commas and the # symbol, you end up with only 22 or so characters for the names of the arguments. If any one of the names of the arguments exceeds 7 characters, there is suddenly not enough room on the line for just 3 arguments.

I've yet to write a macro that needed 4 arguments, but 3 is a common pattern. Many of the label names for properties are like this: view_draw, view_kcmd, view_kprnt etc. Even short names like this are 9 characters! Now imagine a macro call like this (the preceding white space is part of the line length!):

        #setobj16 this,view_kprnt,view_kprnt_

That is a 45 character line. Aka, it's impossible to type out. And macros (in Turbo Macro Pro+reu) cannot have their arguments spill onto a second line. One way to get around this is to define some temporary labels: vkp = view_kprnt and vkp_ = view_kprnt_. If you do this on the line above the macro call it's close enough in the code that the call remains legible... but you've just blown two more labels on nothing but making a macro call possible.

Another problem that Toolkit faces is that it seems to be the one module that leans most heavily on the resources of almost every other module. It needs math, string, memory, screen, service and input. (Math for 16bit divides and multiplies plus 16-bit macros, String for character conversion and length measurements, Memory for allocating and freeing space for new objects, Screen for the draw context system, Service for environment variables for system colors, Input for reading event objects.) Toolkit is a monster that seems to need to depend on bits and pieces of at least 6 out of only 9 or 10 modules total. Reading in header files and constants files bumps up the label count some more.

It's a tough problem.

But I'm not giving up. And I'm not caving in and switching to cross assembly. The first time I hit this problem I worked out a great solution. And I'll find a solution to this problem too. I have a few ideas in mind that will ease the pressure:

  • Move some big and label–heavy routines (boundschk, resizenode, recontext) out of Toolkit and into a smaller but related module (like Screen).
  • Shorten object property labels: view_... to vw_... or even v_...
  • Reduce the number of property labels by joining some properties conceptually. width and height could be a 4-byte dimension property, accessed as v_dim+2 etc.
  • Split some long external includes into multiple include files so one can be included without including and incorporating the labels from the others.
  • Unmacro a few things that really don't need to be macro'd.
  • Use hardcoded offsets for short-distance branches, rather than highly localized labels like next, skip or loop.

If I do all of the above, I should be back in business for some time to come, and may even be able to get most of Toolkit completed within the remaining constraints. If the above is not enough and I encounter Labelname Overflow again, I could try to split the Toolkit classes into separate files. I would like to avoid that, however, because there is definitely overhead, both memory, execution and organizational.

Thanks for reading. Until next post.

UPDATE: September 19, 2017

Last night I got to work implementing my remediation plan above. But first I decided to do a manual count of how many unique labels are in use by Toolkit. My rough counting came very close to 256, so I'm going to guess that's the key number, for obvious reasons.

On the one hand it feels like quite a few labels, but considering I'm able to use 30 or so in just one single routine, it feels like a cripplingly small number. Fortunately, to my discovery, recontext, resizenode, and boundschk are all called, conditionally, but only one other routine, drawchildren. I am going to move all 4 routines, lock, stock and barrel, to the screen module. And I only have to expose one new jumptable entry. drawchildren. Draw Children was actually something I'd already factored out of View's main draw routine, so it can be called easily by other container views.

  1. I dabbled with 6502 ASM over the course of my long history as a Commodore user. But I never made anything of any substantial complexity. Whatever I knew I forgot and had to completely relearn. When I was programming apps for WiNGs, it was 99.9% in C. []
September 11, 2017Programming Theory

Toolkit Introduction

A few housekeeping issues to cover up front. I want to say, welcome back, because it has been an unusually long while since my last blog post. I absolutely do not intend for this to become a habit. A few things have conspired against me to make this post a while in the making.

It is the late summer, so I had a bit of a holiday, which I thoroughly enjoyed. It took me and my kids and in-laws to a cottage and away from the internet for a few days. But on the whole it didn't keep me away from my c64, as we'll get into in this post. This website suffered some down time during the past week. I believe it was down for approximately 4 days. I wish to extend my sincerest apologies for the outage. We were actually without internet service for the entire labor-day weekend. This is annoying for me, no doubt, but it also happens to be annoying for you too, if you're trying to get to one of several sites that I host locally.1 And as is inevitable when the kids go back to school they brought home novel illnesses that have kept me away from work for a couple of days.

On the bright side, I've been hard at work on C64 Luggable. The documentation of which is coming along quite well. And I've taken many many photos of more recent work on the project than I have yet had time to document. And I've been working away on new additions for the Commodore 8-Bit Buyer's Guide. I have well over 35 (!) new products, parts and components that I will be adding to the guide. Including the catalog items from Poly.Play and Retro Innovations, and a number of independents from AmiBay.org, Lemon64, and ebay.

And let's not forget all the work I've been doing on C64 OS too. It's been a really fun adventure so far. I am continuing to learn about 6502 coding along the way. The Toolkit, as we'll see in this post, involves a lot of 16-bit math, and much of it on values accessed through object pointers, as I began to discuss in that post. I'm sure I'll get into more of that as I continue to discuss the Toolkit.

What is a toolkit?

An operating system is made up of many parts that take care of all sorts of tasks for the application developer. Memory management, abstract file access, string manipulation, input drivers, networking, etc. The toolkit is the part of the OS that helps an application build its user interface. The concept of a toolkit is probably the component that is the most absent from the C64's (and PET, Vic-20 and C128's) built-in operating system. GEOS, on the other hand, offers services for producing menus, both horizontal and vertical, buttons, dialog boxes, actionable icons, single line text input, and well, that's about where its toolkit ends. But that's a big step up from nothing. The toolkit-like features of GEOS are what help to give every GEOS application a standard look and feel.

On Linux, which runs on PCs with much more memory and processing power, toolkits can be dynamically loaded and different applications based on different toolkits can all be running side–by–side. This is actually detrimental to the Linux desktop experience because not all apps feel as though they belong to the same environment. On Windows or macOS, which have a dominant vendor–supplied toolkit, virtually all apps on those platforms use the standard toolkit and consequently feel much more like they belong together.

Because the C64's OS (Kernal and Basic rom) do not provide services for producing a user interface, everyone produces there own. Amongst the chaos there are a few OSes, the applications for which usually (but not always) feel as though they belong to that OS. C64 OS will fit into this latter category.

What does the C64 OS toolkit do?

Unlike on Linux where multiple different toolkits can be and often are available to applications simultaneously (GTK, QT, See: Wikipedia Article on Cross-Platform Toolkits), the C64's small memory size means it is only practical for there to be one toolkit available at a time. And in most cases that toolkit is tightly integrated with the other features of the OS anyway. There is no way, for example, to swap out the UI drawing features of GEOS.

The Toolkit in C64 OS is a module, but its location in memory is fixed, and the objects it produces are designed to interact with other services of the OS. For example, it gets its events from the main event loop in the screen module. The structure of the events it expects are built by the updatemnk routine in the input module. It makes calls to allocate memory for itself to the memory module. All the while drawing itself with a drawing context system also provided by the screen module. The essential behavior of an application is to link its functionality to the actions of toolkit objects that interpret the flow of events being generated by the input devices. This is also what we mean by program flow being event–driven.

The Toolkit is object oriented. I began to discuss how one can go about writing code following an object oriented design pattern in 6502 in a previous post, Object Orientation in 6502. In that post I began to talk about the Toolkit in order to have examples. Object oriented code is, by definition, structured as a hierarchy of interrelated classes. Sub-classes descend from other classes and inherit and extend their functionality. In rich modern UI Toolkits, such as the Cocoa framework of macOS, or Cocoa Touch of iOS (its little brother), there are literally hundreds of classes. And each class has hundreds of methods (related functions which operate on the object's own properties). UIButton, in Cocoa Touch, for example, has 32 methods. But these do not include the 21 methods it inherits from UIControl, nor the ~164 (!!!) methods it inherits from UIView.2 And so on up the inheritence chain to UIResponder and UIObject. Even declaring this number of methods would overflow 64K of memory before we got around to implementing anything. And a toolkit is only one part of what it takes to make an operating system.

Needless to say, the C64 OS Toolkit is very trimmed down compared to what we might otherwise bemoan as the endless bloat of modern toolkits. But the principle is similar. Toolkit is a collection of classes, some of which descend from others. They work together allowing a programmer to create and assemble them to construct a flexible user interface that efficiently redraws itself, responds to user input, and calls back to application code when necessary to implement the specifics of the program. The Toolkit relieves the application developer from a huge burden of effort and results in more consistency across applications and enables rich functionality for free3 in the process.

As of the time of this writing, Toolkit consists of just 6 classes. These have already been briefly discsussed in the earlier post on Object Orientation in 6502. I have several other classes planned, checkbox will likely descend from Button, radio will likely descend from checkbox, and a multi-line text view is an absolute must–have, but I haven't yet figured out where in the hierarchy it will fit.

Object hierarchy of views

This lean (by modern standards) class hierarchy may look unimaginably small, but I think you'd be surprised how much UI complexity can be constructed out of such an essential core.

Sample UI screen shot, extruded with labeled parts

A sample of UI, extruded to show the nesting of views, and labeled.

All Toolkit classes descend from View. View provides several collections of properties and methods which make it a foundation of several types of functionality. We'll just dig right into those here.

Node Hierarchy

A view-based user interface is a hierarchical tree of nodes. Therefore, each class that participates in the UI needs to be a type of node that can connect to the others to allow application code and other toolkit built-in functionality to navigate the tree. The View class therefore provides node properties. And since all Toolkit classes descend from view, all Toolkit classes have these properties and are all therefore types of nodes. The node properties are as few as I believe it is possible to have and still have it work. Each view has a parent pointer, a first-child pointer, and a next-sibling pointer.

Don't confuse the node hierarchy with the inheritance hierarchy pictured above. In a real UI views will be nested within views, and buttons may be stacked one above the other or put inside scroll views. Labels will be put before inputs and so on. The inheritance hierarchy is hardcoded and never changes, but the node hierarchy of a UI could be totally different for every application.

Every node has exactly one containing parent node. The parent node pointer points to that containing node. There is always one root view, which usually fills the screen, but doesn't have to, and has no parent node. Its parent node pointer is null, ($0000), which allows code that navigates the tree to know when it has reached the root view.

Any node can theoretically contain multiple child nodes. However, only the View class has draw logic which is designed to deal with multiple children. Typically if you want a node to handle multiple children you would rely on the View's implementation because it's long and complex. Each node has a pointer only to a single child node. But if that child node is one of many children of the same parent, then the first child uses its next sibling pointer to link to the parent's second child. The second child can link to the third and so on. Each child points back to its parent even if it is not the first child. The last child of a parent is the last node in a chain of siblings, it has its next sibling pointer set null. This is how code can determine that it has reached the final sibling.

If a node's first child pointer is null, it has no children. If a node's first child's next sibling pointer is null, then the parent has only one child. And so on. These three pointers are enough to describe the entire tree and allows recursive code to navigate the entire tree. Navigating up the tree is very efficient because every node has a pointer directly to its parent all the way back to the root node with a null parent pointer. An advantage to one child pointer and sibling pointers is that we don't need to have an intermediate data structure such as an array to hold an ordered set of child pointers.

Sample UI node hierarchy

Above is visualized the node hierarchy of a very plausible C64 OS application UI. Note that the node hierarchy defines the structural relationship between the user interface objects, but alone it doesn't define where on the screen siblings will display relative to each other. That part comes in the next section about view metrics.

Here we see that there is a single root view, in the top row. Its parent pointer is null. It has three children, a Scroll view, and two more Views. But the root view only points directly to the Scroll view. The other two Views are linked horizontally as siblings to the Scroll view. The rightmost View in the second row is the last child of the root view, so its next sibling pointer is null. The Scroll view has one child, a multi-line text view. The middle View has four children, two Labels and two Inputs. Presumably these would be laid out on screen such that each Input has one Label. The final View has three children, three Buttons.

Layout Metrics

The node properties are merely structural. In addition to structure each view needs to know where it should position itself. The positioning of a node is always relative to its parent. In modern UI Toolkits, I'm most familiar with Cocoa and Cocoa Touch from macOS and iOS respectively, views support special layout constraint objects. This allows a node to align, position and size itself relative to not just its parent but to its siblings as well. While this does enable the production of incredibly flexible and responsive layouts, C64 OS's Toolkit won't implement anything like that. Firstly, it's far too complex for the C64's memory constraints, and secondly, the size and orientation of a C64's screen has been what it is since 1982. There would be little advantage to having a complex system of constraints meant to adapt a UI flexibly to a variety of different screen sizes and orientations.

Each view must be able to describe its size and position, relative to its parent. To handle this the View class provides several metrics properties, which all other Toolkit classes inherit from View. These properties are:

  • view_top
  • view_bot
  • view_abot
  • view_hght

  • view_left
  • view_rght
  • view_argt
  • view_wdth

  • view_rsmask

First let's look at view_rsmask. This is a bitfield for flags that affect resizing behavior. At the time of this writing, the low nybble is well defined, but exactly how the upper nybble is being used is still in flux while I write the code so I won't talk about those yet. Here are the values for the lower four bits.

  • %0000 0001 — rs_ankt
  • %0000 0010 — rs_ankb
  • %0000 0100 — rs_ankl
  • %0000 1000 — rs_ankr

These flags stand for: Anchor Top, Anchor Bottom, Anchor Left, and Anchor Right. These declare which sides of the view have a fixed offset from its parent. These define which of the other 8 metrics properties are pre–defined and which are computed dynamically. So let's look at some examples of how this might work.

A view must have at least one vertical anchor, and at least one horizontal anchor. If no vertical anchor is set it defaults to top and no horizontal anchor defaults to left. When the view is anchored, say, to the top, the view_top property defines the distance (in 8x8 cells, not pixels) that the view's top edge sits down from the top edge of its parent. These values are all unsigned 16-bit, so a view cannot be offset negatively from its parent. A view can either be flush with the top of its parent or any offset down, upto 65535 text rows, from its parent's top.

If the view is anchored top, but not bottom, then its view_hght property must be set and is used to figure out how tall the view is. In such a case, vertically resizing the view's parent has no affect on its own height. The situation is similar if the view is anchored bottom but not top. The view_bot property holds the number of rows that this view's bottom edge is positioned up from the bottom of its parent. The view_hght is still relevant to determine how tall the view is and it is still unaffected by vertical resizes to its parent.

If the description is hard to follow, here's a visualization that should help.

Visualization of Top and Bottom anchoring

The anchor flags can be OR'd together, of course. So rs_mask could be: rs_ankt | rs_ankb.

Things get more complicated when a view is anchored both top and bottom. You can see how this works in the third example above. view_top defines how far its top edge is from its parent's top, and view_bot defines how far its bottom edge is from its parent's bottom. But when the view is anchored on both sides then when the parent is resized vertically the height of the view changes, tracking the height changes of its parent.

Whatever way the view is anchored, some of its properties get computed automatically. Let's start with view_hght. If the view is rs_ankt | rs_ankb, the view_hght is computed and set automatically. When it comes to actually drawing the view, what the drawing code really needs to know is the absolute top and absolute bottom offsets, from the draw origin, for where the view will render after any anchoring and positioning logic has been applied. As it happens the draw origin is at the top,left of any given on–screen rectangle. The reason for this is because of the way the VIC-II's memory addresses map to positions on the screen. The top,left corner of the screen is the smallest memory address. And the bottom,right corner of the screen is the biggest memory address. This is true for any arbitrary rectangle you draw on the screen. The top,left corner of that rectangle will always have the smallest memory address of any address within that rectangle.

The consequence of this is that view_top, is both the relative offset from the top of the parent, but it's also the absolute top of the view from the draw origin. This is not true of view_bot. view_bot is relative to the bottom of the parent. So the smaller the view_bot value, the lower that edge goes on the screen. A view_bot value of 5 says nothing about where the bottom edge of the view is going be relative to the draw origin. That's what view_abot is about. view_abot stands for absolute bottom, and it is always a computed property. If the view is rs_ankt, then view_abot is view_top + view_hght. If the view is rs_ankb, then the height of the parent has to be taken into account. view_abot is parent->view_hght - view_bot. And view_top is then computed as view_abot - view_hght.

Visualization of Absolute Bottom computed property

At the end of the day, no matter how the view is anchored, the resizenode routine (called internally, never manually), makes sure that all 4 properties are set correctly: view_top, view_bot, view_abot, and view_hght. After this point, the drawing code can completely disregard any positioning complexities due to anchoring and offsets. It simply draws the view between view_top and view_abot, and view_hght is readily available as a reference. Drawing is way beyond the scope of this post, however, and I'll have to return to it at a later date.

I intentionally limited myself in the above description only to the vertical sizing and anchoring properties. The horizontal sizing and anchoring work in exactly the same way. view_left is analogous to view_top. It is relative to the left side of the parent, but also is the absolute offset from the left coordinate of the draw origin. Therefore a view_argt (absolute right) is computed depending on how the left and right anchoring are configured. The logic here is exactly the same, just along a different axis. And of course, rs_ankl and rs_ankr are represented by different bits in the view_rsmask than rs_ankt and rs_ankb, so all 4 can be set independently and simultaneously.

One more brief note before moving on to a different topic. The above description does not even attempt to broach the issue of scrolling, and what effect scrolling has on the calculation of where a view's children will render, or how they get clipped if they are only partially visible. This is of course all taken into consideration in the design of Toolkit, but is way beyond the scope of this introduction.

As we'll no doubt see in concrete examples in future posts, I believe this anchoring and offsets system (which is based loosely on springs and struts which predate autolayout constraints in cocoa/cocoa touch), can be used to make user interfaces that are very flexible. Probably more flexible than anything else available on the C64, and yet simple enough to be eminently suitable for use in C64 OS.

Event Propagation

One of the main talking points I use when describing C64 OS is that it is event driven. Part of what this means has already been discussed in an earlier post about The Event Model. The IRQ service routine, which in C64 OS is implemented in the service module, updates the mouse and keyboard which converts user input activity into 3 queues of input events. Mouse events, Key Command events and Printable Key events. The mouse events are particularly relevant to Toolkit because the mouse cursor is passing over Toolkit views rendered on–screen, and the user is clicking on those views feeling him or herself to be interacting with them.

However, mouse events contain only: screen coordinates, key modifier flags and an event type. Somehow a simple event struct like this has to convert into meaningful interaction with the underlying on–screen display.

The Toolkit needs to be able to determine the target view, which is the first view that has an opportunity to do something with the event. And then each view needs a way to figure out how to pass notification about the existence of the event if it can't or doesn't handle it itself. This behavior is called event propagation.

Every class in the Toolkit needs to know how to, at a minimum, propagate an event to another view. And so for this reason the View class has a set of methods for event propagation. And every other view subclass automatically inherits the basic ability to propagate events.

To handle the first requirement, Toolkit has a hit test routine. This routine effectively walks the node hierarchy, recursively, searching for the uppermost view that is rendered under the screen coordinates where the mouse event occurred. This is actually easier than it sounds. All of a view's child views are constrained (clipped even, when drawn) to within the bounds of their parent. Therefore, if the mouse event does not fall within the bounds of a given view none of its subviews, or their subviews, etc. need to be checked. Siblings must be checked, however, and it is possible for two sibling views to overlap each other.

This introduces a small complication, because first children are drawn first, and their siblings are drawn later. This means if a late drawn sibling has view metrics that cause it to overlap with with an earlier sibling, it will render above that earlier sibling. What this means is that when doing hit testing, the walking routine must start with the last sibling and if the event did not occur within its bounds the hit test should then move to the previous sibling. Otherwise, sibling 1 could claim the hit, even though it is covered by a section of sibling 2 that the user believes he or she is clicking on. See the visualization below to understand how this works.

Visualization of hit testing priority on overlapping siblings

In the example above, first and second child are two children of the same parent. Their metrics are such that their bounds overlap. The red point represents where the user clicked. That coordinate is technically within the bounds of both children. However, because the first child renders before the second child, the second child draws itself overtop of the first child, where they overlap. Thus, when performing the hit test, it is necessary to test the second child prior to the first child.

Actually propagating the event is the second requirement, once a final leaf node (a node which has no children of its own) is found to be the target that node needs to get the event. In C64 OS, the event data itself is not passed, i.e. it is not copied in memory, to the target node. The node is merely notified that an event is targeting it. Toolkit does this by calling the appropriate event routine. The View class has 3 event handling routines, one for each of the 3 event types, mouse, key command and printable key. In our example of the mouse event, then, the Toolkit calls the target node's view_mouse routine.

This merely informs the view that a mouse event is available. If the view handles mouse events in some way, then it can call readmouse in the input module. This will give it the current mouse event details. I'll discuss this further in the responder section below.

By default, the View class is mostly just a generic superclass and container for multiple children. The default implementation of view_mouse is not to read or analyze the event details at all, but just to propagate the event to the next view. It does this simply by calling view_mouse on its own parent in the node hierarchy. This continues up the node hierarchy until a node either handles the event and returns without propagating it further, or the root node is encountered. The root node has no parent node to propagate the event to, so it simply returns.

Key events, both Key Commands and Printable Key events, are handled similarly to each other. They inherently do not contain screen coordinates, so they do not target arbitrary views in the node hierarchy. Instead, they target special views that claim keyboard focus. Only one view may claim focus at a time. Toolkit maintains a pointer to that view, if no view wants to be in focus, or if a view in focus wants to lose its focus (also lovely said to blur) then the root view is assigned as the focus view.

A lower level than the Toolkit, the screen module, which implements the main event loop, also divides the screen renderer into 4 compositing layers. The top most (4th) layer is fixed as the layer onto which the menubar (with CPU busy indicator, top level menus, and clock) and any pull-down menus are rendered. This causes them to always render above everything else. The lower three screen layers can be pushed and pulled by the application from a 3-layer stack. Typically the application's initialization routine pushes layer 1 onto the screen stack. Each screen layer struct has pointers to 4 routines:

  • sldraw
  • slmouse
  • slkcmd
  • slkprnt

During initialization, the application, if it makes 100% use of the Toolkit, can wire these four routines directly to the four equivalent routines of Toolkit. So that when the main event loop interacts with the low level screen compositor the Toolkit routines are called directly without needing the application to intermediate. The separation of screen compositor and Toolkit is to allow a more advanced application to completely forgo use of the Toolkit and handle the low level events manually (most useful for a game or other special purpose), or to share responsibility with the Toolkit. The application can get the low level events first, do with them as it pleases and choose when and if to forward them to the Toolkit.

That was a bit of a tangent. Regardless of how the Key Command and Printable Key events are handed over to the Toolkit, it directs them to the view in focus. And does this by calling that view's view_kcmd or view_kprnt routine respectively.

These routines, in their default implementation by the View class, do exactly what view_mouse does. They merely propagate the events, passing notification, by calling the same routine on their parent node, until something handles the event or the root node is encountered.


The last topic I'll cover in this Toolkit Introduction, is responding. As we saw in event propagation above, although the events are propagated through the node hierarchy, the default behavior, implemented by the View class, doesn't do anything. Except pass the notification to the next view which does equally little. The question is, how does anything get accomplished by means of these events?

All Toolkit classes descend from View, and so they inherit this do–nothing–but–propagate behavior for free. Some subclasses however are meant to respond to events. In some cases this response behavior is entirely internal to the class, requiring no special involvement from the application code. An example of this is an input field. An input view is meant to allow the user to input and edit the text that the field manages. If a mouse event is propagated to an input the input reads the mouse event details from the screen module. If its type is a click event, it tells the Toolkit that it should be the new focus view for all incoming key events. And it configures the system's cursor settings so the cursor begins blinking in the right place, and it returns without propagating the click event any further.

Input events also have a disabled flag, however, which can be configured by the application's code at runtime. If the input receives notice that a mouse event targets it, its first checks its own disabled flag. If it is disabled, it calls its superclass's implementation of this routine. This is very likely to be View's implementation, which as we already know, propagates the event to the input's parent view. Neither of these responses to the event necessitate calling back to the application.

Other subclasses of view, such as button, have both internal and external responses to mouse events. When the left mouse button goes down, it generates a leftdown event. This event is propagated, but when a button gets this event, it marks itself to be redrawn, and when it redraws it redraws with an inverted color. This is an entirely internal behavior, it requires no interaction with the application to appear to be highlighting in response to the user mousing down on it. Similarly when the button goes up, a leftup event is generated. (How mouse up events are routed is a bit complicated, and out of scope for this introduction). The button responds to a leftup by redrawing itself without the inverted color, thus appearing to no longer be highlighted.

Buttons are different than Inputs in at least one important way though, when the input was clicked it prepared itself to handle key events, which is an internal change. However, a button has no inherent, internal change as a result of being clicked. The application needs to be informed that the button was clicked. For this reason, the button subclass adds an additional but_clicked property. This is a pointer to a routine. When the application is being initialized, or at some later time in response to some other activity, the application sets the button's but_clicked pointer to a routine the application has implemented.

The button's own reimplementation of view_mouse receives all mouse events as part of the node hierarchy event propagation system. But button's implementation of this routine calls readmouse to get the event details. If it's leftdown or leftup it changes its highlight state as described earlier, but if it's leftclick it checks to see if its but_clicked property is set. If it is, it calls it, passing a pointer to itself in .X and .Y. And then returns without propagating the event.

The application's own routine is now running. It can, in the simple case, do some fixed behavior, something that should always be done as a result of this button having been pressed. Under more complex conditions, more than one button may be set up to call the same clicked routine. The routine needs to distinguish which button was clicked that triggered the call. It can do this by comparing the pointer to the button in .X and .Y with some reference the app maintains to the that button, or it can write the pointer to a ZP address and lookup and access properties on the button itself. This could be used to change the button's text, or change its hidden state, or change its layout metrics, or read its TK_ID to figure out by ID which button this is to figure out what to do next.

Okay, so that concludes this rather long blog post that I've been putting off writing for so long as the Toolkit module has been under such active design and development. In fact, just in writing this post it's given me some ideas about how to improve the inheritance model.

Inheritance was the topic I left off at the end of Object Orientation in 6502 with just a little preview. The truth is, I hadn't quite figured out in my head the full logic of how it would be possible, in your own application, to create custom subclasses of Toolkit classes. But I think I'm almost there.

So, expect that post to be coming sometime in the not too distant future, a part two to Object Orientation in 6502. And after that it will make sense to return to look in more detail at how some of the Toolkit views make use of their object orientation, especially when it comes to drawing themselves.

Lastly, I suspect large swaths of this post will eventually make it into the C64 OS technical documentation. Especially the images I put together to illustrate some of the ideas.

  1. Although it's neither the fastest nor the most reliable way to host a website, I host many websites on my own hardware and internet infrastructure, because it's something of an enjoyable hobby for me. I offload heavy assets to an AWS S3 bucket to distribute load and cut down on bandwidth costs. This doesn't help when we lose internet altogether as was the case this past week. []
  2. It's a bit tricky to tell the difference between a method and a direct access to a property in modern languages like Swift. The notation to read a property is exactly the same notation as to call a method. And the underlying implementation can be changed from a direct read to a method call at any time without changing the code that accesses it. 164 comes from my rough count in the documentation, and includes property accessors. []
  3. Every feature built into the toolkit classes are features that don't have to be explicitly implemented by the application. Yet they automatically avail themselves simply because the UI is built from these standard classes. Think about momentarily inverting the color of a button to provide feedback when it is clicked. That comes free in every C64 OS application. []
August 15, 2017Programming Theory

Organizing Module Layout

At the beginning of the year I wrote a post, Organizaing a Big Project. At that time, I had just split apart what had been a monolithic code file into a series of more managable modules. At the time of this writing they're documented in the C64 OS technical documentation as:

  1. memory
  2. petscii
  3. input
  4. string
  5. screen
  6. menu
  7. service
  8. file
  9. toolkit, and
  10. network

However, this list is in such flux that even though that documentation was updated just 2 weeks ago, this list of modules has already changed. I've merged petscii, (which only had 3 exports: asc2pet, pet2asc and pet2scr) into the string module which is likely to gain far more functionality in the coming months. Plus, I added a math module, 16-bit division and multiplication routines that started their life in toolkit due to their usefulness there, have been moved to their own module so they can be reused by other modules and by C64 OS applications. And lastly, the system jump table has been extricated from memory and made into its own module, thus making the memory module more of a peer to the others.

Suffice to say, the modules are in a state of flux. As I add new routines, optimize old routines and move routines from one module to another, the assembled object sizes of the modules keeps changing. However, they have to be packed together in memory, such that the end of one module is followed by the start of another module. If they overlap the module lower in memory will overwrite the first part of the module higher in memory that's already loaded in. Thus corrupting it and leading to instant and hair-pulling crashes. If there is space left between the modules, well, then that's just wasted memory that can never be allocated, found or utilized.

In addition to the problem of packing the modules together in memory, the main system jump table also has to know where each module starts, so that it knows where to find the module's table of vectors through which to jump. If that weren't enough, there is also the issue that the jump table itself is in a state of flux. With the addition of the math module, I just added mul16 and div16 to the middle to the jump table. Thus, the modules that consume the exports from other modules have to know where in the jump table to find the routine.

Each of these issues has a solution that I worked out at the beginning of the year. But how these solutions work has proven to be quite laborious for me to upkeep.

In order to pack the modules, I have to know how big each is. So I start with the first, the one highest in memory, and assemble it to somewhere arbitrary. The assembler tells me the start and end address of the object code. (And hopefully I don't get any phase errors, which seem to be fairly easy to produce if you use a label prior to defining it.) I write down that size. Then I go to the next module, load it up, assemble it and record its size. I do this for all 10 (or so) modules.

Then I pull out my calculator, a hex/oct/bin/dec converting Casio my brother helpfully gave me for christmas many moons ago. And I start with the last address plus one, so $CFFF+1, or $D000. And subtract the size of the first module to find where it should start, and I write that down. From that address I subtract the size of the next module, and write down its start address. Repeat this for all the modules. Eventually I end up with a table like this:

Module size and start address table

You can see in the image of my notebook that as I work on the code, I have to recompute the sizes and start addresses when it comes time to test my work. Sometimes, such as in the top middle column (second set of size/start columns), I've only worked on a few modules, so I can leave the others alone and just recompute the offsets for the few I worked on. Still, this process is painstaking and boring, and prone to mistakes leading to nasty bugs.

After I've calculated and written down this table of start addresses, I have to go back to each module, open its main file and manually set the initial start address. (*= $ce34) for memory, for example. Then I have to reassemble this module to an object file.

The system Jump Table is another issue. The Jump Table needs to know where each of these modules starts, so I open that module and update a set of labels with these new start addresses. Each jmp is an indirect jump through a vector found at the start of the module + the offset to the routine. If the module exports 5 routines, then it starts with a 10-byte table of five 2-byte vectors. And the Jump Table correspondingly has 5 entries.

  JMP (input+0)
  JMP (input+2)
  JMP (input+4)
  JMP (input+6)
  JMP (input+8)

That sort of thing. So every time a module grows in size, not only does it have to be reassembled, but every module lower in memory shifts down and has to be reassembled too. And then the jump table has to be reassembled also.

This is a big pain. But, let's not forget about how a module that wants to call an exported routine from another module finds the entry in the jump table. Each module has a .h header file that defines labels for each routine, and sets them to the position within the jump table for the module's block of jumps, plus the 3-byte offset. Such as:

  readmouse = inputbase+0
  mouserc   = inputbase+3
  deqmouse  = inputbase+6
  readkcmd  = inputbase+9
  deqkcmd   = inputbase+12
  readkprnt = inputbase+15
  deqkprnt  = inputbase+18


Thus we have another file that needs to be updated. If a routine is added to a module that needs an entry in the system Jump Table, it will offset the jump table base address for preceding modules. They then need to have their header files updated… and any module that includes that .h file and calls one of those routines needs to be reassembled so they have the right jump table offsets.

Needless to say, it's easy to forget any of these many places that need to be updated. And it's hard to remember exactly which modules depend on which other modules. I end up having load the source code for each module and look through their includes to see if they include an affected header file.

I need a better way to work this, because the burden is so heavy that its discouraging me from wanting to work on the project. And it gets worse the more code I write and the more modules I add.

Finally, it occurred to me that all those calculations of offsets that I'm doing could actually be done by the assembler, if only it had some basic information:

  • How big is each module, and
  • How many routines does the module export

Instead of writing these numbers down on paper, I could put them all together into a header file to be included by everything that needs one of these numbers. Here's what it ended up looking like:

The modules.h header file, top The modules.h header file, bottom

The top half of the file declares all the variable data I need to provide, in one place. One label for each module (prefixed with x for exports), for how many routines are exported by the module. One label for each module (prefixed with y, because it's close to x), for the size of the assembled module. Unfortunately I still have to manually determine the size of a module, however this only needs to be calculated for the modules I'm working on, which need to be reassembled anyway.

The bottom half of the file is where all the magic happens. First, the highest thing in memory is the Jump Table. And as of just a few nights ago, this is no longer part of the memory module but is its own stand alone file. One label is defined for each module (prefixed with j for jump table). The start of the jump table entries for memory are $d000 (hardcoded as the last address + 1), minus the number of exports for this module times 3. Three because each jmp() takes three bytes.

This gives us a label jmemory that indicates where the jump table entries for memory begin. But this also becomes the relative start address for where the next module's jump table entries begin. Thus jinput starts at jmemory minus xinput*3. And so on, all the way through all the modules. These offset labels are computing automatically simply by adjusting the number of exports each module offers at the top of the file.

The memory module is now just a standard module, structurally the same as any of the others, with its own table of vectors to its exported routines. It's start address is going to be offset from the final (lowest in memory) jump table block. There is one label per module (prefixed with s for start address). The Jump Table itself is first module, but its start address is the same as whatever was the address the final jump table block. In this case it's jtoolkit simply because toolkit is the last module. Thus there is a label, sjumptbl that is set equal to jtoolkit. smemory, the start address for the memory module therefore is sjumptbl minus the size ymemory, the size of the memory module. And the next module starts from smemory minus its own size, and so on through the modules. Thus, modules.h is a central place where all the variable data goes, the size of each module and the count of its exports, and it produces a set of labels that are j- prefixed and s- prefixed for the start addresses of each module's jump table block and code, respectively.

How can this be used, now?

Each module includes modules.h. Then sets is own assemble address to its s-prefixed label. Toolkit for example includes modules.h and positions itself by declaring *= stoolkit.

Each module's .h header file declares the labels for its jump table entries as offsets from the start of its j- prefixed label. These declarations are always right and so the header files themselves don't need to be touched.

  readmouse = jinput+0
  mouserc   = jinput+3
  deqmouse  = jinput+6
  readkcmd  = jinput+9
  deqkcmd   = jinput+12
  readkprnt = jinput+15
  deqkprnt  = jinput+18


And lastly, you have the jump table itself. The actual jmp() instructions still need to be written, because Turbo Macro Pro (the native version) doesn't support the code generating pseudo labels that TMPx provides. But where those JMPs jump to is computed dynamically by the s- prefixed labels simply by including modules.h. Thus:

  JMP (sinput+0)
  JMP (sinput+2)
  JMP (sinput+4)
  JMP (sinput+6)
  JMP (sinput+8)
Module size and export count table

I still need to keep track of some things on paper. My table now looks like what is shown in the image above. But I don't have to make any of the intermediate calculations. And I don't have to enter any of those intermediate values manually across a myriad of files. I just update the modules.h file with the new modules' sizes and export counts. Then I update the jump table itself with the additional entries. And reassemble to object files everything that would be affected by the changes.

It's still quite a bit of work if I make changes to a module up high in memory that pushes the others below it to new start addresses. But, like my recursive backup script it saves a huge amount of effort and takes a big burden off my shoulders, so I can spend more time coding and less time worrying about how to fit things into memory.

There is one last interesting benefit that has popped up. If I decide I'm going to start working on the string module a lot, because I'm building out the set of routines for string editing and manipulation, ordinarily I'd be in a bit of a tough spot. string is 3rd highest in memory, with six or more modules below it. I'd hate to have to reassemble 7 modules(!) every time I make string a bit bigger.

Well, actually I don't have to. It's easy peasy in the modules.h file simply to move the string module to the bottom of the pile. Its jump table entries can stay where they are, although it would be handy to build out the entries in the jump table, even with placeholders for routines I know will be there soon. There is nothing inherently special about having string be the 3rd module instead of the 9th. It was just bothersome having to redo all the math.

Now that modules.h does all that math, it becomes trivial to rearrange the modules. If I know I'm going to work on a specific couple for a few days, I can just move them to the bottom of the pile at the start of my session, and all of a sudden the cycle time for testing them becomes very short again.

Just a quick update on toolkit. It has been the primary focus of the last month or so of work. I've put a lot of thought and a lot of code into it. However, it's still so in flux in my head that I haven't wanted to commit to writing anything about it yet. But soon. I've just recently worked out what I think will be a big score in making the drawing of views and clipping them to the insides of their bounds rect much more efficient and easier to program. As soon as I get a slightly better handle on how well that's going to work out, I plan to write my first blog post discussing the general architecture of the toolkit. Stay tuned.

August 4, 2017Programming Reference

6502 / 6510 Instruction Set

Every good Commodore 64 programmer needs to have the 6502/6510 instruction set at his or her fingertips. There are already many reference texts like this out there, however I find all of them to be lacking.

It is my goal for the presentation below to be the fastest, easiest–to–use and best organized 6502/6510 instruction set reference text on the internet. If you have ideas for how I can improve it, please let me know in the comments.

Full credit to 6502.org for the original source of this content, which I have formatted, rearranged, styled and fixed the HTML.

Alphabetically Ordered






































Execution Times

Op code execution times are measured in machine cycles; one machine cycle equals one clock cycle. Many instructions require one extra cycle for execution if a page boundary is crossed; these are indicated by a + following the time values shown.

Notes Links

Bitwise Instructions

AND (bitwise AND with accumulator)

Affects Flags: S Z

MODE          SYNTAX        HEX LEN TIM
Immediate     AND #$44      $29  2   2
Zero Page     AND $44       $25  2   3
Zero Page,X   AND $44,X     $35  2   4
Absolute      AND $4400     $2D  3   4
Absolute,X    AND $4400,X   $3D  3   4+
Absolute,Y    AND $4400,Y   $39  3   4+
Indirect,X    AND ($44,X)   $21  2   6
Indirect,Y    AND ($44),Y   $31  2   5+

+ add 1 cycle if page boundary crossed

EOR (bitwise Exclusive OR with accumulator)

Affects Flags: S Z

MODE          SYNTAX        HEX LEN TIM
Immediate     EOR #$44      $49  2   2
Zero Page     EOR $44       $45  2   3
Zero Page,X   EOR $44,X     $55  2   4
Absolute      EOR $4400     $4D  3   4
Absolute,X    EOR $4400,X   $5D  3   4+
Absolute,Y    EOR $4400,Y   $59  3   4+
Indirect,X    EOR ($44,X)   $41  2   6
Indirect,Y    EOR ($44),Y   $51  2   5+

+ add 1 cycle if page boundary crossed

ORA (bitwise OR with Accumulator)

Affects Flags: S Z

MODE          SYNTAX        HEX LEN TIM
Immediate     ORA #$44      $09  2   2
Zero Page     ORA $44       $05  2   3
Zero Page,X   ORA $44,X     $15  2   4
Absolute      ORA $4400     $0D  3   4
Absolute,X    ORA $4400,X   $1D  3   4+
Absolute,Y    ORA $4400,Y   $19  3   4+
Indirect,X    ORA ($44,X)   $01  2   6
Indirect,Y    ORA ($44),Y   $11  2   5+

+ add 1 cycle if page boundary crossed

ASL (Arithmetic Shift Left)

Affects Flags: S Z C

MODE          SYNTAX        HEX LEN TIM
Accumulator   ASL A         $0A  1   2
Zero Page     ASL $44       $06  2   5
Zero Page,X   ASL $44,X     $16  2   6
Absolute      ASL $4400     $0E  3   6
Absolute,X    ASL $4400,X   $1E  3   7

ASL shifts all bits left one position. 0 is shifted into bit 0 and the original bit 7 is shifted into the Carry.

LSR (Logical Shift Right)

Affects Flags: S Z C

MODE          SYNTAX        HEX LEN TIM
Accumulator   LSR A         $4A  1   2
Zero Page     LSR $44       $46  2   5
Zero Page,X   LSR $44,X     $56  2   6
Absolute      LSR $4400     $4E  3   6
Absolute,X    LSR $4400,X   $5E  3   7

LSR shifts all bits right one position. 0 is shifted into bit 7 and the original bit 0 is shifted into the Carry.

ROL (ROtate Left)

Affects Flags: S Z C

MODE          SYNTAX        HEX LEN TIM
Accumulator   ROL A         $2A  1   2
Zero Page     ROL $44       $26  2   5
Zero Page,X   ROL $44,X     $36  2   6
Absolute      ROL $4400     $2E  3   6
Absolute,X    ROL $4400,X   $3E  3   7

ROL shifts all bits left one position. The Carry is shifted into bit 0 and the original bit 7 is shifted into the Carry.

ROR (ROtate Right)

Affects Flags: S Z C

MODE          SYNTAX        HEX LEN TIM
Accumulator   ROR A         $6A  1   2
Zero Page     ROR $44       $66  2   5
Zero Page,X   ROR $44,X     $76  2   6
Absolute      ROR $4400     $6E  3   6
Absolute,X    ROR $4400,X   $7E  3   7

ROR shifts all bits right one position. The Carry is shifted into bit 7 and the original bit 0 is shifted into the Carry.

Program Counter

When the 6502 is ready for the next instruction it increments the program counter before fetching the instruction. Once it has the op code, it increments the program counter by the length of the operand, if any. This must be accounted for when calculating branches or when pushing bytes to create a false return address (i.e. jump table addresses are made up of addresses-1 when it is intended to use an RTS rather than a JMP).

The program counter is loaded least signifigant byte first. Therefore the most signifigant byte must be pushed first when creating a false return address.

When calculating branches a forward branch of 6 skips the following 6 bytes so, effectively the program counter points to the address that is 8 bytes beyond the address of the branch opcode; and a backward branch of $FA (256-6) goes to an address 4 bytes before the branch instruction.

Branch Instructions

Affect Flags: none

All branches are relative mode and have a length of two bytes. Syntax is "Bxx Displacement" or (better) "Bxx Label". See the notes on the Program Counter for more on displacements.

Branches are dependant on the status of the flag bits when the op code is encountered. A branch not taken requires two machine cycles. Add one if the branch is taken and add one more if the branch crosses a page boundary.

MNEMONIC                       HEX
BPL (Branch on PLus)           $10
BMI (Branch on MInus)          $30

BVC (Branch on oVerflow Clear) $50
BVS (Branch on oVerflow Set)   $70

BCC (Branch on Carry Clear)    $90
BCS (Branch on Carry Set)      $B0

BNE (Branch on Not Equal)      $D0
BEQ (Branch on EQual)          $F0

There is no BRA (BRanch Always) instruction but it can be easily emulated by branching on the basis of a known condition. One of the best flags to use for this purpose is the oVerflow which is unchanged by all but addition and subtraction operations.

A page boundary crossing occurs when the branch destination is on a different page than the instruction AFTER the branch instruction. For example:


A page boundary crossing occurs (i.e. the BCS takes 4 cycles) when (the address of) LABEL and the NOP are on different pages. This means that


the BVC instruction will take 3 cycles no matter what address it is located at.

Compare Instructions

CMP (CoMPare accumulator)

Affects Flags: S Z C

MODE          SYNTAX        HEX LEN TIM
Immediate     CMP #$44      $C9  2   2
Zero Page     CMP $44       $C5  2   3
Zero Page,X   CMP $44,X     $D5  2   4
Absolute      CMP $4400     $CD  3   4
Absolute,X    CMP $4400,X   $DD  3   4+
Absolute,Y    CMP $4400,Y   $D9  3   4+
Indirect,X    CMP ($44,X)   $C1  2   6
Indirect,Y    CMP ($44),Y   $D1  2   5+

+ add 1 cycle if page boundary crossed

Compare sets flags as if a subtraction had been carried out. If the value in the accumulator is equal or greater than the compared value, the Carry will be set. The equal (Z) and sign (S) flags will be set based on equality or lack thereof and the sign (i.e. A>=$80) of the accumulator.

CPX (ComPare X register)

Affects Flags: S Z C

MODE          SYNTAX        HEX LEN TIM
Immediate     CPX #$44      $E0  2   2
Zero Page     CPX $44       $E4  2   3
Absolute      CPX $4400     $EC  3   4

Operation and flag results are identical to equivalent mode accumulator CMP ops.

CPY (ComPare Y register)

Affects Flags: S Z C

MODE          SYNTAX        HEX LEN TIM
Immediate     CPY #$44      $C0  2   2
Zero Page     CPY $44       $C4  2   3
Absolute      CPY $4400     $CC  3   4

Operation and flag results are identical to equivalent mode accumulator CMP ops.

BIT (test BITs)

Affects Flags: N V Z

MODE          SYNTAX        HEX LEN TIM
Zero Page     BIT $44       $24  2   3
Absolute      BIT $4400     $2C  3   4

BIT sets the Z flag as though the value in the address tested were ANDed with the accumulator. The S and V flags are set to match bits 7 and 6 respectively in the value stored at the tested address.

BIT is often used to skip one or two following bytes as in:

CLOSE1 LDX #$10   If entered here, we
.BYTE $2C         effectively perform
CLOSE2 LDX #$20   a BIT test on $20A2,
.BYTE $2C         another one on $30A2,
CLOSE3 LDX #$30   and end up with the X
CLOSEX LDA #12    register still at $10
STA ICCOM,X       upon arrival here.

Beware: a BIT instruction used in this way as a NOP does have effects: the flags may be modified, and the read of the absolute address, if it happens to access an I/O device, may cause an unwanted action.

Processor Flags

The Interrupt flag is used to prevent (SEI) or enable (CLI) maskable interrupts (aka IRQ's). It does not signal the presence or absence of an interrupt condition. The 6502 will set this flag automatically in response to an interrupt and restore it to its prior status on completion of the interrupt service routine. If you want your interrupt service routine to permit other maskable interrupts, you must clear the I flag in your code.

The Decimal flag controls how the 6502 adds and subtracts. If set, arithmetic is carried out in packed binary coded decimal. This flag is unchanged by interrupts and is unknown on power-up. The implication is that a CLD should be included in boot or interrupt coding.

The Overflow flag is generally misunderstood and therefore under-utilised. After an ADC or SBC instruction, the overflow flag will be set if the twos complement result is less than -128 or greater than +127, and it will cleared otherwise. In twos complement, $80 through $FF represents -128 through -1, and $00 through $7F represents 0 through +127. Thus, after:

LDA #$7F ;   +127
ADC #$01 ; +   +1

the overflow flag is 1 (+127 + +1 = +128), and after:

LDA #$81 ;   -127
ADC #$FF ; +   -1

the overflow flag is 0 (-127 + -1 = -128). The overflow flag is not affected by increments, decrements, shifts and logical operations i.e. only ADC, BIT, CLV, PLP, RTI and SBC affect it. There is no op code to set the overflow but a BIT test on an RTS instruction will do the trick.

Flag Instructions (Processor Status)

Affect Flags: as noted

These instructions are implied mode, have a length of one byte and require two machine cycles.

MNEMONIC                    HEX LEN TIM
CLC (CLear Carry)           $18  1   2
CLI (CLear Interrupt)       $58  1   2
CLV (CLear oVerflow)        $B8  1   2
CLD (CLear Decimal)         $D8  1   2

SEC (SEt Carry)             $38  1   2
SEI (SEt Interrupt)         $78  1   2
SED (SEt Decimal)           $F8  1   2

Jump Instructions


Affects Flags: none

MODE          SYNTAX        HEX LEN TIM
Absolute      JMP $5597     $4C  3   3
Indirect      JMP ($5597)   $6C  3   5

JMP transfers program execution to the following address (absolute) or to the location contained in the following address (indirect). Note that there is no carry associated with the indirect jump so:


For example if address $3000 contains $40, $30FF contains $80, and $3100 contains $50, the result of JMP ($30FF) will be a transfer of control to $4080 rather than $5080 as you intended i.e. the 6502 took the low byte of the address from $30FF and the high byte from $3000.

JSR (Jump to SubRoutine)

Affects Flags: none

MODE          SYNTAX        HEX LEN TIM
Absolute      JSR $5597     $20  3   6

JSR pushes the address-1 of the next operation on to the stack before transferring program control to the following address. Subroutines are normally terminated by an RTS op code.

RTS (ReTurn from Subroutine)

Affects Flags: none

MODE          SYNTAX        HEX LEN TIM
Implied       RTS           $60  1   6

RTS pulls the top two bytes off the stack (low byte first) and transfers program control to that address+1. It is used, as expected, to exit a subroutine invoked via JSR which pushed the address-1.

RTS is frequently used to implement a jump table where addresses-1 are pushed onto the stack and accessed via RTS eg. to access the second of four routines:

LDX #1




RTI (ReTurn from Interrupt)

Affects Flags: all

MODE          SYNTAX        HEX LEN TIM
Implied       RTI           $40  1   6

RTI retrieves the Processor Status Word (flags) and the Program Counter from the stack in that order (interrupts push the PC first and then the PSW).

Note that unlike RTS, the return address on the stack is the actual address rather than the address-1.

Math Instructions

ADC (ADd with Carry)

Affects Flags: S V Z C

MODE          SYNTAX        HEX LEN TIM
Immediate     ADC #$44      $69  2   2
Zero Page     ADC $44       $65  2   3
Zero Page,X   ADC $44,X     $75  2   4
Absolute      ADC $4400     $6D  3   4
Absolute,X    ADC $4400,X   $7D  3   4+
Absolute,Y    ADC $4400,Y   $79  3   4+
Indirect,X    ADC ($44,X)   $61  2   6
Indirect,Y    ADC ($44),Y   $71  2   5+

+ add 1 cycle if page boundary crossed

ADC results are dependant on the setting of the decimal flag. In decimal mode, addition is carried out on the assumption that the values involved are packed BCD (Binary Coded Decimal).

There is no way to add without carry.

SBC (SuBtract with Carry)

Affects Flags: S V Z C

MODE          SYNTAX        HEX  LEN TIM
Immediate     SBC #$44      $E9   2   2
Zero Page     SBC $44       $E5   2   3
Zero Page,X   SBC $44,X     $F5   2   4
Absolute      SBC $4400     $ED   3   4
Absolute,X    SBC $4400,X   $FD   3   4+
Absolute,Y    SBC $4400,Y   $F9   3   4+
Indirect,X    SBC ($44,X)   $E1   2   6
Indirect,Y    SBC ($44),Y   $F1   2   5+

+ add 1 cycle if page boundary crossed

SBC results are dependant on the setting of the decimal flag. In decimal mode, subtraction is carried out on the assumption that the values involved are packed BCD (Binary Coded Decimal).

There is no way to subtract without the carry which works as an inverse borrow. i.e, to subtract you set the carry before the operation. If the carry is cleared by the operation, it indicates a borrow occurred.


Use caution with indexed zero page operations as they are subject to wrap-around. For example, if the X register holds $FF and you execute LDA $80,X you will not access $017F as you might expect; instead you access $7F i.e. $80-1. This characteristic can be used to advantage but make sure your code is well commented.

It is possible, however, to access $017F when X = $FF by using the Absolute,X addressing mode of LDA $80,X. That is, instead of:

LDA $80,X    ; ZeroPage,X - the resulting object code is: B5 80

which accesses $007F when X=$FF, use:

LDA $0080,X  ; Absolute,X - the resulting object code is: BD 80 00

which accesses $017F when X = $FF (a at cost of one additional byte and one additional cycle). All of the ZeroPage,X and ZeroPage,Y instructions except STX ZeroPage,Y and STY ZeroPage,X have a corresponding Absolute,X and Absolute,Y instruction. Unfortunately, a lot of 6502 assemblers don't have an easy way to force Absolute addressing, i.e. most will assemble a LDA $0080,X as B5 80. One way to overcome this is to insert the bytes using the .BYTE pseudo-op (on some 6502 assemblers this pseudo-op is called DB or DFB, consult the assembler documentation) as follows:

.BYTE $BD,$80,$00  ; LDA $0080,X (absolute,X addressing mode)

The comment is optional, but highly recommended for clarity.

In cases where you are writing code that will be relocated you must consider wrap-around when assigning dummy values for addresses that will be adjusted. Both zero and the semi-standard $FFFF should be avoided for dummy labels. The use of zero or zero page values will result in assembled code with zero page opcodes when you wanted absolute codes. With $FFFF, the problem is in addresses+1 as you wrap around to page 0.

Memory Instructions

LDA (LoaD Accumulator)

Affects Flags: S Z

MODE          SYNTAX        HEX LEN TIM
Immediate     LDA #$44      $A9  2   2
Zero Page     LDA $44       $A5  2   3
Zero Page,X   LDA $44,X     $B5  2   4
Absolute      LDA $4400     $AD  3   4
Absolute,X    LDA $4400,X   $BD  3   4+
Absolute,Y    LDA $4400,Y   $B9  3   4+
Indirect,X    LDA ($44,X)   $A1  2   6
Indirect,Y    LDA ($44),Y   $B1  2   5+

+ add 1 cycle if page boundary crossed

LDX (LoaD X register)

Affects Flags: S Z

MODE          SYNTAX        HEX LEN TIM
Immediate     LDX #$44      $A2  2   2
Zero Page     LDX $44       $A6  2   3
Zero Page,Y   LDX $44,Y     $B6  2   4
Absolute      LDX $4400     $AE  3   4
Absolute,Y    LDX $4400,Y   $BE  3   4+

+ add 1 cycle if page boundary crossed

LDY (LoaD Y register)

Affects Flags: S Z

MODE          SYNTAX        HEX LEN TIM
Immediate     LDY #$44      $A0  2   2
Zero Page     LDY $44       $A4  2   3
Zero Page,X   LDY $44,X     $B4  2   4
Absolute      LDY $4400     $AC  3   4
Absolute,X    LDY $4400,X   $BC  3   4+

+ add 1 cycle if page boundary crossed

STA (STore Accumulator)

Affects Flags: none

MODE          SYNTAX        HEX LEN TIM
Zero Page     STA $44       $85  2   3
Zero Page,X   STA $44,X     $95  2   4
Absolute      STA $4400     $8D  3   4
Absolute,X    STA $4400,X   $9D  3   5
Absolute,Y    STA $4400,Y   $99  3   5
Indirect,X    STA ($44,X)   $81  2   6
Indirect,Y    STA ($44),Y   $91  2   6

STX (STore X register)

Affects Flags: none

MODE          SYNTAX        HEX LEN TIM
Zero Page     STX $44       $86  2   3
Zero Page,Y   STX $44,Y     $96  2   4
Absolute      STX $4400     $8E  3   4

STY (STore Y register)

Affects Flags: none

MODE          SYNTAX        HEX LEN TIM
Zero Page     STY $44       $84  2   3
Zero Page,X   STY $44,X     $94  2   4
Absolute      STY $4400     $8C  3   4


DEC (DECrement memory)

Affects Flags: S Z

MODE          SYNTAX        HEX LEN TIM
Zero Page     DEC $44       $C6  2   5
Zero Page,X   DEC $44,X     $D6  2   6
Absolute      DEC $4400     $CE  3   6
Absolute,X    DEC $4400,X   $DE  3   7

INC (INCrement memory)

Affects Flags: S Z

MODE          SYNTAX        HEX LEN TIM
Zero Page     INC $44       $E6  2   5
Zero Page,X   INC $44,X     $F6  2   6
Absolute      INC $4400     $EE  3   6
Absolute,X    INC $4400,X   $FE  3   7

Register Instructions

Affect Flags: S Z

These instructions are implied mode, have a length of one byte and require two machine cycles.

MNEMONIC                    HEX
TAX (Transfer A to X)       $AA
TAY (Transfer A to Y)       $A8
TXA (Transfer X to A)       $8A
TYA (Transfer Y to A)       $98

DEX (DEcrement X)           $CA
DEY (DEcrement Y)           $88
INX (INcrement X)           $E8
INY (INcrement Y)           $C8

Stack Instructions

These instructions are implied mode, have a length of one byte and require machine cycles as indicated. The "PuLl" operations are known as "POP" on most other microprocessors. With the 6502, the stack is always on page one ($100-$1FF) and works top down.

MNEMONIC                        HEX TIM
PHA (PusH Accumulator)          $48  3
PHP (PusH Processor status)     $08  3
PLA (PuLl Accumulator)          $68  4
PLP (PuLl Processor status)     $28  4

TSX (Transfer Stack ptr to X)   $BA  2
TXS (Transfer X to Stack ptr)   $9A  2

Other Instructions


Affects Flags: B

MODE          SYNTAX       HEX LEN TIM
Implied       BRK          $00  1   7

BRK causes a non-maskable interrupt and increments the program counter by one. Therefore an RTI will go to the address of the BRK +2 so that BRK may be used to replace a two-byte instruction for debugging and the subsequent RTI will be correct.

NOP (No OPeration)

Affects Flags: none

MODE          SYNTAX        HEX LEN TIM
Implied       NOP           $EA  1   2

NOP is used to reserve space for future modifications or effectively REM out existing code.


The original source for the above content is: http://6502.org/tutorials/6502opcodes.html

August 4, 2017Programming Reference

Commodore 64 PETSCII Codes

Here is the second most frequent table that I find myself using, after screen codes. The PETSCII table.

(dec, hex)
(up/gfx, lo/up)
(dec, hex)
(up/gfx, lo/up)
(dec, hex)
(up/gfx, lo/up)
(dec, hex)
(up/gfx, lo/up)
0 $00   64 $40 @ 128 $80   192 $C0
1 $01   65 $41 A a 129 $81 orange 193 $C1 A
2 $02   66 $42 B b 130 $82   194 $C2 B
3 $03 Stop 67 $43 C c 131 $83 Run 195 $C3 C
4 $04   68 $44 D d 132 $84   196 $C4 D
5 $05 white 69 $45 E e 133 $85 F1 197 $C5 E
6 $06   70 $46 F f 134 $86 F3 198 $C6 F
7 $07   71 $47 G g 135 $87 F5 199 $C7 G
8 $08 disable C=-Shift 72 $48 H h 136 $88 F7 200 $C8 H
9 $09 enable C=-Shift 73 $49 I i 137 $89 F2 201 $C9 I
10 $0A   74 $4A J j 138 $8A F4 202 $CA J
11 $0B   75 $4B K k 139 $8B F6 203 $CB K
12 $0C   76 $4C L l 140 $8C F8 204 $CC L
13 $0D Return 77 $4D M m 141 $8D Shift-Return 205 $CD M
14 $0E lo/up charset 78 $4E N n 142 $8E up/gfx charset 206 $CE N
15 $0F   79 $4F O o 143 $8F   207 $CF O
16 $10   80 $50 P p 144 $90 black 208 $D0 P
17 $11 cursor down 81 $51 Q q 145 $91 cursor up 209 $D1 Q
18 $12 reverse on 82 $52 R r 146 $92 reverse off 210 $D2 R
19 $13 Home 83 $53 S s 147 $93 Clear 211 $D3 S
20 $14 Delete 84 $54 T t 148 $94 Insert 212 $D4 T
21 $15   85 $55 U u 149 $95 brown 213 $D5 U
22 $16   86 $56 V v 150 $96 pink 214 $D6 V
23 $17   87 $57 W w 151 $97 dark grey 215 $D7 W
24 $18   88 $58 X x 152 $98 grey 216 $D8 X
25 $19   89 $59 Y y 153 $99 light green 217 $D9 Y
26 $1A   90 $5A Z z 154 $9A light blue 218 $DA Z
27 $1B   91 $5B [ 155 $9B light grey 219 $DB
28 $1C red 92 $5C pound 156 $9C purple 220 $DC
29 $1D cursor right 93 $5D ] 157 $9D cursor left 221 $DD
30 $1E green 94 $5E up arrow 158 $9E yellow 222 $DE
31 $1F blue 95 $5F left arrow 159 $9F cyan 223 $DF
32 $20 Space 96 $60 160 $A0 Shift-Space 224 $E0
33 $21 ! 97 $61 161 $A1 225 $E1
34 $22 " 98 $62 162 $A2 226 $E2
35 $23 # 99 $63 163 $A3 227 $E3
36 $24 $ 100 $64 164 $A4 228 $E4
37 $25 % 101 $65 165 $A5 229 $E5
38 $26 & 102 $66 166 $A6 230 $E6
39 $27 ' 103 $67 167 $A7 231 $E7
40 $28 ( 104 $68 168 $A8 232 $E8
41 $29 ) 105 $69 169 $A9 233 $E9
42 $2A * 106 $6A 170 $AA 234 $EA
43 $2B + 107 $6B 171 $AB 235 $EB
44 $2C , 108 $6C 172 $AC 236 $EC
45 $2D - 109 $6D 173 $AD 237 $ED
46 $2E . 110 $6E 174 $AE 238 $EE
47 $2F / 111 $6F 175 $AF 239 $EF
48 $30 0 112 $70 176 $B0 240 $F0
49 $31 1 113 $71 177 $B1 241 $F1
50 $32 2 114 $72 178 $B2 242 $F2
51 $33 3 115 $73 179 $B3 243 $F3
52 $34 4 116 $74 180 $B4 244 $F4
53 $35 5 117 $75 181 $B5 245 $F5
54 $36 6 118 $76 182 $B6 246 $F6
55 $37 7 119 $77 183 $B7 247 $F7
56 $38 8 120 $78 184 $B8 248 $F8
57 $39 9 121 $79 185 $B9 249 $F9
58 $3A : 122 $7A 186 $BA 250 $FA
59 $3B ; 123 $7B 187 $BB 251 $FB
60 $3C < 124 $7C 188 $BC 252 $FC
61 $3D = 125 $7D 189 $BD 253 $FD
62 $3E > 126 $7E 190 $BE 254 $FE
63 $3F ? 127 $7F 191 $BF 255 $FF


  1. Codes $00-$1F and $80-$9F are control codes. Printing them will cause a change in screen layout or behavior, not an actual character displayed.
  2. Codes $60-$7F and $E0-$FE are not used. Although you can print them, these are, actually, copies of codes $C0-$DF and $A0-$BE.
  3. Code $FF is the BASIC token of the π (pi) symbol. It is converted internally to code $DE when printed and, vice versa, code $DE is converted to $FF when fetched from the screen. However, when reading the keyboard buffer, you will find code $DE for Shift-↑ (up arrow) as no conversion takes place there yet.
  4. Cleaned up from the original source: http://sta.c64.org/cbm64pet.html
August 3, 2017Programming Reference

Commodore 64 Screen Codes

The new format of C64OS.com has opened up an exciting new possibility for me. With the new dedicated weblog section of the site, and its left sidebar, I can now pin feature posts to the top to make sure people who randomly visit the site will discover them.

But the sidebar actually now gives me a place to open up a whole new categorization system. I am always looking around the web for the various programming references and documentation that I need. Now, I can make blog posts that contain key programming reference material, and simply tag those posts "Programming Reference". And the sidebar can display a list of posts so tagged. Boom, instant access to all my most frequently used reference materials available in one place.

I hope this post series will be useful to other people too.

Screen code
(dec, hex; reverse dec, hex)
(up/gfx, lo/up)
Screen code
(dec, hex; reverse dec, hex)
(up/gfx, lo/up)
0 $00 128 $80 @ 64 $40 192 $C0
1 $01 129 $81 Aa 65 $41 193 $C1 A
2 $02 130 $82 B b 66 $42 194 $C2 B
3 $03 131 $83 C c 67 $43 195 $C3 C
4 $04 132 $84 D d 68 $44 196 $C4 D
5 $05 133 $85 E e 69 $45 197 $C5 E
6 $06 134 $86 F f 70 $46 198 $C6 F
7 $07 135 $87 G g 71 $47 199 $C7 G
8 $08 136 $88 H h 72 $48 200 $C8 H
9 $09 137 $89 I i 73 $49 201 $C9 I
10 $0A 138 $8A J j 74 $4A 202 $CA J
11 $0B 139 $8B K k 75 $4B 203 $CB K
12 $0C 140 $8C L l 76 $4C 204 $CC L
13 $0D 141 $8D M m 77 $4D 205 $CD M
14 $0E 142 $8E N n 78 $4E 206 $CE N
15 $0F 143 $8F O o 79 $4F 207 $CF O
16 $10 144 $90 P p 80 $50 208 $D0 P
17 $11 145 $91 Q q 81 $51 209 $D1 Q
18 $12 146 $92 R r 82 $52 210 $D2 R
19 $13 147 $93 S s 83 $53 211 $D3 S
20 $14 148 $94 T t 84 $54 212 $D4 T
21 $15 149 $95 U u 85 $55 213 $D5 U
22 $16 150 $96 V v 86 $56 214 $D6 V
23 $17 151 $97 W w 87 $57 215 $D7 W
24 $18 152 $98 X x 88 $58 216 $D8 X
25 $19 153 $99 Y y 89 $59 217 $D9 Y
26 $1A 154 $9A Z z 90 $5A 218 $DA Z
27 $1B 155 $9B[ 91 $5B 219 $DB
28 $1C 156 $9Cpound 92 $5C 220 $DC
29 $1D 157 $9D] 94 $5D 221 $DD
30 $1E 158 $9Eup arrow 94 $5E 222 $DE
31 $1F 159 $9Fleft arrow 95 $5F 223 $DF
32 $20 160 $A0Space 96 $60 224 $E0Shift-Space
33 $21 161 $A1! 97 $61 225 $E1
34 $22 162 $A2" 98 $62 226 $E2
35 $23 163 $A3# 99 $63 227 $E3
36 $24 164 $A4$ 100 $64 228 $E4
37 $25 165 $A5% 101 $65 229 $E5
38 $26 166 $A6& 102 $66 230 $E6
39 $27 167 $A7' 103 $67 231 $E7
40 $28 168 $A8( 104 $68 232 $E8
41 $29 169 $A9) 105 $69 233 $E9
42 $2A 170 $AA* 106 $6A 234 $EA
43 $2B 171 $AB+ 107 $6B 235 $EB
44 $2C 172 $AC, 108 $6C 236 $EC
45 $2D 173 $AD- 109 $6D 237 $ED
46 $2E 174 $AE. 110 $6E 238 $EE
47 $2F 175 $AF/ 111 $6F 239 $EF
48 $30 176 $B00 112 $70 240 $F0
49 $31 177 $B11 113 $71 241 $F1
50 $32 178 $B22 114 $72 242 $F2
51 $33 179 $B33 115 $73 243 $F3
52 $34 180 $B44 116 $74 244 $F4
53 $35 181 $B55 117 $75 245 $F5
54 $36 182 $B66 118 $76 246 $F6
55 $37 183 $B77 119 $77 247 $F7
56 $38 184 $B88 120 $78 248 $F8
57 $39 185 $B99 121 $79 249 $F9
58 $3A 186 $BA: 122 $7A 250 $FA
59 $3B 187 $BB; 123 $7B 251 $FB
60 $3C 188 $BC< 124 $7C 252 $FC
61 $3D 189 $BD= 125 $7D 253 $FD
62 $3E 190 $BE> 126 $7E 254 $FE
63 $3F 191 $BF? 127 $7F 255 $FF
  1. Codes $80-$FF are the reverse equivalents of codes $00-$7F
  2. Cleaned up from the original source: http://sta.c64.org/cbm64scr.html
August 1, 2017Programming Theory

Base Conversion in 6502 (2/2)

Very close to a month ago I wrote an article called Base Conversion in 6502. That piece was a part 1 of 2 because I felt like it was getting a little long. In part 1, I went through the process of converting an ASCII string representation of a number to the actual integer equivalent.

This piece is Base Conversion in 6502, part 2. In this article I'm going to walk through the reverse procedure. How to convert an integer into a string representation of that number. In either direction the base of the representation is arbitrary. In the original I showed converting from a string representing a number in base10, but it would be just as easy to convert if the string held a number in base16, or base2, or base8 or whatever. Similarly, when we convert an integer into a string representation we can choose what base we want it to be converted to.

I feel like there was some confusion over my terminology in part 1, because I was refering to integers as numbers in hex, so I'd say, convert a string number to hex. When the CPU deals with integers, say to add them together, the integer is whatever its bits are counted in binary. So, if you have a byte in memory, and the byte's bits are 0000 0010, then we can say that as far as the CPU is concerned that's the integer 2. And it's super easy for the computer to add that to some other integer like 3 (0000 0011), to produce a one byte integer result 5 (0000 0101). If these numbers were in PETSCII, though, they would be 50 == "2" (0011 0010), 51 == "3" (0011 0011), and 53 == "5" (0011 0101), and so on.

The reason I used the words converting between a string and hex, is because in the assembler ints are most frequently typed in hex notation. And when assembled that number typed in hex is represented in the CPU as an integer.

Meanwhile, you can represent a number in binary, in a string! Where each bit is represented by a whole byte in the string, either "0" (48, 0011 0000), or "1" (49, 0011 0001). So, converting an integer like 5 (0000 0101) to a string representation in base2, it would be converted to the string "00000101" which is actually 8 bytes long, 48 48 48 48 48 49 48 49. Okay, here we go.

Convert an INT to a string representation

This is the opposite process. We have an int (an integer), but we want to print it out on the screen in characters that are readable by a human.

What would happen if you didn't bother to do a conversion and just dumped the byte directly to screen memory? Well, imagine your byte was the integer 83. If you dump that byte to the screen the VIC II will draw it as screen code 83, it'll draw a single character that looks like a heart. How would you then figure out what that byte really is? You'd have to refer to a screen codes chart, search through the chart for a heart and then see that the heart is 83, bingo.

What are we really doing when do this? Well, it's kinda like representing the number in base256. Base10 has 10 symbols (0,1,2,3,4,5,6,7,8 and 9), and if the integer is greater than 9 we need a second character to hold another of the 10 symbols. Base2 has 2 symbols (0 and 1), and if the int is greater than 1 we need another character for another either "0" or "1". Base256 would mean we need 256 unique symbols. And if the integer is greater than 255 then we'd need a new character with another one of our 256 symbols. But, that's what screen codes are; 256 unique symbols. A one byte integer requires exactly one screen code character.

But that would be terrible, it would take forever to figure out what the number is in a way that we understand it. And what's worse, many screen codes look very similar to each other. 116 and 117 are both left vertical bars, 117 is just a single pixel wider than 116. So, representing the number in a base256 string is not really workable. We need to convert to some smaller base.

The easiest way to get something we could describe as human readable is to convert to a string that is in base16, or hexadecimal. The reason this is easiest is because the 8 bits of the underlying int break straight down the middle: 4 upper bits (the upper nybble), and 4 lower bits (the lower nybble). Each nybble is a number from 0 to 15 and can be used as an index into a lookup table of 16 PETSCII characters, "0123456789abcdefg".

I personally use this all the time for debug output when coding. I've got some ints that I'm working with and I need to see what those ints are, so, short of using a monitor which does these conversions for us, I just want to spit out some numbers to the screen.

In fact, I already gave the source for such a conversion routine, inttohex, in the post Implementing Factorial In 6502.

It's very very easy. By shifting right (LSR) four times, the upper four bits are shifted down into the lower nybble and the upper 4 places replaced with zeroes. This is now a number in the range 0 to 15, which can be used as a lookup index into a string of digits, labeled above as hexits.

And to get the lower 4 bits alone is even easier. You simply mask the whole byte using AND #%00001111. Some juggling is required to preserve the numbers as you go, but by and large it is simple, fast and super useful.

So, we're done! No, we're not.

Converting to base16 is really convenient for programmers who are debugging, but most people don't understand hexadecimal because they don't have 8 fingers per hand, like programmers do. Most people have only 5 fingers per hand (the thumb is a finger for this exercise.) So most people want to see the output in base10.

Convert an INT to a string in an arbitrary base

We return now to the article about Large Look-up Tables for Hyperfast, Accurate, 16-Bit Scaled-Integer Math by Garth Wilson over at Wilson Mines Co. The original place where I saw descriptions of the principles of base conversion. Here's his prose description of converting from an int to a string in an arbitrary base:

For converting hex numbers to other bases for output (which will normally be a string), initialize a blank string. You will build it from right to left. Divide your number by what's in variable BASE, and use the remainder to add a text digit to the build, even if it's a 0. Keep doing that until there's 0 in the number.

As before, I'll convert that to pseudo code for you.

Convert from INT to a different base:

Define BASE.
Define symbol table with at least BASE-n symbols.
Initialize a blank string.

Divide number by BASE.
Use remainder as index into symbol table.
Prepend symbol to string.
Go loop until number is zero.

Next let's step through the pseudo code with an example number, seeing how it converts one step at a time.

BASE = $A (10 in decimal)
loop 0:
$A894 / BASE == $10DB (remainder $6)
"6" + ""

loop 1:
$10DB / BASE == $1AF (remainder $5)
"5" + "6"

loop 2:
$1AF / BASE == $2B (remainder $1)
"1" + "56"

loop 3:
$2B / BASE == $4 (remainder $3)
"3" + "156"

loop 4:
$4 / BASE == $0 (remainder $4)
"4" + "3156"


So before we proceed with an explanation, let's use our calculator and see if the conversion works. 43156 is indeed equal to $A894. That's convenient.

To start we define the variable BASE as $A, that's base 10, because we'd like to see our output string in the nice friendly base 10 we're all used to. But this would work just as well if we were to use some other base instead.

Between the two horizontal rules I've labelled 5 loop iterations, 0 through 4. It just works out that it takes this many steps to get our division result all the way down to zero.

The initial value is our dividend, and BASE is our divisor. After performing the division we get a result and a remainder. That remainder, as it happens, even if it is zero, is the least significant digit of the number in the base we're converting to. In this case, because we're converting to base10, the range of remainders even represented in hexadecimal will only ever be between $0 and $9.

The remainder has to be converted to a PETSCII character. This isn't shown here, but it's very easy. We can see by looking at a PETSCII table that $30 is "0", $31 is "1", $32 is "2" and so on. So to get the PETSCII character version of our remainder integer all we have to do is add $30. The character has to be added to the start of the string. In our implementation we'll take a look at how I might get around to doing that.

As long the result is not equal to zero we need to loop and continue the process. On each new loop, the result from the previous division becomes the dividend for the next division. In the final loop the dividend ($4) is smaller than BASE ($A) so the result is 0 plus the remainder which will be the final (most significant) digit in the string.

When we converted the other way, our implementation need to have a multiplication routine because the 6502 doesn't have a multiply instruction. Converting this direction, our real implementation will need a division routine. We will also need to worry about precision. In this case, as in the example in part 1, I'm opting to settle for 16-bit precision. So our maximum input number is $FFFF and the maximum string representation will be "65535".

Codebase64 to the rescue on the 16-bit division routine.

INT to a String number Implementation

And here is my implementation of the above. Works just as you'd hope.

Now let's walk through this a line at a time and see how it works.

The division routine that I grabbed from Codebase64 requires three 16-bit numbers in zero page. These are the first constants defined under Workspace. You only need to specify the low byte of these addresses. We have one for divisor, one for dividend and one for remainder. The result in this routine actually is the same address as dividend, which means dividend will continually be overwritten by result. This turns out to be super handy for our use case, as we'll see shortly.

Base is defined as 10, petoffset is the number that needs to be added to an int between 0 and 9 to produce the PETSCII characters "0" through "9". I make this a constant, because I hate magic numbers being sprinkled through code. And chrout is defined here as $FFD2, which is the KERNAL address for outputing a PETSCII character to the screen.

The code starts at $0801, that's where BASIC programs start. The first code encountered is the BASIC preamble. In my post Implementing Factorial on 6502 I talk a bit about how the preamble works. But here'd like to say that rather than just spitting out a series of inscrutable bytes, the assembler actually offers a variety of features that make producing the preamble from memory quite a simple task.

BASIC Preamble in TMP on a real display

The preamble is structured as a BASIC program. First a 2-byte pointer to the next line, this can be done with .word and a label, "end" in this case. Next is a 2-byte line number. This is easy to do with .word 10, that makes a 16-bit line number 10. Next we have a 1-byte BASIC token, this is the hardest part to remember, .byte $9e, that's the token for SYS. Next the SYS wants a memory address, encoded in PETSCII (ironic consider the nature of this article series), plus the BASIC line ends with a null byte $00. This can be very easily produced by the assembler with the .null keyword. The following quoted string is in PETSCII, and it automatically gets null terminated. To end the BASIC program we want a 16-bit $0000, this is easily produced with .word $00, and this is also the line that is pointed to by the first line's next line pointer. So it takes the "end" label. And boom, that's easy to type up from scratch, and is even comprehensible to read.

The main program starts at line 24. We start by putting our BASE into the divisor. This will not change, so it only has to be done once. Even though the divisor fits within 8-bits, the assembler notation of #> and #< will break it into $00 and $0A for us.

Next we put the number we want to convert into the dividend the same way we populated divisor. I've hardcoded $A894, but in a real program you'd probably be pulling this from somewhere.

The loop starts at line 36. The first thing it does is checks to see if dividend is 0. If it is, we're done and we can leave the loop to output our converted string. Dividend is a 16-bit number, but checking to see if it's zero can most easily be accomplished by loading one of its bytes, and then ORing that with the other byte. If the result of both bytes ORed together is still zero, then the whole dividend is zero, and the conversion is complete.

At line 40 we JSR to divid16. This routine overwrites the dividend with the result of the division, as well as populating the remainder, even if the remainder turns out to be zero. Because we're dividing by such a small number, $0A, remainder cannot ever be bigger than a value from $00 to $09, which also means that remainder's high byte will always be $00 and can be ignored.

Lines 42,43 and 44 load the remainder's low byte and add the PETSCII offset to it. Lines 46, 47 and 48 put the PETSCII character into the string. The string, str, is a pre-defined buffer that starts as 5 spaces, null terminated, at line 64. Line 63 is an index variable that starts at 4. That's the offset to the end of the string buffer. When we add the character to the string we add it at the index, then decrement the index. This is how I've decided to do the string prepending. It means I don't have to shift anything, but it also means the final string will always by 5 characters long, and any unused places are padded with spaces.

Line 50 jumps back to the start of the loop. Which will check the dividend to see if it's zero yet. The convenient thing here is that we don't need to explicitly move the result of the previous division into the dividend, because that happens automatically in the divid16 routine. Convenient.

Lastly, we want to output the finished string to the screen. This is done starting at line 53. We simply use the .X register to walk through the bytes of the string and call chrout on each byte to put them on the screen. The loop ends when the string's null byte is reached. And we return to BASIC. That's it.

Output of the conversion program

So, we load it up in vice and low and behold, 43156 gets spat out onto the screen. Looks like it works! This concludes this two part series on INT to String and String to Int conversion. Hopefully someone in the world who's learning 6502 will one day stumble upon this and find it useful.

If you want to download the program and try it out, or inspect how it assembled, you can download it here.

July 21, 2017Hardware

Commodore Logo Mark Patch

I've got something a little special for you today. I've been hard at work building C64 Luggable. You can read all about the project, my progress and how to build one yourself on this site's C64 Luggable page.

It's really great. I'm enjoying the challenge of working with physical materials, and also the intellectual fun of figuring out how to construct something that's modular, self–contained and has everything you'd want in a full Commodore system.

A couple of weeks ago I had a sudden realization that once C64 Luggable is complete I'll want to take it with me to friends' houses, parties, the cottage, and most of all, Commodore Expos such as this year's Commodore World.1 Who builds a custom luggable Commodore all-in-one and then doesn't take it to tradeshows to show it off?

The problem is that it's going to be somewhat vulnerable to nicks and scratches, or much worse, damage to the open flat screen or front speaker cones. Especially if I started lugging it around to shows and taking it in and out of my car. What I really need is a padded cover. It just so happens that my mother is an experienced and very talented quilter. She's an esteemed member of the Napanee Heritage Quilter's Guild. So I recruited her to help me make a custom fitted cover. I'm picturing it like one of those artsy Kleenex box covers, if you've seen those. It slips on over the top. Covers all four sides but leaves the button open so the computer will remain standing on its own feet. And the top cover will have a slit opening sized for the handle to pass through. It will be very handy.

But, we don't just want a generic–looking cover. It's gotta have some style and advertise what it is. So I came up with the idea of getting a custom embroidered patch made up that can be stitched to the front of it.

I looked around the internet and compared prices. The best price I found was from a company in the U.S. called CUSTOMPATCHES. You send them artwork, they mock it up graphically, and follow up with you to make sure it's what you want. Then they create a single proof piece and show you a picture of it. After you approve the proof they make as many as you want in fixed batch sizes. The smallest order is 25. But you can also order 50, 100, 200, 300, 500 and more. The price per unit drops substantially the more you order.

I only want 2 or three for myself. One for the C64 Luggable cover, and a couple more, maybe for a bag or a hoodie. The rest, well, I figure I'll sell them off for a few dollars (+shipping) a piece, mostly to recoup the upfront cost, and to share the love. They've been ordered but I haven't received them yet. They'll probably be available by the end of August 2017.

Here's the artwork I put together in my experiments, and the patch mockup they sent me. The dimensions are, as shown on the mockup, 3.5" by 2.4". They are backed with an iron–on adhesive. This means they can be applied to a bag, or a backpack, (or a luggable computer cover), with just an iron press. They promise me the adhesion is extremely good, and sewing them on isn't absolutely necessary, but they do recommend a single outer stitch to increase the longevity.

Contact me if you're interested in buying a few of these. Supplies available until they're all sold off. I'll update with final price when I know for sure what they'll cost me.

Commodore Logo Mark Commodore Word Mark Commodore Logo Mark (Patch Art)

Update: I'm still working through the proof process with them. The final version may not have the black outer border.

Commodore Logo Mark (Patch Art, Proof 1)

Update 2: This is the final proof. With some feedback from twitter, and my personal opinion, I opted to get rid of the black border. Some border is necessary to prevent fraying, but they've switched it to a white border.

Commodore Logo Mark (Patch Art, Proof 2)
  1. I had a blast at last year's Commodore World 2016. If you haven't read my review of the event, you can find it here. We had the pleasure of hanging out with Bil Herd, legendary designer of the Commodore 128. []
July 5, 2017Programming Theory

Object Orientation in 6502

I feel as though my abilities at 6502 assembly are progressing fairly quickly. I'm a pretty experienced programmer in higher level languages, such as Objective-C, Javascript and PHP, as well as C. But when I first started programming seriously in 6502 I was having a hard time seeing how to accomplish things that are quite common in higher level languages. Things like strings, arrays, structures, functions and objects. What you have available when coding in 6502 ASM is just so incredibly low–level, you have nothing but an open expanse of memory and a handful of simple instructions.

But, I made a lot of progress which I've blogged about earlier, especially in regards to pointers and structures. The next frontier, in my opinion, is learning how to program in an object–oriented style. In higher level languages the compiler or the interpreter itself constrains what you can do. But, these constraints are arbitrary. A computer can jump through any spaghetti mess that you can throw at it. The constraints are there to force you to follow certain design and organizational patterns. The point of organizing your code either functionally or objectively is because it is a good practice that leads to cleaner code that is easier to understand and maintain and that has fewer bugs as a result.

So, when we code in 6502 asm we're not constrained by a compiler to follow any particular good practice. But that doesn't mean we can't apply, as a matter of personal discipline, the organizational principles of object–orientation to our assembly, and then reap the benefits that come from that design. Let's first go over the theory and the terminology so we know what the benefits are in principle. And then after that we'll get into the meat and potatoes of exactly how these principles can be applied in 6502 asm.

What is object–orientation? The Theory.

There are usually four principles that object–oriented design is meant to meet. These are:

  • Encapsulation

  • Abstraction
  • Polymorphism

  • Inheritance

I separate these into encapsulation, and then abstraction and polymorphism, and lastly inheritance, because I find it difficult to separate abstraction and polymorphism into distinct principles. They seem to overlap and feed into one another that makes them hard to discuss individually.


Encapsulation is to me the most important of the four. Rather than having some global data that any code anywhere in the program can see and change, you instead wrap up bundles of related data and hide them behind code that you have to call to get at that data. The distinction between an object and a structure gets really thin when you don't have a compiler constraining what code can work on what data.

A structure, as I've discussed before is a block of data with the possibility of being composed of smaller and different data types. A structure collects data of varying types together that are related to each other. The sub-types are laid out in such a way that the start of each sub-type within the structure has a known offset from wherever the structure as a whole starts in memory. I believe the example I used was a record of a person. A person may have a name, year of birth, height, social security number, gender, etc. These different data can be put together such that you only maintain a pointer to a person and then access the subtypes as offsets into person. So, if you know the struct starts at $8000, and name is at $02, then name is at $8000,$02, and gender might be at $8000,$06, etc. The 6502 is pretty good at this, because it offers the indirect indexed addressing mode. You set the .Y register to the offset of the sub type you want, and put a pointer to the start of the struct somewhere in zero page.

Structs are really great, but then you need routines that know how to operate on the layout of a particular type of struct. You might have person records, and job records, which are structured differently. You'd run into a huge problem if you called a routine that was designed to operate on a person struct but you passed it a job struct. What objects do is take structs one step further, and make some of the struct's properties pointers to routines that are capable of operating on that very type of struct. A bit head spinning at first, by very clever when you think about it.

The encapsulation separates the design of the object from the programmer who might later try to use the object in their own code. Why would anyone want to do that? Well, we'll see that eventually when I post in more detail about the object–oriented nature of the widget toolkit. But in short, you want the designer of the operating system to create some objects that are useful. Then you want a programmer who doesn't directly collaborate with the OS designer, but just reads an API document, to be able to write his own application that makes use of the object. Then, here's the real magic, you want the OS designer to be able to change, improve and expand the capabilities of the OS's objects. And you want the third-party programmer's application not only to not break, but to automatically gain new functionality, effectively for free.

If the third party programmer had complete knowledge of the internal layout of the structure and thus hardcoded direct manipulations of that structure, then his code would break if the OS designer changed the object's internal layout. In order to avoid this problem it is important that the programmer does not directly manipulate the encapsulated data. In 6502 we cannot enforce that the programmer doesn't violate the principle of encapsulation, but we can certainly honor the notion that the data ought to be encapsulated and simply restrain ourselves from direct manipulation.

Abstraction and Polymorphism

In order to understand how it is possible to work with an object whose properties are all encapsulated it is necessary to move on and discuss how objects work. The ideas of abstraction, inheritance and polymorphism will all start to blur in this description.

The crucial step that separates a struct from an object is that an object contains properties that point to routines that are designed to act upon itself and to understand the internal layout of its own structure. This allows the third party programmer to not need to write his own code that manipulates the encapsulated data.

Instead, the documentation specifies a handful of "publicly" known offsets, typically to routines, which in object parlance are called "methods." The programmer then takes a pointer to the object, and looks up a method, and calls the method. The method is given access to a pointer to its own structure, and the method, which was written by the OS designer, operates on the internals of the struct to change the data, retrieve some data for the user, or perform some other function.

So, encapsulation hides the data of the object from the programmer, and instead presents publicly known method offsets. The methods of an object are always known to operate on the object they belong to. So you never need to worry about accidentally calling a routine but passing it the wrong type of record. This design pattern, encapsulating the data and only accessing that data by calling methods which operate on the structure, essentially defines what it means to be an object. However, the relationship between objects is where abstraction, inheritance and polymorphism come into the picture.

The collection of public methods that an object has are known as its interface. They are the calls by which third party code engages with and interacts with the object. But what if you had two different objects, with different encapsulated properties and different internal behaviors, but which exposed exactly the same interface? Well, this is where you get abstraction and polymorphism. What's the difference between abstraction and polymorphism? Well, lots of people ask that question and I'm not 100% certain myself. But I can take a stab at it.

Let's say we have two objects, one that represents a person. Internally the person has a property that stores their date of birth as an offset in seconds since the unix epoch. The other object represents a company, and stores the founding date of the company as a string in YYYY-MM-DD format. But they both have a method called "Start Year". When called, the person object returns 1981, and the company object returns 1975. These two figures can be compared to each other, even though the true underlying data representation is radically different and not directly comparable. The methods thus serve to abstract the underlying data structure, and the third party program that makes use of these classes does not need to know or worry about the internal representations of the data.

Polymorphism is similar, but I think it relates to abstracting behavior, which allows one object to seamlessly substitute for another. So, imagine you have a program and it wants to log error messages. There could be three different types of objects, which each have a "log" method, that take a string for message text. But one of those logger objects emails the message text, another prints it to a printer, and another appends it to a log file. The third party program doesn't actually care which logger object is given to it, it will use and interface with all three in exactly the same way. But the result of the substituting one for another is that a completely different form of output will take place behind the scenes. This is what "polymorph"-ism means, one interface manifests in "many forms."


Abstraction and polymorphism are about how you can use different types of objects in similar ways. Inheritance, then, is about deriving a type of object from a more primitive class of object, such that they have at least some part of their interface in common, which allows for their substitution.

It's important to define here the difference between a class and an object. A class is a description or a prototype of an object. An object is thus an instance of a class. The difference is similar to the definition of a structure, and an actual instance of the structure in memory. The definition of the structure may be that it has 5 properties, 2 bytes wide each. When you want to make an instance of the structure, you effectively just allocate some memory that is the size of the structure, 10 bytes in this case. After that you may want to initialize that memory by assigning known values to each of the 5 properties. Now you have this 10 bytes of memory that are an instance of the structure. Your code could make hundreds or thousands of instances of the structure by allocating another 10 bytes for each new instance.

Objects work essentially the same way. Except that in higher level languages their initialization is performed automatically by a standard method which is known as its constructor. High level languages typically allow you to create an instance of an object with the new keyword. So new Person; would allocate memory the size of Person. Then automatically initialize that memory using a routine which is defined as the constructor for the Person class. The instantiation process automatically assigns all the method pointers to all the routines that are declared as that Class's methods. The constructor method is generally responsible for initializing all the other data properties. And what is returned to the caller is a pointer to the initialized memory of the new instance. After this, we'll look at how we can implement this behavior in 6502.

Inheritance is a design pattern whereby classes are hierarchically related to each other. For example, imagine you have a class that represents a piece of paper. It may have properties such as width, height, thickness, and color. You might then have a subclass of paper called lined paper. Lined paper actually has all of the properties of paper, but it adds extended properities such as line color, line thickness, line spacing, top margin and bottom margin. And you might have a third class that is also a subclass of paper called music paper. This has all the properties of paper, but none of the extra properties of lined paper, and instead offers its own extended properties such as base or treble lines, and how many rows of notation.

Further, beyond just the subclasses sharing the properties of the superclass, the superclass (paper in this case) also implements some methods, such as draw. When you draw a piece of paper it just clears out a rectangle on the screen proportional to its width and height properties. The two subclasses however also have the draw method, but their draw method is more sophisticated but derivative. So, when a lined paper object is instantiated, its draw method pointer is setup to call the draw routine declared for the lined paper subclass. However, when draw is called on a lined paper object, internally that routine can call the original draw rouine for its super class, paper. Paper's draw routine then does its thing, reads the width and height and clears out the rectangle with the correct paper color. After the superclass's draw method completes, the lined paper's draw method only needs to implement the extended functionality. It draws on top of the paper's colored rectangle according to the extended properties defining the lines.

That is really cool. In other words, when you want to implement lined paper, it is not necessary to reinvent the entire wheel and reduplicate the base drawing code. Instead a subclass extends a superclass. Or we say that the subclass inherits from the superclass. The fact that the subclass inherits from the superclass means that an instance of lined paper IS an instance of paper. And so any code that expects to be operating on a paper object should work just fine if it is instead given a lined paper or a music paper object.

So much for theory, what about the 6502?

Effectively, an object, an instance of a class, is a struct where some properties are pointers to methods. And a struct is just a block of memory the appropriate size, which is accessed using indirect indexing where the index offsets are declared by name. What does that mean?

Well, let's take the indexing by name part first. In our source code we will declare a block of labels like this:

The above is not an object, it's a struct definition. This is the sort of code that I use to declare the definition of a structure. Notice that p_size has been manually set to the the size of the whole collection. 2 for name, 1 for sex, 1 for height, 2 for year of birth. We need 6 bytes total for an instance of the person struct. To create one of these in memory we need to rely on our memory manager (and here). In fact, object oriented code essentially requires a memory manager, which is something that I wasn't anticipating when I originally discussed what a memory manager would be useful for. We ask the memory manager to allocate space the size of p_size. We get back a 16-bit pointer to somewhere in memory and that's where our struct is.

Next, we might want to initialize the struct's properties to known defaults. In order to do indirect indexing the 6502 requires that the pointer be in zero page, so we'll stick the pointer into say $fb-$fc, which we've defined as the ptr. Then we set the .Y register to an offset by using one of the struct property declarations and away we go. Here's how this might look in code.

So, that's what we might do in our code to go from a definition of a struct to an instantiated and initialized instance of a struct. And I've posted at length about structs before. The thing to note here is that a struct is not an object. The biggest problem with this code from an object oriented point of view is that the data is not encapsulated, the implemetation is not abstracted, there are no methods which act upon this data, and so polymorphism isn't possible, etc. But, what this does show is how we use those declared offsets to index properties of the struct, by name,1 through a pointer. This part of it is essential to how we'll use objects.

Next, let's look at some real world examples that take structs and move them into the world of objects. I've been working on an object–oriented widget toolkit that will standardize the UI and making building fancy hierarchical "graphical"–like user interfaces easy and consistent. I say graphical–like because the toolkit works the way you expect a modern GUI to work, but in C64 OS it is implemented in the text-based video mode, because it's the only way to get super fast and snappy results on the stock c64. Don't be disheartened though, most c64 video games use text-mode to produce fantastic looking and very fast results and the user doesn't even realize it is text mode, thanks to customizable character sets and smooth scrolling.

Toolkit Views, page 1 Toolkit Subviews, page 2

Okay, so let's take a look at these two spec sheets. They are preliminary, they'll probably change as I build out the functionality needed to support certain types of behavior such as drag and drop, but I have begun coding the essential drawing behavior around how they are presented above. I'll go into much more detail in a future post about how the widget toolkit is supposed to work, for the purposes of this post I only want to show real world examples of how to structure objects that feature inheritance, encapsulation, polymorphyism and abstraction, and how to do that in 6502.

We have here a set of 6 classes: view, scroll, split, label, button and input. Eventually I will be adding more, but I see these as a good start, they will let me compose some pretty interesting user interfaces. These objects are arranged hierarchically, such that they all inherit from view. For this reason, it is appropriate to refer to all of them as "views", because although only the root view is actually called "view", they are all subtypes of the essential view. Here's what the hierarchy of inheritance for these 6 classes looks like.

Object hierarchy of views

So, as we can see, view is the root object from which the other toolkit sub-views descend. Let's look at view in detail for a moment and see what properties it has and how its original spec sheet translates into code.

So, you can see that from the spec sheet, the size (or width) of each element has been translated into offsets from 0. The properites of view offset from zero, because view is the root object.

The first two elements are 1 byte wide each. The first is a node type. Every class in the toolkit will be assigned a type id that allows either the OS designer's own code, or third party code, to easily identify the object type. It is here called a node type but we'll see more about how why that is and how it is useful in a future post about the toolkit. The second element is a generic id. This allows the developer to arbitrarily tag up to 255 different objects to easily identify specific ones, we'll also see in a future post how this is useful.

The following 7 properties describe the view's render size and positional characteristics. I won't go into the details of how these work here, other than to say that the last of these properties is a resize mask. It is one byte, for 8 bit flags. The flags define which of the top/bottom and left/right sides should be honored and which ignored. Notice that top, bottom, left, right, width and height are all 2 bytes, or 16-bits each. This allows a view's height, for example, to be up to 65535 lines high.

Remember, that all the toolkit objects descend from view. That means that the metrics properties of view will be automatically inherited by all other view subclasses. These are the properties that define how every view in the entire toolkit are sized and resized.

The next three properties are link pointers, allowing views to be hierarchically arranged. Every view is linked to its sole parent, or container view, via the view_prnt pointer. Every view has the ability to be pointed to its first child, via view_chld. And the view_nsib is a pointer from a view to its next sibling. If a parent view wishes to iterate over all of its child views, it starts by getting a reference to its first child, and then traverses down the list of its siblings. Again, because all views inherit from this root view, all views can be plugged together using these link pointers.

Object Methods

Still we only have properties. Even though some properties are pointers to other nodes, we have actually already seen this with plain structs. The next four properties however are very interesting. They are pointers to four methods.

The first, view_draw, is the method that is called by the system's main event loop during a screen redraw phase if something has marked the screen as dirty. The view object's draw routine simply clears out a square (both the character map and color map) that is defined as its draw region by its parent view. We'll get into much more detail about how this works in a future post, because it's a huge topic.

But view does something slightly more than this. It also iterates over its children, refines its own region bounds to the region into which the child should draw, and then calls the child's own draw routine. It repeats this for all of its children. This allows the child to draw its specific appearance overtop of the area just cleared out by its parent. Note that this draw behavior is also recursive, because the view's children are also views which may have their own children, and so on down the view hierarchy.

Polymorphism is absolutely critical to how this works. The view's own draw routine knows that each child view must have a draw routine. But it doesn't care whether the child is another view view, or if it is a subclass of view. It doesn't care what the output of its child's drawing is going to be, only that it can ask it to draw itself. This is the very definition of polymorphism.

Let me briefly say what the other 3 method pointers do before describing how the methods are selected and called.

These three: view_mouse, view_kcmd, and view_kprnt should sound very familiar if you've read this post. If you care about this stuff, I highly recommend you go read that post first. These methods match the three basic event types that are generated by C64 OS's extended KERNAL. Mouse events, Keyboard Command events, and Keyboard Printable Text events. For mouse events, whichever view object is clicked2 becomes the first view to receive the mouse event. It does this simply by calling the view_mouse method of the clicked view. The implementation can then use a system call to read the details of the mouse event. However, by default, the view class doesn't have any particular behaviour for any type of mouse activity. So view's implementation is simply to call its parent view's view_mouse routine. Thus, each view up the hierarchy has an opportunity to handle the mouse activity.

The two keyboard event types, however, are dispatched on a principle of focus. One view is allowed to be in focus at a time. So the system maintains a pointer to the view in focus. If you click on an input, for example, the input's own mouse handling routine will tell the system to make it the view object with focus. After that point, when a keyboard event is generated, it is delivered to the focus view first, by calling the focus view's view_kcmd or view_kprnt method. Then, these methods behave just like view_mouse. If the focus view handles that type of keyboard event it can do something with it. Otherwise, it passes it up the hierarchy by calling the same method on the parent view.

Calling Object Methods

The question is, how to call the object methods. The first thing to note is that an object always has access to a pointer that points to itself, to its own data structure. That pointer is typically referred to as the this pointer. And so how pointers work on the 6502/6510 is pretty important. Pointers, for use in indirect indexed (or indexed indirect) addressing have to be in zero page. But that makes zero page a giant global variable space for everything going on in C64 OS. Much the same way there is only one stack. This is partially what makes the 6502 so ill-suited for premptive multitasking. On bigger CPUs, even the 65816, each process can have its own stack and its own zeropage (known as direct page.) However, C64 OS is not multitasking so the only real worry we need to have is that the interrupt routine not use the same zero page addresses that are reserved for use by regular programs.

Next, we want to define a system global, or at the very least toolkit global, zero page address to be used as the currently executing object's this pointer. Toolkit internally uses $fb-$fc. But as I get further along in development I might change this to something lower that is usually only used by the BASIC rom, which C64 OS patches out.

What this means is that, whenever any toolkit object's method is running, it can expect to find itself pointed to by $fb-$fc. In code that interacts with the toolkit, we can use a variable this = $fb to get the flavor of an object. In order to call an object's method all we have to do is, back up the current this pointer to the stack and change the this pointer to point to the target object. Next, read the method pointer from the target object and write that pointer either into the address of an indirect jmp, or self–modding the two bytes following a JSR, and then JSR to the method. When the routine returns we restore the this pointer from the stack. All of this, by the way, could probably be wrapped up in a macro to make it feel a lot more like you're merely calling a child's method.

Let's see this in code.

The comments in the code make it pretty much self–explanatory. $fb is declared as this, so we can use this throughout the code knowing what we're talking about. The current object's this is shoved on the stack to back it up.

Next we use the .Y register as the indirect index through this, to grab a pointer to this object's child. The low byte is being transferred into .X instead of being written to this right away, because obviously, we need this to continue to be valid so we can read the high byte. After we've got both bytes, we write them both to this. This now points to the child object. But we want to call the child's draw routine. So we use .Y to reference the draw routine's offset. And read the pointer to it through this (which is now the child's this.) And write that draw routine's address to the JSR's argument.

Writing to the JSR's argument is a self–modification of code. I like it, because it's fast and clean. It doesn't port well to run from a ROM however, and you may feel dirty using it. Alternatively you could write the address to somewhere in ram that is reserved as the address of an indirect JMP. (JMP ()). And then just JSR to that JMP (). But, I like the self–mod. The draw routine for the child now assumes that the this pointer points at itself, which it does. And it may repeat the process to call the method of some other object.

When the routine returns, we can restore the this pointer from the stack. Now, if this is the very end of the routine and we don't care about restoring this, we could skip the backing up and restoring of this. Whether you need to depends on who is going to be calling this routine. If it's only ever called by some other object, then the this doesn't need to be restored. If it's ever going to be called by another routine of this object, then the this has to be restored.

A little preview of inheritance

I'm cognizant of how long this post is becoming, so I'm going to wrap it up with this last section, and save many of its details for a future post.

But, let's suppose we are implementing the label class. What does it mean for it to be a subclass of view? How does it inherit from view? Well, here's what its data structure would look like.

The comment indicates that it descends from view. And so each of its properties have their offsets prepended by "view_size+", which means their offsets follow all the offsets defined by view. And lastly, label_size is set equal to view_size plus the additional size of the properties added by label. That's very cool. If we were to go back and add an extra couple of properties to view, and adjust view_size accordingly, then everything that's been coded for label will automatically get the new correct offsets and size when reassembled.

If another class, say button, descends from label, then its definition does not need to reference view. Button simply prepends its property offsets with "label_size+". And it specifies its own size as button_size = label_size+... however big the additional button properties are. Again, that's really cool. Because we could even add an intermediate class between view and label. Label would have to be updated to change who it inherits from, but button doesn't have to be changed at all, since it would still inherit from label. Startin' to see some cool design benefits here.

I don't have space here to go into how these classes are instantiated, nor how their constructors work. But in my next post on this topic I'll cover how memory is allocated, how the nodetype id is assigned, how the constructor method is called, how the constructor assigns method pointers, and how subclasses are able to override and extend their superclass's constructor and other methods.

Once again, if you have questions, thoughts or corrections, please leave them in the comments section.

  1. I should emphasize, "by name" is only at assemble time. Once assembled the program will have hard numeric offsets. If the OS designer changes the order of the properties or adds new properties to a superclass it would the property offsets of any subclass. The assembled program will no longer reference the correct offsets. One way to get around this is to have the program lookup the offsets at runtime before using them. However, there would be a speed penalty. []
  2. And how it is determined which view was beneath the mouse when the mouse was clicked is a huge topic for yet another post. But suffice it to say that it's the result of what is called hit testing. []
June 30, 2017Programming Theory

Base Conversion in 6502 (1/2)

I'm working my way through figuring out how to build a widget toolkit for a text-based UI that borrows a lot from the way hierarchical widget toolkits work in graphical user interfaces. But, it's a huge mountain to climb. And I don't want to let the blog go without any development love for too long.

In my post about implementing factorial, from absolute scratch, in 6502, the issue came up at the end of converting the binary result of the math into something that would be human readable. There I included a very simple routine that converts the number into hexadecimal. I chose hex because it is so simple. You just split the byte into two bytes where each has a range from 0 to 15. To grab the most significant 4-bits you shift them right 4 times. And to isolate the least significant 4-bits you can just mask out the upper 4-bits. And then, those two resultant values can be used as an index into a small lookup table of, say, PETSCII string values.

But how do you convert into something other than hexadecimal? What if you want convert a binary integer into decimal? Or vice versa? That's quite a different story.

Fortunately we have Garth Wilson of Wilson Mines Co. and his wonderful website about 6502 programming, http://wilsonminesco.com, to rely on. I found the following descriptions on his page about Large Look-up Tables for Hyperfast, Accurate, 16-Bit Scaled-Integer Math.

The whole article is interesting, although it's quite a bit beyond my level of experience. There is however a particularly useful section listed in the table of contents as, Easy conversion to and from any base. His description is written out in prose, but in my experimenting with the idea it seems to work well. So I thought I'd expound on it with a bit more practical detail here.

Convert a string number to HEX

Actually, what he refers to as hex is in my reckoning actually just binary, but we get to type those binary numbers in hexadecimal notation in our assembler. The difference between them is the following, if you have a single byte:

"1"    ==    0011 0001, or 0x31    (Petscii "1")
1      ==    0000 0001, or 0x01
"a"    ==    0100 0001, or 0x41    (Petscii "a")
$a     ==    0000 1010, or 0x0a

The question is, how do you convert a string of petscii digits into the equivalent conceptual value as the number counted in binary. Here's what Garth Wilson says about it:

For inputting numbers from a text string (e.g. typed in from a keyboard), initialize a number as 0 to build on, then take the first digit in the string and add its value to the hex number you're building. Continue until you're out of digits, each time multiplying the build by the number in variable BASE and then adding the new digit's value to the build.

Let me convert that into pseudo code for you.

Define BASE.
Initialize number to 0.

  Multiply number by BASE.
  Read first digit from string.
  Add its value to number.
  If string has more digits, go to loop.


Next let's step through the pseudo code with an example number, seeing how it converts one step at a time.

BASE = $A (10 in decimal)
loop 0:
$0 * BASE == $0
"4" << "3156"
$0 + $4 == $4

loop 1:
$4 * BASE == $28
"3" << "156"
$28 + $3 == $2B

loop 2:
$2B * BASE == $1AE
"1" << "56"
$1AE + $1 == $1AF

loop 3:
$1AF * BASE == $10D6
"5" << "6"
$10D6 + $5 == $10DB

loop 4:
$10DB * BASE == $A88E
"6" << ""
$A88E + $6 == $A894


Before describing what I did above, let's just use our calculator to confirm that $A894 is equal to 43156. And, it is.

So, we begin be defining BASE as $A. This is base 10, because that's the base that is used in the string input. If the string input were actually in octal, well then BASE should be defined as $8 and all the intermediate results would change.

Between the two horizontal rules, I've labelled 5 loop iterations (0 through 4). That's because the original input string contained 5 digits. Each loop iteration starts by multiplying our running total by BASE. In the zeroth iteration, because the running total started as 0, multiplying it by BASE doesn't change its value.

In each loop iteration we pull one digit off the left hand side of the string. That is, you are processing the most significant digit first. That "W" << "XYZ" is just to show that "W" has been pulled off the left side of the string, leaving behind only "XYZ" in the string, to be dealt with next iteration.

The single digit that is pulled off the string has to be converted to an integer. This isn't shown here, but it's super easy. If the input string is a decimal value in petscii, then using a PETSCII table we can see that "0" is $30, "1" is $31, "2" is $32, etc. So we just have to subtract $30 from the single character to get its integer value. That integer value is added to the running total. If the remaining string still has any digits left over, then we do another loop iteration.

In a real implementation we'll need to use a multiplication routine too, since the 6502 doesn't have instructions for multiplication. But we had a good 16-bit multiplication routine in the post about implementing factorial.

We'll also have to worry about precision here. If we use 16-bit multiplication and 16-bit addition inside the loop, it will limit the maximum size of integer that can be parsed out of a string representation of a number, to 65535.

String number to HEX Implementation

Here's a quick little implemention of the above. Works exactly as expected.

So, how does this work? Let's walk through it a line at a time. I'm using a very similar 16-bit multiplication routine to the one used in Implementing Factorial, with a small change. I removed the code that preserves multiplier, because I'm going to be clobbering multiplier with product immediately after the multiplication. However, for more info on this routine and where I got it you can read about it in the aformentioned post.

The initial constants are workspace variables for the multiplication routine. Then we declare that base is 10, because the string holds a number that's in base 10. And we define petoffset as $30, because I hate magic numbers. This is the offset from 0 as an integer to "0" the petscii character.

We'll start the code where a basic program starts, $0801, and follow it with the BASIC prelude which will SYS to the start of the real code. Read about this in the aforementioned post too.

Now, we need a running total, and we need to multiple that running total in each loop iteration by base. I've decided that "multiplier" shall serve as the running total. During the multiplication subroutine multiplier is corrupted, but the result is put into product. And multiplicand is left untouched. Multiplicand therefore should be base. So lines 18 to 21 copy base the constant, as a 16-bit number into multiplicand. And lines 25 to 27 initialize multiplier and the X register to zero.

The loop starts on line 29, where we read one byte from the string at index X. This will be the leftmost and most significant digit. Lines 30 and 31 check to see if this byte is 0. If it is, that's the null terminator on the string, the whole process is complete and it branches to done. If it got a legitimate digit it carries on, and increments X for the next trip around the loop.

The mult16 routine will destroy the accumulator. So first we back that up to the stack. JSR to mult16. This has corrupted multiplier, but the result is in product. We just want product to be put back into our running total, so we copy it into the multiplier which didn't have anything meaningful anyway. And then restore the accumulator with the digit from the string.

Lines 43 and 44 convert that single digit to an integer by subtracting "petoffset". And then lines 46 to 51 do a simple 16-bit add to add this digit to the running total. And line 53 restarts the loop to move on to the next digit. And that's it.

Output of the base conversion routine

How do we check that it worked? Run it. And the final result should be stored in multiplier which is at $4e,$4f, or 78 and 79. The screenshot shows that if we print out the peek of 78 and the peek of 79 we get 148 and 168 respectively. (Of course, BASIC is converting these byte integers back into strings to be displayed for us, irony.) However, if we stick them in our hex calculator, we see that 148 == $94 and 168 == $A8. Remember that multibyte ints are stored in little endian, or least significant byte first. So, reverse those and we get $A894.

That is indeed the correct answer, according to how I ran the pseudocode by hand. Pretty cool.

I don't want these posts to get too long, so I'll leave conversion the other way, from an int to a string, for part 2 of this post.

Older PostsClick titles to see full content

The home page shows the full content of the 10 most recent posts.
Below are the titles of 5 posts older than the most recent 10.
Click their titles to view the complete post and read and leave comments.

June 20, 2017Software

Huge Site Update

June 5, 2017Software

Recursive File Copier in BASIC

May 29, 2017Technical Deep Dive

Implementing Factorial in 6502

May 16, 2017Editorial

Review: FREEZE64 Fanzine

May 9, 2017Programming Theory

Pointers in Practice, Menus


Full Archive of Past Posts…