Written and Maintained by Gregory Nacu

Featured Posts

C64OS.com has grown to be more than just a blog about one developer's progress, it is becoming a resource to surround and support a type of C64 user that wants to benefit from the Commodore community and get the most out of modern hardware expansions for their beloved platform.

After writing many posts on the C64 OS weblog, the unfortunate reality is that some of my best work gets lost in the stream of news and developments. Be sure not to miss these full–length editorial reviews:

May 16, 2017Editorial

Review: FREEZE64 Fanzine

December 5, 2016Editorial

World of Commodore '16

Programming Reference

August 4, 2017Programming Reference

6502 / 6510 Instruction Set

August 4, 2017Programming Reference

Commodore 64 PETSCII Codes

August 3, 2017Programming Reference

Commodore 64 Screen Codes

Search

Needs some ideas? Trying searching for:
PETSCII, Animation, Memory or Pointers

Recent Posts

August 15, 2017Programming Theory

Organizing Module Layout

August 4, 2017Programming Reference

6502 / 6510 Instruction Set

August 4, 2017Programming Reference

Commodore 64 PETSCII Codes

August 3, 2017Programming Reference

Commodore 64 Screen Codes

August 1, 2017Programming Theory

Base Conversion in 6502 (2/2)

July 21, 2017Hardware

Commodore Logo Mark Patch

July 5, 2017Programming Theory

Object Orientation in 6502

June 30, 2017Programming Theory

Base Conversion in 6502 (1/2)

June 20, 2017Software

Huge Site Update

June 5, 2017Software

Recursive File Copier in BASIC

May 29, 2017Technical Deep Dive

Implementing Factorial in 6502

May 16, 2017Editorial

Review: FREEZE64 Fanzine

May 9, 2017Programming Theory

Pointers in Practice, Menus

May 1, 2017Programming Theory

Loading Sequential Files

April 27, 2017Programming Theory

HomeBase Applications

April 21, 2017Programming Theory

Application Loading

April 6, 2017Programming Theory

Memory Manager Development

March 27, 2017Software

Petscii Art Animation

March 27, 2017Programming Theory

Making Course Corrections, Timers

March 21, 2017Software

PETSCII Art Renderer

March 6, 2017Programming Theory

Code Module Exports Table

March 1, 2017Technical Deep Dive

Recursion and C64 OS's Menu UI (3/3)

February 22, 2017Technical Deep Dive

C-Style Structures in 6502 (2/3)

February 14, 2017Technical Deep Dive

What is a pointer? (1/3)

February 6, 2017Programming Theory

Memory Managed Loader

Older Posts

Full Post Archive

Subscribe to C64OS.com with your favorite RSS Reader

News, Editorials, Progress and Reference

August 15, 2017Programming Theory

Organizing Module Layout

At the beginning of the year I wrote a post, Organizaing a Big Project. At that time, I had just split apart what had been a monolithic code file into a series of more managable modules. At the time of this writing they're documented in the C64 OS technical documentation as:

  1. memory
  2. petscii
  3. input
  4. string
  5. screen
  6. menu
  7. service
  8. file
  9. toolkit, and
  10. network

However, this list is in such flux that even though that documentation was updated just 2 weeks ago, this list of modules has already changed. I've merged petscii, (which only had 3 exports: asc2pet, pet2asc and pet2scr) into the string module which is likely to gain far more functionality in the coming months. Plus, I added a math module, 16-bit division and multiplication routines that started their life in toolkit due to their usefulness there, have been moved to their own module so they can be reused by other modules and by C64 OS applications. And lastly, the system jump table has been extricated from memory and made into its own module, thus making the memory module more of a peer to the others.

Suffice to say, the modules are in a state of flux. As I add new routines, optimize old routines and move routines from one module to another, the assembled object sizes of the modules keeps changing. However, they have to be packed together in memory, such that the end of one module is followed by the start of another module. If they overlap the module lower in memory will overwrite the first part of the module higher in memory that's already loaded in. Thus corrupting it and leading to instant and hair-pulling crashes. If there is space left between the modules, well, then that's just wasted memory that can never be allocated, found or utilized.

In addition to the problem of packing the modules together in memory, the main system jump table also has to know where each module starts, so that it knows where to find the module's table of vectors through which to jump. If that weren't enough, there is also the issue that the jump table itself is in a state of flux. With the addition of the math module, I just added mul16 and div16 to the middle to the jump table. Thus, the modules that consume the exports from other modules have to know where in the jump table to find the routine.

Each of these issues has a solution that I worked out at the beginning of the year. But how these solutions work has proven to be quite laborious for me to upkeep.


In order to pack the modules, I have to know how big each is. So I start with the first, the one highest in memory, and assemble it to somewhere arbitrary. The assembler tells me the start and end address of the object code. (And hopefully I don't get any phase errors, which seem to be fairly easy to produce if you use a label prior to defining it.) I write down that size. Then I go to the next module, load it up, assemble it and record its size. I do this for all 10 (or so) modules.

Then I pull out my calculator, a hex/oct/bin/dec converting Casio my brother helpfully gave me for christmas many moons ago. And I start with the last address plus one, so $CFFF+1, or $D000. And subtract the size of the first module to find where it should start, and I write that down. From that address I subtract the size of the next module, and write down its start address. Repeat this for all the modules. Eventually I end up with a table like this:

Module size and start address table

You can see in the image of my notebook that as I work on the code, I have to recompute the sizes and start addresses when it comes time to test my work. Sometimes, such as in the top middle column (second set of size/start columns), I've only worked on a few modules, so I can leave the others alone and just recompute the offsets for the few I worked on. Still, this process is painstaking and boring, and prone to mistakes leading to nasty bugs.

After I've calculated and written down this table of start addresses, I have to go back to each module, open its main file and manually set the initial start address. (*= $ce34) for memory, for example. Then I have to reassemble this module to an object file.

The system Jump Table is another issue. The Jump Table needs to know where each of these modules starts, so I open that module and update a set of labels with these new start addresses. Each jmp is an indirect jump through a vector found at the start of the module + the offset to the routine. If the module exports 5 routines, then it starts with a 10-byte table of five 2-byte vectors. And the Jump Table correspondingly has 5 entries.

  JMP (input+0)
  JMP (input+2)
  JMP (input+4)
  JMP (input+6)
  JMP (input+8)

That sort of thing. So every time a module grows in size, not only does it have to be reassembled, but every module lower in memory shifts down and has to be reassembled too. And then the jump table has to be reassembled also.

This is a big pain. But, let's not forget about how a module that wants to call an exported routine from another module finds the entry in the jump table. Each module has a .h header file that defines labels for each routine, and sets them to the position within the jump table for the module's block of jumps, plus the 3-byte offset. Such as:

  readmouse = inputbase+0
  mouserc   = inputbase+3
  deqmouse  = inputbase+6
  readkcmd  = inputbase+9
  deqkcmd   = inputbase+12
  readkprnt = inputbase+15
  deqkprnt  = inputbase+18

  etc…

Thus we have another file that needs to be updated. If a routine is added to a module that needs an entry in the system Jump Table, it will offset the jump table base address for preceding modules. They then need to have their header files updated… and any module that includes that .h file and calls one of those routines needs to be reassembled so they have the right jump table offsets.

Needless to say, it's easy to forget any of these many places that need to be updated. And it's hard to remember exactly which modules depend on which other modules. I end up having load the source code for each module and look through their includes to see if they include an affected header file.


I need a better way to work this, because the burden is so heavy that its discouraging me from wanting to work on the project. And it gets worse the more code I write and the more modules I add.

Finally, it occurred to me that all those calculations of offsets that I'm doing could actually be done by the assembler, if only it had some basic information:

  • How big is each module, and
  • How many routines does the module export

Instead of writing these numbers down on paper, I could put them all together into a header file to be included by everything that needs one of these numbers. Here's what it ended up looking like:

The modules.h header file, top The modules.h header file, bottom

The top half of the file declares all the variable data I need to provide, in one place. One label for each module (prefixed with x for exports), for how many routines are exported by the module. One label for each module (prefixed with y, because it's close to x), for the size of the assembled module. Unfortunately I still have to manually determine the size of a module, however this only needs to be calculated for the modules I'm working on, which need to be reassembled anyway.

The bottom half of the file is where all the magic happens. First, the highest thing in memory is the Jump Table. And as of just a few nights ago, this is no longer part of the memory module but is its own stand alone file. One label is defined for each module (prefixed with j for jump table). The start of the jump table entries for memory are $d000 (hardcoded as the last address + 1), minus the number of exports for this module times 3. Three because each jmp() takes three bytes.

This gives us a label jmemory that indicates where the jump table entries for memory begin. But this also becomes the relative start address for where the next module's jump table entries begin. Thus jinput starts at jmemory minus xinput*3. And so on, all the way through all the modules. These offset labels are computing automatically simply by adjusting the number of exports each module offers at the top of the file.

The memory module is now just a standard module, structurally the same as any of the others, with its own table of vectors to its exported routines. It's start address is going to be offset from the final (lowest in memory) jump table block. There is one label per module (prefixed with s for start address). The Jump Table itself is first module, but its start address is the same as whatever was the address the final jump table block. In this case it's jtoolkit simply because toolkit is the last module. Thus there is a label, sjumptbl that is set equal to jtoolkit. smemory, the start address for the memory module therefore is sjumptbl minus the size ymemory, the size of the memory module. And the next module starts from smemory minus its own size, and so on through the modules. Thus, modules.h is a central place where all the variable data goes, the size of each module and the count of its exports, and it produces a set of labels that are j- prefixed and s- prefixed for the start addresses of each module's jump table block and code, respectively.


How can this be used, now?

Each module includes modules.h. Then sets is own assemble address to its s-prefixed label. Toolkit for example includes modules.h and positions itself by declaring *= stoolkit.

Each module's .h header file declares the labels for its jump table entries as offsets from the start of its j- prefixed label. These declarations are always right and so the header files themselves don't need to be touched.

  readmouse = jinput+0
  mouserc   = jinput+3
  deqmouse  = jinput+6
  readkcmd  = jinput+9
  deqkcmd   = jinput+12
  readkprnt = jinput+15
  deqkprnt  = jinput+18

  etc…

And lastly, you have the jump table itself. The actual jmp() instructions still need to be written, because Turbo Macro Pro (the native version) doesn't support the code generating pseudo labels that TMPx provides. But where those JMPs jump to is computed dynamically by the s- prefixed labels simply by including modules.h. Thus:

  JMP (sinput+0)
  JMP (sinput+2)
  JMP (sinput+4)
  JMP (sinput+6)
  JMP (sinput+8)
Module size and export count table

I still need to keep track of some things on paper. My table now looks like what is shown in the image above. But I don't have to make any of the intermediate calculations. And I don't have to enter any of those intermediate values manually across a myriad of files. I just update the modules.h file with the new modules' sizes and export counts. Then I update the jump table itself with the additional entries. And reassemble to object files everything that would be affected by the changes.

It's still quite a bit of work if I make changes to a module up high in memory that pushes the others below it to new start addresses. But, like my recursive backup script it saves a huge amount of effort and takes a big burden off my shoulders, so I can spend more time coding and less time worrying about how to fit things into memory.


There is one last interesting benefit that has popped up. If I decide I'm going to start working on the string module a lot, because I'm building out the set of routines for string editing and manipulation, ordinarily I'd be in a bit of a tough spot. string is 3rd highest in memory, with six or more modules below it. I'd hate to have to reassemble 7 modules(!) every time I make string a bit bigger.

Well, actually I don't have to. It's easy peasy in the modules.h file simply to move the string module to the bottom of the pile. Its jump table entries can stay where they are, although it would be handy to build out the entries in the jump table, even with placeholders for routines I know will be there soon. There is nothing inherently special about having string be the 3rd module instead of the 9th. It was just bothersome having to redo all the math.

Now that modules.h does all that math, it becomes trivial to rearrange the modules. If I know I'm going to work on a specific couple for a few days, I can just move them to the bottom of the pile at the start of my session, and all of a sudden the cycle time for testing them becomes very short again.


Just a quick update on toolkit. It has been the primary focus of the last month or so of work. I've put a lot of thought and a lot of code into it. However, it's still so in flux in my head that I haven't wanted to commit to writing anything about it yet. But soon. I've just recently worked out what I think will be a big score in making the drawing of views and clipping them to the insides of their bounds rect much more efficient and easier to program. As soon as I get a slightly better handle on how well that's going to work out, I plan to write my first blog post discussing the general architecture of the toolkit. Stay tuned.

August 4, 2017Programming Reference

6502 / 6510 Instruction Set

Every good Commodore 64 programmer needs to have the 6502/6510 instruction set at his or her fingertips. There are already many reference texts like this out there, however I find all of them to be lacking.

It is my goal for the presentation below to be the fastest, easiest–to–use and best organized 6502/6510 instruction set reference text on the internet. If you have ideas for how I can improve it, please let me know in the comments.

Full credit to 6502.org for the original source of this content, which I have formatted, rearranged, styled and fixed the HTML.

Alphabetically Ordered

A:

ADC AND ASL

B:

BCC BCS BEQ BIT BMI

  

BNE BPL BRK BVC BVS

C:

CLC CLD CLI CLV

  

CMP CPX CPY

D:

DEC DEX DEY

E:

EOR

I:

INC INX INY

J:

JMP JSR

L:

LDA LDX LDY LSR

N:

NOP

O:

ORA

P:

PHA PHP PLA PLP

R:

ROL ROR RTI RTS

S:

SBC SEC SED SEI

  

STA STX STY

T:

TAX TAY

  

TSX TXA TXS TYA

 

Execution Times

Op code execution times are measured in machine cycles; one machine cycle equals one clock cycle. Many instructions require one extra cycle for execution if a page boundary is crossed; these are indicated by a + following the time values shown.

Notes Links


Bitwise Instructions

AND (bitwise AND with accumulator)

Affects Flags: S Z

MODE          SYNTAX        HEX LEN TIM
Immediate     AND #$44      $29  2   2
Zero Page     AND $44       $25  2   3
Zero Page,X   AND $44,X     $35  2   4
Absolute      AND $4400     $2D  3   4
Absolute,X    AND $4400,X   $3D  3   4+
Absolute,Y    AND $4400,Y   $39  3   4+
Indirect,X    AND ($44,X)   $21  2   6
Indirect,Y    AND ($44),Y   $31  2   5+

+ add 1 cycle if page boundary crossed

EOR (bitwise Exclusive OR with accumulator)

Affects Flags: S Z

MODE          SYNTAX        HEX LEN TIM
Immediate     EOR #$44      $49  2   2
Zero Page     EOR $44       $45  2   3
Zero Page,X   EOR $44,X     $55  2   4
Absolute      EOR $4400     $4D  3   4
Absolute,X    EOR $4400,X   $5D  3   4+
Absolute,Y    EOR $4400,Y   $59  3   4+
Indirect,X    EOR ($44,X)   $41  2   6
Indirect,Y    EOR ($44),Y   $51  2   5+

+ add 1 cycle if page boundary crossed

ORA (bitwise OR with Accumulator)

Affects Flags: S Z

MODE          SYNTAX        HEX LEN TIM
Immediate     ORA #$44      $09  2   2
Zero Page     ORA $44       $05  2   3
Zero Page,X   ORA $44,X     $15  2   4
Absolute      ORA $4400     $0D  3   4
Absolute,X    ORA $4400,X   $1D  3   4+
Absolute,Y    ORA $4400,Y   $19  3   4+
Indirect,X    ORA ($44,X)   $01  2   6
Indirect,Y    ORA ($44),Y   $11  2   5+

+ add 1 cycle if page boundary crossed

ASL (Arithmetic Shift Left)

Affects Flags: S Z C

MODE          SYNTAX        HEX LEN TIM
Accumulator   ASL A         $0A  1   2
Zero Page     ASL $44       $06  2   5
Zero Page,X   ASL $44,X     $16  2   6
Absolute      ASL $4400     $0E  3   6
Absolute,X    ASL $4400,X   $1E  3   7

ASL shifts all bits left one position. 0 is shifted into bit 0 and the original bit 7 is shifted into the Carry.

LSR (Logical Shift Right)

Affects Flags: S Z C

MODE          SYNTAX        HEX LEN TIM
Accumulator   LSR A         $4A  1   2
Zero Page     LSR $44       $46  2   5
Zero Page,X   LSR $44,X     $56  2   6
Absolute      LSR $4400     $4E  3   6
Absolute,X    LSR $4400,X   $5E  3   7

LSR shifts all bits right one position. 0 is shifted into bit 7 and the original bit 0 is shifted into the Carry.

ROL (ROtate Left)

Affects Flags: S Z C

MODE          SYNTAX        HEX LEN TIM
Accumulator   ROL A         $2A  1   2
Zero Page     ROL $44       $26  2   5
Zero Page,X   ROL $44,X     $36  2   6
Absolute      ROL $4400     $2E  3   6
Absolute,X    ROL $4400,X   $3E  3   7

ROL shifts all bits left one position. The Carry is shifted into bit 0 and the original bit 7 is shifted into the Carry.

ROR (ROtate Right)

Affects Flags: S Z C

MODE          SYNTAX        HEX LEN TIM
Accumulator   ROR A         $6A  1   2
Zero Page     ROR $44       $66  2   5
Zero Page,X   ROR $44,X     $76  2   6
Absolute      ROR $4400     $6E  3   6
Absolute,X    ROR $4400,X   $7E  3   7

ROR shifts all bits right one position. The Carry is shifted into bit 7 and the original bit 0 is shifted into the Carry.


Program Counter

When the 6502 is ready for the next instruction it increments the program counter before fetching the instruction. Once it has the op code, it increments the program counter by the length of the operand, if any. This must be accounted for when calculating branches or when pushing bytes to create a false return address (i.e. jump table addresses are made up of addresses-1 when it is intended to use an RTS rather than a JMP).

The program counter is loaded least signifigant byte first. Therefore the most signifigant byte must be pushed first when creating a false return address.

When calculating branches a forward branch of 6 skips the following 6 bytes so, effectively the program counter points to the address that is 8 bytes beyond the address of the branch opcode; and a backward branch of $FA (256-6) goes to an address 4 bytes before the branch instruction.

Branch Instructions

Affect Flags: none

All branches are relative mode and have a length of two bytes. Syntax is "Bxx Displacement" or (better) "Bxx Label". See the notes on the Program Counter for more on displacements.

Branches are dependant on the status of the flag bits when the op code is encountered. A branch not taken requires two machine cycles. Add one if the branch is taken and add one more if the branch crosses a page boundary.

MNEMONIC                       HEX
BPL (Branch on PLus)           $10
BMI (Branch on MInus)          $30

BVC (Branch on oVerflow Clear) $50
BVS (Branch on oVerflow Set)   $70

BCC (Branch on Carry Clear)    $90
BCS (Branch on Carry Set)      $B0

BNE (Branch on Not Equal)      $D0
BEQ (Branch on EQual)          $F0

There is no BRA (BRanch Always) instruction but it can be easily emulated by branching on the basis of a known condition. One of the best flags to use for this purpose is the oVerflow which is unchanged by all but addition and subtraction operations.

A page boundary crossing occurs when the branch destination is on a different page than the instruction AFTER the branch instruction. For example:

SEC
BCS LABEL
NOP

A page boundary crossing occurs (i.e. the BCS takes 4 cycles) when (the address of) LABEL and the NOP are on different pages. This means that

CLV
BVC LABEL
LABEL NOP

the BVC instruction will take 3 cycles no matter what address it is located at.


Compare Instructions

CMP (CoMPare accumulator)

Affects Flags: S Z C

MODE          SYNTAX        HEX LEN TIM
Immediate     CMP #$44      $C9  2   2
Zero Page     CMP $44       $C5  2   3
Zero Page,X   CMP $44,X     $D5  2   4
Absolute      CMP $4400     $CD  3   4
Absolute,X    CMP $4400,X   $DD  3   4+
Absolute,Y    CMP $4400,Y   $D9  3   4+
Indirect,X    CMP ($44,X)   $C1  2   6
Indirect,Y    CMP ($44),Y   $D1  2   5+

+ add 1 cycle if page boundary crossed

Compare sets flags as if a subtraction had been carried out. If the value in the accumulator is equal or greater than the compared value, the Carry will be set. The equal (Z) and sign (S) flags will be set based on equality or lack thereof and the sign (i.e. A>=$80) of the accumulator.

CPX (ComPare X register)

Affects Flags: S Z C

MODE          SYNTAX        HEX LEN TIM
Immediate     CPX #$44      $E0  2   2
Zero Page     CPX $44       $E4  2   3
Absolute      CPX $4400     $EC  3   4

Operation and flag results are identical to equivalent mode accumulator CMP ops.

CPY (ComPare Y register)

Affects Flags: S Z C

MODE          SYNTAX        HEX LEN TIM
Immediate     CPY #$44      $C0  2   2
Zero Page     CPY $44       $C4  2   3
Absolute      CPY $4400     $CC  3   4

Operation and flag results are identical to equivalent mode accumulator CMP ops.

BIT (test BITs)

Affects Flags: N V Z

MODE          SYNTAX        HEX LEN TIM
Zero Page     BIT $44       $24  2   3
Absolute      BIT $4400     $2C  3   4

BIT sets the Z flag as though the value in the address tested were ANDed with the accumulator. The S and V flags are set to match bits 7 and 6 respectively in the value stored at the tested address.

BIT is often used to skip one or two following bytes as in:

CLOSE1 LDX #$10   If entered here, we
.BYTE $2C         effectively perform
CLOSE2 LDX #$20   a BIT test on $20A2,
.BYTE $2C         another one on $30A2,
CLOSE3 LDX #$30   and end up with the X
CLOSEX LDA #12    register still at $10
STA ICCOM,X       upon arrival here.

Beware: a BIT instruction used in this way as a NOP does have effects: the flags may be modified, and the read of the absolute address, if it happens to access an I/O device, may cause an unwanted action.


Processor Flags

The Interrupt flag is used to prevent (SEI) or enable (CLI) maskable interrupts (aka IRQ's). It does not signal the presence or absence of an interrupt condition. The 6502 will set this flag automatically in response to an interrupt and restore it to its prior status on completion of the interrupt service routine. If you want your interrupt service routine to permit other maskable interrupts, you must clear the I flag in your code.

The Decimal flag controls how the 6502 adds and subtracts. If set, arithmetic is carried out in packed binary coded decimal. This flag is unchanged by interrupts and is unknown on power-up. The implication is that a CLD should be included in boot or interrupt coding.

The Overflow flag is generally misunderstood and therefore under-utilised. After an ADC or SBC instruction, the overflow flag will be set if the twos complement result is less than -128 or greater than +127, and it will cleared otherwise. In twos complement, $80 through $FF represents -128 through -1, and $00 through $7F represents 0 through +127. Thus, after:

CLC
LDA #$7F ;   +127
ADC #$01 ; +   +1

the overflow flag is 1 (+127 + +1 = +128), and after:

CLC
LDA #$81 ;   -127
ADC #$FF ; +   -1

the overflow flag is 0 (-127 + -1 = -128). The overflow flag is not affected by increments, decrements, shifts and logical operations i.e. only ADC, BIT, CLV, PLP, RTI and SBC affect it. There is no op code to set the overflow but a BIT test on an RTS instruction will do the trick.

Flag Instructions (Processor Status)

Affect Flags: as noted

These instructions are implied mode, have a length of one byte and require two machine cycles.

MNEMONIC                    HEX LEN TIM
CLC (CLear Carry)           $18  1   2
CLI (CLear Interrupt)       $58  1   2
CLV (CLear oVerflow)        $B8  1   2
CLD (CLear Decimal)         $D8  1   2

SEC (SEt Carry)             $38  1   2
SEI (SEt Interrupt)         $78  1   2
SED (SEt Decimal)           $F8  1   2

Jump Instructions

JMP (JuMP)

Affects Flags: none

MODE          SYNTAX        HEX LEN TIM
Absolute      JMP $5597     $4C  3   3
Indirect      JMP ($5597)   $6C  3   5

JMP transfers program execution to the following address (absolute) or to the location contained in the following address (indirect). Note that there is no carry associated with the indirect jump so:

AN INDIRECT JUMP MUST NEVER USE A VECTOR BEGINNING ON THE LAST BYTE OF A PAGE

For example if address $3000 contains $40, $30FF contains $80, and $3100 contains $50, the result of JMP ($30FF) will be a transfer of control to $4080 rather than $5080 as you intended i.e. the 6502 took the low byte of the address from $30FF and the high byte from $3000.

JSR (Jump to SubRoutine)

Affects Flags: none

MODE          SYNTAX        HEX LEN TIM
Absolute      JSR $5597     $20  3   6

JSR pushes the address-1 of the next operation on to the stack before transferring program control to the following address. Subroutines are normally terminated by an RTS op code.

RTS (ReTurn from Subroutine)

Affects Flags: none

MODE          SYNTAX        HEX LEN TIM
Implied       RTS           $60  1   6

RTS pulls the top two bytes off the stack (low byte first) and transfers program control to that address+1. It is used, as expected, to exit a subroutine invoked via JSR which pushed the address-1.

RTS is frequently used to implement a jump table where addresses-1 are pushed onto the stack and accessed via RTS eg. to access the second of four routines:

LDX #1
JSR EXEC
JMP SOMEWHERE

LOBYTE
.BYTE <ROUTINE0-1,<ROUTINE1-1
.BYTE <ROUTINE2-1,<ROUTINE3-1

HIBYTE
.BYTE >ROUTINE0-1,>ROUTINE1-1
.BYTE >ROUTINE2-1,>ROUTINE3-1

EXEC
LDA HIBYTE,X
PHA
LDA LOBYTE,X
PHA
RTS

RTI (ReTurn from Interrupt)

Affects Flags: all

MODE          SYNTAX        HEX LEN TIM
Implied       RTI           $40  1   6

RTI retrieves the Processor Status Word (flags) and the Program Counter from the stack in that order (interrupts push the PC first and then the PSW).

Note that unlike RTS, the return address on the stack is the actual address rather than the address-1.


Math Instructions

ADC (ADd with Carry)

Affects Flags: S V Z C

MODE          SYNTAX        HEX LEN TIM
Immediate     ADC #$44      $69  2   2
Zero Page     ADC $44       $65  2   3
Zero Page,X   ADC $44,X     $75  2   4
Absolute      ADC $4400     $6D  3   4
Absolute,X    ADC $4400,X   $7D  3   4+
Absolute,Y    ADC $4400,Y   $79  3   4+
Indirect,X    ADC ($44,X)   $61  2   6
Indirect,Y    ADC ($44),Y   $71  2   5+

+ add 1 cycle if page boundary crossed

ADC results are dependant on the setting of the decimal flag. In decimal mode, addition is carried out on the assumption that the values involved are packed BCD (Binary Coded Decimal).

There is no way to add without carry.

SBC (SuBtract with Carry)

Affects Flags: S V Z C

MODE          SYNTAX        HEX  LEN TIM
Immediate     SBC #$44      $E9   2   2
Zero Page     SBC $44       $E5   2   3
Zero Page,X   SBC $44,X     $F5   2   4
Absolute      SBC $4400     $ED   3   4
Absolute,X    SBC $4400,X   $FD   3   4+
Absolute,Y    SBC $4400,Y   $F9   3   4+
Indirect,X    SBC ($44,X)   $E1   2   6
Indirect,Y    SBC ($44),Y   $F1   2   5+

+ add 1 cycle if page boundary crossed

SBC results are dependant on the setting of the decimal flag. In decimal mode, subtraction is carried out on the assumption that the values involved are packed BCD (Binary Coded Decimal).

There is no way to subtract without the carry which works as an inverse borrow. i.e, to subtract you set the carry before the operation. If the carry is cleared by the operation, it indicates a borrow occurred.


Wrap-Around

Use caution with indexed zero page operations as they are subject to wrap-around. For example, if the X register holds $FF and you execute LDA $80,X you will not access $017F as you might expect; instead you access $7F i.e. $80-1. This characteristic can be used to advantage but make sure your code is well commented.

It is possible, however, to access $017F when X = $FF by using the Absolute,X addressing mode of LDA $80,X. That is, instead of:

LDA $80,X    ; ZeroPage,X - the resulting object code is: B5 80

which accesses $007F when X=$FF, use:

LDA $0080,X  ; Absolute,X - the resulting object code is: BD 80 00

which accesses $017F when X = $FF (a at cost of one additional byte and one additional cycle). All of the ZeroPage,X and ZeroPage,Y instructions except STX ZeroPage,Y and STY ZeroPage,X have a corresponding Absolute,X and Absolute,Y instruction. Unfortunately, a lot of 6502 assemblers don't have an easy way to force Absolute addressing, i.e. most will assemble a LDA $0080,X as B5 80. One way to overcome this is to insert the bytes using the .BYTE pseudo-op (on some 6502 assemblers this pseudo-op is called DB or DFB, consult the assembler documentation) as follows:

.BYTE $BD,$80,$00  ; LDA $0080,X (absolute,X addressing mode)

The comment is optional, but highly recommended for clarity.

In cases where you are writing code that will be relocated you must consider wrap-around when assigning dummy values for addresses that will be adjusted. Both zero and the semi-standard $FFFF should be avoided for dummy labels. The use of zero or zero page values will result in assembled code with zero page opcodes when you wanted absolute codes. With $FFFF, the problem is in addresses+1 as you wrap around to page 0.

Memory Instructions

LDA (LoaD Accumulator)

Affects Flags: S Z

MODE          SYNTAX        HEX LEN TIM
Immediate     LDA #$44      $A9  2   2
Zero Page     LDA $44       $A5  2   3
Zero Page,X   LDA $44,X     $B5  2   4
Absolute      LDA $4400     $AD  3   4
Absolute,X    LDA $4400,X   $BD  3   4+
Absolute,Y    LDA $4400,Y   $B9  3   4+
Indirect,X    LDA ($44,X)   $A1  2   6
Indirect,Y    LDA ($44),Y   $B1  2   5+

+ add 1 cycle if page boundary crossed

LDX (LoaD X register)

Affects Flags: S Z

MODE          SYNTAX        HEX LEN TIM
Immediate     LDX #$44      $A2  2   2
Zero Page     LDX $44       $A6  2   3
Zero Page,Y   LDX $44,Y     $B6  2   4
Absolute      LDX $4400     $AE  3   4
Absolute,Y    LDX $4400,Y   $BE  3   4+

+ add 1 cycle if page boundary crossed

LDY (LoaD Y register)

Affects Flags: S Z

MODE          SYNTAX        HEX LEN TIM
Immediate     LDY #$44      $A0  2   2
Zero Page     LDY $44       $A4  2   3
Zero Page,X   LDY $44,X     $B4  2   4
Absolute      LDY $4400     $AC  3   4
Absolute,X    LDY $4400,X   $BC  3   4+

+ add 1 cycle if page boundary crossed

STA (STore Accumulator)

Affects Flags: none

MODE          SYNTAX        HEX LEN TIM
Zero Page     STA $44       $85  2   3
Zero Page,X   STA $44,X     $95  2   4
Absolute      STA $4400     $8D  3   4
Absolute,X    STA $4400,X   $9D  3   5
Absolute,Y    STA $4400,Y   $99  3   5
Indirect,X    STA ($44,X)   $81  2   6
Indirect,Y    STA ($44),Y   $91  2   6

STX (STore X register)

Affects Flags: none

MODE          SYNTAX        HEX LEN TIM
Zero Page     STX $44       $86  2   3
Zero Page,Y   STX $44,Y     $96  2   4
Absolute      STX $4400     $8E  3   4

STY (STore Y register)

Affects Flags: none

MODE          SYNTAX        HEX LEN TIM
Zero Page     STY $44       $84  2   3
Zero Page,X   STY $44,X     $94  2   4
Absolute      STY $4400     $8C  3   4

 

DEC (DECrement memory)

Affects Flags: S Z

MODE          SYNTAX        HEX LEN TIM
Zero Page     DEC $44       $C6  2   5
Zero Page,X   DEC $44,X     $D6  2   6
Absolute      DEC $4400     $CE  3   6
Absolute,X    DEC $4400,X   $DE  3   7

INC (INCrement memory)

Affects Flags: S Z

MODE          SYNTAX        HEX LEN TIM
Zero Page     INC $44       $E6  2   5
Zero Page,X   INC $44,X     $F6  2   6
Absolute      INC $4400     $EE  3   6
Absolute,X    INC $4400,X   $FE  3   7

Register Instructions

Affect Flags: S Z

These instructions are implied mode, have a length of one byte and require two machine cycles.

MNEMONIC                    HEX
TAX (Transfer A to X)       $AA
TAY (Transfer A to Y)       $A8
TXA (Transfer X to A)       $8A
TYA (Transfer Y to A)       $98

DEX (DEcrement X)           $CA
DEY (DEcrement Y)           $88
INX (INcrement X)           $E8
INY (INcrement Y)           $C8

Stack Instructions

These instructions are implied mode, have a length of one byte and require machine cycles as indicated. The "PuLl" operations are known as "POP" on most other microprocessors. With the 6502, the stack is always on page one ($100-$1FF) and works top down.

MNEMONIC                        HEX TIM
PHA (PusH Accumulator)          $48  3
PHP (PusH Processor status)     $08  3
PLA (PuLl Accumulator)          $68  4
PLP (PuLl Processor status)     $28  4

TSX (Transfer Stack ptr to X)   $BA  2
TXS (Transfer X to Stack ptr)   $9A  2

Other Instructions

BRK (BReaK)

Affects Flags: B

MODE          SYNTAX       HEX LEN TIM
Implied       BRK          $00  1   7

BRK causes a non-maskable interrupt and increments the program counter by one. Therefore an RTI will go to the address of the BRK +2 so that BRK may be used to replace a two-byte instruction for debugging and the subsequent RTI will be correct.

NOP (No OPeration)

Affects Flags: none

MODE          SYNTAX        HEX LEN TIM
Implied       NOP           $EA  1   2

NOP is used to reserve space for future modifications or effectively REM out existing code.

 

The original source for the above content is: http://6502.org/tutorials/6502opcodes.html

August 4, 2017Programming Reference

Commodore 64 PETSCII Codes

Here is the second most frequent table that I find myself using, after screen codes. The PETSCII table.

PETSCII code
(dec, hex)
Character
(up/gfx, lo/up)
PETSCII code
(dec, hex)
Character
(up/gfx, lo/up)
PETSCII code
(dec, hex)
Character
(up/gfx, lo/up)
PETSCII code
(dec, hex)
Character
(up/gfx, lo/up)
0 $00   64 $40 @ 128 $80   192 $C0
1 $01   65 $41 A a 129 $81 orange 193 $C1 A
2 $02   66 $42 B b 130 $82   194 $C2 B
3 $03 Stop 67 $43 C c 131 $83 Run 195 $C3 C
4 $04   68 $44 D d 132 $84   196 $C4 D
5 $05 white 69 $45 E e 133 $85 F1 197 $C5 E
6 $06   70 $46 F f 134 $86 F3 198 $C6 F
7 $07   71 $47 G g 135 $87 F5 199 $C7 G
8 $08 disable C=-Shift 72 $48 H h 136 $88 F7 200 $C8 H
9 $09 enable C=-Shift 73 $49 I i 137 $89 F2 201 $C9 I
10 $0A   74 $4A J j 138 $8A F4 202 $CA J
11 $0B   75 $4B K k 139 $8B F6 203 $CB K
12 $0C   76 $4C L l 140 $8C F8 204 $CC L
13 $0D Return 77 $4D M m 141 $8D Shift-Return 205 $CD M
14 $0E lo/up charset 78 $4E N n 142 $8E up/gfx charset 206 $CE N
15 $0F   79 $4F O o 143 $8F   207 $CF O
16 $10   80 $50 P p 144 $90 black 208 $D0 P
17 $11 cursor down 81 $51 Q q 145 $91 cursor up 209 $D1 Q
18 $12 reverse on 82 $52 R r 146 $92 reverse off 210 $D2 R
19 $13 Home 83 $53 S s 147 $93 Clear 211 $D3 S
20 $14 Delete 84 $54 T t 148 $94 Insert 212 $D4 T
21 $15   85 $55 U u 149 $95 brown 213 $D5 U
22 $16   86 $56 V v 150 $96 pink 214 $D6 V
23 $17   87 $57 W w 151 $97 dark grey 215 $D7 W
24 $18   88 $58 X x 152 $98 grey 216 $D8 X
25 $19   89 $59 Y y 153 $99 light green 217 $D9 Y
26 $1A   90 $5A Z z 154 $9A light blue 218 $DA Z
27 $1B   91 $5B [ 155 $9B light grey 219 $DB
28 $1C red 92 $5C pound 156 $9C purple 220 $DC
29 $1D cursor right 93 $5D ] 157 $9D cursor left 221 $DD
30 $1E green 94 $5E up arrow 158 $9E yellow 222 $DE
31 $1F blue 95 $5F left arrow 159 $9F cyan 223 $DF
32 $20 Space 96 $60 160 $A0 Shift-Space 224 $E0
33 $21 ! 97 $61 161 $A1 225 $E1
34 $22 " 98 $62 162 $A2 226 $E2
35 $23 # 99 $63 163 $A3 227 $E3
36 $24 $ 100 $64 164 $A4 228 $E4
37 $25 % 101 $65 165 $A5 229 $E5
38 $26 & 102 $66 166 $A6 230 $E6
39 $27 ' 103 $67 167 $A7 231 $E7
40 $28 ( 104 $68 168 $A8 232 $E8
41 $29 ) 105 $69 169 $A9 233 $E9
42 $2A * 106 $6A 170 $AA 234 $EA
43 $2B + 107 $6B 171 $AB 235 $EB
44 $2C , 108 $6C 172 $AC 236 $EC
45 $2D - 109 $6D 173 $AD 237 $ED
46 $2E . 110 $6E 174 $AE 238 $EE
47 $2F / 111 $6F 175 $AF 239 $EF
48 $30 0 112 $70 176 $B0 240 $F0
49 $31 1 113 $71 177 $B1 241 $F1
50 $32 2 114 $72 178 $B2 242 $F2
51 $33 3 115 $73 179 $B3 243 $F3
52 $34 4 116 $74 180 $B4 244 $F4
53 $35 5 117 $75 181 $B5 245 $F5
54 $36 6 118 $76 182 $B6 246 $F6
55 $37 7 119 $77 183 $B7 247 $F7
56 $38 8 120 $78 184 $B8 248 $F8
57 $39 9 121 $79 185 $B9 249 $F9
58 $3A : 122 $7A 186 $BA 250 $FA
59 $3B ; 123 $7B 187 $BB 251 $FB
60 $3C < 124 $7C 188 $BC 252 $FC
61 $3D = 125 $7D 189 $BD 253 $FD
62 $3E > 126 $7E 190 $BE 254 $FE
63 $3F ? 127 $7F 191 $BF 255 $FF

Notes:

  1. Codes $00-$1F and $80-$9F are control codes. Printing them will cause a change in screen layout or behavior, not an actual character displayed.
  2. Codes $60-$7F and $E0-$FE are not used. Although you can print them, these are, actually, copies of codes $C0-$DF and $A0-$BE.
  3. Code $FF is the BASIC token of the π (pi) symbol. It is converted internally to code $DE when printed and, vice versa, code $DE is converted to $FF when fetched from the screen. However, when reading the keyboard buffer, you will find code $DE for Shift-↑ (up arrow) as no conversion takes place there yet.
  4. Cleaned up from the original source: http://sta.c64.org/cbm64pet.html
August 3, 2017Programming Reference

Commodore 64 Screen Codes

The new format of C64OS.com has opened up an exciting new possibility for me. With the new dedicated weblog section of the site, and its left sidebar, I can now pin feature posts to the top to make sure people who randomly visit the site will discover them.

But the sidebar actually now gives me a place to open up a whole new categorization system. I am always looking around the web for the various programming references and documentation that I need. Now, I can make blog posts that contain key programming reference material, and simply tag those posts "Programming Reference". And the sidebar can display a list of posts so tagged. Boom, instant access to all my most frequently used reference materials available in one place.

I hope this post series will be useful to other people too.

Screen code
(dec, hex; reverse dec, hex)
Character
(up/gfx, lo/up)
Screen code
(dec, hex; reverse dec, hex)
Character
(up/gfx, lo/up)
0 $00 128 $80 @ 64 $40 192 $C0
1 $01 129 $81 Aa 65 $41 193 $C1 A
2 $02 130 $82 B b 66 $42 194 $C2 B
3 $03 131 $83 C c 67 $43 195 $C3 C
4 $04 132 $84 D d 68 $44 196 $C4 D
5 $05 133 $85 E e 69 $45 197 $C5 E
6 $06 134 $86 F f 70 $46 198 $C6 F
7 $07 135 $87 G g 71 $47 199 $C7 G
8 $08 136 $88 H h 72 $48 200 $C8 H
9 $09 137 $89 I i 73 $49 201 $C9 I
10 $0A 138 $8A J j 74 $4A 202 $CA J
11 $0B 139 $8B K k 75 $4B 203 $CB K
12 $0C 140 $8C L l 76 $4C 204 $CC L
13 $0D 141 $8D M m 77 $4D 205 $CD M
14 $0E 142 $8E N n 78 $4E 206 $CE N
15 $0F 143 $8F O o 79 $4F 207 $CF O
16 $10 144 $90 P p 80 $50 208 $D0 P
17 $11 145 $91 Q q 81 $51 209 $D1 Q
18 $12 146 $92 R r 82 $52 210 $D2 R
19 $13 147 $93 S s 83 $53 211 $D3 S
20 $14 148 $94 T t 84 $54 212 $D4 T
21 $15 149 $95 U u 85 $55 213 $D5 U
22 $16 150 $96 V v 86 $56 214 $D6 V
23 $17 151 $97 W w 87 $57 215 $D7 W
24 $18 152 $98 X x 88 $58 216 $D8 X
25 $19 153 $99 Y y 89 $59 217 $D9 Y
26 $1A 154 $9A Z z 90 $5A 218 $DA Z
27 $1B 155 $9B[ 91 $5B 219 $DB
28 $1C 156 $9Cpound 92 $5C 220 $DC
29 $1D 157 $9D] 94 $5D 221 $DD
30 $1E 158 $9Eup arrow 94 $5E 222 $DE
31 $1F 159 $9Fleft arrow 95 $5F 223 $DF
32 $20 160 $A0Space 96 $60 224 $E0Shift-Space
33 $21 161 $A1! 97 $61 225 $E1
34 $22 162 $A2" 98 $62 226 $E2
35 $23 163 $A3# 99 $63 227 $E3
36 $24 164 $A4$ 100 $64 228 $E4
37 $25 165 $A5% 101 $65 229 $E5
38 $26 166 $A6& 102 $66 230 $E6
39 $27 167 $A7' 103 $67 231 $E7
40 $28 168 $A8( 104 $68 232 $E8
41 $29 169 $A9) 105 $69 233 $E9
42 $2A 170 $AA* 106 $6A 234 $EA
43 $2B 171 $AB+ 107 $6B 235 $EB
44 $2C 172 $AC, 108 $6C 236 $EC
45 $2D 173 $AD- 109 $6D 237 $ED
46 $2E 174 $AE. 110 $6E 238 $EE
47 $2F 175 $AF/ 111 $6F 239 $EF
48 $30 176 $B00 112 $70 240 $F0
49 $31 177 $B11 113 $71 241 $F1
50 $32 178 $B22 114 $72 242 $F2
51 $33 179 $B33 115 $73 243 $F3
52 $34 180 $B44 116 $74 244 $F4
53 $35 181 $B55 117 $75 245 $F5
54 $36 182 $B66 118 $76 246 $F6
55 $37 183 $B77 119 $77 247 $F7
56 $38 184 $B88 120 $78 248 $F8
57 $39 185 $B99 121 $79 249 $F9
58 $3A 186 $BA: 122 $7A 250 $FA
59 $3B 187 $BB; 123 $7B 251 $FB
60 $3C 188 $BC< 124 $7C 252 $FC
61 $3D 189 $BD= 125 $7D 253 $FD
62 $3E 190 $BE> 126 $7E 254 $FE
63 $3F 191 $BF? 127 $7F 255 $FF
  1. Codes $80-$FF are the reverse equivalents of codes $00-$7F
  2. Cleaned up from the original source: http://sta.c64.org/cbm64scr.html
August 1, 2017Programming Theory

Base Conversion in 6502 (2/2)

Very close to a month ago I wrote an article called Base Conversion in 6502. That piece was a part 1 of 2 because I felt like it was getting a little long. In part 1, I went through the process of converting an ASCII string representation of a number to the actual integer equivalent.

This piece is Base Conversion in 6502, part 2. In this article I'm going to walk through the reverse procedure. How to convert an integer into a string representation of that number. In either direction the base of the representation is arbitrary. In the original I showed converting from a string representing a number in base10, but it would be just as easy to convert if the string held a number in base16, or base2, or base8 or whatever. Similarly, when we convert an integer into a string representation we can choose what base we want it to be converted to.

I feel like there was some confusion over my terminology in part 1, because I was refering to integers as numbers in hex, so I'd say, convert a string number to hex. When the CPU deals with integers, say to add them together, the integer is whatever its bits are counted in binary. So, if you have a byte in memory, and the byte's bits are 0000 0010, then we can say that as far as the CPU is concerned that's the integer 2. And it's super easy for the computer to add that to some other integer like 3 (0000 0011), to produce a one byte integer result 5 (0000 0101). If these numbers were in PETSCII, though, they would be 50 == "2" (0011 0010), 51 == "3" (0011 0011), and 53 == "5" (0011 0101), and so on.

The reason I used the words converting between a string and hex, is because in the assembler ints are most frequently typed in hex notation. And when assembled that number typed in hex is represented in the CPU as an integer.

Meanwhile, you can represent a number in binary, in a string! Where each bit is represented by a whole byte in the string, either "0" (48, 0011 0000), or "1" (49, 0011 0001). So, converting an integer like 5 (0000 0101) to a string representation in base2, it would be converted to the string "00000101" which is actually 8 bytes long, 48 48 48 48 48 49 48 49. Okay, here we go.

Convert an INT to a string representation

This is the opposite process. We have an int (an integer), but we want to print it out on the screen in characters that are readable by a human.

What would happen if you didn't bother to do a conversion and just dumped the byte directly to screen memory? Well, imagine your byte was the integer 83. If you dump that byte to the screen the VIC II will draw it as screen code 83, it'll draw a single character that looks like a heart. How would you then figure out what that byte really is? You'd have to refer to a screen codes chart, search through the chart for a heart and then see that the heart is 83, bingo.

What are we really doing when do this? Well, it's kinda like representing the number in base256. Base10 has 10 symbols (0,1,2,3,4,5,6,7,8 and 9), and if the integer is greater than 9 we need a second character to hold another of the 10 symbols. Base2 has 2 symbols (0 and 1), and if the int is greater than 1 we need another character for another either "0" or "1". Base256 would mean we need 256 unique symbols. And if the integer is greater than 255 then we'd need a new character with another one of our 256 symbols. But, that's what screen codes are; 256 unique symbols. A one byte integer requires exactly one screen code character.

But that would be terrible, it would take forever to figure out what the number is in a way that we understand it. And what's worse, many screen codes look very similar to each other. 116 and 117 are both left vertical bars, 117 is just a single pixel wider than 116. So, representing the number in a base256 string is not really workable. We need to convert to some smaller base.

The easiest way to get something we could describe as human readable is to convert to a string that is in base16, or hexadecimal. The reason this is easiest is because the 8 bits of the underlying int break straight down the middle: 4 upper bits (the upper nybble), and 4 lower bits (the lower nybble). Each nybble is a number from 0 to 15 and can be used as an index into a lookup table of 16 PETSCII characters, "0123456789abcdefg".

I personally use this all the time for debug output when coding. I've got some ints that I'm working with and I need to see what those ints are, so, short of using a monitor which does these conversions for us, I just want to spit out some numbers to the screen.

In fact, I already gave the source for such a conversion routine, inttohex, in the post Implementing Factorial In 6502.

It's very very easy. By shifting right (LSR) four times, the upper four bits are shifted down into the lower nybble and the upper 4 places replaced with zeroes. This is now a number in the range 0 to 15, which can be used as a lookup index into a string of digits, labeled above as hexits.

And to get the lower 4 bits alone is even easier. You simply mask the whole byte using AND #%00001111. Some juggling is required to preserve the numbers as you go, but by and large it is simple, fast and super useful.

So, we're done! No, we're not.

Converting to base16 is really convenient for programmers who are debugging, but most people don't understand hexadecimal because they don't have 8 fingers per hand, like programmers do. Most people have only 5 fingers per hand (the thumb is a finger for this exercise.) So most people want to see the output in base10.

Convert an INT to a string in an arbitrary base

We return now to the article about Large Look-up Tables for Hyperfast, Accurate, 16-Bit Scaled-Integer Math by Garth Wilson over at Wilson Mines Co. The original place where I saw descriptions of the principles of base conversion. Here's his prose description of converting from an int to a string in an arbitrary base:

For converting hex numbers to other bases for output (which will normally be a string), initialize a blank string. You will build it from right to left. Divide your number by what's in variable BASE, and use the remainder to add a text digit to the build, even if it's a 0. Keep doing that until there's 0 in the number.

As before, I'll convert that to pseudo code for you.

Convert from INT to a different base:

Define BASE.
Define symbol table with at least BASE-n symbols.
Initialize a blank string.

loop:
Divide number by BASE.
Use remainder as index into symbol table.
Prepend symbol to string.
Go loop until number is zero.

Next let's step through the pseudo code with an example number, seeing how it converts one step at a time.

BASE = $A (10 in decimal)
$A894
--------------------
loop 0:
$A894 / BASE == $10DB (remainder $6)
"6" + ""

loop 1:
$10DB / BASE == $1AF (remainder $5)
"5" + "6"

loop 2:
$1AF / BASE == $2B (remainder $1)
"1" + "56"

loop 3:
$2B / BASE == $4 (remainder $3)
"3" + "156"

loop 4:
$4 / BASE == $0 (remainder $4)
"4" + "3156"

--------------------
"43156"

So before we proceed with an explanation, let's use our calculator and see if the conversion works. 43156 is indeed equal to $A894. That's convenient.

To start we define the variable BASE as $A, that's base 10, because we'd like to see our output string in the nice friendly base 10 we're all used to. But this would work just as well if we were to use some other base instead.

Between the two horizontal rules I've labelled 5 loop iterations, 0 through 4. It just works out that it takes this many steps to get our division result all the way down to zero.

The initial value is our dividend, and BASE is our divisor. After performing the division we get a result and a remainder. That remainder, as it happens, even if it is zero, is the least significant digit of the number in the base we're converting to. In this case, because we're converting to base10, the range of remainders even represented in hexadecimal will only ever be between $0 and $9.

The remainder has to be converted to a PETSCII character. This isn't shown here, but it's very easy. We can see by looking at a PETSCII table that $30 is "0", $31 is "1", $32 is "2" and so on. So to get the PETSCII character version of our remainder integer all we have to do is add $30. The character has to be added to the start of the string. In our implementation we'll take a look at how I might get around to doing that.

As long the result is not equal to zero we need to loop and continue the process. On each new loop, the result from the previous division becomes the dividend for the next division. In the final loop the dividend ($4) is smaller than BASE ($A) so the result is 0 plus the remainder which will be the final (most significant) digit in the string.

When we converted the other way, our implementation need to have a multiplication routine because the 6502 doesn't have a multiply instruction. Converting this direction, our real implementation will need a division routine. We will also need to worry about precision. In this case, as in the example in part 1, I'm opting to settle for 16-bit precision. So our maximum input number is $FFFF and the maximum string representation will be "65535".

Codebase64 to the rescue on the 16-bit division routine.

INT to a String number Implementation

And here is my implementation of the above. Works just as you'd hope.

Now let's walk through this a line at a time and see how it works.

The division routine that I grabbed from Codebase64 requires three 16-bit numbers in zero page. These are the first constants defined under Workspace. You only need to specify the low byte of these addresses. We have one for divisor, one for dividend and one for remainder. The result in this routine actually is the same address as dividend, which means dividend will continually be overwritten by result. This turns out to be super handy for our use case, as we'll see shortly.

Base is defined as 10, petoffset is the number that needs to be added to an int between 0 and 9 to produce the PETSCII characters "0" through "9". I make this a constant, because I hate magic numbers being sprinkled through code. And chrout is defined here as $FFD2, which is the KERNAL address for outputing a PETSCII character to the screen.

The code starts at $0801, that's where BASIC programs start. The first code encountered is the BASIC preamble. In my post Implementing Factorial on 6502 I talk a bit about how the preamble works. But here'd like to say that rather than just spitting out a series of inscrutable bytes, the assembler actually offers a variety of features that make producing the preamble from memory quite a simple task.

BASIC Preamble in TMP on a real display

The preamble is structured as a BASIC program. First a 2-byte pointer to the next line, this can be done with .word and a label, "end" in this case. Next is a 2-byte line number. This is easy to do with .word 10, that makes a 16-bit line number 10. Next we have a 1-byte BASIC token, this is the hardest part to remember, .byte $9e, that's the token for SYS. Next the SYS wants a memory address, encoded in PETSCII (ironic consider the nature of this article series), plus the BASIC line ends with a null byte $00. This can be very easily produced by the assembler with the .null keyword. The following quoted string is in PETSCII, and it automatically gets null terminated. To end the BASIC program we want a 16-bit $0000, this is easily produced with .word $00, and this is also the line that is pointed to by the first line's next line pointer. So it takes the "end" label. And boom, that's easy to type up from scratch, and is even comprehensible to read.

The main program starts at line 24. We start by putting our BASE into the divisor. This will not change, so it only has to be done once. Even though the divisor fits within 8-bits, the assembler notation of #> and #< will break it into $00 and $0A for us.

Next we put the number we want to convert into the dividend the same way we populated divisor. I've hardcoded $A894, but in a real program you'd probably be pulling this from somewhere.

The loop starts at line 36. The first thing it does is checks to see if dividend is 0. If it is, we're done and we can leave the loop to output our converted string. Dividend is a 16-bit number, but checking to see if it's zero can most easily be accomplished by loading one of its bytes, and then ORing that with the other byte. If the result of both bytes ORed together is still zero, then the whole dividend is zero, and the conversion is complete.

At line 40 we JSR to divid16. This routine overwrites the dividend with the result of the division, as well as populating the remainder, even if the remainder turns out to be zero. Because we're dividing by such a small number, $0A, remainder cannot ever be bigger than a value from $00 to $09, which also means that remainder's high byte will always be $00 and can be ignored.

Lines 42,43 and 44 load the remainder's low byte and add the PETSCII offset to it. Lines 46, 47 and 48 put the PETSCII character into the string. The string, str, is a pre-defined buffer that starts as 5 spaces, null terminated, at line 64. Line 63 is an index variable that starts at 4. That's the offset to the end of the string buffer. When we add the character to the string we add it at the index, then decrement the index. This is how I've decided to do the string prepending. It means I don't have to shift anything, but it also means the final string will always by 5 characters long, and any unused places are padded with spaces.

Line 50 jumps back to the start of the loop. Which will check the dividend to see if it's zero yet. The convenient thing here is that we don't need to explicitly move the result of the previous division into the dividend, because that happens automatically in the divid16 routine. Convenient.

Lastly, we want to output the finished string to the screen. This is done starting at line 53. We simply use the .X register to walk through the bytes of the string and call chrout on each byte to put them on the screen. The loop ends when the string's null byte is reached. And we return to BASIC. That's it.

Output of the conversion program

So, we load it up in vice and low and behold, 43156 gets spat out onto the screen. Looks like it works! This concludes this two part series on INT to String and String to Int conversion. Hopefully someone in the world who's learning 6502 will one day stumble upon this and find it useful.

If you want to download the program and try it out, or inspect how it assembled, you can download it here.

July 21, 2017Hardware

Commodore Logo Mark Patch

I've got something a little special for you today. I've been hard at work building C64 Luggable. You can read all about the project, my progress and how to build one yourself on this site's C64 Luggable page.

It's really great. I'm enjoying the challenge of working with physical materials, and also the intellectual fun of figuring out how to construct something that's modular, self–contained and has everything you'd want in a full Commodore system.

A couple of weeks ago I had a sudden realization that once C64 Luggable is complete I'll want to take it with me to friends' houses, parties, the cottage, and most of all, Commodore Expos such as this year's Commodore World.1 Who builds a custom luggable Commodore all-in-one and then doesn't take it to tradeshows to show it off?

The problem is that it's going to be somewhat vulnerable to nicks and scratches, or much worse, damage to the open flat screen or front speaker cones. Especially if I started lugging it around to shows and taking it in and out of my car. What I really need is a padded cover. It just so happens that my mother is an experienced and very talented quilter. She's an esteemed member of the Napanee Heritage Quilter's Guild. So I recruited her to help me make a custom fitted cover. I'm picturing it like one of those artsy Kleenex box covers, if you've seen those. It slips on over the top. Covers all four sides but leaves the button open so the computer will remain standing on its own feet. And the top cover will have a slit opening sized for the handle to pass through. It will be very handy.

But, we don't just want a generic–looking cover. It's gotta have some style and advertise what it is. So I came up with the idea of getting a custom embroidered patch made up that can be stitched to the front of it.

I looked around the internet and compared prices. The best price I found was from a company in the U.S. called CUSTOMPATCHES. You send them artwork, they mock it up graphically, and follow up with you to make sure it's what you want. Then they create a single proof piece and show you a picture of it. After you approve the proof they make as many as you want in fixed batch sizes. The smallest order is 25. But you can also order 50, 100, 200, 300, 500 and more. The price per unit drops substantially the more you order.

I only want 2 or three for myself. One for the C64 Luggable cover, and a couple more, maybe for a bag or a hoodie. The rest, well, I figure I'll sell them off for a few dollars (+shipping) a piece, mostly to recoup the upfront cost, and to share the love. They've been ordered but I haven't received them yet. They'll probably be available by the end of August 2017.


Here's the artwork I put together in my experiments, and the patch mockup they sent me. The dimensions are, as shown on the mockup, 3.5" by 2.4". They are backed with an iron–on adhesive. This means they can be applied to a bag, or a backpack, (or a luggable computer cover), with just an iron press. They promise me the adhesion is extremely good, and sewing them on isn't absolutely necessary, but they do recommend a single outer stitch to increase the longevity.

Contact me if you're interested in buying a few of these. Supplies available until they're all sold off. I'll update with final price when I know for sure what they'll cost me.

Commodore Logo Mark Commodore Word Mark Commodore Logo Mark (Patch Art)

Update: I'm still working through the proof process with them. The final version may not have the black outer border.

Commodore Logo Mark (Patch Art, Proof 1)

Update 2: This is the final proof. With some feedback from twitter, and my personal opinion, I opted to get rid of the black border. Some border is necessary to prevent fraying, but they've switched it to a white border.

Commodore Logo Mark (Patch Art, Proof 2)
  1. I had a blast at last year's Commodore World 2016. If you haven't read my review of the event, you can find it here. We had the pleasure of hanging out with Bil Herd, legendary designer of the Commodore 128. []
July 5, 2017Programming Theory

Object Orientation in 6502

I feel as though my abilities at 6502 assembly are progressing fairly quickly. I'm a pretty experienced programmer in higher level languages, such as Objective-C, Javascript and PHP, as well as C. But when I first started programming seriously in 6502 I was having a hard time seeing how to accomplish things that are quite common in higher level languages. Things like strings, arrays, structures, functions and objects. What you have available when coding in 6502 ASM is just so incredibly low–level, you have nothing but an open expanse of memory and a handful of simple instructions.

But, I made a lot of progress which I've blogged about earlier, especially in regards to pointers and structures. The next frontier, in my opinion, is learning how to program in an object–oriented style. In higher level languages the compiler or the interpreter itself constrains what you can do. But, these constraints are arbitrary. A computer can jump through any spaghetti mess that you can throw at it. The constraints are there to force you to follow certain design and organizational patterns. The point of organizing your code either functionally or objectively is because it is a good practice that leads to cleaner code that is easier to understand and maintain and that has fewer bugs as a result.

So, when we code in 6502 asm we're not constrained by a compiler to follow any particular good practice. But that doesn't mean we can't apply, as a matter of personal discipline, the organizational principles of object–orientation to our assembly, and then reap the benefits that come from that design. Let's first go over the theory and the terminology so we know what the benefits are in principle. And then after that we'll get into the meat and potatoes of exactly how these principles can be applied in 6502 asm.

What is object–orientation? The Theory.

There are usually four principles that object–oriented design is meant to meet. These are:

  • Encapsulation

  • Abstraction
  • Polymorphism

  • Inheritance

I separate these into encapsulation, and then abstraction and polymorphism, and lastly inheritance, because I find it difficult to separate abstraction and polymorphism into distinct principles. They seem to overlap and feed into one another that makes them hard to discuss individually.

Encapsulation

Encapsulation is to me the most important of the four. Rather than having some global data that any code anywhere in the program can see and change, you instead wrap up bundles of related data and hide them behind code that you have to call to get at that data. The distinction between an object and a structure gets really thin when you don't have a compiler constraining what code can work on what data.

A structure, as I've discussed before is a block of data with the possibility of being composed of smaller and different data types. A structure collects data of varying types together that are related to each other. The sub-types are laid out in such a way that the start of each sub-type within the structure has a known offset from wherever the structure as a whole starts in memory. I believe the example I used was a record of a person. A person may have a name, year of birth, height, social security number, gender, etc. These different data can be put together such that you only maintain a pointer to a person and then access the subtypes as offsets into person. So, if you know the struct starts at $8000, and name is at $02, then name is at $8000,$02, and gender might be at $8000,$06, etc. The 6502 is pretty good at this, because it offers the indirect indexed addressing mode. You set the .Y register to the offset of the sub type you want, and put a pointer to the start of the struct somewhere in zero page.

Structs are really great, but then you need routines that know how to operate on the layout of a particular type of struct. You might have person records, and job records, which are structured differently. You'd run into a huge problem if you called a routine that was designed to operate on a person struct but you passed it a job struct. What objects do is take structs one step further, and make some of the struct's properties pointers to routines that are capable of operating on that very type of struct. A bit head spinning at first, by very clever when you think about it.

The encapsulation separates the design of the object from the programmer who might later try to use the object in their own code. Why would anyone want to do that? Well, we'll see that eventually when I post in more detail about the object–oriented nature of the widget toolkit. But in short, you want the designer of the operating system to create some objects that are useful. Then you want a programmer who doesn't directly collaborate with the OS designer, but just reads an API document, to be able to write his own application that makes use of the object. Then, here's the real magic, you want the OS designer to be able to change, improve and expand the capabilities of the OS's objects. And you want the third-party programmer's application not only to not break, but to automatically gain new functionality, effectively for free.

If the third party programmer had complete knowledge of the internal layout of the structure and thus hardcoded direct manipulations of that structure, then his code would break if the OS designer changed the object's internal layout. In order to avoid this problem it is important that the programmer does not directly manipulate the encapsulated data. In 6502 we cannot enforce that the programmer doesn't violate the principle of encapsulation, but we can certainly honor the notion that the data ought to be encapsulated and simply restrain ourselves from direct manipulation.

Abstraction and Polymorphism

In order to understand how it is possible to work with an object whose properties are all encapsulated it is necessary to move on and discuss how objects work. The ideas of abstraction, inheritance and polymorphism will all start to blur in this description.

The crucial step that separates a struct from an object is that an object contains properties that point to routines that are designed to act upon itself and to understand the internal layout of its own structure. This allows the third party programmer to not need to write his own code that manipulates the encapsulated data.

Instead, the documentation specifies a handful of "publicly" known offsets, typically to routines, which in object parlance are called "methods." The programmer then takes a pointer to the object, and looks up a method, and calls the method. The method is given access to a pointer to its own structure, and the method, which was written by the OS designer, operates on the internals of the struct to change the data, retrieve some data for the user, or perform some other function.

So, encapsulation hides the data of the object from the programmer, and instead presents publicly known method offsets. The methods of an object are always known to operate on the object they belong to. So you never need to worry about accidentally calling a routine but passing it the wrong type of record. This design pattern, encapsulating the data and only accessing that data by calling methods which operate on the structure, essentially defines what it means to be an object. However, the relationship between objects is where abstraction, inheritance and polymorphism come into the picture.

The collection of public methods that an object has are known as its interface. They are the calls by which third party code engages with and interacts with the object. But what if you had two different objects, with different encapsulated properties and different internal behaviors, but which exposed exactly the same interface? Well, this is where you get abstraction and polymorphism. What's the difference between abstraction and polymorphism? Well, lots of people ask that question and I'm not 100% certain myself. But I can take a stab at it.

Let's say we have two objects, one that represents a person. Internally the person has a property that stores their date of birth as an offset in seconds since the unix epoch. The other object represents a company, and stores the founding date of the company as a string in YYYY-MM-DD format. But they both have a method called "Start Year". When called, the person object returns 1981, and the company object returns 1975. These two figures can be compared to each other, even though the true underlying data representation is radically different and not directly comparable. The methods thus serve to abstract the underlying data structure, and the third party program that makes use of these classes does not need to know or worry about the internal representations of the data.

Polymorphism is similar, but I think it relates to abstracting behavior, which allows one object to seamlessly substitute for another. So, imagine you have a program and it wants to log error messages. There could be three different types of objects, which each have a "log" method, that take a string for message text. But one of those logger objects emails the message text, another prints it to a printer, and another appends it to a log file. The third party program doesn't actually care which logger object is given to it, it will use and interface with all three in exactly the same way. But the result of the substituting one for another is that a completely different form of output will take place behind the scenes. This is what "polymorph"-ism means, one interface manifests in "many forms."

Inheritance

Abstraction and polymorphism are about how you can use different types of objects in similar ways. Inheritance, then, is about deriving a type of object from a more primitive class of object, such that they have at least some part of their interface in common, which allows for their substitution.

It's important to define here the difference between a class and an object. A class is a description or a prototype of an object. An object is thus an instance of a class. The difference is similar to the definition of a structure, and an actual instance of the structure in memory. The definition of the structure may be that it has 5 properties, 2 bytes wide each. When you want to make an instance of the structure, you effectively just allocate some memory that is the size of the structure, 10 bytes in this case. After that you may want to initialize that memory by assigning known values to each of the 5 properties. Now you have this 10 bytes of memory that are an instance of the structure. Your code could make hundreds or thousands of instances of the structure by allocating another 10 bytes for each new instance.

Objects work essentially the same way. Except that in higher level languages their initialization is performed automatically by a standard method which is known as its constructor. High level languages typically allow you to create an instance of an object with the new keyword. So new Person; would allocate memory the size of Person. Then automatically initialize that memory using a routine which is defined as the constructor for the Person class. The instantiation process automatically assigns all the method pointers to all the routines that are declared as that Class's methods. The constructor method is generally responsible for initializing all the other data properties. And what is returned to the caller is a pointer to the initialized memory of the new instance. After this, we'll look at how we can implement this behavior in 6502.

Inheritance is a design pattern whereby classes are hierarchically related to each other. For example, imagine you have a class that represents a piece of paper. It may have properties such as width, height, thickness, and color. You might then have a subclass of paper called lined paper. Lined paper actually has all of the properties of paper, but it adds extended properities such as line color, line thickness, line spacing, top margin and bottom margin. And you might have a third class that is also a subclass of paper called music paper. This has all the properties of paper, but none of the extra properties of lined paper, and instead offers its own extended properties such as base or treble lines, and how many rows of notation.

Further, beyond just the subclasses sharing the properties of the superclass, the superclass (paper in this case) also implements some methods, such as draw. When you draw a piece of paper it just clears out a rectangle on the screen proportional to its width and height properties. The two subclasses however also have the draw method, but their draw method is more sophisticated but derivative. So, when a lined paper object is instantiated, its draw method pointer is setup to call the draw routine declared for the lined paper subclass. However, when draw is called on a lined paper object, internally that routine can call the original draw rouine for its super class, paper. Paper's draw routine then does its thing, reads the width and height and clears out the rectangle with the correct paper color. After the superclass's draw method completes, the lined paper's draw method only needs to implement the extended functionality. It draws on top of the paper's colored rectangle according to the extended properties defining the lines.

That is really cool. In other words, when you want to implement lined paper, it is not necessary to reinvent the entire wheel and reduplicate the base drawing code. Instead a subclass extends a superclass. Or we say that the subclass inherits from the superclass. The fact that the subclass inherits from the superclass means that an instance of lined paper IS an instance of paper. And so any code that expects to be operating on a paper object should work just fine if it is instead given a lined paper or a music paper object.


So much for theory, what about the 6502?

Effectively, an object, an instance of a class, is a struct where some properties are pointers to methods. And a struct is just a block of memory the appropriate size, which is accessed using indirect indexing where the index offsets are declared by name. What does that mean?

Well, let's take the indexing by name part first. In our source code we will declare a block of labels like this:

The above is not an object, it's a struct definition. This is the sort of code that I use to declare the definition of a structure. Notice that p_size has been manually set to the the size of the whole collection. 2 for name, 1 for sex, 1 for height, 2 for year of birth. We need 6 bytes total for an instance of the person struct. To create one of these in memory we need to rely on our memory manager (and here). In fact, object oriented code essentially requires a memory manager, which is something that I wasn't anticipating when I originally discussed what a memory manager would be useful for. We ask the memory manager to allocate space the size of p_size. We get back a 16-bit pointer to somewhere in memory and that's where our struct is.

Next, we might want to initialize the struct's properties to known defaults. In order to do indirect indexing the 6502 requires that the pointer be in zero page, so we'll stick the pointer into say $fb-$fc, which we've defined as the ptr. Then we set the .Y register to an offset by using one of the struct property declarations and away we go. Here's how this might look in code.

So, that's what we might do in our code to go from a definition of a struct to an instantiated and initialized instance of a struct. And I've posted at length about structs before. The thing to note here is that a struct is not an object. The biggest problem with this code from an object oriented point of view is that the data is not encapsulated, the implemetation is not abstracted, there are no methods which act upon this data, and so polymorphism isn't possible, etc. But, what this does show is how we use those declared offsets to index properties of the struct, by name,1 through a pointer. This part of it is essential to how we'll use objects.

Next, let's look at some real world examples that take structs and move them into the world of objects. I've been working on an object–oriented widget toolkit that will standardize the UI and making building fancy hierarchical "graphical"–like user interfaces easy and consistent. I say graphical–like because the toolkit works the way you expect a modern GUI to work, but in C64 OS it is implemented in the text-based video mode, because it's the only way to get super fast and snappy results on the stock c64. Don't be disheartened though, most c64 video games use text-mode to produce fantastic looking and very fast results and the user doesn't even realize it is text mode, thanks to customizable character sets and smooth scrolling.

Toolkit Views, page 1 Toolkit Subviews, page 2

Okay, so let's take a look at these two spec sheets. They are preliminary, they'll probably change as I build out the functionality needed to support certain types of behavior such as drag and drop, but I have begun coding the essential drawing behavior around how they are presented above. I'll go into much more detail in a future post about how the widget toolkit is supposed to work, for the purposes of this post I only want to show real world examples of how to structure objects that feature inheritance, encapsulation, polymorphyism and abstraction, and how to do that in 6502.

We have here a set of 6 classes: view, scroll, split, label, button and input. Eventually I will be adding more, but I see these as a good start, they will let me compose some pretty interesting user interfaces. These objects are arranged hierarchically, such that they all inherit from view. For this reason, it is appropriate to refer to all of them as "views", because although only the root view is actually called "view", they are all subtypes of the essential view. Here's what the hierarchy of inheritance for these 6 classes looks like.

Object hierarchy of views

So, as we can see, view is the root object from which the other toolkit sub-views descend. Let's look at view in detail for a moment and see what properties it has and how its original spec sheet translates into code.

So, you can see that from the spec sheet, the size (or width) of each element has been translated into offsets from 0. The properites of view offset from zero, because view is the root object.

The first two elements are 1 byte wide each. The first is a node type. Every class in the toolkit will be assigned a type id that allows either the OS designer's own code, or third party code, to easily identify the object type. It is here called a node type but we'll see more about how why that is and how it is useful in a future post about the toolkit. The second element is a generic id. This allows the developer to arbitrarily tag up to 255 different objects to easily identify specific ones, we'll also see in a future post how this is useful.

The following 7 properties describe the view's render size and positional characteristics. I won't go into the details of how these work here, other than to say that the last of these properties is a resize mask. It is one byte, for 8 bit flags. The flags define which of the top/bottom and left/right sides should be honored and which ignored. Notice that top, bottom, left, right, width and height are all 2 bytes, or 16-bits each. This allows a view's height, for example, to be up to 65535 lines high.

Remember, that all the toolkit objects descend from view. That means that the metrics properties of view will be automatically inherited by all other view subclasses. These are the properties that define how every view in the entire toolkit are sized and resized.

The next three properties are link pointers, allowing views to be hierarchically arranged. Every view is linked to its sole parent, or container view, via the view_prnt pointer. Every view has the ability to be pointed to its first child, via view_chld. And the view_nsib is a pointer from a view to its next sibling. If a parent view wishes to iterate over all of its child views, it starts by getting a reference to its first child, and then traverses down the list of its siblings. Again, because all views inherit from this root view, all views can be plugged together using these link pointers.

Object Methods

Still we only have properties. Even though some properties are pointers to other nodes, we have actually already seen this with plain structs. The next four properties however are very interesting. They are pointers to four methods.

The first, view_draw, is the method that is called by the system's main event loop during a screen redraw phase if something has marked the screen as dirty. The view object's draw routine simply clears out a square (both the character map and color map) that is defined as its draw region by its parent view. We'll get into much more detail about how this works in a future post, because it's a huge topic.

But view does something slightly more than this. It also iterates over its children, refines its own region bounds to the region into which the child should draw, and then calls the child's own draw routine. It repeats this for all of its children. This allows the child to draw its specific appearance overtop of the area just cleared out by its parent. Note that this draw behavior is also recursive, because the view's children are also views which may have their own children, and so on down the view hierarchy.

Polymorphism is absolutely critical to how this works. The view's own draw routine knows that each child view must have a draw routine. But it doesn't care whether the child is another view view, or if it is a subclass of view. It doesn't care what the output of its child's drawing is going to be, only that it can ask it to draw itself. This is the very definition of polymorphism.

Let me briefly say what the other 3 method pointers do before describing how the methods are selected and called.

These three: view_mouse, view_kcmd, and view_kprnt should sound very familiar if you've read this post. If you care about this stuff, I highly recommend you go read that post first. These methods match the three basic event types that are generated by C64 OS's extended KERNAL. Mouse events, Keyboard Command events, and Keyboard Printable Text events. For mouse events, whichever view object is clicked2 becomes the first view to receive the mouse event. It does this simply by calling the view_mouse method of the clicked view. The implementation can then use a system call to read the details of the mouse event. However, by default, the view class doesn't have any particular behaviour for any type of mouse activity. So view's implementation is simply to call its parent view's view_mouse routine. Thus, each view up the hierarchy has an opportunity to handle the mouse activity.

The two keyboard event types, however, are dispatched on a principle of focus. One view is allowed to be in focus at a time. So the system maintains a pointer to the view in focus. If you click on an input, for example, the input's own mouse handling routine will tell the system to make it the view object with focus. After that point, when a keyboard event is generated, it is delivered to the focus view first, by calling the focus view's view_kcmd or view_kprnt method. Then, these methods behave just like view_mouse. If the focus view handles that type of keyboard event it can do something with it. Otherwise, it passes it up the hierarchy by calling the same method on the parent view.

Calling Object Methods

The question is, how to call the object methods. The first thing to note is that an object always has access to a pointer that points to itself, to its own data structure. That pointer is typically referred to as the this pointer. And so how pointers work on the 6502/6510 is pretty important. Pointers, for use in indirect indexed (or indexed indirect) addressing have to be in zero page. But that makes zero page a giant global variable space for everything going on in C64 OS. Much the same way there is only one stack. This is partially what makes the 6502 so ill-suited for premptive multitasking. On bigger CPUs, even the 65816, each process can have its own stack and its own zeropage (known as direct page.) However, C64 OS is not multitasking so the only real worry we need to have is that the interrupt routine not use the same zero page addresses that are reserved for use by regular programs.

Next, we want to define a system global, or at the very least toolkit global, zero page address to be used as the currently executing object's this pointer. Toolkit internally uses $fb-$fc. But as I get further along in development I might change this to something lower that is usually only used by the BASIC rom, which C64 OS patches out.

What this means is that, whenever any toolkit object's method is running, it can expect to find itself pointed to by $fb-$fc. In code that interacts with the toolkit, we can use a variable this = $fb to get the flavor of an object. In order to call an object's method all we have to do is, back up the current this pointer to the stack and change the this pointer to point to the target object. Next, read the method pointer from the target object and write that pointer either into the address of an indirect jmp, or self–modding the two bytes following a JSR, and then JSR to the method. When the routine returns we restore the this pointer from the stack. All of this, by the way, could probably be wrapped up in a macro to make it feel a lot more like you're merely calling a child's method.

Let's see this in code.

The comments in the code make it pretty much self–explanatory. $fb is declared as this, so we can use this throughout the code knowing what we're talking about. The current object's this is shoved on the stack to back it up.

Next we use the .Y register as the indirect index through this, to grab a pointer to this object's child. The low byte is being transferred into .X instead of being written to this right away, because obviously, we need this to continue to be valid so we can read the high byte. After we've got both bytes, we write them both to this. This now points to the child object. But we want to call the child's draw routine. So we use .Y to reference the draw routine's offset. And read the pointer to it through this (which is now the child's this.) And write that draw routine's address to the JSR's argument.

Writing to the JSR's argument is a self–modification of code. I like it, because it's fast and clean. It doesn't port well to run from a ROM however, and you may feel dirty using it. Alternatively you could write the address to somewhere in ram that is reserved as the address of an indirect JMP. (JMP ()). And then just JSR to that JMP (). But, I like the self–mod. The draw routine for the child now assumes that the this pointer points at itself, which it does. And it may repeat the process to call the method of some other object.

When the routine returns, we can restore the this pointer from the stack. Now, if this is the very end of the routine and we don't care about restoring this, we could skip the backing up and restoring of this. Whether you need to depends on who is going to be calling this routine. If it's only ever called by some other object, then the this doesn't need to be restored. If it's ever going to be called by another routine of this object, then the this has to be restored.

A little preview of inheritance

I'm cognizant of how long this post is becoming, so I'm going to wrap it up with this last section, and save many of its details for a future post.

But, let's suppose we are implementing the label class. What does it mean for it to be a subclass of view? How does it inherit from view? Well, here's what its data structure would look like.

The comment indicates that it descends from view. And so each of its properties have their offsets prepended by "view_size+", which means their offsets follow all the offsets defined by view. And lastly, label_size is set equal to view_size plus the additional size of the properties added by label. That's very cool. If we were to go back and add an extra couple of properties to view, and adjust view_size accordingly, then everything that's been coded for label will automatically get the new correct offsets and size when reassembled.

If another class, say button, descends from label, then its definition does not need to reference view. Button simply prepends its property offsets with "label_size+". And it specifies its own size as button_size = label_size+... however big the additional button properties are. Again, that's really cool. Because we could even add an intermediate class between view and label. Label would have to be updated to change who it inherits from, but button doesn't have to be changed at all, since it would still inherit from label. Startin' to see some cool design benefits here.

I don't have space here to go into how these classes are instantiated, nor how their constructors work. But in my next post on this topic I'll cover how memory is allocated, how the nodetype id is assigned, how the constructor method is called, how the constructor assigns method pointers, and how subclasses are able to override and extend their superclass's constructor and other methods.

Once again, if you have questions, thoughts or corrections, please leave them in the comments section.

  1. I should emphasize, "by name" is only at assemble time. Once assembled the program will have hard numeric offsets. If the OS designer changes the order of the properties or adds new properties to a superclass it would the property offsets of any subclass. The assembled program will no longer reference the correct offsets. One way to get around this is to have the program lookup the offsets at runtime before using them. However, there would be a speed penalty. []
  2. And how it is determined which view was beneath the mouse when the mouse was clicked is a huge topic for yet another post. But suffice it to say that it's the result of what is called hit testing. []
June 30, 2017Programming Theory

Base Conversion in 6502 (1/2)

I'm working my way through figuring out how to build a widget toolkit for a text-based UI that borrows a lot from the way hierarchical widget toolkits work in graphical user interfaces. But, it's a huge mountain to climb. And I don't want to let the blog go without any development love for too long.

In my post about implementing factorial, from absolute scratch, in 6502, the issue came up at the end of converting the binary result of the math into something that would be human readable. There I included a very simple routine that converts the number into hexadecimal. I chose hex because it is so simple. You just split the byte into two bytes where each has a range from 0 to 15. To grab the most significant 4-bits you shift them right 4 times. And to isolate the least significant 4-bits you can just mask out the upper 4-bits. And then, those two resultant values can be used as an index into a small lookup table of, say, PETSCII string values.

But how do you convert into something other than hexadecimal? What if you want convert a binary integer into decimal? Or vice versa? That's quite a different story.

Fortunately we have Garth Wilson of Wilson Mines Co. and his wonderful website about 6502 programming, http://wilsonminesco.com, to rely on. I found the following descriptions on his page about Large Look-up Tables for Hyperfast, Accurate, 16-Bit Scaled-Integer Math.

The whole article is interesting, although it's quite a bit beyond my level of experience. There is however a particularly useful section listed in the table of contents as, Easy conversion to and from any base. His description is written out in prose, but in my experimenting with the idea it seems to work well. So I thought I'd expound on it with a bit more practical detail here.

Convert a string number to HEX

Actually, what he refers to as hex is in my reckoning actually just binary, but we get to type those binary numbers in hexadecimal notation in our assembler. The difference between them is the following, if you have a single byte:

"1"    ==    0011 0001, or 0x31    (Petscii "1")
1      ==    0000 0001, or 0x01
"a"    ==    0100 0001, or 0x41    (Petscii "a")
$a     ==    0000 1010, or 0x0a

The question is, how do you convert a string of petscii digits into the equivalent conceptual value as the number counted in binary. Here's what Garth Wilson says about it:

For inputting numbers from a text string (e.g. typed in from a keyboard), initialize a number as 0 to build on, then take the first digit in the string and add its value to the hex number you're building. Continue until you're out of digits, each time multiplying the build by the number in variable BASE and then adding the new digit's value to the build.

Let me convert that into pseudo code for you.

Define BASE.
Initialize number to 0.

loop:
  Multiply number by BASE.
  Read first digit from string.
  Add its value to number.
  If string has more digits, go to loop.

End.

Next let's step through the pseudo code with an example number, seeing how it converts one step at a time.

BASE = $A (10 in decimal)
"43156"
--------------------
loop 0:
$0 * BASE == $0
"4" << "3156"
$0 + $4 == $4

loop 1:
$4 * BASE == $28
"3" << "156"
$28 + $3 == $2B

loop 2:
$2B * BASE == $1AE
"1" << "56"
$1AE + $1 == $1AF

loop 3:
$1AF * BASE == $10D6
"5" << "6"
$10D6 + $5 == $10DB

loop 4:
$10DB * BASE == $A88E
"6" << ""
$A88E + $6 == $A894

--------------------
$A894

Before describing what I did above, let's just use our calculator to confirm that $A894 is equal to 43156. And, it is.

So, we begin be defining BASE as $A. This is base 10, because that's the base that is used in the string input. If the string input were actually in octal, well then BASE should be defined as $8 and all the intermediate results would change.

Between the two horizontal rules, I've labelled 5 loop iterations (0 through 4). That's because the original input string contained 5 digits. Each loop iteration starts by multiplying our running total by BASE. In the zeroth iteration, because the running total started as 0, multiplying it by BASE doesn't change its value.

In each loop iteration we pull one digit off the left hand side of the string. That is, you are processing the most significant digit first. That "W" << "XYZ" is just to show that "W" has been pulled off the left side of the string, leaving behind only "XYZ" in the string, to be dealt with next iteration.

The single digit that is pulled off the string has to be converted to an integer. This isn't shown here, but it's super easy. If the input string is a decimal value in petscii, then using a PETSCII table we can see that "0" is $30, "1" is $31, "2" is $32, etc. So we just have to subtract $30 from the single character to get its integer value. That integer value is added to the running total. If the remaining string still has any digits left over, then we do another loop iteration.

In a real implementation we'll need to use a multiplication routine too, since the 6502 doesn't have instructions for multiplication. But we had a good 16-bit multiplication routine in the post about implementing factorial.

We'll also have to worry about precision here. If we use 16-bit multiplication and 16-bit addition inside the loop, it will limit the maximum size of integer that can be parsed out of a string representation of a number, to 65535.

String number to HEX Implementation

Here's a quick little implemention of the above. Works exactly as expected.

So, how does this work? Let's walk through it a line at a time. I'm using a very similar 16-bit multiplication routine to the one used in Implementing Factorial, with a small change. I removed the code that preserves multiplier, because I'm going to be clobbering multiplier with product immediately after the multiplication. However, for more info on this routine and where I got it you can read about it in the aformentioned post.

The initial constants are workspace variables for the multiplication routine. Then we declare that base is 10, because the string holds a number that's in base 10. And we define petoffset as $30, because I hate magic numbers. This is the offset from 0 as an integer to "0" the petscii character.

We'll start the code where a basic program starts, $0801, and follow it with the BASIC prelude which will SYS to the start of the real code. Read about this in the aforementioned post too.

Now, we need a running total, and we need to multiple that running total in each loop iteration by base. I've decided that "multiplier" shall serve as the running total. During the multiplication subroutine multiplier is corrupted, but the result is put into product. And multiplicand is left untouched. Multiplicand therefore should be base. So lines 18 to 21 copy base the constant, as a 16-bit number into multiplicand. And lines 25 to 27 initialize multiplier and the X register to zero.

The loop starts on line 29, where we read one byte from the string at index X. This will be the leftmost and most significant digit. Lines 30 and 31 check to see if this byte is 0. If it is, that's the null terminator on the string, the whole process is complete and it branches to done. If it got a legitimate digit it carries on, and increments X for the next trip around the loop.

The mult16 routine will destroy the accumulator. So first we back that up to the stack. JSR to mult16. This has corrupted multiplier, but the result is in product. We just want product to be put back into our running total, so we copy it into the multiplier which didn't have anything meaningful anyway. And then restore the accumulator with the digit from the string.

Lines 43 and 44 convert that single digit to an integer by subtracting "petoffset". And then lines 46 to 51 do a simple 16-bit add to add this digit to the running total. And line 53 restarts the loop to move on to the next digit. And that's it.

Output of the base conversion routine

How do we check that it worked? Run it. And the final result should be stored in multiplier which is at $4e,$4f, or 78 and 79. The screenshot shows that if we print out the peek of 78 and the peek of 79 we get 148 and 168 respectively. (Of course, BASIC is converting these byte integers back into strings to be displayed for us, irony.) However, if we stick them in our hex calculator, we see that 148 == $94 and 168 == $A8. Remember that multibyte ints are stored in little endian, or least significant byte first. So, reverse those and we get $A894.

That is indeed the correct answer, according to how I ran the pseudocode by hand. Pretty cool.

I don't want these posts to get too long, so I'll leave conversion the other way, from an int to a string, for part 2 of this post.

June 20, 2017Software

Huge Site Update

It's been two weeks since my last post, but I haven't been sitting on my thumb. For the last month, I've been working on a major overhaul to c64os.com, which you should now see for yourself around this post.

The site began as a simple one–page blog that gave me a place to talk about my development efforts on C64 OS, and to let me explain to the world the things I was learning along the way about 6502 programming and related Commodore technologies. But, I've got a passion for writing, I've known that for most of my adult life. I started to write some lengthy editorial posts here on the blog, including a review of the fanzine FREEZE64, and a review of World of Commodore 2016. I want to continue to write more of these sorts of posts, as well as hardware reviews and recommendations. However, with the simple blog format, posts quickly get pushed down and out of view by newer (and less important) posts.

I've had a longing to build a sort of online catalog of all the C64–related hardware that is commercially available. And I also have two main projects, C64 OS, and C64 Luggable, which really need some technical documentation. And documentation is not the same as a blog post. Blogs are about chronicling a journey, and are best followed and read over time. Or, sometimes a blog post will have some useful technical nuggets that will get picked up and indexed by search engines. To describe the technical details of C64 OS and C64 Luggable I really want a special page, with a table of contents, that describes each of these projects in their gory details.

So, that's what I've done. The changes to the site fix and address all of the above. The site is now divided into 5 main sections, which can be navigated with the main top navigation menu. A welcome page that talks about the rest of the site and gives a bit of history about me, and includes the full hero top section. Then, one page devoted to being the technical documentation, complete with table of contents, for C64 OS. One page devoted to the technical details, overview and construction plans of C64 Luggable. The forth page is the blog. But the blog has been freed from carrying the burden of what the other pages handle. The blog sidebar now has room for an index of past posts, search and feature posts. And the fifth and final page is, I am proud to announce, the Commodore 8-Bit Buyer's Guide.

I've put a huge amount of work into the Buyer's Guide. Finding sources of currently available products, organizing them into categories, finding high quality photos, and editing them to have a consistent size and background (whenever possible.) And assembling them into what I hope you'll agree is an easily navigable, graphically oriented, layout. My plan is for each product, or product family1, to have a feature page built out that provides more photos, and a consistent set of information about its price, how to order it, compatibility and more. Comments are open at the bottom of each feature page, as well as links and photos to similar hardware, or hardware you might want to get to go along with the featured piece.

At the time of this writing, I have only written a feature page for the 1541 Ultimate II+. Please bear with me as it takes a lot of time to build these out. When I began the project I figured there would be 15 to 20 available hardware projects. I was blown away when my list crossed the 100 line and then on up to around 120, give or take depending on availability runs. Now that I know there are so many products, projects and kits available, it makes me even more satisfied that I decided to take on the challenge of cataloging them.

Everything on the site is always a work in progress, but I think it's getting better and more useful as a resource all the time. If you have any comments or suggestions, please feel free to leave them below the article. (Comments are only available if you click the article title from the main blog page to see the article specific page. Or click the blue comments button below the title.)

  1. I am not planning, for example, to create a separate feature page for both the uIEC/SD and the Deluxe Daughter Card. These products are so clearly meant for each other that both entries in the Buyer's Guide will link to the same feature which discusses them both together. []
June 5, 2017Software

Recursive File Copier in BASIC

When I first began work on C64 OS, I was so new to 6502 programming that I just started to write all my code into one gigantic source file. It didn't take long before this was not sustainable, and I had to break the project down into smaller modules.

I'm also pretty sensitive to the risk of losing data. Not just from a harddrive crashing but also just from the sheer stupidity of something like accidentally overwriting a code file with its header. And so I've gotten in the habit, every night at the end of my coding session I backup my files from the CMD HD where I work on them to the uIEC/SD. I have a backups folder on an SD card, and inside that I have YYYY-MM-DD folders. So, I can recover from a previous backup and even review my history, it's a pretty good setup. Also, from the SD card, I periodically copy the backups to another computer.

When everything was in one file, backup was really easy! But as I started to split my project up into many smaller files it started to get a little bit trickier. The way I was doing the copying was to use the built–in JiffyDOS two drive copier. The steps would be something like this:

  • Switch to the uIEC/SD
  • Switch into the backups folder
  • Create and switch into a directory for the current date
  • Set the JiffyDOS target drive, (@X10)
  • Switch to the CMD HD
  • Load the directory (possibly with a pattern)
  • Hit Control-W to select all files
  • List the directory and use Control-A to deselect some files
  • Start the copy with RUN
  • Repeat from the CMD directory load with a different pattern

The pattern loads were to get, say, all the header files first. Then later get the source code files, and then the includes. All the while skipping the assembled object files, or other random text files etc. I've devised a single letter file extension system to organize all my project files by type. They are as follows:

  • no extension — Executable
  • .a — Source ASM
  • .s — Includes
  • .h — Headers
  • .t — Text File
  • .o — Object File
  • .m — Menu Definition
  • .i — Info/Config file
  • etc.

That's quite a few steps, and the amount of work keeps getting greater the more files I add to the project. But then something much worse happened. In the post about Application Loading, I discussed my plans to have C64 OS load applications out of subdirectories. The main system folder will have Services and Applications subfolders. Inside these are folders named for applications. And inside each of these will be the standard layout of main executable, menu definitions file, about info and other app specific resources.

I've started to build out the first app I'll be writing for C64 OS, App Launcher. And instantaneously my old backup system has become paralyzingly laborious. It got to the point where I was starting to dread the process of just doing a simple backup. But, I have to do them every night or risk losing all my code in the case of a disk failure. My first thought was, one of the very first features I should add to the C64 OS File Manager is the ability to recursively copy folders. That's a fine idea, but I know I'm going to have untold many more nights of coding before that will be ready. I need a solution for backup now.

BASIC: The original scripting language

I've yet to see BASIC described as a scripting language, but the more I think about it the more apt a description I think it is. BASIC is often described as a high level language. But there is an important difference between high level languages and scripting languages. Usually the key difference is that scripting languages are interpreted. Well, BASIC is a high level interpreted language. It's effectively a scripting language since before people started to call them that.

I decided to try to write a quick and dirty file copy utility in BASIC. The problem is that I had no idea how to write a file copier. Commodore DOS provides commands for copying files between mechanisms that belong to one device, and are controlled by one device controller. So, for example, the CBM 3040 or 4040 Dual Disk Drive. You can issue a command telling them to copy a file by specifying one filename preceded by the internal drive mechanism number, and then the destination filename preceded by a different drive mech number.

CBM Model 3040 Dual Disk Drive

This inter-mechanism file copying was later supported by the CMD HD, and is currently still supported by the uIEC/SD, where the drive mechanism numbers are instead partition numbers of the same physical mechanism. That was a very clever adaptation that maintained backwards compatibility with existing Commodore DOS syntax. But, how do you get data transfered from one serial bus device to a different serial bus device? I want to transfer data from a CMD HD to the uIEC/SD.

I succeeded in making what is probably the slowest file copier that has ever been written for the C64. Yay, I've made history! However, I do know of a way that I could make it slower. About the only thing I know is that reading one byte from one disk and then writing it to a different disk and repeating is unimaginably slow because between each read and each write the computer has to send all the TALK/UNTALK/LISTEN/UNLISTEN overhead. But, the problem is that I just didn't know of a very good way to read many bytes at a time such that I could write many bytes to the destination. Here's what I came up with: (don't bother to try this at home folks.)

CopyFile in BASIC

Here's how this works. I begin by making a dimension (an array) of string variables with a capacity of 10,000 elements. Next I open a file (drives.h) for read on LF#2, and another file (drives.h2) for write on LF#3. (In the screenshot they're both on the same device, but this can by changed to a different device number, of course.) Next I set a variable k to 0, which is the length of data stored in the dimension d$. BASIC v2 evidently has no way of dynamically getting the length of how much of the dimension has been written to.

At line 50 I read a byte, check if we're at the end of the file. Otherwise line 60 effectively pushes the byte onto the end of the dimension, but setting d$(k)=a$ and then incrementing k. As long as k is less than 10000 keep looping grabbing more bytes from the source and pushing them onto the dimension. When the dimension is full we go into another loop that writes all the bytes in the dimension to the output file. This is done by incrementing another variable l from zero until it's equal to k. And then we go back to line 40 to do another 10,000 byte read phase. If we hit the end of the file before filling the dimension, there is another loop at line 100, 110 that will write out the end of the file. Then both files are closed.

Believe it or not, for one file, a small sequential text file, this actually works. So it tricked me into thinking that maybe this code could be part of a larger script that would copy multiple files. That turned out to be bunk. After writing the larger program and watching it start to copy I realized it was going to take like an hour. So instead of waiting staring at the screen I spun around to another computer and started up a game of CREATURES 2 that I just got in the mail from Psytronik. I played for at least 45 minutes, maybe an hour, and when I turned around the utility had only copied about 10 files. Clearly, a file copy utility that is going to take 5 hours or more to finish is nearly useless.

Damn.

I didn't know how else to read bytes from one file and write them to another file any faster using BASIC. So I went to bed feeling as though I'd failed. I had a solution, but it wasn't exactly a viable solution.

JiffyDOS, you are our friend

The next night, I got to thinking, maybe there is some way to use JiffyDOS's file copying abilities from inside a program. So I scoured the JiffyDOS manual looking for mention of how to use its copier from within a BASIC program. Unfortunately I didn't find anything. (UPDATE: July 4, 2017: Apparently I didn't look hard enough, because it's totally in the manual.)

I thought about it for a bit though, and remembered that when you load a directory on the C64, it loads it as though it were a BASIC program. What happens when you RUN that directory listing? Usually you just get a syntax error. But when you use the JiffyDOS two device file copier, you load the directory, then you use Control-A and Control-W which places asterisks into the BASIC lines just before the quoted file name. And then you RUN the directory listing to start the copy. That got me to thinking, so I wrote a little program to test it out.

JiffyDOS File Copier in BASIC

It turns out this works brilliantly. Line 10 uses @# to set the default source device, and @X to set the default destination device. A little bit of testing shows these can also be variables, so you can go: DV=9:@#DV or ZZ=11:@XZZ and those work as you'd expect. Very powerful. As you can see in lines 20, 30 and 40, the asterisk is actually a basic extension provided by JiffyDOS. You can even use that one in direct mode. Amazing, all of the years I've been a Commodore user and all the years I've had JiffyDOS, let's call it 20 years, and I never knew you could initiate a one off file transfer using the asterisk. It's super easy, you just issue these in direct mode:

	@#8
	@X9
	*"myfile"

... and boom, that will copy a file called "myfile" from device 8 to device 9. Default partitions and directories are used both for source and destination so you just change those ahead of time to determine where to copy the file to.

The next thing to point out is the file type. PRG, SEQ, USR. As can be deduced by seeing how it works when you load a directory, mark some files for copy, and run to start the copy, the source file type is on each line, following the quoted filename. I experimented a bit with this, and found that you can either use the 3-letter file type, or just the first letter. This is the same as when you're opening a file for write using the BASIC open command, like this:

	open2,8,2,"afile,w,seq":rem with the full "seq" type
	open2,8,2,"afile,w,s":rem works exactly the same with just "s"

But, what does this file type represent? It has nothing to do with the file type of the file being read. It just sets the file type on the copied file. So you could be copying a PRG file, but if you do it as: *"myprgfile" seq then the destination file will have type SEQ. I have no idea what would happen if you tried to copy a REL file and changed its file type. Also, I'll have to play around to see what happens to a PRG's two–byte load address. My guess is that it just ends up being the first two bytes of the SEQ file.

Lastly, being able to use a variable for the filename to copy makes this the perfect copier for use in a BASIC program. (Or should I say BASIC script?)

Making a File Copier

In order to actually put these abilities of JiffyDOS to use, however, we need a way to load in a directory so that we have a set of filenames to loop over and copy.

About 4 or 5 months ago I stumbled upon this article over at pagetable.com. It's a pretty technically accurate description of what's going on with a directory load from disk, but I think his coding style prioritizes smallness of code to a fault. As someone in the comments points out, it's not even compatible with all versions of the C64 that were commercially shipped. Also, I disagree with his general sentiment that the directory load as a BASIC program is a hack. The article also says the drive makes assumptions about what the computer will understand, for example, to invert the text. It's not so much an assumption, (which sounds negative,) as that the two were written for each other. The directory listing contains a PETSCII value that is defined as REVERSE text ON, and the KERNAL's screen editor knows how to correctly interpret PETSCII. </rant>

It did help me however to understand how BASIC data is structured, because the directory is returned to the computer as a BASIC program. And thus LISTing it presents it just as a BASIC program gets listed.

Each line starts with a 2-byte pointer to the memory address where the next line starts. In modern terms the lines of a BASIC program are a singly-linked list. From the address of any line, the computer can get to the next line, but cannot get back to the previous line without starting from the beginning and searching for it by moving forward through the links. The next two bytes of a line are its 16-bit line number. At first, I guessed that when you enter a line while editing a BASIC program, that the interpreter must interpret your line number, hunt for the line number that precedes your line number, and then do some pointer manipulation to insert your new line into the linked-list. I wanted to be sure though, so I tested it out and was surprised by what I found.

I assumed the linked-list nature of the line numbers helped with inserting new lines because rather than moving large chunks of memory around it could just manipulate the pointers. The tests reveal, however, that it actually does move whole chunks of memory to make room for your new line. Although, having a linked list is still useful at execution time to lookup where a line number is. The interpreter would start at the first line and follow the links from line to line looking for the line number to GOTO or GOSUB to. Here's what my tests revealed.

First I entered line 20 (0x14), then I entered line 10 (0x0A). To my surprise when I looked at the memory layout, the complete content of line 10 precedes line 20's content.

14 08 0A 00 99 22 4A 55 53 54 20 41 20 54 45 53 54 22 00 
27 08 14 00 99 22 48 45 4C 4C 4F 20 57 4F 52 4C 44 22 00 
00 00

I then decided to add in a line 5, and to my continued surprise the complete contents of line 5 were put into memory before line 10. And lines 10 and 20 where shifted up in memory, and the next line pointers were adjusted to match their new locations.

19 08 05 00 99 22 48 4F 57 20 44 4F 45 53 20 49 54 20 57 4F 52 4B 22 00 
2C 08 0A 00 99 22 4A 55 53 54 20 41 20 54 45 53 54 22 00 
3F 08 14 00 99 22 48 45 4C 4C 4F 20 57 4F 52 4C 44 22 00 
00 00

A final test continues to confirm the behaviour. I modified line 5 by making it longer. All the lines that follow it are shifted up, and the next line pointers are adjusted. This feels a bit brutal to me. If you had a really big program and you edit line 1, it would have to shift the entire program and update the pointers for every line. The upside to doing this is that it doesn't need a memory manager to find free space, and memory never gets fragmented. See my discussions on this in my posts about C64 OS's memory manager.

28 08 05 00 99 22 54 48 49 53 20 49 53 20 41 20 56 45 52 59 20 4C 4F 4E 47 20 4C 49 4E 45 20 49 4E 44 45 45 44 22 00 
3B 08 0A 00 99 22 4A 55 53 54 20 41 20 54 45 53 54 22 00 
4E 08 14 00 99 22 48 45 4C 4C 4F 20 57 4F 52 4C 44 22 00 
00 00

The thing to note about how BASIC lines are structured is that the line number and where the line falls in the linked list are only loosely related. If you are actually editing the program and you enter a new line with a new line number, the BASIC interpreter will on–the–fly shift the higher lines up and out of the way, change all the link pointers, and insert your new line into the correct place according to its line number. However, if you save a basic program to disk and then edit the bytes on disk such that the first line in the linked list has a number bigger than the next line in the linked list, when it comes time to run the program or list the program, it doesn't follow the line numbers, it follows the linked list order. Take a look at these screenshots.

BASIC lines listing out of order

I entered the program as shown in the top screen. Line 10 first, as "first line" then line 20 as "second line." Listing it shows 10 comes before 20 as expected. Then I saved it, edited it in HexFiend for Mac, changing just the line numbers but not the linked list order and reloaded it. In the second screen you see the very odd behavior in that when listed, "line 20" lists before "line 10". However, you can see that "first line" and "second line" strings are still in the original order. And when RUN, it prints "first line" "second line". So when executing it's following the link pointers, not the line numbers. That's pretty neat.

This is relevant because this is how the disk's directory is able to use BASIC line numbers to represent the sizes of the files. But the order of the files is in the order that they are found on disk. In other words the disk's directory exploits the separation of line numbers from their line link pointers. Interestingly enough, that also allows two files that have the same size to produce two BASIC lines that have the same line number!

Knowing all of this, here's the program I wrote that loads in a set of file names by parsing a directory listing that is returned from the disk.

Reading a directory in BASIC

Let's walk through it. We want to know both the filename and its associated file type, so I start by making two dimensions, LN$ and LT$, for name and type. Both contain the same 100 element capacity. LI is our dimension index. Line 10 opens a file for load from device 8. Channel 1 is specifically reserved for "load" (as opposed to a mere "read") which is required by the drive's DOS when the filename is "$" to request the directory. Otherwise, if it were a read, the DOS looks for a file that is actually named "$" and doesn't find one. I'm also using a pattern with the directory load, which is optional. On that same line I do a GET#1,A$,B$. This pulls the first two bytes and they are then discarded, these are the two byte load address that are returned first with every load.

Line 20 initializes our filename string a$ to blank, and goes to the subroutine at 1000. The subroutine just pulls 4 bytes and discards them. These are the 2 byte line pointer and the 2 byte file size (or BASIC program line number) which we don't care about in this program. Immediately after trying to grab the 4 bytes the start a line number I check the status to see if we're at the end of the file. This will be true when we reach the end of the listing and is what ends the program.

You'll see a pattern on successive lines of getting bytes, comparing them to something and either proceeding or looping back to get another byte. 30 and 35 pull a byte and if it's not a quote (PETSCII 34, 0x22) loops back to grab another byte until it finds the first quote. 50 and 55 are a loop that pulls a byte, if it's a quote then it's the closing quote and the filename is over, otherwise it's the next character in the filename and is appended to a$. At 60 therefore a$ holds a complete filename. It is pushed onto the dimension LN$ at LI and printed so we see it. Line 70 is a loop that pulls a byte and if it's a space (32, 0x20) loops to pull another byte until it's not a space. This byte is being stored in b$. At 80 b$ therefore holds the first letter of the filetype, (P of PRG, S of SEQ, or U of USR.) This filetype character is put into the dimension LT$ also at LI, and then LI is incremented. Line 90 is a final loop that searches for the end of the line, which is null terminated (ends with a 0 byte). It then returns to 20 which reinitializes the filename string a$, gets the header and checks if we've hit the end of the directory, etc.

When the end is reached the file is closed, and ln$ and lt$ have the same number of elements. ln$ is full of filenames and lt$ has the corresponding file types for those filenames. That's pretty good! It's easy to see from here how you would just loop over the filenames and use the JiffyDOS file copy commands to copy each of those files. That's really good, we're close to a solution.

Recursive File Copier in BASIC

I've broken the final program down into 4 main chunks. The main program, the file copier, the directory loader, and some directory utility routines. Let's look at them one at a time.

Recursive File Copier in BASIC, main program

DI and DO are device input and device output, 8 and 10 respectively. The subroutine at 5000 is to setup the initial backup folder, we'll come back to this. Next, the program configures what is effectively a "stack". FA$ is a dimension for filenames (FN$ somehow is reserved and using it produces an error when run), FT$ is a dimension for file types. And FC is a dimension used for making the recursion possible. FA$ and FT$ are 999 elements deep, but these could be made larger if necessary. They represent the total number of filenames that can be on the stack at any one time. This is, all the files in the root directory, plus all the files in the current subdirectory, plus all the files in that subdirectory's current subdirectory and so on recursively through everything being copied. FC is a dimension of numbers, where each element tells us how much of the stack is used for this recursive level. It's defined as 15, which means it cannot recurse more than 15 folders deep. This could also be lengthened if necessary.

FI and FJ are Stack Pointers. FI indexes into both FA$ and FT$, while FJ indexes into FC. Line 20 kicks the process off by GOSUB to the file copy routine at line 30. When this finally returns the entire backup is complete, so it prints the "Backup Complete" message and ends.

Recursive File Copier in BASIC, copy files routine

Not only does the file copying happen in this routine, but the recursive magic all happens here too. FL is initalized to zero. It's the local count of how many files (and folders) are found in the source's present directory. GOSUB 4000 takes us to a directory reading routine much as we have already seen, that loads the files from the present directory, and sets FL to how many it finds. As it finds files, it pushes the filenames (and their file types) on to the global FA$ (and FT$) dimensions, which updates the FI index as it goes.

Line 40 pushes the local file count onto the FC dimension. And if that count is zero, there are no files in this folder so it skips to the end of this routine which will pop that file count off and print a message about this folding being complete and return.

Line 100 sets the source and target devices, this probably doesn't have to be done once for every folder, but it doesn't hurt. 110 sets local variables $F and $T as the filename and file type that are the last element of those stacks. Lines 120 and 130 check to see if the file type is either P or S, and if they are does the JiffyDOS copy command with the current filename and either PRG or SEQ for destination file type. I tried to get those to be resolvable as variables, but for some reason that doesn't work. They have to be typed out. I'm not supporting USR (nor REL, DEL, etc.) because I only ever use PRG and SEQ file types. After copying a file it skips to line 200. This decrements the local file count, and pops the filename and file type off their stacks. If the local file count is still greater than zero it loops back up to line 110 to copy the next local file.

If the file type is neither P nor S, it is assumed to be D for directory. Lines 160 through 180 handle recursing into the subdirectory. GOSUB 5200 is a subroutine that creates the directory based on the current filename on the target device. Then switches into that directory on both source and target devices. The current local file count is backed up to the FC stack. And GOSUB 30 recurses by calling the file copy routine at 30 from within itself. Mind bending recursion occurs, but when it returns we GOSUB 5300 which pulls both source and target devices back up one directory. Then the local file count is restored and this routine continues copying files at this level.

The key to how this recursion works is that FC dimension. Let's say in the root folder it finds 10 files and 3 directories. The last element of FC is thus set to 13, indicating that the last 13 entries in FA$ and FT$ belong to this level. After copying one of those files, that file is popped from FA$/FT$ so now only the last 12 belong to this level. Thus the value at the end of FC is decremented to reflect this. When this level encounters a subdirectory, it creates the directory and moves source and target into it. Then it reads the file count in that folder and pushes a new file count to the end of FC. This tells that routine how many files it needs to process at this level before its over and the remaining files on FA$/FT$ belong to the previous level. It works brilliantly! Gads I love recursion.

Recursive File Copier in BASIC, read directory routine

This read directory routine is nearly identical to the standalone routine we went over earlier. With a couple of minor exceptions. When a directory is read in, a header flag, H, is set. This causes the first row, the directory header, to not be pushed on the stack as a file to be copied. And the P flag is used as a "phase." Reading a directory is done in two phases, the first gets all files that have any single character extension. This accords with how I've got all my files setup for C64 OS development. After phase 0, it loops back and does a second phase which repeats the whole routine except it changes the directory pattern to just load directory names.

Filenames and types are pushed on to FA$ and FT$ respectively, and FL is incremented for each entry found. And thats it. This routine only ever operates on the current directory of the source device. Setting up that current directory is entirely handled by the previous file copy routine.

Recursive File Copier in BASIC, directory utility routines

This last section is for the three utility routines related to handling directories. 5000 is called once at the very start. It asks the user for the backup date to use, typically I'd put this in the format YYYY-MM-DD, but I don't have to. The current directory on the output device is switched into the backups folder, the new date folder is created and switched into. The source folder is assumed to be already the default, this is a useful assumption because it lets me recursively backup an subset of the C64 OS project, or actually any other folder on any other partition (with the one constraint that it only matches files with a single character extension.)

5200 makes a new directory in the current directory of the target device based on the current filename F$. Then it swithes into that folder on both target and source devices.

5300 tells both the source and target devices to go back up one folder.


And that's it folks! That's the whole thing. A recursive file copy utility, scripted in BASIC, and backending on the JiffyDOS 2 device file copier. One caveat is that you have to have JiffyDOS. But one time when I asked Glenn Holmer on IRC to help me test a drive detection routine I wrote, I asked if he had JiffyDOS. He said he did, then very cheekily added that he also has indoor plumbing. I take his point. If you don't have JiffyDOS yet, what are you waiting for? Go buy the roms now, available from Retro Innovations.

Example output while the recursive file copier is working

This above is what it looks like as the copier is copying and moving through directories. At the very top you see it make the App Launcher folder. Then reads its files and subfolders. Then it copies 4 files in that folder. Then that folder is complete. But then, its parent folder is also complete, and it steps back up and out of the "Services" folder. Next it makes the "Applications" folder which is sibling to "Services". But Applications is empty so after reading its files and folders it reports that that folder is complete. Then it continues copying all the other files from the root level. It actually works. Cool.

Here's how the code looks, not in screenshots, if you want to copy it and make use of it for yourself.

Older PostsClick titles to see full content

The home page shows the full content of the 10 most recent posts.
Below are the titles of 5 posts older than the most recent 10.
Click their titles to view the complete post and read and leave comments.

May 29, 2017Technical Deep Dive

Implementing Factorial in 6502

May 16, 2017Editorial

Review: FREEZE64 Fanzine

May 9, 2017Programming Theory

Pointers in Practice, Menus

May 1, 2017Programming Theory

Loading Sequential Files

April 27, 2017Programming Theory

HomeBase Applications

Archive

Full Archive of Past Posts…