NEWS, EDITORIALS, REFERENCE
Happy Valentine's Day, 6502 freaks. Here's a new genre of post, I've tagged this as: Programming Practice, because it gets right into the nitty gritty of a practical problem and the steps taken to fix it. This post includes a 30 minute video of a realtime debugging session. If a picture is worth a thousand words, well then, a video has gotta be worth something.
Late last month I posted my first video update. A video update is kind of cool because as I build out C64 OS, there will be more and more interesting things to see. And pictures are handy, but a video really gives you a flavor for how fast something is loading, or how the mouse is responding and how the screen is refreshing, etc.
In my video update, which was only a couple of minutes long, I held the camera in my left hand and worked the computer and the demo with my right. That's fine for a short clip, but to really get into the weeds you need your hands free and therefore a stand for your camera. I banged together a decent makeshift camera stand out of nothing more than a stiff metal coat–hanger, with a grippy rubber coating. My workspace is under a stairwell, so the ceiling is quite low. Hanging the coat–hanger from the ceiling works perfectly and positions it to point straight at the monitor.
Let's get into this video debugging session, and take a glance into some native coding.
I began this video knowing I had an interesting bug to find. It's interesting because it has a clear visual artifact and involves several low–level code modules that I'm working through for the first time.
Before I started my hunt for this bug, it struck me, this would be a great opportunity to catch the entire process in a video. I spend the first few minutes talking about the general layout of the source code and the convention I use for filename extensions. Then I spend maybe a minute describing the basic problem I'm having, and then we just hop right into using some tools, like DraBrowse64, Turbo Macro Pro, SuperMon64+ and JiffyDOS commands. The goal is to show how one goes about using the tools available on a C64 to hypothesize about the problem, run some tests, modify the code, reassemble and test again.
The video concludes with me being very self–satisfied, having diagnosed the problem, found the bug and fixed it, and showing that the result of the fix has removed the offending visual artifact.
WAIT, hold on. That's not the end of the story...
Shortly after concluding the video I decided I really should run it against a couple of additional test files. I very quickly noticed that I was getting unexpected results. Off video, I ended up going back into the debugging process, and after about 2 hours of testing some things out, I finally realized what the real source of the problem was. The fix shown in the video is completely wrong. It solves the visual problem by fluke, and in fact the entire on–screen display is wrong, symptomatic of the real problem, but it's only subtly wrong in a way that is hard to notice.
So, watch the video, and see how the tools work and the general process of debugging native code. And then read on below for a full description: What went wrong? How did I mis-diagnose the problem? How did my bogus fix look like it solved the problem? And what was the real problem and the real fix?
Now on to the REAL bug
The first thing that went wrong was that I mis–diagnosed what effect a certain kind of problem would result in. This led me down a rabbit hole looking for the problem in the wrong place. I ignored two common sense clues, because they didn't conform with the problem as I had imagined it. And the fake solution itself was made possible by an issue in the data. So let's dig a bit deaper on these.
When I looked at the PETSCII image it looked right, columns line up row after row. But, imagine if you started the base address of screen memory at one address offset from where it was supposed to be. Instead of passing a pointer to $0400, for example, imagine we passed $03FF. The first character would not be displayed on screen ($03FF isn't in screen memory), but then all of the characters in that first row would be offset to the left by one. In all subsequent rows, the character from the leftmost column would appear as the last character in the row above. That's because from the top left going right, memory wraps around to the first column of the next row.
The following image demonstrates the effect. (It's only 20x20, the C64 is 40x25, but the concept is the same.) If things go well, you should see the yellow circle, as on the left. But if you start one memory location short, as you can see in the green on the right, every row would pull left one column. Except, all the data in the first column would wrap up a row and to the opposite side of the screen. And at the end, you'd get one missing byte, represented here by the red square.
I am getting the problem this shows of the red square, where it has read in data off the end of the file. But what I was not seeing was any of the left hand column data wrapping to the right. The image itself looks normal.
The image looking normal led me to think I was simply not reading in 1000 characters of data. I even did the math, and because of my bias towards what I thought was the problem I did the math and still thought I was not reading in enough data.
Here's the thing though, you only see a row wrap around if there are any characters in the leftmost column to start with. As it happens with this image, the word EARTH is supposed to run along the right edge of the screen, and there are around two columns on the left that are just empty. By pulling everything left by one column, it didn't wrap the image like I thought it would. And so, it turns out, it really was pulled left.
Thinking that I was not reading in enough data, I made a change that would load in one extra byte. This was a mistake, now I was loading in 1001 bytes, and it's the 1001st byte that got drawn into the red square. But, usually this would have failed because the PETSCII file should only contain exactly 1000 bytes. However, I generated it using my web–based PETSCII Art Renderer. The file actually contains more data than the C64's text screen has. But that's just a problem with the renderer, and it doesn't matter too much that the file may have an extra row of data or so. However, it conspired to trick me into thinking it worked because there was that extra byte available to load in.
And the REAL fix
Turns out, I really was reading in 1000 bytes. My so–called fix actually would have introduced a new bug that loads in 1 byte more than you asked for.
And, it turns out, the code that specifies the start of the buffer is correct, and it really is looping copying 1000 bytes from the buffer. If that's all correct, then what the heck is wrong?
It's the textblit routine. My goal was to make this copy 1000 byte from a buffer into screen memory as fast as computationally possible. To do this, I'm using a macro, configregion that is invoked 4 times that self–mods the output of another macro, drawregion. Let's take a look at how these work, and why they're fast.
You call textblit with a RegPtr to a buffer. It then calls confreg macro 3 times, before setting up a loop around the output of 4 instances of the drawreg macro. Each instance of the drawreg macro starts with a label, regiona, b, c and d. The first line of the drawreg macro outputs an lda $ffff,x. But the confreg macros are designed to self–mod these load addresses.
Because an 8-bit register can only count from 0 to 255, but we need to copy 1000 bytes, the screen is divided into 4 regions that are 250 bytes big each. The 4 drawregion macros have the CPU copy 4 bytes, one from each region, per loop. This parallel copying is faster than doing one region after the next, because it reduces the total number of loops and the required branching logic to perform each iteration. Instead of 1000 iterations, there are only 250. See the post Anatomy of a Koala Viewer to see how the routines in that Koala Viewer program do something similar.
That's a pretty cool trick. But there is a problem when branching. We want X to loop from 249 to 0, therefore, we only want to stop when X rolls over from 0 to 255 (or -1 signed). There are generally two ways to do the branching for these kinds of loops:
The first way is faster, but only works if the biggest X value is less than 128. For any value 128 to 255 the 7th bit (b7, aka x000 0000) is set, which is also the negative flag for signed numbers. If you want to loop from 127 or less down to and including 0, you can branch using BPL (branch if positive.) 0 is considered positive simply because the negative flag is not set. You can't use this method if the max value starts off greater than 127 because the negative flag will already be set.
The second way is slower. You can do a CPX #255 at the end of every loop to see if X is exactly one less than 0 and then branch on the compare result. But, that kinda sucks because now you have a whole other instruction to perform on each loop.
The third solution is what I came up with. (Although I'm not claiming credit, this is well picked over territory.) Have the loop go from 250 to 1, and branch using BNE, that will cause the loop to end when X becomes 0. The drawreg macro then modifies the destination address to be one less than the base screen memory address. In other words, the write goes from $03FF,X where X ranges from 250 to 1. The copy goes in reverse, and ends at $0400.1 And then this is repeated across 4 regions to cover the whole screen.
That is cool. It's the fastest I can come up with. Ideally there would be no cycles wasted, because this routine is going to be called a lot for compositing screens together.
But there is a problem, and this was the true bug. The LDA address is set exactly as it is passed in, yet the X index is 1 more than it should be. So, the problem was not that my destination addresses were all pulled one to the left, but that the source addresses were all pushed one to the right.
One solution is to do a 16-bit decrement on the RegPtr when it's passed in. This is what I did, and it actually solves the problem.
For the Keeners
It's possible that this whole technique is trying to be more clever than it needs to be. It seems to be the case that there is no +1 cycle penalty for crossing a page boundary with STA Absolute,X but there definitely is that penalty with LDA Absolute,X.
By setting the source address back 1, we might be setting it from something that is nicely page aligned (most likely will be, because the page allocator will give out boundary–aligned pages of memory). What that means is that the majority of LDA's will be crossing a page boundary. All but 36, actually. That's 1000 - 36 = 964 cycles spent on page cross penalties. Plus a few extra cycles are required to do the 16-bit decrement on the initial pointer.
If you use method 2, however, you still can't avoid all page boundary crossings. In fact, there are only about 250 fewer page boundary crossings, all of which occur in the first region. After that, the boundaries get crossed regardless. But, still, that's 250 cycles saved. But, then you need to do a CPX Absolute 250 times. Each of those costs a whopping 4 cycles, or 1000 extra cycles.
Therefore, despite the initial penalty of page boundary crossings, in total the trickier technique saves around 750 cycles, or 0.75 milliseconds. Is it worth it? Hell yeah! It only cost me one little bug, and now that it's fixed, it's solid AND bitchin' fast.
One last thought. C64 OS is architected to follow some modern design principles from Model–View–Controller. What is in screen memory is never the only copy of some of the application's data. That violates the separation of model and view. In C64 OS, the screen is just a view. Anything that gets drawn to the screen is drawn from extant models somewhere else in memory.
Every application must have a standard draw routine. And you never call that draw routine manually. The main event loop calls multiple draw routines, in the correct order, and only when necessary. The application must be prepared to redraw the current state of its entire interface at any time.
Besides many advantages this brings that I won't go into here, one major advantage should be scrolling speed. The KERNAL's screen editor scrolls the screen perilously slowly, when compared with how quickly the 1MHz 6510 can copy data from an external buffer into screen memory. If you have an external buffer, and you want to scroll the screen, it is much faster to just change the pointer to the source address by the length of one row, and then copy the whole contents as quickly as possible. What the Screen Editor has to do is copy row 1 into row 0, then copy row 2 into row 1, and repeat for all rows. Then repeat again for all the rows of color data. Ouch.
- According to my own version of the 6502 Instruction Set, and also the one on 6502.org, STA Absolute,X does not suffer a +1 cycle penalty for crossing a page boundary. [↩]
I want to talk about File References and the services provided to the developer by the File module in C64 OS. But to do that, we'll have to first take a brief tour through the IEC serial bus, the KERNAL and the C64's native I/O architecture.
To Rewrite or Not To Rewrite
First things first. C64 OS does not patch out the KERNAL ROM, it makes use of it for many low–level routines. Some of those routines include file handling. Many people who write an Operating System for the C64 make it one of their first and top priorities to rewrite the low–level file access routines. But I definitely don't want to do that. Why not?
- C64 OS is single tasking
- The C64 is already short on RAM
- Writing serial routines is hard
- JiffyDOS is already very fast
- C64 OS wants to be compatible and agnostic
If I can be allowed the pleasure of describing those points in prose, the bottom line is that writing serial routines is hard, and serial routines already exist in the KERNAL ROM. The RAM beneath the KERNAL's addressing space is still available to the machine,1 so anything you can make use of from the KERNAL, even if you only use 20% of the KERNAL, that's 20% of 8K. That's 1.6K of memory you can spend implementing something else that the KERNAL doesn't do.
Compute's 1st Book of C64: Block diagram of memory with I/O and ROM overlays.
Furthermore, the reasons people write their own file routines usually fall into a couple of categories. They want serial access to be faster, they're targeting a specific common device such as the 1541, or they're doing something more sophisticated than the KERNAL's routines can handle. These special needs might arise as a result of the the OS you're writing being more sophisticated, such as by supporting multi–tasking. C64 OS is single–tasking, that was a decision I made very early.
I want C64 OS to be able to use any type of device, IDE64, SD2IEC, 1541 Ultimate, 1541/71/81, CMD FD/HD/RL, etc. JiffyDOS, which if you don't have, you should get, already has routines that have been carefully thought out to get a good balance of greatly improved speed without sacrificing compatibility. I had a discussion with Jim Brain about how JiffyDOS's speed loading routines work, relative to how the custom speed loaders in some games work. Yes, some custom speed loaders can get more speed than JiffyDOS, but some of them also require you to have only one drive on the IEC bus, or will only work with a 1541 or 100% compatible clone. Those limitations might make sense for a demo or a game, but they are the sorts of tradeoffs that I would never make for a platform like C64 OS.
Check out this speed comparison between JiffyDOS and the original KERNAL rom. It's pretty stark.
So, C64 OS uses the KERNAL's routines. And if you want speed, replace the stock KERNAL ROM with a JiffyDOS KERNAL ROM.
The IEC Serial Bus
Let's talk about the C64's serial bus for a moment, so we have some idea of what any software routine is actually working with.
The word serial in serial bus means it sends one bit at a time along a single data line. But you can have many devices hooked up to the bus all at the same time. And somehow the data gets to and from all the right places. How does that work?
My understanding of how this works has evolved approximately like this: When I was I kid I was happy to believe that it was pure magic. By the time I was in my 20s I knew enough to be thoroughly confused because I could imagine the limits of a single data line (plus a couple of signal lines), but I wasn't creative enough to imagine the solution. Now, by the time I'm part way through my 30s, and have rekindled my interest in all things Commodore, I've taken the time to read up on the hardware and understand as much of the code as necessary to finally get the gist of how it actually works.
In digital electronics, one line is one bit. So a serial bus is effectively a 1-bit bus. Now, you can read or write 8 bits in a row to transfer a byte–sized package, but where is that byte going to go? Who is it addressed to? How does it get there? If you're writing a byte, how does that byte get to the destination device but not to any other device? These were some of the questions I had, up until quite recently, that I just didn't know the answers to. I'm not suddenly a serial bus guru but at least I have a pretty good understanding now of how this generally works.
There is a protocol, that is to say, an order to the rise and fall of the data and signal lines, and an interpretation of the bytes that are transmitted at certain stages of communication, which is agreed upon by the software in the computer and the software running on each of the serial devices. The C64 is the master device on the bus, so it organizes who is allowed to talk on the bus and who should be listening.
Besides the data line, there are a few signal lines: Clock, Attention and Reset. The data and signal lines are electrically continuous all the way from the computer through each daisy chained device. What does that mean in practice? It means that if the data line goes high, every device on the bus knows that it is high. And this is a hardware quality of the bus, it isn't controlled by software, each line is just one continuous electrical wire along which the various devices are connected. The question then is, how does the computer direct the data to just one of those devices?
IEC Serial Port PINOUT, from IEC disected.
Every device on the bus must be manually assigned a unique address. This is done by the user, with DIP switches or cutting or joining traces on the device's logic board.2 These range from 4 to 30. The first for storage devices is 8, hence the famous load "*",8. The computer uses the attention line to tell every device it's going to issue a control byte. In the control byte, some of the bits represent a command: TALK, UNTALK, LISTEN, UNLISTEN, and some others. With the TALK and LISTEN commands, the other bits are used to represent the device number. Because the data and signal lines are electrically continuous along the whole bus, every device always sees all the information on the bus. But, if the computer begins with a control byte that is a combination of the command LISTEN and the device #12, then every device that knows its own address is not 12 ignores all subsequent data, until the next control byte is sent. The device that has address 12 starts paying attention to all subsequent data bytes because it knows the data on the bus is meant for it.
- TALK tells a device that it should transmit data to the computer.
- UNTALK tells the device that is TALKing that it should cease transmitting data.
- LISTEN tells a device that it should receive data from the computer.
- UNLISTEN tells the device that is LISTENing that it should stop listening.
There is a bit more to it, but that is the general idea of the protocal in a super tiny nutshell.3
What the KERNAL does
What does the KERNAL do? First I'll talk about what the KERNAL does, then I'll talk about what the KERNAL does not do, because trust me, there is a lot of stuff that the KERNAL does not do.
The KERNAL divides the labor into (roughly) two layers. And thank goodness it divides it at all, because this division is what allows storage devices that are not physically connected to the serial bus to work more or less as if they were. (Devices like IDE64, RamLink, an REU with RamDOS, and so on, are only connected via the expansion port). The first layer is a set of 8 low–level routines for working directly with the serial bus. These are as follows:There are a couple of others. Some routines, which programmers have taken to using, do not have official jump table entries. But, this is the gist.
- TALK — Command a serial bus device to TALK
- TKSA — Send a secondary address to a serial bus device that has been commanded to TALK
- ACPTR — Retrieve a byte from a device that is TALKing on the serial bus
- UNTLK — Send the UNTALK command to the serial bus
- LISTEN — Command a serial bus device to LISTEN
- SECOND — Send a secondary address to a serial bus device that has been commanded to LISTEN
- CIOUT — Send a byte to a device that is LISTENing on the serial bus
- UNLSN — Send the UNLISTEN command to the serial bus
As you can see, these routines are very low–level. They mirror the essential hardware protocol of the IEC serial bus itself. But they do a lot of heavy lifting for you, by communicating with the CIA #2 chip to manipulate the DATA, CLK and ATN lines to send and receive one bit at a time to and from the serial bus.
Stripped down segment of the C64 Schematics. Shows Port A of CIA 2 driving the Serial Port.
The second layer is a channel–based layer of abstraction above the hardware level routines. What does that mean? The designers at Commodore, possibly getting their ideas from some other reference design, had in mind a certain native way that the machine should work. It's actually quite clever, I hope you'll agree, but at the same time it is very foreign to the way modern computers work. Considering the hardware limitations, though, the more I think about how it works the more I like it in concept. The only problem is that, today, no one wants to use their computer like this. So let's get into it.
The idea is that the user should be able to use generic routines to read data from any input device, and be able to stream that data to any output device. It sounds pretty great. This works by opening channels to device numbers. Then the KERNAL's routines, which can be accessed by BASIC4, are used to pipe data from any input channel through to any output channel.
The quintessential input/output device is the disk drive, or other file storage device. And that is what we're talking about in this post. But, to understand the architecture we really need to take a step back and look at the bigger picture. Every device is treated like a generic I/O device, not just hard drives and disk drives. Here is what the KERNAL supports, and their corresponding device numbers:
- Keyboard (Input only)
- Tape (Input & Output)
- RS232/Modem (Input & Output)
- Screen (Input & Output)
- Printer 1 (Output only)
- Printer 2 (Output only)
- Plotter 1 (Output only)
- Plotter 2 (Output only)
- Disk 1 (Input & Output)
- ... 30. Up to 22 more serial storage devices
Some devices are input only, and some devices are output only. Obviously, you can't output data to the keyboard, just as clearly as you can't read data in from a printer. But you can input from the keyboard and you can output to a printer.
In another post I went deep on how the keyboard works. As you type, the interrupt service routine is scanning the keyboard 60 times a second, and converting key presses into PETSCII characters, which it puts into a 10 byte buffer in workspace memory starting at $0277. So while one routine is dealing with the low–level nature of the keyboard, and is populating a generic buffer in memory, another set of routines can completely ignore the technicals of how that buffer gets populated, and just read data out of it.
A printer, on the other hand, is on the serial bus. It requires a completely different set of low–level routines for sending data to it, probably involving those mentioned above. But at a higher level the KERNAL supports directing any input source to any output destination. And so with only a few commands you can make your C64 behave just like a typewriter. You press a key on the keyboard, and instead of the character going to the screen, it goes to a printer and out onto paper. Sure, it's always possible to get a computer to do this, but on a Commodore 8-Bit one can do this with remarkably little code.
Let's look at another example. The Screen itself is abstracted into an I/O device via the routines that implement the screen editor. While the VIC-II can be configured to see a certain range of memory for the 1000 characters of its text mode, routines in the KERNAL abstract this too. The screen editor code treats the range of memory, that gets rendered by the VIC chip to the physical screen, as one long rolling buffer. The cursor is an abstract notion. It is maintained by two variables in workspace memory that represent the current column and row. When you output a byte to the screen device, the low–level screen editor routines kick into action. First its PETSCII value is analyzed to see if it's a control code. A control code may change the state of the screen's modes or move the cursor, but otherwise, if it is a printable character it gets converted to a screen code according to the current screen modes and put into memory where ever the cursor offsets indicate it should go. Then the cursor is advanced by incrementing and rolling over the column and/or row variables. When the cursor rolls off the end of a row, the column index is reset to 0, and the row index is incremented. But when the cursor moves off the end of the last row, the KERNAL shifts every row in screen memory up and off the top of the physical screen, and out of screen memory into nowhere, and the cursor gets repositioned appropriately. These low–level routines allow the screen to behave like a generic I/O device.
In fact, this is how the C64 is configured when it is first turned on. The keyboard is the default input device, and the screen is the default output device. Every time you press a key, the generic piping routines read a byte from the keyboard device and write a byte to the screen device. That's pretty slick! And any combination of two devices can be wired together this way. This is why it is so easy to open a SEQ (aka text) file from disk, and output its contents straight to screen. It can just as easily be used to pipe a text file from disk straight to printer, or from keyboard straight to printer, as we saw before, or from keyboard straight to disk. It's really very clever.
However, it isn't just a direct one–to–one, device to device. There is another layer of abstraction that comes into play; logical files. The KERNAL maintains a table in workspace memory that holds the relationship between Logical File Numbers, Device Numbers, and Secondary Addresses. The table is 30 bytes long. Three ranges of 10 bytes each, to support 10 open channels. Each of which is more or less just a data stream to an I/O device.
- $259 — $262 Ten Logical File Numbers
- $263 — $26C Ten Device Numbers
- $26D — $276 Ten Secondary Addresses
You assign a logical file number to a channel being opened on a device. When you pipe data, it doesn't go from device to device, but from logical file number to logical file number. Some examples of how this works, and how the secondary address is used with different devices can be seen here.
Right, so, that's what the KERNAL does, again in a tiny nutshell. I really do think it's pretty cool. It let's you do so many neat things in such a streamlined and simple way. It reminds me a lot of the power of Unix piping and treating everything like a file. It's neat right? You can stream data from a modem straight to a file on disk. That's cool.
There are several problems though, or should I say, limitations. The biggest problem is that, well, quite frankly, no one really wants to use their computer like this anymore. It is super neat that with little effort you can turn your sophisticated 1MHz 6510, 64K of RAM, computer into an electronic typewriter. But who the hell actually wants the functionality of a typewriter anymore? It's also pretty nifty that you can dump the contents of a text file to a rolling screen buffer, but, if you do that it will roll the whole screen, whether you want it to do that or not. And if you want to scroll backwards through the file you are completely screwed. Because streams can't really be interacted with, they just stream out of one thing and linearly into the other. You can slow down the stream or pause it, but that's it. (By the way, that's what the scroll–lock key on PC keyboards was traditionally for.)
So what does the KERNAL actually know about files? Believe it or not, it knows next to nothing. The KERNAL knows that devices in the range from 8 to 30 are storage I/O devices on the serial bus. The secondary address is a concept that the KERNAL knows about. It's a number that lets the computer signal some very basic intentions to the storage device. 0 for binary load, 1 for binary save, 2 through 14 for byte–by–byte sequential read or write, and 15 for sending commands. If more than one file is opened with secondary addresses from 2 through 14, that number is used for the computer and the device to know which of the concurrently open streams is being dealt with.
To open a file then, the KERNAL really only has two routines: SETLFS and SETNAM. The first sets the Logical File Number, Device Number, and Secondary Address. These three values will be put into the table described earlier, such that, given the logical file number of a previously opened channel, the KERNAL is able to recover the corresponding device number and secondary address. SETNAM takes a pointer to a string in memory and its length. The inner structure of that string, though, is something that the KERNAL knows nothing about. It is merely a command of some sort that will be sent blindly to the device.
When you call OPEN, it uses the low–level serial routines, such as TALK, TKSA and so on, to pass the command string to the device. Only the remote device itself has to understand what that command means. It can include a partition/mechanism number, a path (absolute or relative), a filename, a file type, read or write flags, an append flag, an overwrite command, and more. Then, depending on how the command string has prepared the device, the connection is open and the KERNAL can read (or write) data from the device, without having any clue what it is actually reading. Is it reading the contents of a sequential file? Is it reading part of a record from a relative file? Is it reading a directory listing? Or is it reading data from the Realtime Clock, or is it the DOS version of the device? It doesn't know and it doesn't care.
The KERNAL knows almost nothing about files, with only a couple of minor exceptions. Such as having the ability to load a program file straight into memory to be run.
As I've discussed scattered across posts from the past, it's useful that the KERNAL is so generic about files. The KERNAL is truly not a "Disk Operating System." Without a disk drive, a C64 has no disk operating system. It does not know how to manage or format disks, nor how to organize files nor do any file management. All of that work is offloaded into the DOS that is running on the storage devices themselves.
One KERNAL ROM, many eras of storage device
This allows the C64's own KERNAL to be quite small, and yet still work with newer storage devices, such as SD2IEC. Those devices abstract away from the KERNAL exactly how it is that the files are stored, organized, renamed, copied, moved, fetched and so on, from the physical media. A C64 itself, today, does not know or see any difference between a 1541 floppy drive with 166K of storage, a 4 Gigabyte SCSI–based CMD HD, and a 128 Gigabyte SD Card with a Microsoft FAT32 file system. The KERNAL accesses all of them exactly the same way. Meanwhile it's just 8K of code and the only significant change it's undergone in the last 35 years is when we updated our systems to use JiffyDOS.
What the KERNAL does not do
Above I gave a fairly detailed description of what the KERNAL does, and how in many ways it is quite advantageous for the C64. 35 years later, the KERNAL still works well, for communicating with even modern storage devices.
But, there are in fact some pretty big changes between a 166 kilobyte floppy, which does not support sub–directories, and a 128 gigabyte SD card that supports partitions, sub–directories, and also track–&–sector–emulating disk images.
The KERNAL supports 10 open files, simultaneously. After that it starts throwing errors, because its correlation table of Logical File Number to Device Number and Secondary Address is just 30 bytes total, 10 entries of 3 bytes each. But, this does not mean you can open 10 files on your 1541 at the same time. The 1541's internal RAM and DOS have their limits too. Depending on how you're opening the files, and what type of files they are, the number of files that can be open at the same time on a single device varies from around 4 or 5 down to just 1. The KERNAL's support for 10 can still be reached though, by opening a few files each across multiple devices.
If you have two files open, on two different devices, it makes sense how they are distinguished and accessed separately. The Logical File Numbers correlate to two different device numbers, and each device is aware of just its own one open file. What about when you open two files on the same device? That's what the Secondary Address is for. Both Logical File Numbers are correlated to the same device number, but to two different Secondary Addresses. When the KERNAL asks the device to TALK it must also give it the Secondary Address. The device itself knows about its own two open files, and each is connected to a unique Secondary Address.
When you open a file on, say, a CMD HD, the CMD HD uses its own internal RAM to hold all the information it needs to work with that file. As long as that file is open, it is associated with a Secondary Address, and the in–device representaiton knows exactly where on disk the file comes from, and maintains a cursor into that file for reading or writing.
A device like the CMD HD supports multiple partitions, and in Native partitions it supports nested sub–directories. It also, conveniently, remembers the current partition, and within each, that partition's current directory. This is very convenient. When you ask the device for a directory, load "$",8, you don't have to keep supplying the partition number and full path, it just gives you the directory of the current default location. All file related commands are relative to the default location, unless an alternative partition and/or path are provided. Each directory looks just like the directory of a 1541, as far as the C64 is concerned. To change directory, or to change partition, commands can be issued to the device via channel 15. These merely set new partition and path defaults, such that when you request a directory, the directory from the new current defaults are returned. That makes sense.
So, now let's imagine a typical scenario. You change the default partition to 2. (@cp2). Then you switch the default sub–directory, (@cd//os/services/app launcher). Then you open a file by name and assign it a secondary address 2. That file is looked up, by its name, in the current partition and current sub–directory. You read some bytes from that file but don't close it. Next, you change the device's default partition to 3, (@cp3), and go into some sub–directories there. (@cd//programming/documentation). Then you open a second file, by name and assign it a secondary address 3. This file is looked up by its name, relative to the new current partition and path. Each of these open files has a different Secondary Address (2 and 3) that the device uses to distinguish between them. And the computer can toggle between them by issuing low–level commands to the device to switch between the Secondary Addresses.
When the computer issues commands to the CMD HD to change Secondary Address, to read more data from one of the open files, the current partition and the current path have no bearing on continuing to read from that file, which itself was opened from a partition and path which are no longer the current defaults. The CMD HD has maintained enough information to know exactly where on disk the next byte of that file will come from.5 The current partition, that partition's current directory, and even the filename, are totally irrelevant at that point. The device knows what partition the open file is in, and the directory and filename don't matter, because the in memory reference maintains a cursor for the physical track, sector, and offset into the sector of where the next data for this file comes from. While a file is open, its name can be changed, as this merely changes the directory entry that points to the first block of the chain of blocks that make up the file. With a bit more work its directory entry could even be moved to a different directory. This makes no difference to where the file's physical, on–disk, data comes from.
But. And this is a big but. The KERNAL immediately forgets almost everything about an open file. The KERNAL doesn't maintain the filename, nor does it know the partition number, nor the path on the device. It doesn't even know what kind of device it's talking to. It can only look up the device number and the secondary address and that's it.
That's fine as long as you keep that file open. But remember, the KERNAL can only keep 10 files open, and each device can only have open far fewer than that. Why would this matter? Well, let's take a look at my Mac. Right now, the Finder alone has 145 files open. Safari has 197 files open. And BBEdit has 98 files open (even though I'm only editing this single document). Modern computers, with their many running tasks, have hundreds if not thousands of files open, at the same time. Is that insane? Yes! It's a bit insane. Will C64 OS have that many files open? No, it will not. And we don't want to lose the simplicity of a system that is useful without needing to work with that many files at the same time. But, the principle should be evident.
In order to accomplish great things, an application and the system as a whole should be able to work with files almost seamlessly, and without direct user interaction. Writing a short file, a handful of bytes perhaps, only takes a fraction of a second. A few examples of what might access files automatically come readily to mind: Writing changed settings back to a file in the application bundle. Appending some data to a system log file. Writing a copied selection of text to a clip file in the system folder. Reading a folder–display–settings file when navigating the file system through a file manager. Loading file metadata on demand as the user moves a selection focus. Loading in some graphical UI elements that are only needed under certain circumstances. Loading a cache of some online resource, like the directory of an FTP server. All of these things require accessing bits and pieces of files, here and there, above and beyond the user explicitly navigating to a folder where he or she has a document and opening it. Not even to mention what happens when an application (still single tasking) allows the user to open two or more documents to be worked on at the same time.
The way it works right now, what happens inside the C64 is conveniently generic and what happens inside the drive is conveniently opaque. But this arrangement has its downsides too.
Many one–file games, to take a common example, write a highscore file back to disk. But, they
just assume that the disk in the drive, which is equivalent to the partition and sub–directory
of a Hard Drive, is the same one whence they were loaded. And so, imagine you're in partition #1,
subdirectory: //games/arcade/, of device #9.
When you load and run the game, if you're lucky, it will write the highscore file back to the correct device, in which case it will end up in the same folder it itself was run from. Although often enough it'll just blindly write the file back to device 8.
But now let's imagine another scenario. You're in partition #2, sub–directory: //programming/tools/,
and you load the game like this: ^1//games/arcade/:paradroid
It will load and run alright. But when it saves its highscore file, guess what? It's going to save it back to partition #2, //programming/tools/paradroid-hi.mem or whatever. What's worse, is that upon first load in this scenario, it will search partition #2 //programming/tools/ looking for a highscore file to load in and it won't find one.
So what? Big deal! Yeah, you're right. It's not a big deal when it's a game and an unimportant highscore file. But this does demonstrate the general shortcoming of the opaque nature of the KERNAL's file accesses. The C64 simply has no means of knowing internally where anything is, where anything—including the program that's running—came from, where anything should be, or where to put or get any additional resources beyond that first load. And the result is very limiting. This is a shame considering that we're not using 166K floppies anymore, we're using 128 GIGABYTE SD Cards.
C64 OS File References
I am under no illusions that anyone will ever write a C64 game that requires C64 OS to run. It would be cool if someone found its services so useful that to rewrite them would be a waste of time. But, chances are a game won't need most of its services, and you'd be crazy to waste 4 to 8K of memory on the libraries of an OS that you don't need. The beauty of the C64 is that you can patch out all the roms and use every last byte and every last CPU cycle in precisely the way you want, in the service of some visual and auditory spectacle that will blow people away. Please! Have at it!
But I've seen how people write tools for the C64. They are ad hoc at best. They rarely benefit from the behavior of other C64 tools and they suffer from all the same problems of there simply being so few modern resources to work with. Rightfully, no one wants to rewrite complex string, file and UI libraries for every little tool. It is for this that I'm writing C64 OS. So that I can write my own tools and applications and feel like I'm writing them with ease and that each becomes more useful the more of them that are written.
So let's talk File References and File Routines. Dynamically allocated File References, also known as File Refs or FREFs for short, are exactly one page of page–aligned memory. This is what you get when you call pgalloc with a request for 1 page. A File Reference does not necessarily have to be a full page, if its memory is managed manually. The contents of a File Ref is structured as follows:
|Device Number||frefdev||1 Byte|
|Partition Number||frefpart||1 Byte|
|Logical File Number||freflfn||1 Byte|
|File Name||frefname||17 Bytes|
|Absolute Path||frefpath||0 to 236 Bytes|
The C64 OS booter loads and runs a drive detection and registration routine. This builds a table of device numbers to drive type IDs. The table is constructed automatically during bootup, but a C64 OS standard tool will allow you to re–run the routine during runtime to update the table. You'd typically do this after you turn a connected drive on or off. C64 OS's file routines use the device number in a File Reference to look up the corresponding drive type, and use this to know whether the device supports partitions and/or sub–directories.
The logical file number (LFN) doubles as a flag for whether the file is open. If the LFN is 0, the file is closed. When you open a file, by File Reference, it is automatically assigned an available LFN, so you never have to keep track of these manually. And if you try to open a file whose LFN is not zero, you'll get a File Already Open standard KERNAL error.
The Filename is 17 bytes long, because it is a C-style string. This affords 16 characters of filename plus a 1 byte null terminator. You never need to manually supply the length of the filename to the C64 OS file routines. Even though C64 OS uses the SETLFS and SETNAM KERNAL routines under the hood, it measures the length of the null–terminated filename for you.
The Path is structured as what has become the defacto standard, as used by CMD devices, and as also works on SD2IEC devices. //absolute/path/name/. The pathname can be up to a maximum of 235 characters, and it too is a C-style string that ends with a null character. Any path that is set will be completely ignored if the drive type is found to be the kind that does not support sub–directories. If you try to use a path on a device that supports sub–directories, but in a partition that does not, it will simply produce an error.
You might be thinking, oh what a pain! I have to create one of these things just to open a file?! C64 OS itself maintains at least 2 (sometimes 3, and for a brief moment while transitioning between apps, up to 4).
The first is the System File Ref. This is a manually memory–managed file reference, so it's less than a page. It knows where the system folder is and whence this instance of C64 OS is running. This is pretty cool actually. You could very easily have two different versions of C64 OS, or two different configurations, set up on different partitions or even in side–by–side folders of the same partition. The System File Ref is what identifies to the currently running instance, what device, partition and system folder it is running from.
The second is the App File Ref. This informs the currently running application what it is. It seems a simple concept, for a program to be able to answer Who am I? and Where do I come from? but most C64 programs actually do not know these things. Just imagine when you write a little BASIC program, and save it as, "hello world". The program, while it's in memory, has absolutely no idea that it is called "hello world", nor where and on which device it is saved. Actually, it doesn't even know that it has an on–disk representation, it may have just been freshly typed in from scratch.
The third is the Open File Ref. This one is optional, it is configured to reference a document file that was used to open the application. In the simplest example, a text file that is opened by a text editor. You use the C64 OS file manager to navigate about the file system. When you come across a text file you may click to open it in a text editor. The system will configure the Open File Ref to point to the text file, plus a File Ref to point to the text editor and will then call loadapp passing a pointer to the new App File Ref. The old app has a chance to quit using the old App File Ref to find its own bundle, for saving state, such as the last device, partition and path you'd navigated to in the File Manager. Then the new AppFileRef pointer is pointed to the new App File Ref for the new app to be loaded. It is the job of the app that supports loading document files to check, during its initialization, if Open File Ref is configured. If it is, the app should proceed to open that file. When the app saves the file it can use the settings of the Open File Ref to know where to write the file back to.
Lastly, I mentioned that there may momentarily be one more File Reference allocated. This is, as just mentioned above, what happens when the current app calls loadapp for the next. For a short time both apps have a configured file reference. The outgoing one still has access to its own app file ref so it knows where to save its state or settings back to. And the new one is configured ready to be put in place for the incoming app to know where to load its additional resources.
It is always possible for your app to manually configure a file reference. But for the most common cases it actually doesn't need to. For instance, if you are in your text editor and working on a new file, when you want to save you can invoke the system save panel. This is a UI that allows the user to navigate the file system and enter a filename. What is returned is simply a pointer to the file reference to use for saving the file. Your app doesn't actually have to construct that file reference, it just passes the pointer to it through to the file routines. And that now naturally takes us to the file routines.
Working with files is more than just opening and closing connections to an I/O device. You also have to be able to read data from the file into memory, and write data from memory to the file. The KERNAL really only provides generic I/O routines to retrieve or to send one byte at a time. CHRIN reads a byte from the current input channel. The keyboard is a bit of an exception, there is also the routine GETIN. And there is also the low–level ACPTR routine which retrieves a byte from the physical serial bus. But in any of these cases, it merely puts a single byte into the CPU's accumulator.
What do you do with it then? Well, you need to write code to figure out where to put that byte, and how many times to loop, and you need conditional code for status checking. And we're on an 8-bit CPU so heaven forbid you need to read more than 256 bytes of data or you're into the fun world of 16-bit looping.
C64 OS provides some of the most essential features that make life easier: fopen, fclose, fread and fwrite. Plus one additional routine, finit. For now, a future version of C64 OS may add other file routines. Let's go over these briefly, starting with the oddball, finit.
FINIT takes a RegPtr (a 16-bit pointer, .X low byte, .Y high byte) to a File Reference. The referenced device will have its default partition and default path set according to the partition and path in the file reference. This is convenient if you want to do something low–level with the device that doesn't consist of explicitly opening a file. For example, you might do this before loading a directory. The finit sets the default path, then you can request a directory with the relative $.
There is another function of this routine, if you set the filename as $:filename, it will return the 16-bit block size of that file, as a RegWrd (a 16-bit word, .X low byte, .Y high byte.) It is an unfortunate limitation that this can only be done on files with a filename of 14 or fewer characters, because 2 characters in the 16 byte filename string must be taken up by "$:". Nonetheless, it is very convenient to know how big a file is, especially if you want to allocate a memory buffer big enough to hold the whole file, before proceeding to read in the file's data.
FOPEN takes a RegPtr to a File Ref, plus a set of bit flags in the Accumulator. The flags are as follows:
If these are passed in illogical combinations, you won't get an error, but will get an unexpected result. Read cannot be combined with Write, for example. Nor can Overwrite or Append be combined with Read. If you use Write, it can sensibly be combined either with Overwrite or with Append, but not with both. This makes sense if you think about it for a moment. There is only one way to open a file for Read. But if you'll open a file for Write, by default you'll get an error if the filename already exists. Alternatively, you can pass Overwrite, which will replace any existing file with the same name. Or, you can pass Append. If no such file already exists you'll get an error. But if the file already exists it will be opened, and the cursor will be moved to the end of the file. Append is rare, but it is most useful for log files and the like.
You do not need to specify a Device Number, a Logical File Number, or a Secondary Address. Nor do you need to specify the length of the filename. Just the pointer to the FileRef is all you need to open that file. The Logical File Number, Secondary Address, and all other commands and flags are sorted out for you. And of course, the file will be opened on the device, in the partition, and at the path specified in the FREF. That is after all, the point.
If you try to call FOPEN passing a pointer to a File Ref that is already open, you'll generate the KERNAL's standard File Already Open error.
FCLOSE is a very simple routine. You pass a RegPtr to a previously opened File Ref, and it will close it for you. It will set the Logical File Number on the FREF back to 0, indicating that its closed, and that LFN will be freed back up to be automatically used by a subsequent FOPEN.
Calling FCLOSE on a File Ref that's not open generates a File Not Open error. There is not much else to say about FCLOSE.
FREAD requires three 16-bit arguments. As I discussed in the post, Passing Inline Arguments, this is a perfect candidate routine for using inline arguments. For consistency with the other routines, the File Ref whence to read is passed as a RegPtr. Followed by two inline 16-bit words: a pointer to a buffer and a 16-bit length to read. The File Ref has to have previously been opened for read or an error will be generated. Here's how this looks:
#rdxy opnfileref ;Read the Open File Ref pointer into RegPtr lda #ff_r ;Set "read" flag jsr fopen ;Open the Open File Ref jsr fread .word buffer ;Pointer to buffer .word 30 ;Read 30 bytes jsr fclose ;Close the File Ref [...] buffer ;Declare 30 bytes of space here.
My source code has a few standard macros for more easily working with 16-bit values. #ldxy loads a 16-bit number into .X and .Y. If used with a label, it loads the address of the label. #rdxy loads a 16-bit value into .X and .Y that is stored at the memory locations represented by the label. And #stxy will write .X and .Y to the memory at a label.
So the first #rdxy reads the address pointed to by opefileref (a standard pointer in workspace memory) into .X and .Y, which I refer to as a RegPtr. Next the Read flag is loaded into the accumulator, before calling fopen.
Note that, .X and .Y are left undisturbed by any of the file routines. So, after calling fopen, RegPtr is still pointing at the same file reference. Therefore, the call to fread only needs the two inline arguments that follow it. These two arguments, in the example above, are hardcoded to an inline buffer and for a specific size. If you wanted these to be dynamic you could simply add labels to these two inline arguments and set them as you would set any other variables.
Lastly, the call to fclose doesn't require any additional arguments. The RegPtr is still set to the same File Ref after calling fread. Everything is done and closed and you've got 30 bytes from the file into the buffer. If you ask me, that is really simple. It gets even better if that 30 byte read exceeds 255. It could just as easily be 4096 or any 16-bit number and you don't have to see any of that nastiness.
FWRITE is nearly identical to FREAD, except in the other direction. Let's just take a look at the code straight away.
#rdxy opnfileref ;Read the Open File Ref pointer into RegPtr lda #ff_w.ff_o ;Set "write" and "overwrite" flags jsr fopen ;Open the Open File Ref jsr fwrite .word buffer ;Pointer to buffer .word 32 ;Write 32 bytes jsr fclose ;Close the File Ref [...] buffer .text "This string contains at 32 bytes"
The process is virtually identical to FREAD. Read in a RegPtr to a File Ref. Set the flags in accumulator to Write and Overwrite, logically OR'd together, and call fopen. Next you can immediately call fwrite because RegPtr is still configured. The first inline argument is a pointer to a buffer out of which data should be written to file. The second inline argument is the number of bytes to write. There is no bounds checking. If you say you want to write 1024 bytes but the buffer is only 32 bytes long, then whatever else is out there in memory beyond the end of the buffer will get written to file.
And, lastly the file is closed. RegPtr is still set, and so there is still nothing to extra to do. If you did some work between the frwrite and the fclose that disrupts .X or .Y, all you need to do is slip a #rdxy opnfileref (or whatever is your file reference) before the final call to fclose.
One more thing
The file routines in C64 OS have at least one other trick up their sleeve. But there will likely be other tricks to come in future versions of the OS.
C64 OS uses a mouse and mouse pointer. But anyone who has programmed any file accesses on the C64 knows that when the VIC-II is showing sprites file access gets slowed way down. It has something to do with the timing getting interrupted. Both FREAD and FWRITE automatically disable and reenable the mouse pointer for you. This is one less thing you have to do on a typical file access.
You can still use the KERNAL
Here's the thing. C64 OS doesn't get rid of the KERNAL. Under the hood, SETLFS, SETNAM, OPEN, and CHRIN and CHROUT, etc. are still being used. If the device is legitimately on the IEC serial bus, then TALK and TKSA and ACPTR and so on are still be used. But if the device is an IDE64, then TALK and TKSA, etc. are being shortcut around by the way in which the IDE64 wedges itself into the CHRIN and CHROUT vectors. In other words, C64 OS is exactly as compatible with the IDE64 as if you were just manually using the KERNAL yourself. But FREAD and FWRITE make it easy to get 16-bit length reads and writes to and from memory. And FOPEN and FCLOSE make it easy to work with files that actually live in real places on devices that are 30 years newer than a 1541.
The reliance on the KERNAL goes a step further than just compatibility. You can actually still use the KERNAL, at your leisure, for binary loads and single byte reads and writes. And you can combine what you want from C64 OS's routines with KERNAL calls. So you can call FOPEN to open a file from somewhere that you don't care to know about where it is. If you want you can use FREAD to load in some part of the file, or not. But then you can simply read the assigned Logical File Number out of the File Reference structure, do a CHKIN and start reading single bytes from the stream with CHRIN just as you normally would.
That's all for now. In future posts I'll give some examples of where I'm using File References and the File Routines to greatly simply an application's interaction with the file system, and derive unexpected benefits from being able to read and write data from and to the system folder and the application's own bundle.
Rock the box.
- My plan is to use the space beneath the KERNAL for rendering bitmaps. It's visible to the VIC-II even when the KERNAL rom is patched in. And it's a region that's physically discontiguous with main memory $0800 - $AFFF. It's also hard to run regular code from there, because the interrupt handler depends on the KERNAL being patched in. [↩]
- The addressing on the C64's IEC bus is simple compared to serial buses on other computers. The Apple Desktop Bus (ADB, 1986), for example, allowed all the devices on the bus to auto-negotiate their addresses with each other. So you never needed to set an address manually, and you never got a conflict. Later serial buses got even more sophisticated allowing hot swapping which ADB did not support. [↩]
- You can read all the gory details in IEC disected, a paper put together by Jan Derogee. There are plenty of other places that document it too, if you search around the internet. [↩]
- I've said this before, but I'll say it again. BASIC is essentially a built–in, easy–to–learn, scripting language. Many of its routines backend on KERNAL routines, just the way many PHP routines backend on the C and C++ libraries. [↩]
- It's slightly more complicated than knowing where on disk the next byte will come from. Because the drive reads one or more blocks from the disk into a buffer in memory. Individual bytes are then fed from the buffer. Eventually it uses the buffered data to find the next block to read into the buffer. But, we can ignore that detail for now. [↩]
I've been plugging away in the new year, trying to get the booter booting properly, and the ability to launch an app from an application bundle. So here's a quick video update of my progress so far.
What this video shows
The booter has run, it loads in all of the OS modules. It runs a drive detection routine and constructs a drive type to device number table in workspace memory.
The booter sets itself up as a standard app with a "quit app" jump table entry. Then it calls service routines to quit itself. This actually puts a vector to the "loadhome" service routine into the main event loop's EventLoopBreakVector. Typically this will cause the main event loop to end, and will jump to the routine to load the designated "home" application. But because we're just booting, the main event loop is not yet actually running. The quit app routine returns the pointer to loadhome in the X and Y registers. I refer to these in C64 OS now as a RegPtr. That's a standard 16 bit pointer, low byte in X, high byte in Y. The booter takes that returned pointer and jumps to it manually. This kicks off the load home routine.
Load home starts by allocating a page to be the application file reference (AppFileRef) for the about–to–be–launched app. It allocates new space because it needs to still have access to the AppFileRef for the about–to–be–quit app. Load Home then configures this new file reference with the device, partition and path to the currently designated home app's bundle. It then calls the loadapp service routine, with a pointer to the newly configured AppFileRef. In this way, the home app gets loaded just like any other app. And any app can load any other app directly, without needing to go back to the home app first. However, if you want to just quit the current app, there is a service routine that configures loading the home app for you. So it's very easy to leave your app and have the system return to the default home app.
The very first thing the loadapp routine does, is quit's the currently running app. In order to do this it jumps to the application's standard quit routine. The booter configured itself to have a standard quit jump table entry specifically so that things would go smoothly when loadapp tries to quit the booter. The booter's own quit routine doesn't need to do anything so it just RTSs.
Typically the quit routine gives the application one last chance to clean up, before it is forcibly removed from memory. It might be that it closes network connections, or saves open files to disk, or saves its own state back to its application bundle. In fact, that's the reason for maintaining the current app's AppFileRef, so it can use it during the quit phase to write data back to its own bundle.
After the previous app has completed it's quit routine, the current AppFileRef is deallocated and AppFileRef is pointed to the configured file reference for the app to be loaded. Then all application–level allocated memory pages are deallocated. Then the new app's menu file is loaded in, and the new menus are initialized. The menu initialization process deallocates the previous application's menu allocations. Memory allocations for the menu system are done at the system–level. So your application never needs to worry about setting up or taking down its own menus.
Then the application's main.o file is loaded in. And the pages that it occupies get marked as allocated (application–level.) Next, just as every application has a standard quit jump table entry, every application also has an initialization jump table entry. Load app then jumps to the freshly loaded main.o's init routine. At this early stage, I've setup app launcher as the default home app, but file manager will be next. App launcher's init routine is called. Typically it would configure its Toolkit widget UI, but I'm not there yet. It allocates space for a PETSCII image, which it then populates from a file background.pet which it finds as a resource in its own app bundle.
Next the init method sets up a screen layer struct which it pushes onto the screen layer stack. This defines its mouse and keyboard event handling routines, and its draw routine. The draw routine doesn't do much, except call textblit to copy its background PETSCII image to the screen. The mouse event routine is not yet forwarding to the Toolkit, but is instead just doing some primitive checks for where the clicks are happening. Clicking in the first column changes the foreground/background colors. Clicking in the middle of the screen changes the colors back.
Clicking in the second column puts the app into an infinite loop. This lets us test how the interrupt routine handles the situation where the app fails to return to the main event loop for a prolonged period of time. It kicks into action the CPU Busy animation, top left corner. We can see that working. It is animated by the interrupt routine. But as long as the the main event loop keeps running in a timely fashion, and events keep getting distributed, it continually updates a variable that prevents that IRQ handler from animating the CPU busy indicator. Pretty cool.
Once the app goes into an infinite loop, the main event loop is no longer running, and so events are no longer being distributed by their normal means. The interrupt handler is still updating the mouse and keyboard, though, and it's still buffering input events. The infinite loop code is scanning the key command buffer. As soon as there is a key command in the buffer it breaks the loop.
Most of these things are just a way for me to test various low–level features of the system. And things seem pretty stable. I'm making progress.
This table appears in The Complete Commodore Inner Space Anthology, pages 37 and 38. The data are rearranged here to best suit the format of a webpage, and a couple of pretty big errors in the Inner Space Anthology listing have been corrected. Corrections are noted below each block.
It is called a SuperChart because it lists the PETSCII, ScreenCode (both character sets), BASIC character/token, and 6502 operation for every value in the 8-bit range. The numeric values are conveniently given in both decimal and hexadecimal. I'm digitizing it so I can have access to it without having the reference text open on the desk in front of me, and tagging it here as Programming Reference for others to find. Hopefully someone out there will find this chart as useful as I do.
PETSCII, ScreenCodes and BASIC codes are roughly divided into 8 blocks of 32 characters each. For this reason, the chart is presented below as eight 32-row tables. See the end discussion for an explanation of PETSCII's block structure.
Bit 7 Low: %0xxx xxxx
- Block 1: 000 - 031: $00 - $1F
- Block 2: 032 - 063: $20 - $3F
- Block 3: 064 - 095: $40 - $5F
- Block 4: 096 - 127: $60 - $7F
Bit 7 High: %1xxx xxxx
- Block 5: 128 - 159: $80 - $9F
- Block 6: 160 - 191: $A0 - $BF
- Block 7: 192 - 223: $C0 - $DF
- Block 8: 224 - 255: $E0 - $FF
|29||1D||cur right||ORA X|
Corrections: Inner Space Anthology lists the block–1 alphabetic SCREEN CODES as only uppercase. They should be uppercase in the upper/graphics character set, and lowercase in the lower/upper character set, as shown above.
Corrections: Inner Space Anthology lists the SCREEN CODE $1C as a backslash. It should be the british pound symbol (£) as shown above.
Corrections: Inner Space Anthology lists the block–3 PETSCII alphabetic characters as uppercase. They should be lowercase as shown above.
Corrections: Inner Space Anthology lists the block–3 SCREEN CODES lower/upper alphabetic characters as lowercase. They should be uppercase as shown above.
|131||83||load & run||DATA|
|150||96||lt. red||DEF||STX Z,X|
|153||99||lt. green||STA Y|
|157||9D||cur left||CMD||STA X|
Corrections: Inner Space Anthology lists the block–5 reversed alphabetic SCREEN CODES as only uppercase. They should be uppercase in the upper/graphics character set, and lowercase in the lower/upper character set, as shown above.
Note: Inner Space Anthology lists two PETSCII values at $A9 and $BA in the block–6 range. See end discussion on the changes I've made above.
Corrections: Inner Space Anthology lists the block–7 PETSCII alphabetic characters as lowercase. They should be uppercase as shown above.
Note: Inner Space Anthology lists two PETSCII values from $C1 to $DA, $DE and $DF in the block–7 range. See end discussion on the changes I've made above.
Corrections: Inner Space Anthology lists the block–8 reversed SCREEN CODES at $F7, $F8, and $F9 as their non–reversed counterparts at $77, $78 and $79. They should be reversed as shown above.
Understanding the difference between PETSCII values and Screen Codes can be confusing. The lists you will find if you search the internet for PETSCII Table do not all agree with one another, nor do they all agree with older listings found in books such as The Complete Commodore Inner Space Anthology. My SuperChart listing above is derived from Inner Space Anthology. I have made some corrections, as noted above, but I've also made some changes. I want to explain my reasons for the changes, even if others would not agree with me.
Screen Codes, divorced from their conversion from PETSCII values, are easy to understand. Any value from 0 to 255, which occurs in a screen memory location, will be rendered by the VIC-II chip as a unique character.
The appearence of that character depends on the character set that is used. A character set is a 2 kilobyte bitmap. The screen code value is an index into that character set bitmap. The C64 and Vic-20 come with a character ROM, which is 4 kilobytes and contains 2 complete character sets. Additionally, custom character sets can be defined in RAM. Therefore, what actually gets displayed on screen is determined by the contents of the character set.
In the two built–in character sets, the bottom half, 0 to 127, are all unique. But the upper half, 128 to 255, are the reverse of the bottom half. Conveniently, this means that toggling bit 7 of any screencode will exchange that character with the inverse version of itself. Custom character sets do not need to adhere to this standard, but the fact that the two built–in sets work this way was a very clever design decision. In order to make the cursor blink, the KERNAL need only toggle bit 7 on and off. A nice side effect is that it allows text to be written in reverse characters for titles that stand out, and doubles the available graphics characters for drawing PETSCII–based images.
What the PETSCII values actually are is more contentious. Obviously, it cannot be the case that there is a one–to–one relationship between PETSCII values and Screen Codes. If there were, then every PETSCII value would have a visual representation, and there would be no room in the 8-bit space for control codes.
The tables above are divided into 8 blocks of 32 characters, because PETSCII itself is divided into 8 blocks. And it is easiest to understand PETSCII by comparing those blocks with each other. The 8 blocks are further subdivided into the low 4 blocks, and the high 4 blocks, as indicated in the table of contents links at the top of the SuperChart. The high 4 blocks are like a shifted version of the low 4 blocks.
- Block 1 → shifted → Block 5
- Block 2 → shifted → Block 6
- Block 3 → shifted → Block 7
- Block 4 → shifted → Block 8
PETSCII uses blocks 1, 2 and 3, and their shifted equivalents, 5, 6 and 7. PETSCII leaves block 4, and its shifted equivalent block 8, undefined. This provides a space of 192 values, however, in practice 22 of even those values are not defined.
The general scheme, with minor exceptions, is as follows:
- Block 1: Control Codes
- Block 2: Symbolic / Numeric
- Block 3: Alphabetic (Lowercase)
- Block 4: Unused
The shifted corresponding blocks are as follows:
- Block 5: Control Codes (Often with inverse role)
- Block 6: Symbolic (Graphical)
- Block 7: Alphabetic (Uppercase)
- Block 8: Unused
If you look at the two blocks of control codes, the unshifted block 1 is almost always inverted by the same code in the shifted block 5, (if that control code is dual in nature at all.) For example:
- Stop → Run
- Text → Graphics
- Cursor Down → Cursor Up
- Reverse On → Reverse Off
- Home → Clear
- Delete → Insert
- Cursor Right → Cursor Left
The alphabetic block 3 values are therefore the lowercase letters, which become their uppercase equivalents in the shifted block 7.
And the symbolic / numeric block 2 becomes a set of 32 graphical symbols in the shifted block 6. The set of symbols don't have conceptual shifted alternatives, that's why the shifted block 6 is the block used for graphic symbols.
Strings of PETSCII that are output through the KERNAL's Screen Editor are interpreted and mapped into Screen Codes, which are put in memory to be displayed. For example, there is no PETSCII code that represents "reversed lowercase a". However, the following PETSCII sequence: $0E, $12, $41, (that is "the control code for Lowercase/Uppercase character set", "the control code for Reverse On", and the PETSCII value for "a") will cause a reversed lowercase "a" to appear on the screen where ever the cursor currently is, and then the cursor will be advanced. That sequence of PETSCII is interpreted by the KERNAL, but it results in the Screen Code $81 being put into screen memory, and the Lowercase/Uppercase character set being selected.
Confusion from Character Sets and Conversions
The above description of PETSCII is hard to uncover, because of how the alternative character set is laid out, and how PETSCII values are interpreted by the KERNAL's Screen Editor and converted to Screen Codes for display.
When a Commodore 64 is first turned on it is using the Uppercase/Graphics character set, rather than the Lowercase/Uppercase character set. It may simply be that Commodore thought having more graphics characters to work with in exchange for having only one case of alphabetic characters seemed like a good trade off.
Now, if you're going to provide only one case of letters in a character set it will be uppercase not lowercase. Why is that? Well, many things are written in all–caps, but very few things are ever written in all lowercase. Hence the existence of the term all–caps, and the existence of all–caps fonts. Many words, such as the pronoun I and most proper names cannot be correctly rendered in small letters alone. In fact, in ancient writing, Latin, Greek, and Hebrew, there were no small letters, that is, no miniscules only majuscules. So it makes sense that the alternative character set is Uppercase/Graphics and not Lowercase/Graphics.
However, it also makes sense that in the Uppercase/Graphics character set, it is the lowercase PETSCII values that are replaced by uppercase glyphs. Otherwise, when in Uppercase/Graphics mode you would have to constantly hold down shift in order to get letters instead of graphic symbols. But on the other hand, when you do switch character sets, it has to be that whatever you've typed without the shift key will appear as lowercase characters, otherwise, in the Lowercase/Uppercase character set, you'd be getting uppercase characters without the shift key, and you'd have to press the shift key to get a lowercase character. That just doesn't make any sense.
And so, all this taken into consideration, one should not be confused about the PETSCII values on account of the existence, availability and even default–status of the Uppercase/Graphics character set. It is merely a character set, an alternative on–screen representation, that renders lowercase PETSCII values using a set of uppercase glyphs, and renders uppercase PETSCII values with an extended set of graphical glyphs. In a conversion from ASCII to PETSCII, though, ASCII's uppercase characters map to PETSCII's block 7, and ASCII's lowercase characters map to PETSCII's block 3.
In my opinion, a "standard code of information interchange," which is what the SCII in PETSCII stands for, cannot have multiple meanings for the same value. Given that I feel there are good arguments that the lowercase alphabetic characters are in block 3, and the uppercase equivalents are in block 7, then the Lowercase/Uppercase character set should be used for deriving the canonical PETSCII values in block 6 and 7 where there are alternatives.
Specifically, $A9 and $BA. These are mapped by the KERNAL to the Screen Codes $69 and $7A respectively. In this entire block (block 6), both character sets are identical, except for these two values. My listing indicates that the canonical PETSCII value is the character that appears in the Lowercase/Uppercase set, and that the other glyph is merely a result of swapping in an alternative character set.
The only other contentious spot is $DE and $DF in block 7. These get mapped by the KERNAL to the Screen Codes $5E and $5F, where there is an alternative appearence in the Uppercase/Graphics set. This ambiguity is not there when you type the unshifted ↑ (PETSCII $5E, mapped to Screen Code $1E), because $1E is identical in both character sets.
The Character Sets
Here are the character set images, to help clarify how the Screen Codes work and differ between them.
Lowercase/Uppercase Character Set, and Uppercase/Graphics Character Set.
Another year, another adventure. Long live the C64. Your second World of Commodore since returning to the scene cannot ever be as epic as the first one. But as the fates would have it, I would not be able to repeat last year even if I'd wanted to. This is my review of World of Commodore 2017. In a nutshell, friends and networking. Networking as in what people do when they get together? No, no, no. Digital, 8-bit, computer networking. Read on.
Last year the journey felt truely epic. I got up early on a Saturday, I brought my son along with me. I had no expectations, except bad ones, and when I arrived it was just one pleasant surprise after another. What could compare with that?
I've got several of hobbies that tug at my heartstrings and draw me out into the world, from Esperanto conferences to Star Trek CCG tournaments. Commodore Expos are a new(old) one to add to my list of places to go throughout the year. But my wife is always making me agree to compromises so that I actually, you know, spend time with her and our family. The dice were rolled, and this year the Archibald Family Christmas party at a cottage north of Peterborough came up on the same Saturday as World of Commodore's main day. Drat! I tried to argue that I've been working my butt off all year and that the one and only time and place that anyone on earth is going to care about what I've been doing is that very Saturday, in a hotel basement in Mississauga. I didn't get far along those lines. Besides, the show is a two day event, right?
So I went to the cottage on Saturday, December 9th. A 3 hour drive out, in the winter, and a 3 hour drive back at the end of the day. I was already exhausted and I hadn't even made it to the show. I packed up my C64 Luggable project, got some Commodore Logo Patches ready to bring with me and went to bed early. The next morning, I got up before the sun. Bid farewell to my wife and sleepy kids and hit the road.
I wasn't quite as excited as last year. I knew where I was going, I knew how long it would take me to get there. And I knew that the location was the same as last year so there weren't likely to be any big surprises. And there weren't. The weather was great, the drive was fast and easy, I stopped once for food and a bathroom, and I didn't even bother to get gas. By the time I pulled into the parking lot at the Admiral Inn my tank was on empty, and it was 10am sharp.
I left my equipment in the car, and didn't even bother to put on my coat. I just dashed across the frigid parking lot in my sweater, and into the hotel lobby. I ran past a guy loading a luggage cart stacked with 8-Bit Commodore goodness into the back of a van. Besides that and a tiny square sign with the chicken lips logo and an arrow pointing right, there wasn't much sign that anything was going on. I was a bit worried. Maybe last year was an exception? Maybe Bil Herd had drawn an uncharacteristically large crowd last year that would never be matched again?
I headed downstairs and found the sign in table completely unmanned, and no one was waiting to sign in. Damn it, did I miss the whole show?
Before I took five more paces, Jim Mazurek, a true 8-Bit soldier, who along with Eric Kudzin seem to go to supernatural lengths to make an appearance at every show, popped out from around the corner. He greeted me kindly, and broke some unexpected but hugely uplifting news. Dave Ross is here! Holy crap! Dave (Watson) Ross, one of my oldest online Commodore friends, together with Vanessa Dannenberg and David Wood (too bad they couldn't make it), was here in the flesh.
I took a couple of steps into the showroom and peeked around the corner to see what I could see. "No shit! Is that really Dave Ross I see sitting there?!" I called out. The same guy, only a decade older than the last time I saw him spun around and stood up with a smile on his face. After extending my arm to shake his hand, I thought, ah screw that, and gave him a hug. He could have been sitting completely alone in that giant room, and I still would have been happy to have made the journey.
A couple of quick thoughts. First, there is no doubt that this is a Sunday morning, the second day of a successful show. The tempo was way down from last year, but that was not a sign of less interest or smaller participation. Instead, the room was only partially populated by sleepy lookin' dudes who had just gotten out of bed from the partying the night before and strolled down to see what was going on. Many of the regulars, Joey and Leif, for example, weren't even there. John (CRT) Hammarberg, who I wrote about last year, was there, but he was packing up his gear and getting ready for the 6 or so hour drive back home. I barely had a chance to say hi, shake his hand and wish him a safe trip home. I also discovered that Jim Brain had not made it up this year. That's a bummer.
The second thing to point out though, is that the room was bigger. Like, two times as big. At first I was worried, were there no presentations this year? Last year the room had been divided in half by means of a folding partition that slides along a track in the ceiling. This year, that partition was neatly tucked away inside the hidden wall compartments, and the showroom floor was consequently much more spacious than last year. But, fear not, I soon found out, when the first presentation of the day was due to begin (which I'll go into later), that there is a whole separate room reserved and set up just for presentations.
This year the showroom was configured more like a giant 8 on an old digital alarm clock. The room was rectangular, like two squares side by side. Tables flanked the entire outer rim of the whole room. And in the middle, two inset squares, one for vendors the other for more demo machines.
Let's walk our virtual way around the room to the best of my memory. Last year I left a couple of people out who weren't super happy that I'd completely forgotten about them, but you know how it is with memory.
The first machine I cam across was a breadbox C64 with a conspicuously bright red 1541 Ultimate II+ hanging out the back. I got a look at it, and learned a few tidbits, such as that the onboard speaker plays sounds that make it sound like a spinning 1541 disk drive. To be honest, I'm not sure how much I like that, and I hope that feature can be turned off. The other thing I learned is that it's access to mass storage devices works a lot like the uIEC/SD. I'm still going to have to get my hands on one for testing and compatibility with C64 OS. I'd love it if this became the storage device in my C64 Luggable. Because it packs a lot into a small space, ethernet, USB Storage, REU emulation, Freezer Cartridge, and more.
Beside it on the same table was an Amiga 600 with the Vampire 2 accelerator board. I've never been into the Amiga, but there were quite a few of them scattered around the room. I looked up the Vampire 2 board online, and found this link for you, if you're interested in getting one for yourself. http://www.apollo-accelerators.com.
In the video review of the Vampire 2, in the first 5 seconds of the video Dan Wood declares that the A600 is "by far the worst Amiga every created" and the black sheep of the Amiga family. I would not have guessed that that's how people feel about it. It has always been my favorite model, coming from a guy who has never used an Amiga in his life. I think it's the diminutive form factor, but to me it is the sleekest looking and most visually appealing of all the Amigas. It's so cute, I love it. Anyway, now it's got some amazing speed boost via the equivalent of the SuperCPU of the Amiga World, Vampire 2. Or something. Check it out.
Right next to that was a table with two of Steve Gray's PETs with their tops cracked open. I got a couple of quick looks on the inside. The first is an 8032, and it was showing off the Multi-Edit ROM adapter. And the second was a SuperPET with some other board mounted inside. Unfortunately I didn't get a clean look at that one.
Check out the Multi-EditROM up close and personal. Looks wonderful! I love seeing all those jumper wires, reminds me of when I installed the daughter board in the c128D to make it compatible with the SuperCPU128.
If you're interested in this project, you can read all about it on Steve Gray's website, (a sub-page of 6502.org, how cool is that?) http://www.6502.org/users/sjgray/projects/editrom/.
The Editor ROM is responsible for all video initialization, screen output, keyboard input, full-screen editor, and IRQ handling. By putting all this in one ROM Commodore was able to customize the machines for various markets, with different options. Commodore provided for most combinations, but not all models got all options. Later CBM machines had additional editing features and even a simple power-on menu for "Executives". Steve Gray—The PET/CBM Editor ROM Project
He also had on display the absolute behemoth, CBM Model 4040 Dual Disk Drive. The thing is like the size of a small washing machine. No, I jest, but it is big. Especially compared to the very modestly sized Super Disk Drive SD-2 by MSD which he also had on display, with the top case removed, right next to it.
Zooming around the corner, Leif Bloomquist, TPUG member and co-organizer of World of Commodore had several of his own machines set up.
The first was in his transparent C64c case which I marveled at last year. This year he was using it to show off a Raspberry Pi C64 interface board of his design. Unfortunately this is not a commercially available project, and he doesn't currently plan on manufacturing the board. Although the schematics and plans are (or will be) available on his website.
Essentially, the Raspberry Pi is mounted on a board that plugs into the User Port on a C64. It is fully powered by the C64 and can be communicated with over the serial connection provided by the User Port, as though communicating with a remote machine over a modem. Except, the other machine isn't remote, it's neatly hanging straight off the back of the C64.
Besides just being a kind of neat project, to see that you can do that, I'm not sure if Leif has in mind any bigger plans for what one might be able to do by having access to a Raspberry–Pi–in–a–UserPort–Cartridge. Frankly, I'm a fan of when these sorts of projects are turned into commercial products, even if there is only a run of 50 boards or whatever, even if it's a kit, but I respect the fact that there is some financial risk in making a bunch of product that no one ends up buying.
By the way, I consider this project to fit within the theme of "networking" for this year's show. Even though it's a point–to–point network over a very short distance, it is an interesting entry in a wide range of demonstrations of people using their C64s to talk to other computers.
Leif also had on display a C64c with a beautiful shade of blue for the top case. This one he painted himself. It looks very professional. The machine also had a number of buttons, switches and knobs popping out the left hand side. I should have asked him what all those do.
Another contributor to the networking theme of this year, but not quite a Commodore 8-Bit, Leif also had set up a WYSE WY-60 Terminal between his C64c's. WYSE was a company founded in 1981 that specialized in terminals and then thin clients, before being acquired by DELL in 2012. You can read all about there sordid history here, https://en.wikipedia.org/wiki/Dell_Wyse.
Essentially this machine was just a terminal. Sort of like how we use terminal emulator software, that software just emulates what these machines and their ilk had built in. Leif hooked up a DB-25 Parallel WiFi Modem that emulates a Hayes Modem. It's pretty cool, because with this simple "dumb" machine, plus the intelligent guts crammed onto that tiny WiFi modem, this is all you need to be able to log into a free Unix Shell account somewhere, such as Toronto Free-Net. From there, you can read email with PINE or browse the web with Lynx, and so on. Pretty cool.
Also sitting on the desk there beside the WYSE Terminal is the very last one of Leif's own deluxe WiFi modem. He made a run of about 100, and has since sold them all, down to this very last one. There will not be another run, so be the lucky one to get your hands on this one. I asked him if he felt it was a success, and the answer was a resounding yes! He expected to make 20 or so, and questioned whether he'd be able to offload them all. So it was a nice surprise for him to be able to move so many units, especially considering the plethora of cheaper but less capable models of WiFi that have become popular in the community in the last few years.
My only regret is that I'll have to remove his modem from the Commodore 8-Bit Buyer's Guide after I get word that he's sold the last one.
The VR64, a set of Virtual Reality Googles for the C64, was demoed earlier in the year, but this is the first time I had a chance to try it out for myself. Jim had sent Leif a sample of the goggles to put on display at World of Commodore.
The project was put together by Jim_64 (I can't seem to be able to figure out his full name, apologies.) He wrote a nice blog post about how he put the project together with links and resources to be able to build your own. http://64jim64.blogspot.ca/2017/09/vr64-virtual-reality-goggles-for.html.
His project is so clever and of interest to people on cutting edge of technology with VR being all the rage these days, that VR64 was featured in an article on Gizmodo, and on PopularMechanics.com. I'd say that's a pretty big win for the popularity of the C64. The articles feature some nice videos you can watch.
I got to try on the goggles, and play the one game that Jim_64 also produced to go along with the project, Street Defender. It's cute, a bit simplistic, but there is some real promise there. I'd love to see what sort of mind bending 3D visuals could be produced if a couple of experienced demo coders felt like putting their talents behind some VR64 demos.
UPDATE: January 2, 2018
Didn't I say I'd forget someone? It totally skipped my mind when I first wrote this review, but Francis Bernier from Montreal was setup after VR64. I didn't take any pictures of his table for some reason. He had on display a C64c, top case removed, with several modifications made by soldering things directly onto the main board. To be honest, it gave me the willies to think of dropping a soldering iron directly onto a c64's main board, for fear of damaging something.
He had removed the RF Modulator, and used the space for a modification that broke out the video signals in some way as to get a cleaner signal with less noise and interference. He paired this with a LumaFix64. The LumaFix64 is listed in the Buyer's Guide, and I would have bought one from him right there on the spot, but he only had one left and it was already spoken for. The LumaFix64 sits directly into the VIC-II socket, and the VIC-II chip is then plopped into the matching socket on the other side of it. There are then three tiny adjustable knobs. These can be tweaked in realtime while the machine is on. You watch the display, and if there are vertical bars or other interference, you can gently adjust those knobs until the picture becomes clearer.
The combination of his video line mod, plus the LumaFix64, produced an output of truly stunning clarity. I thought the 1084S produced sharp output. But I've never seen results this good. He had an upscaler connected to a VGA monitor. You don't see raster lines when using a VGA monitor, and I commented that the expensive upscaler probably had something to do with the quality. But he assured me it did not. He had another 64 setup without those mods. With the flip of a switch he could toggle the display between the two machines. The machine without any modification had the typical blurriness we're all familiar with, horizontal bleed and weird vertical lines. But when it flipped back to the machine he was showing off, it was like looking at an emulator it was so clear.
Getting a LumaFix64 is on my shortlist of purchases for 2018.
Apologies to Francis for forgetting to mention you in my first draft.
Sliding down the tables some, Jim Mazurek had a flat c128 with a 512K REU clone, a small drive stack, 1351 mouse and an Amiga monitor. Across the table was scattered a variety of new and old hardware expansions, a machine language reference guide, some cables and a networking router. Precariously balanced atop the Amiga monitor was a XYPLEX MAXserver 1600, Terminal Server. A say what now? A 16-port terminal server, which you can pick up on ebay for anywhere between 100 and 200 dollars.
Jim was using the XYPLEX as a TCP/IP to Serial bridge, which he connected to the c128 via a UserPort serial adapter that looks suspiciously like an EZ-232. Ironically, this entire goldbergian contraption plus the benefit of wireless networking is more or less encapsulated in any of the many popular WiFi modems. However, Mazurek assures me that there are numerous useful benefits to using the XYPLEX, like 16 serial ports(!) that make it worth the trouble and physical space it occupies.
In another wonderful contribution to the networking theme, Jim was using his setup to display the modern revival of Quantum Link, or Q-Link, the originally Commodore 64/128 only online service that changed its name to, you guessed it, America Online, in 1991. After the name change, the service expanded to service the PC and Macintosh and still exists today.
Q-Link offered a number of departments for such things as news, online shopping, chatrooms with other users, downloading software, and more. As a kid I remember having the Q-Link disk that came packaged in the box with GEOS on my first C64c. By the time I acquired my first modem in the mid 90s the service had long already moved on and was no longer accessible by the C64 version of the software.
The efforts of Glenn Holmer and Jim Brain, lead to QuantumLink RELOADED. The backend code has been updated to support a bridge to the IRC network, which is what Jim Mazurek was demonstrating. The process of getting online was, in my opinion, complicated. But I do applaud the effort. And it's nice, as Jim said, for people to experience the nostalgia of using Q-Link again if they had been devoted users of it back in the late 80s.
Later in the day, I decided to shoot a video of Mazurek walking me through the process involved in getting his c128 connected and logged into the #c64friends IRC channel via Q-Link. Give it a whirl, it's got a few funny bits, and some chit chat that gives the feel of what it's like to hang out with your C64 friends at an expo.
Rounding the bend again, and a long line of tables ran the back of the room. First up was a table setup with diagnostic cartridges and lots of bare motherboards. Along with some monitors, this table was meant for people to test hardware. Joe Palumbo spent a healthy portion of the day hunched over some C64 mainboards cursing and scratching his head. It also served as a nice spot to congregate and discuss some of the technical intricacies of C64s, such as how to identify when a particular RAM chip has gone bad. Also discussed was how to deal with the fact that many C64s have their ROM chips soldered directly to the mainboard, and how best to upgrade these machines with JiffyDOS.
I learned a few fun facts. One, is that it is very likely the reason some mainboards have sockets and others have the chips soldered on, is because Commodore was cranking out mainboards as quickly as they could. But as they went, the production of the mainboards was not always in sync with the ready supply of ICs. If some ICs were not available at the time the board was being produced, they simply substituted in a socket instead. When the chips became available they could easily be plopped into place.
Next up was the table with drinks and a super old school TPUG sign ringed with a string of neon blue lights. Bottles of pop, stacks of red plastic glasses and a couple of bottles Rye Whiskey tantalizingly called out my name. 11am seemed a bit early to crack into the booze though, and by the end of the day I was going to have to drive home. This is one of those unfortunate downsides to showing up on a Sunday. If I were there on the Saturday, and had a room somewhere a few short storeys above the show, I'm sure that smooth liquor would have added to the authentic World of Commodore experience. This is bringing back fond memories of hot pressing T-Shirts with the words "I Partied with Jim Butterfield" strewn across them.
Speaking of Jim Butterfield, his name still comes up at these shows. And there is still an air of sadness in the room from the loss of such a vibrant, eccentric and memorable Commodore icon. It was at an expo much like this, many years ago, that I first learned about how to make the BANANA prompt.1 Let's all raise a red plastic glass of Coke and Whiskey, to Jim Butterfield.
On the next table, past the emergency exit, sat a few Amigas doing their Amiga-y thing. One was a brand new Amiga motherboard inside some partially transparent PC tower case. I think a lot of Amiga fans would have been excited to see it, but I admit, that, by the time it gets up to that level, it just starts looking like standard boring PC to my eyes.
Somewhat more exciting for me, were the Amiga 1000 and 3000 (I think) that followed it. I'm not 100% sure what these were showing off, but it's always fun to poke around on an Amiga when you get an opportunity.
And it's pretty traditional to have a few games stations set up. In the corner following the Amigas were an SX-64 and a Breadbox with what look like the KONIX SpeedKing, or perhaps the EPYX 500XJ Joysticks, hard to tell from the photos. These machines dutifully pumped some SID tunes into the air.
Next up around the corner from the game stations was Jay Hamill's rig. He had a old laptop sandwiched between a C128D and an SX-64 of his own. The C128D had four custom toggle switches jutting out the bottom of the front panel. To one side of his rig was a bin full of cartridges, eprom burners, joysticks, a cartridge port expander and a 1764 REU. To the other side was a stack of drives, including a 1541-II, a 1581 and "Micro IEC" which predates the uIEC/SD. It's the same concept, was made by Jim Brain, but has a Compact Flash card and a socket for an IDE ribbon cable. This Micro IEC storage device is no longer available, having been superseded by the uIEC/SD. But what amazes me is that this device was created, came into existence, was commercially sold, and then stopped being sold in favor of a newer, smaller model supporting newer smaller storage cards, all within the length of time that I'd been out of the scene.
The table surrounding his 128D was covered with networking hardware. WiFi modems and serial adapters, some of which after all my searches online for populating the 8-Bit Buyer's Guide, I'd still never even heard of.
Better yet, he demoed for a small crew of spectators how to configure and sign into a WiFi SSID, and how to connect to IRC and FTP using these devices. But, I'll return to cover those in the next section about the demos. Chalk another one up for the networking theme.
After that came the freebies table. In the middle of the room, there was another PET, with the built–in cassette deck and the chicklet keyboard. And beside it a Vic-20 with the old style power cable.
It was on these inner tables that I brought in and set up my C64 Luggable, such as it is. I didn't have time to finish it before World of Commodore, but I also didn't want to rush it. I put the bits together, but the back door is super rough, the front panel is not yet permenantly screwed down, so it's held on with masking tape. But the speakers and their grills were in place, the front panel is all dremeled out, but held on with masking tape. None of it is painted, though I put the paint chit on the top, to show what color I intend it to be.
The handle and display were put in, the mainboard was mounted in, and the keyboard and mouse I brought along to set out in front of it. I stuck the sweet little chrome Commodore logo to the keyboard, and masking tapped the "Commodore 64 Personal Computer" logo mark to the front of the machine. However, I did not have the rubber feet on the bottom yet, because I want to wait until after the bottom has been painted to apply those.
I did bring with it one little surprise that I haven't posted anything about online yet. My mother, who is a very skilled quilter, has made me a custom fit, quilted cover with a cutout for the handle. And to the cover she stitched on two of the embroidered Commodore Logo Patches that are available in the Buyer's Guide.
I got a few oohs and aahs, and a handful of questions about the machine. But, unfortunately it was not in functional condition, so I was not able to turn it on. Next year, I fully expect that not only will I be turning it on, but I'd love to use it as a demo station and a 4 player game station. The speakers in my C64 Luggable were easily the nicest to be found anywhere in the room, so using it as a High Voltage SID Collection boom box would also probably go over pretty well.
With any luck, C64 OS will be at least bootable, with something interesting to show for next year too. In which case, this will be my demo machine for whatever talk I can schedule myself in to give. Ironically, I didn't take any photos of my machine on display at the show, but I took some photos of it in the workshop the night before I left, and here they are.
The Presentation Room
I'll be pretty straight with you, I totally missed the best and most interesting presentations, the ones that took place on Saturday. But there were some presentations on Sunday too. My greatest lament is that I missed the in depth talk on the PLA, delivered by the guy who created U17 PLAnkton. Forgive me for having forgotten his name. Eslapion, as he's known on the Melon64 forums. Such a talk would have been really great. I'm actually rather disappointed I didn't get a chance to hear it.
The presentation room this year was actually in another physical room, accessible from the main lounge area outside the showroom. It had seating for about 40 people, which to me seems a bit cramped. But on the Sunday it was okay because there was a smaller crowd that day. There was a projector and a screen at the front of the room, as well as a microphone and a presenter's table for his or her machine and notes. Tom Luff was operating the video recorder on a tripod from the front row. One blessing is that I'll at least be able to watch the discussion of the PLA on YouTube.
UPDATE: January 2, 2018
According to MindFlare Retro (@MindFlareRetro) on Twitter, the title of the presentation by Eslapion about the U17 PLAnkton was "The PLAin Truth About the Commodore 64 PLA". I haven't found the video yet on YouTube, but when I do, I'll embed it here.
UPDATE: February 7, 2018
MindFlare Retro has uploaded the video to YouTube. Here it is:
The first talk I went to was by David Zvekic about DizzyTorrent for Amiga 68K. DizzyTorrent was actually released the weekend of World of Commodore, so this is pretty fresh news. He gave a delightful talk about the challenges involved with getting an Amiga to play nicely with the world of BitTorrent. One main issue that sticks out in my mind is that when you are torrenting a large file, you receive little bits of it at a time. And whichever little bits you have become available for other people to get from you. The problem is that this requires more or less random access into what could potentially be a multi–gigabyte file. The Amiga file system known as Fast File System (FFS), as it turns out, is not that fast. His description of it reminded me of the problem with seeking through a large file on a typical C64 file system, like that of a CMD native partition. The directory entry points to the first block, and the first block points to the second block, and so on until the end of the file. There is no way to find the last block of the file without loading in every block along the way and traversing all the pointers. His solution was clever, and the rest of the talk was similarly interesting.
If you're an Amiga 68K user, soak up the love here: http://amitopia.com/bittorrent-client-for-amiga-68k-released-on-aminet/
Although this next one was not an official talk, (if it didn't happen on Saturday,) it should have been. It was super easy to talk Jay Hamill into giving us a pretty extensive demonstration of the typical WiFi UserPort Modem.
With the development of a very inexpensive component, a WiFi modem, full TCP/IP stack, with programmable firmware, to RS232 serial. It was originally developed to make it easy for companies with older but very expensive hardware with serial ports to get that hardware onto a WiFi network for cheap–and–ready remote management. This little device has become an absolute boon to the retro computer market. There are many more solutions that make use of it than I was initially aware of, and I'm still playing catch up trying to get them all listed in the Buyer's Guide.
Hamill was eager to show off a myriad of interesting hardware devices, he had scattered around his c128D. Here are just a few examples: The MicroIEC (CF/IDE), C64Net Wifi, Link232, EZ-232, Commodore 64 Telnet Wifi Adapter and Strikelink WiFi. Check them out.
Hamill walked us through what you can do with a WiFi Modem. He picked one that had been created by Bo Zimmerman, along with a number of programs that Zimmerman has put together to make use of them. We only got to see a few of them in action, but others lurked in the directory listing that I'm pretty excited to download, reverse engineer and learn from and about in the coming weeks.
A config utility that lists the WiFi Hotspot SSIDs, and lets you join by supplying a username and password. Looks pretty straightforward, although I'd love to rewrite this application for C64 OS. The great thing about the hardware is that its network connection is independent of the C64. Once you use the configuration utility to get it online, it will remain online and even survive the C64 itself being reset without dropping the connection. This is very convenient because it allows you to hop from one program to the next without needing to reconnect to the hotspot.
The other two programs he walked us through are IRC Chat 1.5, and an FTP client. We used the IRC Chat program to get on #C64friends right then and there. While I appreciate the nostalgic angle of using QuantumLink RELOADED, using a dedicated IRC Chat program is much more straightforward. He then walked us through connecting to Bo Zimmerman's own FTP server using the FTP program. We downloaded a program and it worked great. The most astonishing fact about these programs, get ready for it, is that they were written in BASIC! The BASIC and KERNAL roms of the C64 include built–in support for RS-232 communications (up to 2400 baud) over the UserPort.
A few caveats, yes, these programs, and these modems, are kind of slow. The common UserPort WiFi modems can actually support 9600 kbps, which is a lot better than 2400 baud. But, if you've been used to using a 33.6K or 56K modem over a Turbo232, these things are gonna feel pokey. There is no real reason why this WiFi modem dingus could not be connected via RS-232 that connects to the C64 via the expansion port however. Also, these particular programs, which feel a bit more like proofs–of–concept than truly end–user friendly, were written in BASIC. BASIC and the built–in KERNAL routines only support up to 2400 baud. It will be most interesting to see what sort of speed can be accomplished with a native TCP/IP stack and 64NIC+ or RR–Net ethernet adapter.
Near the end of all that Jay Hamill was helpfully showing us, we got called away for another talk in the presentation room that was about to begin.
This talk was called Annals of Digital Archaeology. It was given by a gentleman, Zbigniew Stachniak, who described the process they underwent to uncover the lost secrets of an early home computer company that was founded in Toronto in 1970s.
This company was so on the bleeding edge of what it even meant for computing to be done by home users, that they wanted to explore the early software to see what ideas they had. Unfortunately so much of the format and the documentation has been lost to time, that uncovering what they could was really quite akin to archaeology. They started by digitally recording the analog contents of the audio cassettes and analyzing the waveforms to try to figure out what encoding scheme was being used so that the original information could be extracted.
It was a bit of an unsual talk, but as I sat and listened, I marveled at how, here in a hotel basement people have peacefully gathered together, and are sitting quietly watching a man give a presentation which he had clearly devoted a great deal of time to. It is so genuinely refreshing to take part in something that is so real. Our world has become annoyingly fixated on what is next, what is better, what is faster, and often regards just last year's technology with dismissal and disdain. Yet here we are, gathered together, watching with interest and curiosity to learn about the clever ingenuity, pragmatic decisions, and inevitable pitfalls and missteps of engineers who were pioneering personal computing a decade before I was even born. It's fascinating.
I grabbed a panorama of the presentation room for this talk.
Unfortunately, the Sunday show ends at 3pm. It gives everyone who has traveled some distance a chance to get back on the road. I sat around for a half hour or so catching up with Dave Ross before he had to get a lift up to the airport to fly back to Boston.
Our stories are not too different. We both sort of fell out of the community for a few years. We both got married, and we've both got toddlers running around at home. And here we are, years later, a few more grey hairs in our sideburns, hanging out at a Commodore Expo again. We both agreed that something over this past decade changed.
In the 80s our Commodore computers were on the top of the hill, the peak of their commercial popularity. In the 90s, we struggled with trying to keep our machines competitive with a rapidly developing PC landscape, losing community members like big blue was swatting flies in the window of a summer farm house. By the start of the 2000s we had to come to grips with the fact that we had lost the war. And there was a widening gap between the old guard and the youngsters. The old guard was a progressively aging group of people who didn't want to move on, and still wanted their beloved Commodores to be productive. And a younger crowd, who more and more just wanted to make retro games, go to demo parties, and stay up all night drinking, hacking and coding. We both left the scene somewhere around this time. But in the intervening years, something happened with PCs, and the rise of the iPhone, and other mobile computing appliances, that are beautiful to behold, work flawlessly as if by some magic dust that's been sprinkled inside, and yet have hermetically sealed away even the faintest trace of anything electronic.
PCs won the war. But they won the war so hard and so completely, that they came out the other side as something entirely different. They are not really even computers any more. They're just iPads and smartphones. Apple has this ad they recently released showing a young kid living and breathing his iPad Pro and Apple Pencil. But when his mom tells him to put away his computer he innocently asks, What's a computer? Exactly. What is a computer? Commodore 8-Bits, and other platforms of its era, have found a new life, as computers. Things you can dig into, and get your hands dirty with, and learn about in all their intricate technical details. It's about falling in love all over again. That sense is palpable at World of Commodore. We are living in the retro computing renaissance. And it is wonderful.
Before we left the hotel, I kept pickin' the brains of Leif, and Ian, and Jim and Eric, about how NTSC and PAL video signals are encoded, and how computers are timed around their delicate requirements. Then we shot the video of the QuantumLink RELOADED review, as embedded above.
I tested as working, and bought a 1541-II from JPPBM. Then I splurged and bought my first PAL C64c, an Australian model. Can't wait to finally be able to view those PAL demos I've been deprived of all my life. I bought a couple more Abacus books, one on Graphics Programming for the C64. I'm sure that'll come in handy throughout this year as I continue to work on C64 OS.
We packed our cars, and made one last rendezvous at Master Steaks. A grungy diner just off the 401 highway, ate some deliciously tasty chicken souvlaki served on a cafeteria tray, and carried on the technical discussions with Joe Palumbo, Leif Bloomquist, Eric Kudzin and Jim Mazurek. When we were finally plump and satiated, and well caffeinated for the long drive home, we shook hands, wished each other well, and headed off in separate directions into the dark and frigid Canadian winter night.
See you all next year my friends. It's been a pleasure.
- The BANANA prompt was a fun trick he showed us, first copying the contents of the BASIC ROM into the underlying RAM. Then swapped out the ROM for the RAM, and poked the letters "BANANA" overtop of the constant for "READY." at $A378 [↩]
Welcome back C64 and 6502 fans! This is my 50th blog post at c64os.com. Today will not be a highly technical post, but I have a lot of small things to talk about.
World of Commodore 2017
It's that time of year again. World of Commodore 2017 is just days away. Read about it here. I had a wonderful time refamiliarizing myself with everything that's new in the Commodore world at this show last year. It is sure to be a blast.
Unfortunately I can't make it on the Saturday, because of family obligations. But I will be there on Sunday, as this is a two day event. I'm not sure what I'll be bringing. I've been hard at work through the whole year, but nothing I've been working on is really ready. C64 OS has seen lots of work and now consists of lots of code, but there isn't much yet to demo. And C64 Luggable, which I'll likely bring with me, is not fully finished. But, I might bring it anyway, just to show people my progress. I know at least a couple of people are interested in where it's at.
Hopefully someone will notice, but I've put a lot of work over the course of the year, into building this website. In the last couple of weeks I have done a fair bit of work touching up the styles. I've normalized the fonts and the hierarchical usage of H1/H2/H3/H4.
I have also put quite a bit of work into getting the whole site to be responsive. This involved redoing the top navigation completely. This is now done entirely in CSS, no more graphics for those little bubble buttons. This allows the top navigation to scale smoothly down to mobile phone sizes. And in all the pages with a left side bar, WELCOME, C64 OS, C64 LUG, and WEBLOG, the side bar now gets out of the way on mobile devices, letting the content take the full width. You can call out and dismiss the side bar with new buttons that appear above the section title.
I want to give a shout out1 to my longtime friend and colleague, Brad Marshall, for his help, suggestions and patience with me, on some design and CSS questions to get this site more responsive and looking a bit snazzier. He's a talent web designer, and you can read more about him and his work on his own blog, bmarshall.ca.
The popularity of the site is growing, as I promote it on Twitter and IRC and move up the ranks in the search engines. I'm now getting hundreds of unique visitors a month. So I feel pretty good about that. I've only got ~230 followers on Twitter, so it helps a lot when people with a couple of thousand followers retweet my tweets that promote a new blog post, or a new feature in the Buyer's Guide. Thank you to everyone for that!
As for the C64 OS Technical Docs on this site, I have not put nearly enough time or effort into them. That is something I intend to put a lot more energy into in the coming months. One reason for my reluctance has been the nature of ongoing and active development of the project itself. Unlike a weblog, the C64 OS Technical Docs are a living document that are meant to reflect the current state of the project, not a chronological log of how things are changing.
As it stands, whatever is in the Tech Docs is pretty out of date already. The good news is that development of C64 OS continues apace. I feel like I've got the modularity of it well worked out, assembling multiple modules that plug together and are able to call each other's routines has never been easier. This lets me focus on how to organize the code. And there is still a fair bit of flux involved here. Big projects are in large part, in my opinion, a matter of good organization.
It's quite a trip to be coding native. Many tools that would ordinarily be available to a programmer on a modern platform are simply unavailable. I continue to learn about what the tools I'm working with can actually do. I'm almost ashamed to admit this, but I didn't realize that Turbo Macro Pro has block operations! How could I have lived without them? Well, I'll tell you, whenever I needed to rearrange large blocks of code, I would export my code as text and open it in novatext. That's the surprisingly–pleasant–to–use 40 column text editor that comes bundled with Novaterm 9.6. It has block move/copy operations that make it convenient to search/find and rearrange large sections of text. Then I'd save it back to the text file, and reimport into Turbo Macro Pro.
As it turns out, TMP+ has block operations built in. This has greatly sped up development. Let's just call that special Commodore back arrow "escape", since it's in that spot on the keyboard. ESC-m activates the mark submenu. Press 's' to set the start mark, and 'e' to set the end mark, thus defining a block. Move the cursor to somewhere new and press ESC-b to activate the block submenu. Press 'c' to copy the block to the cursor's location, or 'm' to move it. You can also delete a whole block at a time.
There are some other things you can do that take time to become second nature. For example, a block can also be written to disk as a text file. My new practice is to write a block to disk with a standard filename of "@:clip.t". The @: tells the DOS to overwrite any existing file with the name clip.t. Thus, clip.t becomes a sort of generic clipboard. The latest clip always overwriting the previous clip, just like a modern computer's clipboard. Getting used to this convention makes it much faster and easier to copy and paste between source code files, as there is less mental overhead in coming up with a name, and then remembering what I named it. Good times.
Lots more to say about development on new areas of C64 OS, but I'll leave those for another post.
While I'm dicussing my projects, I may as well give an update on where I'm at with C64 Luggable. The progress documentation on C64 Luggable is not fully up to date with where I'm at. But there is one advantage to documenting a hardware project. What has been completed tends to stay the way it is, in a way that is not true of a software project.
The front panel is almost fully complete. I had to order a second set of DB9 breakout boards, because the first set were problematic in a number of ways. The audio amplifier, as it turns out, I got two of those as well. The power jack is fully mounted, the speaker holes are finished, and the front metal plate has been dremelled out.
The back hinged door has been cut, and the hinges have been put on. So the door now officially opens and closes. I've purchased the paint and have started to apply the white base/sealer to some pieces of the chassis. One concern I have is that, due to the size and shape of the audio amplifier, once I fully screw on the front part of the chassis (with the speaker holes, and the cut out for the front I/O board), I will no longer be able to pull the front I/O board out. Even though it has been mounted to the bottom chassis in the same way as the rear I/O board.
This is truly unfortunate, because it would have been very convenient to be able to pull out and swap out the entire front I/O board after the main chassis is fully assembled. This has made me reluctant to put the front part of the chassis on, because I have to be certain that everything is as I want it to be first. And I can't start painting the outside chassis in earnest until the whole chassis is actually together.
I would love it if it had been all ready to go for World of Commodore 2017. However, after working on the project for a year, I don't want to rush the end of it and do a poor job, just to meet this deadline. I'll probably show it as it is, and next year, for sure, it will be fully together and working.
I do have a couple of small surprises about the project, which I'll show off at World of Commodore. Nothing big, but a couple of nice touches.
Commodore 8 Bit Buyer's Guide
Lastly, I just want to give an update on the Buyer's Guide.
I am super pleased with the Buyer's Guide. I just wrote about the recent updates to the guide in the introduction to the post 3D Printed Components. So I don't want to repeat myself so soon after. I'll merely point out a few new things.
The recent responsive site design changes extend to the Buyer's Guide as well. I continue to put effort in to improve the responsiveness and more changes are coming soon. The main table of contents is now responsive, changing to 2 columns of 6 categories at one breakpoint, and changing to one column at the smallest size.
The fonts in the guide also resize to make them fit better, and the overall grid layout of the items in the guide has been able to smoothly scale since the beginning.
Making the features pages responsive is under development. But speaking of features. As I tweet out random products in the Buyer's Guide, I'm adding features. In order to add a little bit of dynamic feedback to the guide, the From the Editor letter near the top of the Buyer's Guide now tells you how many feature pages are available. This is realtime read from the file system that hosts the features files, so it's always up to date. At the time of this writing it's at 11 features. Additionally, I've added a view count (yes, the good old page counter) to each feature page so I and everyone else can get a simple metric on how popular some of the products are. I am hoping that seeing these numbers and watching them grow will be an inspiration to me to keep writing.
In writing features, I realize that there are some products which I know absolutely nothing about and have not been able to find out anything about by searching the web. This has encouraged me to reach out to the vendor or the product's creator to get more information. For example, Tim Harris from Shareware Plus has actually mailed me a paper copy of the user manual for one of his products, which I will be digitizing in order to have some documentation to link to.
I would call this a net win for the Commodore community.
That's all for today. Hopefully I'll see you at World of Commodore in Toronto, this Saturday December 9 and Sunday December 10, 2017.
- I hate this expression, but it is a thing in our language. https://www.urbandictionary.com/define.php?term=shout-out [↩]
Following up on my technical deep dive, Anatomy of a Koala Viewer, I've been thinking about how one could go about making improvements to the viewer and its UI, or lack thereof.
My thinking about this is completely aside from work on C64 OS, except that everything one learns helps in developing other projects. And I am a C64 user after all, I'm not going to just live the rest of my life inside the bubble (an OS environment) I'm creating for myself.
I was thinking about how command line interfaces normally work. Usually the shell program offers a single line editor. And whatever you type into that line, when you press return, is parsed and interpreted by the shell. Typically the first word is the name of a program and subsequent words and symbols are meant to be arguments given to the program. The shell has a path environment variable that defines a set of file system paths to be searched through when looking for the program.
Presuming that the program is found, it is loaded and run automatically and it has programmatic access to all the arguments. Shell's do get more sophisticated but we don't need to worry about that for now.1
Think about how the "command line interface" works on a C64. It is actually the BASIC interpreter. You are free to start entering a program, simply by putting a line number at the start of a line you type out. But all immediate mode lines, those lines you enter without a number at the beginning, are constrained to a limited set of BASIC commands. BASIC 2.0 has 71 commands, but only a subset of those can be used in immediate mode.
How do you load a program? You use the LOAD command. And it takes a fixed set of arguments, "filename" (this argument can include a partition number and directory path), then a device number and a relocate-supression flag. Although all of these arguments can be omitted depending on whether the defaults match what you're after. (LOAD by itself will start the computer loading the first file it finds on the tape in the datasette.)
There are extensions to this command set that can be made, both with software and hardware. For example, when you're using an IDE64, you can directly issue "CD" and "MKDIR" and so on. That's because the hardware adds those commands to the BASIC language. JiffyDOS does the same thing to add its commands, * for transfering files, @ to read the current device's command channel, / to load relocated, etc. But none of these extensions allow you to pass arguments directly to a program that was just loaded.
BASIC Wedge Programs
One reason you'd want to add an argument to a program call is the simple use–case of being able to pass a filename to that Koala Viewer. All the viewer does is show a Koala image, but you have to manually load the data yourself. And you have to load it using the correct flags, or the correct JiffyDOS load commmand, so it goes to the right place in memory. Then you run the program and it assumes you've put the correct format of data into the correct place in memory. Otherwise, you just see crap on the screen.
It works, but its as brutal as you can get.
There are lots of multi-color image file formats, as outlined in that post. They are all very similar, but different enough that the viewer program needs to know which ranges of memory hold the color data, and the bitmap data, etc. The Koala Viewer program, however, has no way of knowing what data was loaded in, (or even if any data was loaded in) because it has no concept of files. The file loading is handled entirely by BASIC and JiffyDOS, which just puts raw data into memory. That Koala Viewer in fact takes no input whatsoever.
I have a different idea.
How about this. You load and run a program, let's call it "image viewer". You do this the standard way. If you have JiffyDOS £ IMAGE VIEWER. Or if it's off in some other place then maybe like this:
That sort of thing is pretty common to load a one file program, from a CMD device for example.
This command loads the program into memory according to its header address, so maybe it's assembled to $C000 or somewhere else convenient. And it's jumped to immediately to run it. The program has put assembly language routines into memory, and it has been run, which wedges the BASIC vectors to add new BASIC commands, but it doesn't need to implement a UI. It doesn't need to show you a full screen menu, or anything like that. It just needs to add one or more commands to BASIC.
BASIC already provides a means of inputting lines, and your code has the ability to intercept the interpretation of a BASIC line. I am going to reserve the technical details of impelemntation for a later post, when I've actually got the tool written. But the short explanation is that the ability to wedge BASIC is done by redirecting a series of vectors to your program's own routines.
By simply returning to BASIC, you continue to use BASIC (and JiffyDOS if you have it) to navigate around the file systems of the devices you have hooked up, and to view directories the way you always would from BASIC, such as by pressing F1 or with @$. When you want to view an image file, you would use a new BASIC command that the loaded image viewer has wedged in. In this case, maybe just the keyword "view" would be suitable:
Besides being able to just use BASIC to navigate your drives and their file systems, the advantage here is that the program gets to see the filename. It can load the data in, according to how the data should be loaded, even if that still means loading it directly to where the file itself wants to go (as odd and archaic as that seems in the context of a modern OS.) But, because it sees that file extension, one image viewer program can handle multiple formats, like Advanced Art Studio, Amica Paint, CDU-Paint, Draz Paint, Koala etc. Or it can even handle other formats besides multi-color.
Presuming these files have standardized, identifiable elements to the file name. As long as the type is identifiable from a feature of the name, the program can look for that feature to determine the file type.
One complaint about command line programs is their inherent lack of visible UI. Sometimes it can be hard to remember how to use them. You load and run an image viewer as I'm describing, and instead of getting a menu or an input prompt, you get nothing. You're just back to the regular READY prompt.
Two solutions to this come to mind. The first is to immediately print out a simplified help text. Even if it's just a one–liner like:
USAGE: VIEW "IMAGEFILE"
That would get you pretty far down the line to knowing how to use it. The second thing to do would be to have the program wedge at a minimum two commands. The main command to do whatever the program is designed to do (i.e. "view") and also a standard "help". If every program that used this model implemented the "help" command, you could just type help whenever to reveal A) if such a program is loaded in. A syntax error will tell you that the program hasn't been loaded yet. And B) It can be used at any time to remind yourself of the commands and any extra arguments or syntax. And the explicit invocation of help could also be used to show extended help, beyond what the initial one–liner showed.
How about exiting? What do you do to "exit" this type of BASIC wedge program? Well, to me there are two kinds of exit to consider. The first kind is how to get back to the READY prompt after something the image viewer takes you to a full screen bitmap. Even though the Koala Viewer accepted slapping the space bar, somehow I managed to not notice that for quite a while. I tried pressing many different keys and then just gave up assuming there was no way to exit.
In after thought, the space bar is the obvious key, but in my opinion pressing the run/stop key should always work. It is after all the "STOP" key, Commodore's equivalent of Escape. "RUN" is supposed to be the function of the key when pressed in conjunction with SHIFT. But on further thought, why shouldn't any key be pressable to return to the READY prompt?
The second kind of exit is, how to unwedge the program from BASIC? Or, how to move on from the image viewer to a different kind of program? For the most part, you don't really have to explicitly unload such a program. Pressing run/stop-restore will probably reset the BASIC vectors and thus unwedge the commands. And otherwise, the program can remain resident.
If you load a different kind of BASIC wedge program, it may well overwrite the all or part of the first one, but then it'll update the BASIC vectors to point to its own routines. And there was no explicit need to get rid of the first one.
UPDATE: November 27, 2017
Pressing run/stop–restore does not reset the BASIC vectors. I tested that last night. However, doing a jump to $E453 (SYS 58451) will run the KERNAL's routine that restores the BASIC vectors to their hardware defaults. This is better than manually setting them, because different KERNAL rom's have different vector defaults. For example, JiffyDOS points two of those vectors to its own routines. One could do this SYS manually to unwedge an installed program.
But here's what I'd suggest. When a BASIC wedge program is first run, the first thing it should do is JSR $E453 to reset the vectors, this will unwedge any previously installed program. Then, it should copy the vectors to its own variables and change the vectors to point to its own routines. But, in its routines, when forwarding control to BASIC, forward through the backed up vectors and not to hardcoded addresses. This will maintain support for the special features of alternative ROMs like JiffyDOS and its command wedge.
In my mind, this program model of wedging in some BASIC commands works really well for the Koala Viewer. It's a simple one–purpose program, that takes at least one argument. I can immediately think of several other programs that fit the same description that could benefit from the same paradigm.
How about an unzip utility? As PC/Mac command line unzip/gunzip shows you, you really don't need a full UI. Just load in the tool, it wedges in a command "unzip". (And help.) You navigate to a directory where a .zip/.gz file is. And enter:
It uses the .gz or .zip extension to know what kind of archive it is. It just immediately starts unzipping the contents and writing them to the same output device and path as the original file. As it finds files in the archive, it can print out the name of the file. When it's done, you're just back at the READY prompt, and can immediately call the command again with a different filename.
How about for a background SID file player? Load in the program, and it wedges three commands, PLAY,PAUSE and HELP. The play command can take two arguments, filename and maximum number of seconds to play back. 0, could mean to play to the natural end of the song (presuming the natural end can be found.)
That loads in the file, maybe it tells you where in memory it was loaded to, wedges in to the IRQ the necessary routine to play the file, but ultimately returns you to the READY prompt with the file still playing. Want to stop the music?
This command could simply unwedge the IRQ routine. Would it be compatible with every SID file? No, it would probably conflict with some of them. But on the whole it would be pretty cool.
There are side benefits to wedging these sorts of programs into BASIC. Want to make a playlist? Want to make a slide show? Just write a BASIC program that loops reading file names out of data statements, and calls the custom command. You have to load the wedged program in before loading in the BASIC program, but it'd be like this:
10 FOR I=0TO2 20 READ A$:PLAY A$,180,1:NEXT 30 END 40 DATA "SONG1.SID","SONG2.SID","SONG3.SID"
Oops, there's a gotcha here. When inside a program like this that loops, you'd want the PLAY command not to return until the song is finished. That's easy to handle. The PLAY command can take an optional third parameter. That's what the extra ,1 does after the number of seconds of playback.
You could do the same thing with an image viewer. Let BASIC work as a script language to move from image to image.
10 FOR I=0TO2 20 READ A$:VIEW A$:NEXT 30 END 40 DATA "IMAGE.KOA","IMAGE.HED","IMAGE.OCP"
Run it, you're looking at the first image, slap the long one, and boom, the next one loads in, etc.
I think it's a great idea. And to that end, I've been reading Advanced Machine Language for the Commodore 64 by Abacus Software. It devotes the whole 3rd section of the book to hacking/wedging BASIC with your custom machine language programs.
I've learned a lot so far, and a few neat things about BASIC that I never knew before. So when I get a bit further into it, I'll write another post describing my findings, and how it works out.
- Shells get more complicated because the output from that program can be streamed into either a file or as the input of another program. That's what makes a unix command line so powerful. [↩]
It's been over a year since I started this blog, and this is my 48th post. And I'm still feeling like I've got a lot to talk about. So let's hope I can keep it up.
The Commodore 8 Bit Buyer's Guide has recently undergone a pretty major update. And I'm trying to produce a new Feature Page, one for each product, every business day. Sometimes that's not possible, because on some day's I've already got a super busy schedule and simply can't fit it in. But that is the aggressive schedule I'm trying to hold myself to. In order to keep myself motivated I'm tweeting (#8bitbuyer) every day about one product, pulled at random, from the guide. Whatever product I tweet about, that's the product for which I write the feature page. You can follow this stream on Twitter here: https://twitter.com/hashtag/8bitbuyer
The guide is now up to 175 products, projects and kits, up from just over a hundred. And these come from 25 vendors, up from just 15. It isn't that this many products just spontaneously sprang into existence, but that I keep digging deeper and expanding the set of available products to buy.
For example, I dug into the catalog at Retro Innovations, and pulled out cables, adapters and rom chips, and have listed them. I discovered Poly.Play and all the stuff they carry and listed that. And I discovered some older projects like the 64JPX adapters that are still commercially available and I listed them. Among numerous others. If it's available, if you can pull out your credit card, place an order and have a thing for your Commodore 8-Bit, then I want you to be able to find it in the Buyer's Guide.
There are some caveats to that. I don't want to list used products of limited stock that people are selling off from their personal collections. You can generically use ebay for that sort of thing. If someone came into the possession of 25 or more Commodore REUs, but they were technically all used, I'd probably still list them. What I'm saying is that, as the editor of the guide, it's a judgement call what I think will benefit people to be able to find.
All the above said, I had a tough time with deciding what I should do about the relatively new phenomenon of 3D printing.
By their very nature there is no stock of 3D printed products waiting to be sold off. Someone somewhere has designed a 3D model and upload the file to a publicly accessible catalog. And there are ways you can turn that into a product you can hold in your hand. I'll get into the details of how to do that below.
One problem that I wrestled with is that there is a virtually unlimited number of such 3D printable products, and their lack of stock means once they're in the catalog, there seems little reason to pull them from the catalog. And, doesn't that present a philosophical problem? Won't simple 3D printable components eventually overtake everything else? This was my concern.
Someone on IRC suggested some 3D printable products and gave me links to items on thingiverse.com. I decided to peruse Thingiverse for everything I could find that is C64 related and just see what I could come up with. What I found was about 12 or 13 things. But immediately didn't see the utility of listing a couple of them. They simply fell below my radar of what I thought would provide enough value that it merited a place in the guide. And I reduced the list to just 10 things.
The philosophical problem is not yet a real problem. There are in reality only 10 things, not the theoretical infinite number of things. And the truth is, I am a person with judgement, and if I feel that the 3D printed options are getting a bit out of hand and not providing the right value, I'll cull them as the need arises.
I had one other problem though. Thingiverse is not a supplier. They're just an online catalog with pictures and related–links, and they host the 3D model files. My concern for the Buyer's Guide is that I want you the Commodore enthusiast to be able to get your hands on the products in as practical, straightforward and realistic a way as possible. So, let's get real, how do you actually get the real, physical, 3D printable component?
How to Acquire a 3D Printable Component
As it turns out, it's a hell of a lot easier than I thought it was going to be. I was pleasantly surprised. And I'm glad to be able to pass on this news and the details of my experience to you.
The Commodore 8 Bit Buyer's Guide provides a link to the component's page at thingiverse (or other online 3D printing catalog). Like this: https://www.thingiverse.com/thing:1561772 Which takes you to a page that looks something like this:
You don't have to have a sign in, you don't even need an account, you don't need to give them your email address. You can just click that blue "Download All Files" button at the top of the rightside column. It downloads a zip file that contains a standard structure. A couple of text files for readme, licence and attribution information. A folder called images with a couple of rendered pictures of what the item is. And a "files" folder which contains the actual model file or files. In the case of the C64c Keyboard Brackets, there are actually two components, the left and right brackets. So this zip package has 3 files, one for the left bracket one for the right, plus a third that contains the models for left and right brackets together in a single file.
Next, you need to find a source for printing the model files. Thingiverse recommends two websites that help you find a source and connect you with that source. I picked one of them at random, treatstock.com. I can't speak to how the others work, but my experience with treatstock.com was seamless and easy.
You do have to make an account with them, because you're going to be placing an order, so your account will maintain your order history and also facilitates communications between you and the source that actually prints your item. Fortunately, creating an account was dead simple. Basically just a name and email address. You don't have to provide any credit card information and creating an account is free. They get their money by taking a cut of whatever you pay when you decide to actually place an order and get an item printed up.
TreatStock makes the process of ordering incredibly easy. There are buttons on the main page that say, "Order 3D Print." That's all you have to do to get started. Their main page also has a simple overview of the entire process end–to–end. Here's what they say:
- Upload Model
- Upload your files
- Find 3D models in our catalog
- Hire a designer to help bring your ideas to life
- Select a Service
- Choose from over 100 materials and colors
- Compare prices, reviews and locations
- Place an order with the best service for your project
- Order Delivery
- Track the production process of your order
- Our friendly customer support team is always here to help
- Fast delivery within 5 days with free shipping insurance
It is in fact even easier than they describe. Step 1, upload your model. You don't need to find a model in their catalog, and you definitely don't need a designer to help you bring your own ideas to life. The 8 Bit Buyer's Guide already gave you the page whence you downloaded the model files for the item you want. Uploading is a snap, in my case, I only uploaded the one file which contains both left and right brackets.
Selecting a service is also very easy. Once you've uploaded the file, they know what its dimensions are. Some services are only going to be capable of producing item within a limited size range. The services that can't produce your item are automatically removed from the list.
There are a choice of materials, which is initially a bit daunting, but they do have simple description text beside each choice to tell you how they're different. And they are generally ordered by price, least expensive first. I went with ABS Plastic, second cheapest, very durable and doesn't biodegrade. And you can also choose your color. The C64 parts are mostly monochromatic so whichever you choose, the whole thing will be produced in that one color.
There is a small gotcha to consider. Varying the material and color changes the set of available sources. The reason for this should be obvious. Some sources simply lack the ability to print in certain materials or to produce certain colors. After toggling through a variety of color options, I discovered that in ABS Plastic, with some colors, there was a source available right here in my home town of Kingston Ontario. The next nearest sources were Ottawa, Montreal, Toronto, Vancouver etc. Other Canadian cities. Especially because these are an internal component, I would gladly forgo some color options, for the ability to have it locally printed where I could drive to pick it up.
I placed an order with the local Kingston source. It cost $10.85. A tad pricey, but not outrageous. We are after all now talking about production runs of one at a time. Before 3D printing, making one thing would be prohibitively expensive for all but companies that were prototyping in advance of mass production.
Payment was handled via my pre-existing PayPal account. So that was a snap. The site then tells you when the source has completed producing the part. I expected it would take a few days, but the source got back to me before the end of the business day. I used the site to send him a message, saying I'd come pick it up after work. TreatStock gives you the address and renders it to a map to show you where the source is located. Zooming in on the map, I discovered that the Kingston source I used is just a house, in the middle of a new residential division! It was just a 10 minute drive up the road.
When I arrived, I went up and knocked on his front door. The parts were ready and waiting. And I didn't have to come with cash because they'd already been paid for. The guy just handed me a set of keyboard brackets for my C64c. Amazing. It's the next best thing to having a replicator. Given the absolute ease with which I went through the process on my very first try, I am more confident than ever recommending printable 3D components in the Buyer's Guide.
3D prints are pretty cool. You get a thing, and it's like it just comes out of nowhere. For designers, it must be totally amazing.
However, 3D printing is not perfect. If you're expecting to get what you would get from an injection mold, you'll be disappointed. The printer prints in layers. And the layers are stuck together using some chemical magic that I'm not even trying to understand. The parts are sturdy, there is no doubt about that. For internal components, such as these brackets, there is virtually no downside. They're the right size, they add a splash of color when you open the chassis, and they are more than robust enough to hold up your keyboard.
I'm a bit more reluctant to recommend 3D prints for enclosures. Simply because the quality is below what we would normally expect. However, if that doesn't bother you, and if you'd rather have a 3D Printed case than no case at all, then by all means, that's why people have taken the time to design those models.
Feel free to ask me any questions, either in the comments on this article, or in the comments on any of the feature pages that are being built out for these components.
After 10 years away from the Commodore scene, the most shocking part of coming back was not the speed of a 1 Mhz clock, or even a 320x200 16-color display. The most shocking part is the essential interaction model. And this was my inspiration to start working on C64 OS. GEOS, as I've mentioned in many of my posts, comes much closer to what you would expect from a modern computer. But even it has all kinds of oddities that show its age. And unfortunately, its fully bitmapped UI is just too damn slow to get used to. Using a C64's READY prompt to do ordinary computing tasks is truly a blast from my adolescent past. But returning after a few years away and learning all over again how it works, it feels like a totally foreign world.
Nothing better exemplifies this than my most recent experience working with koala image files and a koala viewer program. So I'm going to deep dive on exactly what I mean by totally foreign and discuss how C64 OS will work to make the whole model more modern.
What is Koala?
Koala was a suite of drawing technologies for a number of computer platforms in the 80s. You can read all about it on this Wikipedia article. It was developed by a company called Audio Light. The suite consists of the KoalaPad which is a drawing tablet and KoalaPainter which is an accompanying art/graphics program. The program, on a C64, works with Koala format image files. The KoalaPainter program can load existing files, which you can then edit using a wide range of tools like fills, boxes, lines, ellipses, and brushes with different patterns. Then you can save your work to disk, again in the Koala image file format.
That's great! It actually sounds pretty modern when described like that. It sounds like how photoshop works. The quality of the images and assortment of tools, of course scaled for the age of the machine, but it sounds like a pretty standard modern way that a computer works. You have an application, which you load and run, it has a graphical user interface, you pick a file from a directory listing from a harddrive, it loads the file in, you work on it, save it back to disk. You can later copy the image files around individually, back them up, share with your friends, upload to the internet (or to a BBS back in the day), and so on. Very modern.
As with any image file format, especially if the files are going to be distributed, the people who receive the image are most likely not artists with intentions of manipulating the image. They are normals like us who just want to look at the beautiful artwork that has been produced by others far more talented than we are. And so to view the image files it doesn't make sense to have to load the entire Koala graphics editing program. Not to mention the fact that the original full graphics editing software likely cost money, as well it should.
What you want then is to have a free viewer that is small and quick to load, which can display the image files created by the full editor to people who just want to look at them. Again, though, this is a very modern concept. You don't have to own photoshop, nor launch photoshop, to look at an image file that was produced by it.1
I like to use a Mac to convert images (JPEGs, PNGs, etc.) to Koala format, (exactly what that format is I'll mention below.) And I also plan to have a network service which will fetch images from URLs and convert them to Koala format on–the–fly for C64s to be able to browse the web, in a more meaningful way, via proxy. A viewer is therefore far more important to me than the original KoalaPainter program. And so I found a simple koala viewer online. It's just 4 blocks (~1 KB) on disk. But… how do you use it? How does it work? Where does the modern end and the antiquity begin?
How is a Koala image formatted?
First, let's talk about the graphic formats we're all accustomed to. When you examine a JPEG, or a PNG, or a GIF, you actually find that the internal structure and layout of the data on the disk—even when they represent the same picture on screen—is radically irreconcilably different from format to format. Why is that? Well, there are proximate reasons and ultimate reasons. I like thinking in terms of the latter. The ultimate reason is because the graphics capabilities of modern video hardware long ago outpaced the increases in storage and load speed from harddisk or network. I'll explain.
A MacBook Pro, today, has a pixel resolution of 2560x1600, and each pixel can show at least 24 bits of color. 8 bits of red, 8 bits of green and 8 bits of blue for every pixel. That's 3 bytes of data per pixel. 2560 times 1600 is 4,096,000 (4 MILLION) pixels, times 3 bytes each is 12,288,000 bytes. That's 12 megabytes of raw data for just a single image that fills the screen. And we all know that many images are in fact larger than the screen and software allows the user to zoom in or out or pan around to see the whole thing. It is not at all practical or economical to actually store all those megabytes for one image.
Therefore, each of the common graphics formats, JPEG, PNG, GIF, etc. use different compression techniques (sometimes lossy) optimized for different general use cases. Each format sacrifices something, sometimes something that is difficult to perceive, in order to dramatically decrease the necessary storage requirements on disk. The task then of the viewer or the decoder, as they're more properly called now, is to uncompress the data on disk (or from a network) and reconstitute the full bitmap data in memory, whence the video hardware actually outputs the data to the screen.
And herein lies the first big difference. On a C64, 16 colors is 4 bits and 320x200 is 64,000 pixels. That would mean at least 32 kilobytes of storage for a full screen of image, but storing and manipulating color data alternately in upper and lower nybbles is a pain. So practically speaking even though 16 colors can be represented with 4 bits, if each color value is given its own byte that doubles the memory storage requirement. So, with that in mind, 64,000 pixels times one byte per pixel takes up almost 64 kilobytes in memory. But the C64 only has 64K of memory.
For this reason, the design of the VIC-II chip relieves it from needing to assign one color to each and every individual pixel. The VIC-II can show a fullscreen image, and if the artist is clever, it can be hard to notice any limitations on colors per pixel, even though there very much are. But if you think about what that means, the VIC-II chip embeds a special type of compression directly into its native display modes. The VIC-II in fact, as we all know, has several display modes, which in a sense can be thought of as multiple native compression formats. Koala makes use of the format called "Multi-Color". This is not a post about the details of how multi-color is structured, but you can read all about it and see some examples here.
The point is, only around 10 kilobytes of memory is required for the VIC-II to show a fullscreen colored image. And what is saved to disk is also very close to 10K, just a few extra bytes are in a Koala image file on disk than the raw data in memory. It's not a bad compression format at all. If you convert a 10K Koala image to PNG, it becomes 20K. If you convert to GIF, it becomes 14K. 10K is pretty good. It becomes apparent, especially if you send the file over to a PC or Mac that doesn't have the same colors–per–pixel limitations, that the on–disk format is essentially just a full bitmap with a rather unique compression scheme and cooresponding set of limitations. In fact, the Mac has no difficulty at all viewing Koala (and other) C64 image formats. You just need the right decoder. You can download and install a quicklook plugin with support for various C64 formats here. This plugin decompresses a Koala image essentially the same way another plugin decompresses a GIF, and converts it to raw bitmap data native to the Mac's video hardware.
And so the format of a Koala image is effectively just a dump of memory. The bitmap and color memory regions are not contiguous in a C64, however, so in a Koala image those three dumped regions are packed together, plus a few extra bytes. And that's it. On disk, a Koala image file is already in what amounts to a compressed format. So it's quick to load and small to store. But, it's better than that, because the viewer program doesn't actually need to do any work decompressing and converting the data to the native format of the video hardware. Because the video hardware interprets the on–disk compression scheme natively.2
This is very different than modern graphics formats. There are other multi-color mode image formats for the C64, you can read all about them and their technicals here. What is so telling is that it only takes a couple of lines to describe the difference between each of these formats. Because they are merely different arbitrary orders of appending together the various dumped regions of memory. Some put color memory first before bitmap memory, some after, some put the background color at the beginning, some at the end, some between the bitmap and color memory, etc. But large swaths of the files from two of these different formats will be exactly the same: the format of the VIC-II's multi-color mode, either bitmap or color data.
Where else do things differ?
Let's start with how you actually use the Koala Viewer that I found. The viewer is so simple it almost boggles the modern mind. It has a one-button user interface, and I mean, a one keyboard button UI. You load it, then you run it, and by default a bunch of crap displays on the screen. Nothing is responsive, there is no anything, except crap on the screen. At first, I would just reset the computer to get out of the viewer. There goes my uptime! (A concept that does not exist in the Commodore 8-bit world.) Power cycling or resetting the machine is a common user interaction model for exiting a program.
It wasn't until I disassembled the program to figure out how it worked that I discovered you can press the space bar to exit the program (or press fire on a joystick in port 1.) It's really that simple. Run the program, press space to exit the program. The end. On afterthought I should have thought to slap the long one, since that is indeed such a common interaction in games and demos that it is more or less a C64 standard. Disassembling the program also got me to figure out how it is you actually use this viewer program to, you know, view something. But I'll return to analyze the code a bit later. First, let's just pretend we knew all along how to use it.
Here's what you do. You first have to use commands from the READY prompt to load the image data into memory. Then you load and run the viewer, and it knows where to look in memory to find that image data. This explains why you just see a bunch of crap if you load the viewer but you haven't first loaded in any image data. The viewer just happily shows you whatever left over crap was in that place in memory from the last thing you ran. The VIC-II happily interprets something, anything, even executable code, as though it were image data. This is all so very very different than anything you'd find on a modern computer. It's just so low level. But it is fun, it's fun to feel yourself so close to the bare metal.3
You'll notice that even though a Koala image file is technically data, as opposed to executable code, it is stored on disk as a PRG type file. This means it can be loaded. But the C64 has two kinds of loads. You can load a PRG relocated to $0801, which is the default, or you can use a relocate flag that prevents the relocation and will instead load the data to the address specified in the first two bytes of the file on disk. That two byte header is not loaded into memory, it is loaded from disk, used to figure out where the following data should go and is discarded. Fortunately, JiffyDOS includes 4 different load commands: (/,^,% and £). Frontslash, up-arrow, percent sign, and British pound sign. These commands alone feel incredibly ancient. They are completely arbitrary one-character BASIC commands. They're so unusual, modern North American keyboards don't even have 2 of these 4 symbols on their keys. They do the following, respectively:
- Load relocated to $0801 but do not run,
- Load relocated to $0801 and run automatically,4
- Load to header–specified address but do not run, and
- Load to header–specified address and run (by jumping to wherever it was loaded.)
For Koala images, we have to use the percent sign (%) load. This puts the data in memory, but does not attempt to jump to it (thank god, because it's not executable.) Next, we load the viewer. In the screenshot above I loaded it with frontslash (/), so that I could list it to see what we can see before it runs. What we see is the standard basic preamble, SYS 2061, a BASIC command to jump to the immediately following assembly code. Pretty standard way to get assembly code to be loadable/runnable with the "/" and "^" JiffyDOS commands, or ye ol' standard LOAD"PROGRAM",8. Finally, let's run the viewer!
Oh. My. Gawd. It's Deep Space Nine!! I'm so excited. But, seriously, it's pretty cool right? It worked. After we enjoy the image for a while we slap the long one to exit the viewer. Strangely though, we're taken back to the READY prompt but something is not quite fully restored. The majority of text on the screen is white rather than light blue. Only the final READY prompt is light blue, and that's because it was drawn to screen by the KERNAL after the program exited.
It should start to feel more and more as though this entire experience is straight out of a computer era that ended decades ago. How is it even possible that we could use the JiffyDOS un-relocated load command to get the image data into memory? For that, we need to examine the file itself. It also takes time to find equivalent tools on the C64 that you might be used to having on a Mac or PC. This is not a pick–on, of course, once you start living in the Commodore world for some time, you do find these tools, and you start to know intuitively which tools you need to use to accomplish common simple tasks.
DraCopy is a great little utility released by Draco in 2009. You can download it here. Besides being a useful 2-panel file system navigator and file copier, it has a built–in HEX viewer. When you don't have a HEX viewer, and you need a HEX viewer, finally finding one is like a breath of fresh air.
And there it is. We examine ds9.koa with the HEX viewer and find that the first two bytes are $00 and $60. That's little endian, least significant byte first, for the memory address $6000. When we do the JiffyDOS un-relocated load, it is reading that address from the image file, and using it to know that it should put the image data into the fixed memory address $6000... and up.
Please, take a moment to stop and think about what this means. The image file itself has hardcoded into it a fixed memory address whither the data should be loaded. Unthinkably ancient. Can you even imagine, a JPEG file embeds a memory address that tells the computer where in memory this JPEG data ought to be loaded?! How presumptuous. How parochial and shortsighted. Why should it ever be the decision of the data file where it itself should go in memory? That data file has no idea about what else the computer might be running or using that memory for.
Why it worked this way actually makes sense, from the original KoalaPainter program's point of view. The KoalaPainter program put itself into memory, and intentionally left a space in memory where the graphics data should be loaded to. $6000 to $8710. (More on this particular location when we examine the code of the viewer.) Next, everyone who has coded anything with files on a C64 knows that the KERNAL can do a LOAD much faster than if you loop over repeated calls to CHRIN, reading one byte at a time. It was an eminently reasonable decision for KoalaPainter to save the image data as a PRG with a header address of exactly where the program wants to load the data, for itself.
But from a viewer perspective it makes no sense. The viewer program is not the original KoalaPainter program. And who knows how big it may be or what areas of memory it may occupy, and inside the context of an OS with a memory manager it is even more obscene. But it is what it is, and this viewer program is clearly hardcoded to look for the image data starting at $6000. Here's the thing though, if JiffyDOS handles loading the data into memory, and the data is already formatted as expected by a native VIC-II display mode, then what is it that a Koala Viewer actually has to do?
Digging into the code of a Koala Viewer
I manually transcribed this code to a Gist, from the photographs of an ML monitor's disassembly. All the numbers are in HEX. This isn't typically the style I would use when writing code, and labels are not used, rather hard memory addresses are used because that's how the disassembler produced the output.
Obviously I added the comments. I'm not super familiar with VIC-II programming, so why it has to mess with the VIC's raster interrupt, I truly do not know. Here's the code, let's dig in.
The code consists of three main parts. The first part, lines 1 to 44 are the "main" program. The second part is a subroutine called by the main program to configure the VIC and CIA2, and move data from where it was loaded in memory to places the VIC can use it. And the last part is a short routine wedged into the KERNAL's IRQ service routine, which scans the keyboard and allows the main program to progress past viewing the image, to the clean up part and exit.
So, main program, in detail. Mask CPU interrupts so we don't get disturbed while setting up memory and the VIC. Then it also masks CIA1's interrupt generator, not 100% sure why this is necessary since the CPU is already ignoring interrupts.
Lines 8 to 11 are what wedge this program's IRQ routine into the KERNAL's IRQ service routine. The KERNAL hops through a vector at $0314/$0315.
Next, it turns off the VIC's display. This causes the screen to blank out. I believe that's because it will look cleaner to move all the graphics and color data into place while the screen is off. And the screen can be turned back on when everything is ready to go. At this time the border color is also set to black. The border color must not be specified by the Koala image format. That's a shame, if you ask me, it's an important part of the image. Especially if you wanted the edges of the image to blend seamlessly into the border.
At line 17, the main program calls the only proper subroutine. So let's go check out what that subroutine does.
The first thing it does, from lines 50 to 72, is copy 2 kilobytes of data. The way it does this is very cool. It actually loops just 256 times, one complete cycle of the X register from $00 to $00 counting backwards. On each loop, it copies 8 regions in parallel. The first 4 regions are the 1K (256 * 4 = 1024 bytes) of Koala data into Color Memory. The second 4 regions are 1K that are copied from Koala data into "Screen" Matrix Memory. In the Multi-Color mode, Screen Matrix Memory is used to hold the extended color data. Both upper and lower nybbles are used to hold 2 additional colors per 8x8 pixel cell.
Interestingly, that's all the memory that is ever copied by this program. What about the 8 kilobytes of bitmap data? Now it becomes clear to us why the Koala image format, as insane as it seems, specifies where it should be loaded in memory. It's loaded to $6000 you'll recall. Let's dig into the VIC for a second.
The VIC-II chip has 14 address lines, not 16. 14 bits can address from $0000 to $3FFF, or 0 to 16383. So, the VIC can see 16K of memory at a time. The most significant 2 bits of addressing are supplied by the CIA2's Port A bit0 and bit1. This means that configuring CIA2 allows you to choose which of 4 blocks of 16K the VIC II "sees." In the C64's main addressing space those ranges are:
- $0000 - $3FFF (CIA2's $11)
- $4000 - $7FFF (CIA2's $10)
- $8000 - $BFFF (CIA2's $01)
- $C000 - $FFFF (CIA2's $00)5
Within a 16K block, there are two 8K chunks. From the VICs perspective an upper 8K and a lower 8K. Bitmap data is just shy of 8K (8 * 1000 = 8000, but 8K is 8 * 1024 = 8192). The VIC can be configured to read bitmap data out of either of these 8K regions, upper or lower, but always aligned to these two regions. It cannot for example arbitrarily read an 8K bitmap that's shifted by just 2K or something like that. If you divide the 2nd bank in half, the lower 8K goes from $4000 to $5FFF, and the upper 8K goes from $6000 to $7FFF. Bingo. $6000 is the start of an 8K bank whence the VIC-II can directly read a bitmap.
So, what's neat about loading a Koala, is that the JiffyDOS (%)-command, (load to the header address,) without any other viewer or decoder program's involvement, literally loads data directly off the disk and straight into video memory exactly where it needs to be for the VIC to display it. Crazy low level. But, hey, it's hard to imagine how it could be more efficient.
Moving on now.
Lines 74 to 94. The VIC's background color is set, which is just one byte from the Koala data into the VIC's background color register. Next, in short order, the VIC's raster interrupt is masked (or unmasked?) I'm not sure what this is for. The byte inside the infinite loop code in the main program is set so the loop will loop, not sure why this had to be done here and not in the main part of the program itself. CIA1's interrupts are, started? Or touched, not sure why this has to be done. Then the VIC's raster counter is set. Next the VIC's memory pointers are configured. This is what tells the VIC to get the bitmap from the upper 8K, and where in the lower 8K screen memory (1K, used in this mode for extended color info) should be found. And the write to DD00 is CIA2, which configures which of the four 16K banks the VIC should see. Lastly, multi-color mode is turned on and the subroutine ends, returning to the main program.
Back in the main program, now that all the color data has been moved into place and the VIC's registers have been configured, the VIC's display is reenabled. And lo and behold the image appears.
CPU interrupts are unmasked, and at lines 24 and 25 the main program just goes into a hard infinite loop. This to me is one of the things that makes this program—and the C64 and its lack of modern OS—feel the absolute oldest. Literally, the program goes into an infinite loop that is just 4 bytes long, 2 opcodes and 2 data bytes. It loads an immediate value of #01 into the accumulator. Then if the value loaded is not zero, which, duh it's not, it branches right back into loading the accumulator again. Argh! While you look at the image, that beautiful 1 Mhz CPU spins like an insane idiot doing absolutely nothing. Almost nothing. Modern computers, and C64 OS will never do absolutely nothing like this.
Almost but not quite nothing. Because there is still the interrupt service routine. So let's look at that. While the main program is in a tight infinite loop, the routine at line 100 to 110 is being called 60 times a second. The first thing it does is clear the VIC's raster interrupt. I'm not super familiar with raster interrupts, but actually, it looks like the CIA1 might not be the one generating the interrupts this time. It's the VIC's raster that generates the interrupts. Not sure why they did it this way. But in any case, reading from DC01 checks the column that the space key, and the control port 1's fire button is in. You can read my recent post about How the C64 Keyboard Works for more detail on why this works.
If the space key is not held down, it skips over the next two lines and returns into the KERNAL's usual IRQ routine. However, if space key is held down, it writes a #00 directly into $0832, that's the address that holds the immediate argument inside the main program's infinite loop! Thus, it breaks the loop and allows the main program to proceed into its second half.
The main program, following the infinite loop, is from lines 27 to 44. Everything is done between a pair of SEI/CLI, so that interrupts are masked at the start and unmasked at the end. Then it does three quick subroutine calls into the KERNAL. These initialize the SID, CIA, VIC and IRQ, and reinitialize the I/O vectors (which unwedges the IRQ subroutine previously discussed.)
The only problem with returning to BASIC immediately, is that the Multi-Color Mode uses color memory for part of the image's color map. But when the VIC is put back into character mode that color memory is used for the colors of the characters. So color memory needs to be reset. But, what to reset it to? In this case, a short loop (again in parallel, 4 regions are being set per loop,) color memory is filled with #01... white. And that explains why all the characters on the screen go white after the program returns to BASIC. The reason is because color memory is shared between character mode and bitmap mode, and when the Koala data is copied into color memory, it clobbers whatever was in there. Then it picks white arbitrarily to put back in. It could put light blue back in and most people wouldn't notice, but what would be even better is if it made a backup of color memory, and then restored from that backup.
So that's the whole process, in as deep detail as I care to go.
Some Thoughts About Modernization
Inside the context of an OS, memory cannot just be clobbered arbitrarily. It just can't be. What if the ethernet driver has allocated some space to buffer some incoming data, and you don't even know where those allocations have come from? Memory in the middle of the computer, especially around $6000 cannot just be blindly filled up with data that is loaded straight from disk. Even if that space were available, you'd need to at least mark it now as occupied, so that future allocations don't come out of it.
The VIC-II shares memory with main system memory. And a bitmap can only be in one of 8 possible locations. In my mind, the ideal place for bitmap data is under the KERNAL ROM. The 4th 16K block that the VIC can see goes from $C000 to $FFFF. That range, in C64 OS, is never allocated by the memory manager. It's too messy to use for arbitrary purposes. $C000 to $CFFF is used for C64 OS code, so that's never available. $D000 to $DFFF is under I/O, and is used in precision ways by C64 OS for storing system related data, which it accesses by patching out I/O when it needs to. And in C64 OS the KERNAL ROM is used and usually patched in, which covers the remaining 8K, $E000 to $FFFF.
However, the VIC doesn't always see what the CPU sees. Even when the KERNAL ROM is patched in for the CPU, when the CPU writes to $E000-$FFFF those writes go into the underlying RAM. And when the VIC reads from that range, it always reads from RAM. That's so great. That means with the KERNAL ROM patched in, code executing within that range can write data into that same range and can be affecting what the VIC is showing in realtime. As long as your code needs the KERNAL, and also wants to show bitmap data, storing the bitmap data under the KERNAL sounds like a no brainer.
If bitmap data is the 8K block under the KERNAL, then screen matrix memory can be configured to be found somewhere inside the 8K block from $C000 to $DFFF. In C64 OS, the layout of $D000 to $DFFF is managed manually, much the way the system workspace is managed manually from $0000 to $07FF. A thousand locations can simply be reserved under I/O space, just for extended color info for multi color bitmaps. No need to dynamically allocate it, that's just where it goes.
What does this mean for Koala image files? Or all the other multi-color format variants? Like, Advanced Art Studio, Amica Paint, CDU-Paint, Draz Paint, etc. When reading these from disk, they should be read more along the lines of how C Structures are read into memory. The headers on a file are prescribed sizes. You declare a structure composed of a certain number of bytes, and you allocate space for that structure. Then you populate the structure by loading a fixed number of bytes from the file into the memory allocated for the structure. And then, you can access all the properties of the struct to know more about the file contents.
A loader for Koala would start by loading 8000 bytes into $E000. Then another load operation would load 1000 bytes into where screen matrix memory will be assigned in $Dxxx. Then, if you want to have color memory preserved, perhaps another 1000 byte block under $Dxxx should be managed by C64 OS as the color memory buffer. But the point is that you have to load the data in chunks.
It is much faster to use the KERNAL's LOAD routine rather than looping over CHRIN and putting the bytes where you want them to go. It's also easier to use, it requires less code. However, LOAD is brutally uncontrolled. It must load the entire file, and it must all go to one contiguous range of memory. But that's just not going to cut it.
C64 OS has a File module, which not only works with C64 OS File References, but also offers the standard FOPEN, FREAD, FWRITE and FCLOSE. This makes it very easy to load in chunks. You call FOPEN with a pointer to a File Reference (which will get created for you using the C64 OS file manager,) then you call FREAD with a pointer to the open File Reference, a pointer to where the data should go, and a 16-bit length to read. And that's all you have to do. FREAD will put the data in the buffer. It handles errors, it handles looping, it handles hiding the mouse cursor and other implementational details.
Will FREAD be slower than the KERNAL's LOAD? Yes, by definition it has to be, as it uses the KERNAL under the hood. But, on the other hand, we're not loading from a 1541 anymore, (hopefully.) We're loading from a CMD-HD or an SD2IEC, or better yet from an IDE64 or a 1541 Ultimate.
More Sophisticated Data Structures
All of these image formats, (20 of them hires and multi-color, not counting the interlaced and more advanced formats) they all just assume the image is exactly 320x200. And that nothing whatsoever will appear on the screen at the same time as the image.
Storing data in a format native to the VIC-II is fine. But, what about images that are bigger than the 320x200? What about images that are smaller? Smaller images make a lot of sense to me, especially if you've got a background scene, say, already loaded, and you want to load in a small rectangular area to overlay on top of the scene. I can imagine tonnes of uses for this in games, or animation sequences or presentation software. Or even just to overlay some UI controls on top of a bitmapped image. Instead of having no UI, but to slap the space bar, in C64 OS the mouse will be active, what about being able to load in a "close" X, or back and forward arrows and display them on the bitmap screen? They could be loaded directly from disk when you want to show them. The possibilities are endless.
A more modern C64 image format doesn't need to abandon the idea that the data is in a VIC-II friendly format. But what it should have is a standard header that can be loaded first into a struct somewhere. That could specify the dimensions of the image data in 8x8 cells, the data format (hires or multi-color), and the offset into the file for the color and bitmaps sections. It could even specify where on the screen it ought to display, if that made sense. But it wouldn't do that with hardcoded memory addresses, but instead with offsets from a virtual top left corner.
Going into a tight infinite loop as your program's main behavior can't be how it works. I understand why this viewer does that. I mean, what else is it supposed to do? Where else ought the CPU to execute? It's got to execute something, and with the image statically displaying on screen, and the IRQ routine checking for the space bar, there isn't really anything else to do.
But, even though C64 OS isn't "multi-tasking", it's still got system level code that is capable of processing events. After setting up the image to display, in C64 OS, your code would just return. And it's the OS's own main event loop that will do the infinite looping. Meanwhile maintaining the ability to respond to timers, pass mouse and key command events to your code, process incoming network data, and whatever else gets added in future versions of the OS.
Okay, that was long. But it's good to go into this stuff. Leave your thoughts in the comments section.
- It's not a perfect analogy because modern image editors store extra data besides the final image. Such as layers, groups, tags and selections. So they export a "flattened" version of the image. Whereas Koala edits the same flat file that is ultimately viewed. But, it doesn't affect my argument much. [↩]
- That said, there are some formats that are compressed on disk further. Such as RLE compressed Koala format. These require the viewer to use a decompression routine. However, such schemes are very primitive. [↩]
- I wouldn't even know where to begin to actually put data, directly, on to the screen on a Mac. Most nerds would ask, "Why would you ever want to do that?" Just out of curiousity. My guess is that you have to be some sort of super privileged low level OS process to be able to put things directly into video memory. It's so arcane it seems that no one really knows how to do it. [↩]
- Note, of course, that these "relocations" don't modify the code. If the code is not intentionally designed to be runnable from $0801 it will crash. The relocate is a simple shift of position in memory. Real code relocation is much more complicated. Although it is possible it requires a special binary file format and a special assembler. See here, for example. [↩]
- It tripped me up for a minute, because the program writes #$02 to DD00, to me, that looks like it should be the 3rd 16K bank, 00, 01, 10, 11. 0,1,2,3, right? But I checked the C64 schematics, and indeed, the V14, V15 address lines coming off the CIA2 have a bar above them. That means they're inverted, for whatever reason. So the correct order is 11,10,01,00 for banks 0,1,2,3. Thus #$02 really does give you the second bank not the 3rd. [↩]
Happy Hallowe'en C64 enthusiasts. This post is about passing inline arguments to subroutines in 6502. Hopefully it won't be too spoooky.
The KERNAL and Register Arguments
As limiting as just 3 bytes of arguments might seem, there are a lot of routines one could write in this world that can get by with so few. As mentioned above, the KERNAL uses processor registers exclusively for its arguments. There are times when three bytes aren't enough. But to handle this the KERNAL simply requires you to call one or more preparatory routines first.
The chief example being opening a file for read. In a C program this can be done with a single function, but in the KERNAL you have to supply several arguments. A Logical File Number (1 byte), a device number (1 byte), a secondary address or channel number (1 byte), a pointer to a filename string (2 bytes), plus the length of the string (1 byte). Count 'em up, that's 6 bytes. The KERNAL therefore requires you to call SETLFS with the first three arguments, and then SETNAM with the string pointer and length, before you can call OPEN. (And incidentally, OPEN doesn't take any input arguments.) A bit more complex than C, but not outrageous.
The advantage to processor arguments is that they're fast. Load the .A register with a value, that could be as short as 2 cycles. JSR to a routine that takes .A as its only argument, that takes a standard 6 cycles, but .A is already loaded and ready to use by the code in that routine. I mean, there is virtually no overhead. Especially if .A was already set by a previous routine's return, there is sometimes literally zero overhead in passing that data through to the next subroutine. So, if you can get away with using just 3 bytes of arguments, it's fast, it's really damn fast.
But there are downsides. Registers are effectively little more than in–processor global variables. Global variables which certain instructions operate directly on, and which those instructions require to work at all. An indirect indexed load, for example, can only use the .Y register, and must use the .Y register. So if you use the .Y register to pass in an argument, but the code in the routine needs to do an indirect indexed instruction, the .Y register has to be overwritten and whatever argument was passed in on it has to be saved somewhere so as not to be lost. This makes matters somewhat more complicated.
The other downside is that if your routine is using .A, .X and .Y for holding on to temporary state, such as the index of a loop, but inside the loop you JSR to another routine, it is necessary to know which registers that other routine (and more complicated, any subroutines that it may call, and so on ad infinitum) will use. The KERNAL routines are all very self–contained. They generally don't go off and call other routines, so their register usage is predictable. In fact, the documentation tells you exactly which registers are used (and thus disrupted) by each KERNAL routine. For example, VECTOR, takes .X and .Y as inputs, and returns a value in .X, but it uses .A in the process. So before calling VECTOR if you care about what's currently in .A you'd have to back it up first.
For the accumulator, backing up and restoring is easily done by pushing and pulling from the stack, but to push and pull .X or .Y from the stack it is much more complicated. The NMOS 6502, and the 6510 (NMOS 6502 variant used in the C64 and C128)2 there is no way to push .X and .Y directly to the stack, nor to pull them directly off the stack. You have to first transfer them to .A, then push that. And you similarly have to pull to .A and transfer to .X or .Y. This adds complication, because it involves disrupting .A. In the end, the most efficient way to write 6502 code that uses and worries about the registers for argument passing and local variables, is to write code carefully by hand. Like an art form.
C Language and the Stack
C is a super popular language, and without even considering how much code is still written in C today, most other modern languages are modeled syntactically on C. Don't believe me, the Wikipedia article, List of C-family programming languages, lists over 60 languages in the C family. One of the most recent being Apple's Swift which was first released just in 2014. The original C language began development in the late 1960s and first appeared in 1972.
Assembly is a step above Machine Code, by abstracting absolute and relative addresses with labels, and instructions with mnemonic codes, but the assembly code maintains a strict one–to–one relationship between code typed and instructions output by the assembler. The assembly programmer still has complete control over the precise gyrations the CPU will go through. C is a compiled language, which means by definition the programmer is not the one writing the Machine Code, the compiler is. The compiler is not an artist. It needs simple, reliable rules, that operate on a very local scale to produce predictable output that will work. There is no way that a compiler would go off looking through the code of routines that the current routine will call, to see which registers it will use and thus which are safe for its own use. It's just, too artsy. It's not rigorous enough.
Instead, the way C works is that all arguments are passed on the stack. Not only are all arguments on the stack, but all local variables of a function are also on the stack, and the return value is sent back via the stack too. The big advantage is that every instance of a function being called has its own unique place in memory, and that place is dynamically found and allocated. This makes recursive code really easy. A single function can call itself over and over, deeper and deeper, and each call instance has its own set of variables that don't interfere with the previous instances, because they are each stored further and further into the stack.
Passing arguments is easy too. The declaration of the function (or the function's signature) defines how many bytes of arguments it takes. When the function is called the caller puts that much data onto the stack and the function uses the stack pointer to reference those variables.
So, why doesn't the C64 programmer use C or a C–like solution?
I said it so poetically in my review of World of Commodore 2016, after listening to Bil Herd talk about the development of the C128 and where it was in the market at the time. People back in the 60s and 70s already knew what could be accomplished by a computer with an incredible amount of computing power. Computers in those decades already had 16 and 32 bit buses, megabytes of RAM and complex, timeslicing, multi–user Unix operating systems. The only problem is that these computers cost millions of dollars and were as big as modern cars. But, none–the–less, they had many of the modern computer amenities. What made the 6502 so special was that it was a complete central processing unit, all crammed together into a single IC package that could be sold for just a few dollars a piece, but not that it was a powerful and fully featured CPU. As I said in my article, early home computers really were just toys, in the eyes of the big business computers of the preceding 15 to 20 years.
Take the PDP-11, for example. The PDP-11 was commercially available starting in 1970, contemporaneous with the development of C. It was a very popular 16-bit minicomputer by DEC, that looks like an enormous stand–up freezer with tape reels on the front. Its main processor was in fact the design inspiration for both the Intel x86 family and the Motorola 68K family. And the C language explicitly took advantage of many of its low–level hardware features.
Just as a point of comparison, while the 6502 has 3 general purpose 8-bit registers, the PDP-11 has 8 general purpose 16-bit registers. But the the most important advantage is its addressing modes. The PDP-11 has, among others, 6(!) stack relative addressing modes and efficient support for multiple software stacks in addition to the hardware stack. The 6502 has a single, fixed address hardware stack, a single 8-bit stack pointer, no dedicated features for software stacks, and NO stack addressing modes at all. In fact the 6502 has just two instructions that affect the stack pointer. TSX and TXS. Transfer the Stack Pointer to .X register, and vice versa respectively. What this means is that performing operations directly on data in the stack is very crippling.
It is possible to write in C for the 6502, there is a 6502 C Compiler. But my understanding, from conversations I've had on IRC, is that it works by emulating in software the features of the CPU that are not supported natively. This is a recipe for slow execution. The bottom line is, no matter how convenient C's stack–oriented variables and arguments are, C was not designed for the 6502. And the 6502 is simply not well suited to run C code.
Alternative Argument Passing
Okay, now we know how the KERNAL works and what the limitations are with its techniques. And we also know why it is that the C64 doesn't just standardize on C (or a derivative) like every other modern computer. But we are still left with the problem of how to pass more than 3 bytes worth of arguments to a subroutine.
I can think of at least two ways to work around the limitation and then I'll go into some detail on a third and fairly clever solution. I found these tricks and techniques by reading the Official GEOS Programmer's Reference Guide. It discusses various ways that the GEOS Kernal supports passing arguments to its subroutines.
Zero Page on the 6502 has been described as a suite of pseudo registers. Almost every instruction has a Zero Page addressing mode that can work with data there more quickly than in higher memory pages. But, there is a limited amount of ZP space, just 256 bytes. And $0000 and $0001 are actually taken by the 6510's processor port.
But one trick that GEOS does is reserves 32 bytes of Zero Page as a set of 16, 16-bit registers, which it numbers R0 through R15. After that, it treats them very much the same way we would treat the real registers, and all their limitations described above. I said that one problem with the CPU registers is that they are effectively global variables. And so are the GEOS ZP registers. The advantage is that you've got 16 of them, and they're 16-bits wide. So various routines that need several 16-bit arguments are simply documented as, you put a 16-bit width into R5, you put an 8-bit height into R6, you set a fill pattern code into R3, and a buffer mode in R1, and then you jump to this subroutine and it draws a square and blah blah blah. I'm making up the specifics, but you get the point.
When the subroutine is called, it expects that you have populated the correct global ZP "registers" with suitable values, which it reads and acts upon. It also comes with a suite of Macros that help you set those registers. It's a bit slower than using real registers the way the KERNAL does, and has many of the same limitations, but it greatly expands the space, up to 32 bytes from just 3 bytes. You also have to have a lot of Zero Page dedicated to this solution. This is something GEOS could do because it replaces the C64 KERNAL entirely, and can do whatever it wants with the entire computer's memory space. This is more or less a solution I was able to dream up on my own, but in a less structured way. You can always just shove a few bytes directly into known system working space addresses, and then call a routine that will use those work space addresses. But, there is something that feels a bit dirty about this. What happens if you change the KERNAL and reassign some common work space addresses? Likely, old apps will stop working and will probably crash the system.
An alternative solution that I have come up with on my own, I don't recall seeing this in the GEOS Kernal, is to pass a pointer to a structure. This is the essential behavior in C64 OS of the Toolkit and its view objects. And it is also the way File References work. When calling a subroutine that operates on a file, for example, rather than with the KERNAL having to manually pass 5 different properties of varying 8- or 16-bit widths, the properties are set on a local (or allocated) block of memory. Then a pointer to that block of memory is put in .X and .Y (lo byte, hi byte) and the routine is called. The routine generally needs to write this pointer to zero page, and then use the .Y register as an index in indirect indexed mode to work with those properties.
There is no way around needing to use some global zero page space. As indirect pointers can only be referenced through addresses in zero page. However, moving the responsibility of what free space in ZP to use from the calling routine to the called routine feels much less dirty.
Using structures and passing pointers to structures is a pretty good solution, and it solves a lot of problems. But it isn't the best solution for all problems. Which is why we'll now look in more detail at one more solution I found in the GEOS Programmer's Reference Guide.
Inline Argument Passing
Let's just use the concrete example of what I've run into in C64 OS to understand the issue. C64 OS has a file module, which knows how to work with File Reference structs, and is a light wrapper for working with streams of data. The essential functionality I want is: fopen, fread, fwrite and fclose. Fopen will use a file reference struct to identify the device, partition, path and filename, and also dynamically stores the logical file number which doubles as the status for whether the file is currently open. Using a file reference requires a minimum of a 2–byte pointer. The file can be opened either for read or for write. After which, the pointer to the file reference can be used with fread or fwrite.
If the file was opened for read, for example, then we will want to call fread to actually read some data out of it. The arguments for fread will be, a pointer to the file reference, a pointer to a buffer of memory into which the data should be read, and a 16–bit length of data to read. That's 6 bytes of arguments.
If on the other hand the file was opened for write, then we want to call fwrite to write some data in memory to that file. The arguments will be, a pointer to the file reference, a pointer to a buffer in memory where some data is to be written, and a 16–bit length of data to write. That is also 6 bytes of arguments.
Closing a file reference is pretty straight forward. It only needs the 2 byte pointer to an open file reference. It just reads the logical file number out of the reference struct, closes the file and releases the LFN.
But now let's consider opening the file. If opening for read, not too hard. We need the pointer to the file reference, plus probably 1 byte to indicate the read/write direction. That's only 3 bytes, we could get away with putting the read/write flag in .A and the file ref pointer in .X and .Y.
When it comes time to opening a file to write, though, it becomes a bit more needy. You need the file ref pointer (2 bytes), the read/write flag (1 byte), plus a file type byte (1 byte, USR/SEQ/PRG), and you also need to indicate whether an overwrite should be permitted if the file already exists (1 byte), or if an append to an existing file should happen (1 byte). Naively, that's 6 bytes. Although, you could assign the Read/Write, to 1 bit, the file type to 2 bits and the overwrite and append to 1 bit each, and pack all that into a single byte argument. Dealing with bits makes your code fatter though. And nothing gets around the 6–byte requirements of the fread and fwrite routines.
Some GEOS routines can take inline arguments. Here's how it works:
You do a JSR to a subroutine, and in the calling code, you follow the JSR line with a series of argument data bytes. Effectively, as many as the routine needs. They're called inline arguments because they follow inline with your code.
The routine that accepts inline arguments begins by transferring the Stack Pointer to the .X register. It then uses the stack pointer to find the return address that the JSR pushed on. This address is actually the address where the inline arguments begin. It puts this address into a zero page pointer, where it can use indirect indexed calls to read the arguments (and optionally later overwrite them too if that made sense). But, it must also change the values on the stack to move the return address beyond the end of the inline arguments. When the routine does an RTS, the updated return address will be pulled off the stack and execution will continue just following the inline arguments. It's very clever. It's not always what you need, but it's a really great tool to have in the box if you need it.
Reading about it in the GEOS programmer's reference guide, it sounds a bit complicated to use. So I've written some sample code to show and explain how easy it really is. And how you can abstract the code to reuse it for different routines that take differing numbers of arguments.
Inline Arguments in Practice
Okay, so let's walk through this code and see how it works. We assemble to $0801 which is where a BASIC program should start. And the first include is the basic preamble. This will do a SYS2061, to jump to the first line of our assembly program which starts at line 6. The kernal.s include just defines the names of the kernal routines.
Getting right to the meat and potatoes of what we're showing, line 6 has a jsr getargs, this is an example routine that takes 3 bytes of inline arguments, and will print the three bytes in hexadecimal notation. On lines 7, 8 and 9 you can see that we've supplied three inline bytes of data. These are the arguments, immediately following the JSR instruction. At line 11 we call jsr getargs again, to show that in this case the 3 bytes of inline arguments are in the form of a .text string. It doesn't matter how we inline the arguments, all that matters is that the routine expects there to be 3 bytes and execution is going to continue just after those 3 bytes.
The rts on line 14 is the end of the program. Following this we include our handy inttohex.s. It has a routine tohex that takes a value in the accumulator and returns it as two hexadecimal (0 to F) PETSCII characters in the .X and .Y registers. prnthex is a simple little routine that will print the output of the tohex routine. First it CHROUTs a "$" followed by the two hexadecimal characters and lastly a new line.
And now for the getargs routine and how it works with inline arguments:
I like to comment my routines showing the ins and outs with arrows. In this case I've labeled my inline argument inputs as a1, a2 and a3. Following the right pointing arrow is usually a comment on what the argument holds. I've put in here .byte to indicate the length of the argument. In your calling routine, you could state the inline argument, for example, as a .word, then in the comments mention it's a .word, and you'll have to read two bytes to grab the whole thing.
Lines 5 to 16 do all the magic. The JSR instruction automatically pushes the return address onto the stack. That address is effectively just a 16-bit number. We have to add a number to that address that is equal to the total size of the inline arguments. In this case our args are 3 bytes. So we have to add 3 to that 16-bit return address. But adding 3 to the low byte could overflow that byte, so we have to do a full 16-bit add, using the carry. To begin the 16-bit add we start by clearing the carry, on line 5.
Next, we need to grab the current stack pointer. To do this we use the only instruction the 6502 has for getting the stack pointer, TSX, which transfers it to the .X register. To understand what happens next you have to know how the stack works. The stack consists of the entire first page of memory $0100 to $01FF. But things are pushed onto the stack starting from the top working down. The Stack Pointer is an offset into $01xx and always points to where the next pushed byte will go. Thus, when the stack is empty, the stack pointer is $FF. When the stack is full the stack pointer is $00. With each push, the stack pointer gets decremented, with each pull the stack pointer gets incremented.
After a JSR, whatever the stack pointer is, the return address is at the stack pointer +1, and the stack pointer +2. Here's the thing though, you'd think that once .X holds the stack pointer, you would need to manipulate .X to find the bytes you're after. Instead though, to access bytes relative to the stack pointer you can merely change the absolute start address from $0100 to something bigger or smaller.
Let's say the stack pointer is $F5. If you pushed something onto the stack it would go to $01F5, that means the last two things pushed onto the stack are actually stored at $01F6 and $01F7. Do a TSX, now .X is $F5. You don't need to INX and then read from $0100,X you can simply read $0101,X (which is $0101+$F5) add them together and you get $01F6. Similar to get $01F7 you don't need to INX, just start the absolute address as $0102,X (which is $0102+$F5).
Okay, now we know how we're referencing the address on the stack. Here's the next thing to know. When a JSR happens, the return address is pushed onto the stack high byte first, low byte second. So $0101,X references the low byte, $0102,X references the high byte. Now we're ready to see what happens.
Line 8 grabs the low byte from the stack. Line 9 stores it to a Zero Page address of our choice. It has to be stored somewhere in zero page to be able to read through it as a vector. Once we've stored that low byte, we can add 3 to it and write it straight back into the stack whence it came. Lines 10 and 11.
Adding 3 may have overflown the accumulator, but if it did, the carry is now set. Line 13 grabs the high byte and stores it at $fd, the high byte of the zero page vector. Then we add 0, which includes the carry and completes the full 16-bit add. And we write the new high byte straight back to the stack whence it came. And we're done.
The stack now contains a return address that is 3 bigger than it used to be, just past the end of the inline arguments. And the zeropage addresses $FC and $FD contain a vector that points at the block of inline arguments.
The only thing left to do is read those arguments. Here's one more trick of the JSR return address, it is actually the address of the next instruction... minus 1. But the RTS instruction expects that and returns you to the correct place. So, adding three to the return address was certainly the right thing to do. However, the vector at $FC/$FD actually points to one byte before the inline arguments. Again, no problem, we don't have to waste cycles and memory adding 1 to the vector, we just access the three arguments with offsets 1, 2 and 3, instead of 0, 1, and 2.
Lines 18 to 20, 22 to 24 and 26 to 28 show that we set .Y to the argument offset, then do an LDA indirect indexed through the zero page vector to grab that argument. The jsr prnthex simply prints that argument in hexadecimal which is what this example routine is supposed to do. Note that you don't need to get the arguments in any particular order. You can read any of them whenever it makes sense to in the routine. You can ignore some of them, or even modify and write a new value back to that inline argument memory location. The world's your oyster. You can have up to 255 bytes of inline arguments without needing to modify any of the initial stack manipulation logic.
So you might be asking, well that's neat, but why bother with all that? You could just put a few bytes at the end of your routine to hold the arguments, then load up .X and .Y with a pointer to those bytes and call the routine. The routine then only needs to write .X and .Y to the zero page addresses of its choice and boom you're done. Like this:
And yeah, that works too. But there are downsides. When you put the args below the routine:
- You have to use a label to find them.
- They are separated from the routine call they apply to.
- You have to add code LDX/LDY above routine call.
- If there are many blocks of args for different subroutines, it gets messy fast.
And all the converse are advantages for inline arguments. They don't need a label, because their address is known implicitly from the address of the JSR's return address. They sit together in your code with the JSR they apply to. If you have multiple JSRs each with their own arg blocks it doesn't get progressively messier. And you don't need any extra code on the calling side to set it up.
There is just one downside, in my opinion. The called routine has to have a fair chunk of code to manipulate the stack and set up the vector. Instead of 4 bytes for the end argument block, you need 22 bytes for the inline arguments. And if you had many routines that all use inline args those 22 bytes start adding up. Read on for one last solution to that problem!
A Slightly More Advanced Trick
For end arguments, you need 4 bytes in the called routine to setup the vector. And you need 4 bytes in the calling code to setup the .X and .Y pointer to the arguments (plus a label). So you actually need 8 bytes to "pass" the arguments. That's 8 bytes on top of the byte count of the arguments themselves.
With inline arguments, you need 0 bytes in the calling code, but 22 bytes in the called routine. But, if you're going to use this trick for numerous routines with inline arguments, you can move those 22 bytes into a shared argument–pointer–configuration routine, like this:
Now we've got a new routine called setargptr. It will handle the work of manipulating the stack and setting up the zero page vector. There are some gotchas. First, if we're going to use this routine for multiple routines that accept inline arguments, we need a way to customize how many inline arguments to expect. It can't just be hardcoded at 3.
We pass to this routine in the accumulator the number of inline arguments there shall be. The first thing this routine now does is writes the inline argument count (self–modifying code here) to a new label argc(+1). This overwrites the value to add to the low byte of the return address. The rest of the 16-bit add works as it did before.
The second gotcha is that the real routine, getargs in this case, has to call setargptr. But this call pushes its own return address onto the stack. So the place on the stack where our arguments are is 2 bytes further back. That's easy to deal with though, the new stack absolute offsets are simply $0103 and $0104, instead of $0101 and $0102. That's it.
The only other gotcha is that the setargptr routine now bears the responsibility for which Zero Page addresses to use for the vector. And thus, every routine that uses inline arguments and uses setargptr to manipulate the stack, must all use the same zero page vector. But, depending on your situation, that might be just fine.
In the actual getargs routine now, instead of all that messy code to manipulate the stack we just load the accumulator with the expected number of inline args, and JSR to setargptr. Boom, done. Now, it's 5 bytes per routine that uses inline arguments instead of 8 for end arguments. We actually save 3 bytes per routine. But, the setargptr routine is now 26 bytes. End arguments require 4 bytes per routine, plus 4 bytes per call. So if we write just a few routines with inline arguments, and call those routines a few times. We quickly end up using less total memory than if we used end arguments. But, to be honest, saving some memory is icing on the cake. Inline arguments are just cool.
Thanks for the tip GEOS! GEOS has got a few other tricks I'll probably end up exploring in future posts. Stay tuned.
- At some point in the future, I'd like to post my own version of the C64 KERNAL documentation as a Programming Reference. But, it's a lot of work. For now, you can read about the KERNAL in the C64 Programmer's Reference Guide. Chapter 5: Basic to Machine Language (PDF) [↩]
- The CMOS version of the 6502, the 65c02, includes extra instructions for pushing and pulling the .X and .Y registers directly to and from the stack, and other handy things. Unfortunately, there is no such 65c10 processor. [↩]
The home page shows the full content of the 10 most recent posts.
Below are the titles of 5 posts older than the most recent 10.
Click their titles to view the complete post and read and leave comments.