NEWS, EDITORIALS, REFERENCE

November 30, 2023#122 Software

Fast App Switching

Hi everyone. I know it's been a while since I've written a blog post, but I have been super busy working on new C64 OS code and documentation.

A quick update on the documentation. I've been working on the Programmer's Guide, where I have completed Chapter 4: Using the KERNAL. It is an overview and discussion about how to use the C64 OS KERNAL, plus a reference-style breakdown of all 114 KERNAL routines divided into 10 modules: Memory, Input, String, Math, Screen, Menu, Service, File, Toolkit and Timers.

I've also been working on Chapter 8: Writing an Application (née Chapter 7). It's not quite ready to publish yet, but it is currently sitting around 35,000 words, and is a deep technical walkthrough of everything from stem-to-stern that goes into writing the TestGround Application that ships with C64 OS. TestGround's source code is opened in the process of writing this chapter, and it covers everything from creating an Application Bundle, to making its menu data file, creating an icon and about metadata file for your App, implementing a user interface in Toolkit, creating custom a class with custom draw routines, responding to messages, handling menu commands, loading data files and loading custom classes.

I have also been updating the online C64 OS User's Guide with new material from C64 OS v1.03, v1.04 and the brand new v1.05. I've added new documentation for the simplified VICE set up instructions for CMD HD and a new sub-chapter for set up with IDE64. And a new and updated document for advanced VICE configuration with CMD HD. Advanced VICE configuration for IDE64 and set up guide for C64 OS on TheC64 mini/maxi are on their way.

I decided early on when I was working on the C64 OS User's Guide that I would draw a dividing line between "core" OS features and the Applications, Utilities and other features that are provided as add-ons. One reason for this division is because, from my experience with working on other OSes for the Commodore 64, I have found that in the minds of many people all things that depend on the OS get rolled into one giant mega-project. And then cool new things get dismissed as, "Oh, it's just a new feature of that C64 OS project. Neat." I want to fight against this impulse. C64 OS is an application platform, and therefore Applications written for C64 OS should stand alone as their own productions, which, incidently require C64 OS to be run.

To that end, the documentation for the Installer Utility and the C64 Archiver Application, as well as a description of the C64 Archive (.CAR) file format, were left out of the main C64 OS User's Guide. Instead, you can find a whole other guide, C64 Archiver and Installer, that covers all of that.

Screenshot triptych from the C64 Archiver and Installer Guide.

In similar fashion, I am also working on a new guide just for Image Viewer and all of its features. It's not ready to publish yet. But here's a sneak peak at its main promo image.

Screenshot triptych from the Image Viewer Guide.

And now, let us get on to the main post, an update on all the new stuff available in C64 OS v1.05.

C64 OS v1.05: Fast App Switching

I'm super excited by the latest release of C64 OS, version 1.05. Each release going back to the earliest betas has been given a name that encapsulates the tentpole new feature, and this one is called "Fast App Switching."

I remember back in high school reading about how Creative Micro Designs (CMD) had a GEOS enhancement called gateWay. Here's a copy of the CMD gateWay Users Manual you can download if you're interested. I recall reading that it had a feature called "Switcher" which you could use in combination with an REU to add a kind of simple multi-tasking to GEOS.

Screenshot of the CMD gateWay replacement for deskTop.

I never actually got around to using gateWay, as I became a Wheels user instead, and then later I became enamoured with WiNGs (née JOS.) However, the idea of using the REU to bank a frozen copy of the state of the C64, with an Application and its data in progress, and then restore another Application from the REU, was ravishing for me. It also made a lot of sense. The C64 doesn't have a very fast CPU, and the CPU itself is not designed for the kind of pre-emptive multi-tasking that was pioneered with early UNIX systems. But then, along came the REU, an early and officially Commodore (and thus bonafide retro) RAM expansion cartridge. Suddenly your C64 could have 256K, or 512K and soon 1MB, 2MB and today 16MB(!!) of RAM. Using that RAM was never entirely straightforward though, and the program had to explicitly make use of it.

But what if the OS, such as GEOS with gateWay and its "switcher" technology, could allow multiple Applications, each one designed to use the whole computer, to be parallelized and banked in and out of the REU very quickly? Suddenly, the Application proper doesn't need to make explicit use of the extra RAM. The OS could automatically bestow on the App the benefit of the presence of an REU. And, the system as a whole could become much more productive, as you could quickly whizz from one Applicaiton to another. As I said, I found the idea ravishing. It piqued my young and enthusiastic mind.

Task Switching

One problem with the gateWay switcher for GEOS is that the GEOS Applications, such as geoWrite or geoPaint, don't actually know anything about context switching. So, it was always a kind of hack. And with hacks their often come subtle incompatibilities, unexpected results, things that don't work quite as you'd wish they would.

From the earliest days of C64 OS's design, I knew that I would not design it as a pre-emptively (or otherwise) multi-tasking system. C64 OS uses the KERNAL ROM in the C64, and it uses the standard DOS's that are found built-in to the storage devices, such as the legacy disk drives: 1541, 1571, 1581, and the newer mass-storage devices, CMD FD, CMD HD, RamLink, IDE64 and SD2IEC. None of these are designed for pre-emptive multi-tasking, and true multi-tasking adds a world of complication that taxes the C64's resources in a way that I thought was a bad tradeoff.

GateWay-style task switching, though, that seemed to me like the best way to go. But rather than hack it into place later, the way gateWay does for GEOS, I designed C64 OS to be aware of task switching from the very beginning. Although task switching was not available in the v1.0 commercial release, many foundational building blocks had already been put in place to allow for the clean and stable addition of task switching in a future version.

And that future version is now available, in version 1.05.

Loader v. Switcher

C64 OS is made up of many small and interchangeable components. During boot up, there is a set of boot components that get loaded in various ways and in a certain order that is determined by the codes and order of the entries in the file //os/settings/:components.t (and as of version 1.04, you can choose between different boot modes, which selects a different components file, from components1.t to components4.t.)

One of the boot components, rec.lib.r, detects the presence and capacity of a 17xx-compatible REU and puts that information into workspace memory. The booter than loads a set of KERNAL modules determined by the codes and the order of the entries in the file //os/settings/:modules.t (and the components file can then specify which modules files to load, from modules1.t to modules9.t.) The modules file allows different or alternative KERNAL modules to be loaded, and the codes allow different modules to be selected depending on whether an REU is available or not.

Version 1.05 rearranges the components so the first one is the rec.lib.r to detect the REU. And the second component is a new one called reuboot.r. On a fresh boot, when the REU doesn't hold anything special, reuboot sees that the REU doesn't yet hold the C64 OS magic, and so it does a couple of things instead. It installs a small routine in workspace memory called an "REC Shunt" and it loads the "switcher" (//os/library/:switcher) and calls a routine in switcher called reuinit.

The modules.t file then gets processed during boot and has a new branch for choosing between two alternatives for the services KERNAL module. The original, //os/kernal/:services.o if there is no REU, or the new alternative, //os/kernal/:services.reu.o if there is an REU available.

The services KERNAL module is responsible for a mix of a few different low-level things. It implements the main interrupt service routine, it handles global keyboard shortcuts for things like taking screenshots or changing video modes, and it also has the code used to load a new Application or to load a Utility. The original non-REU version of services handles loading a new Application by first loading the "loader", which is found here: //os/library/:loader. A long time ago all this loader code was part of the KERNAL. But I realized that it's only needed when changing from App to the next, and once the new App is loaded in, having all that stuff in memory is just a waste of space. The loader does things like, shows the Application loading splash screen, it reads in and parses the App's menu.m and other housekeeping tasks in prep to load in the new App.

The REU version of the services KERNAL module does all the same things, but when it comes to loading an Application, rather than loading the loader from disk, it swaps the switcher in from the REU. The switcher has everything that the loader does, plus it has all the logic for managing Fast App Switching. Even if an Application is not yet loaded in and has to be loaded from disk, it benefits from the switcher, because the switcher itself is banked in the REU. So the time it would take to load the loader from disk is gone. Well, it's not quite gone, but instead of loading that code from disk again every time you change Application, the switcher is loaded during boot, by the reuboot.r component, and part of its reuinit routine involves banking itself to the REU.

The Work Bank

Interestingly, CMD gateWay's switcher requires a minimum of 512KB REU. So, no support for the 1700 (128KB) or the 1764 (256KB), only the 1750 (512KB) or later. I don't know the technical reason why, but I could take a pretty good guess.

REUs naturally divide into banks of 64KB. The addressing registers for REU memory are 24-bit, or 3 bytes, low, medium and high, OR, conceptually, low byte and high byte which precisely correspond with the low and high bytes of the 16-bit address in the C64's main memory, plus 3rd byte use to select the 64KB bank. On a 1700, then, there are only 2 banks, and so the only valid bank byte values are 0 and 1. A 1764 has 4 banks, a 1750 has 8 banks. And continuing the pattern, a 1750 expanded to 1MB has 16 banks, a CMD 1750XL (2MB) has 32 banks. And the reason why the 17xx-compatible REU family maxes out at 16MB is because that represents 256 banks, which is the maximum number of the 3rd addressing byte.

In C64 OS, if an REU is available, the zeroth bank (i.e., the bank that is selected when the bank addressing byte is set to 0) is automatically claimed by the OS for workspace. It is called, in common C64 OS nomenclature, the work bank.

The work bank is used for several things:

The C64 OS "magic"
Buffer for fast screen compositing
Storage for shared workspace memory
Storage for the switcher component
Table of Fast App Switching slots
Table of Fast App Switching access order
Table of REU Bank allocation
Storage for RTC driver

Let's walk through what these mean.

The C64 OS "magic"

It is apparently common practice for a program that uses the REU to write some identify bytes into a fixed place in the REU. These bytes are called "magic" because what they are don't really matter, they are merely used to uniquely identify whether or not the program's (or operating system's) own data is already in the REU.

The typical place for the magic bytes is the first few bytes of the zeroth bank. For example, when TurboMacroPro+REU runs, it stores code in the REU, but it writes the characters "TURB" into the first 4 bytes of the zeroth bank. You can quit to BASIC, run other software, like a machine language monitor or Novatext or whatever, and when you load and run TMP again it still has your code in memory and ready to go. That's because when it starts up, it looks for its magic "TURB", and if it's there, it assumes the REU still holds its own data.

C64 OS writes its own magic to that very same place, but its magic are the characters "C64OS".

What this means is that, if C64 OS is stored in the REU and you drop to BASIC, but then you run TurboMacroPro+REU, TMP will overwrite the C64 OS magic and suddenly C64 OS will no longer attempt to fast reboot. But, that makes sense, because TMP has also trashed whatever content C64 OS was storing in the REU and so a fast reboot would be impossible. The reverse, of course, happens if TMP has its magic (and your assembly code) in the REU and you boot C64 OS. When you return to BASIC and run TMP, it knows that its own content is no longer in the REU.

It is within the realm of possibility, though, that some random C64 software (i.e. not written for C64 OS) will overwrite the REU's contents (thus corrupting C64 OS content stored there) but still leaves the C64 OS magic in place. If this happens, a fast reboot will be attempted but it will fail. If this happens, there are 3 ways to overcome the problem and restore order to the galaxy.

Power cycle the computer
Load and run //os/c64tools/:reuinit
Hold COMMODORE+CONTROL+SHIFT while booting C64 OS.

The power cycle is the harshest on the 40-year-old chips, but the most straightforward for the user. When you cut power to the computer, the REU forgets all of its contents. When you power up again, C64 OS will naturally do a fresh boot from disk.

Loading and running the reuinit tool uses a few BASIC pokes to clear the C64 OS magic from bank 0, which fixes the problem. And holding COMMODORE+CONTROL+SHIFT while booting up tells the reuboot.o component to ignore the magic if it's found and just perform a standard fresh boot up from disk. One advantage to using reuinit, is that it frees up the keyboard to select an alternative boot mode on your next boot. Using the keyboard combo will force a fresh boot, but it will also use the standard boot mode.

Buffer for fast screen compositing

Going back to v1.0, C64 OS has had two versions of the screen KERNAL module, one for the REU and one for just the CPU. The screen module has routines for compositing the C64 OS screen layers together into a buffer, which is then transferred to the main VIC-II video buffer. The VIC-II's video buffer (video matrix, color matrix and bitmap data) are coordinnated around the state of split screen and the position of the split.

When the REU is available, the screen KERNAL module for REU is selected, and the REU is used to speed up data transfer between buffers. Screen data is temporarily copied into an area of the REU, and then from there copied back to the C64's main memory or color memory. A part of the work bank is used as a space that is always reserved for these memory copies.

Storage for shared workspace memory

C64 OS has a number of settings including the current color theme, settings for the clock, the mouse settings such as pointer style, color, tracking speed, etc. Many of these can be seen by going through the Configure Tool.

When C64 OS is running an Application, the user could pop open a Utility, like say, Themes, or Mouse or Time. With those Utilities they can change system settings. The Utility writes the new setting to workspace memory for the current App, plus writes the changes out to the settings files so they're retained for the future. The problem is that if you Fast App Switch to another Application already open, that Application needs to get the updated copy of these settings.

When the switcher switches Apps, it first copies the current App's settings data from main memory to the work bank in the REU. Then it switches Apps, and then it copies the settings data from the work bank in the REU to main memory containing the new App. In this way, the work bank becomes a common storage area for settings data that is shared by multiple Apps.

Storage for the switcher component

On a fresh boot, the reuboot component loads the switcher component into memory and calls its reuinit routine. This prepares the REU by formatting the fast app switch slots and other tables. The last thing the switcher component does is archive itself into the work bank of the REU.

Whenever the user tries to change Apps, the first thing the REU-version of the services KERNAL module does is swaps the switcher component in from the REU work bank. This is much faster than loading the loader from disk. The switcher component consults the tables to see if the App is already loaded and stored in a Fast App Switching slot. If it is, it proceeds to switch to that App. If it's not, it searches the tables for a free slot, possibly expunging data from the least recently accessed slot, and then performs a standard Application load, assigning it one of the slots.

The complete set of code that is in the loader component is also part of the switcher component.

Fast App Switching Tables

There are 3 tables that the switcher maintains, these are stored in the REU work bank.

Table of Fast App Switching slots

You choose in the Configure Tool how many 64KB REU banks you want to dedicate as Fast App Switching slots, from 0 to 31. Each slot allows for a one Application to be open at the same time. If you choose more slots than are available in your REU, C64 OS automatically drops this down to the maximum available. The table of slots is 1KB of space in the REU work bank, or 4 pages. The pages are divided into 32 bytes, the first 16 bytes holds the name of the Application, and the second set of 32 bytes are used for status flags and some bytes reserved for future use.

The index of the slot in the slots table is always identical to the bank the App is stored in. That is, slot index 1 holds the name of the Application that is stored in REU bank 1. Slot index 31 the Application stored in REU bank 31.

Table of Fast App Switching access order

This is a small table just 32 bytes long. It holds the order that the Fast App Switching slots were accessed in. The table will look something like:

04, 05, 02, 01, 03, 06, 07, 08, 09, 10, 
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 
21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
31

In the example above, it is likely that there are no Apps in banks 6 to 31. And the order of the first 5 bytes is showing the order that the user has been accessing the Apps in those banks. The App in bank 4 is currently running, the App in bank 5 is frozen but was running just prior to the current App, and so on.

Table of REU Bank allocation

The REU bank allocation table is always exactly 1 page (256 bytes), because 256 is the maximum number of REU banks, which represents 16MB. This table holds a byte code to represent the allocation status of each 64KB bank.

Byte	Constant	Notes
$FF	bnk_nota	Not available
$FE	bnk_free	Free for allocation for data
$FD	bnk_empt	Empty Fast App Switching slot
$FC	bnk_full	Fast App Switching slot holding an App
$01 → $20	—	Allocated by the App in the corresponding slot

This is one of the tables initialized by the switcher component during a fresh boot. The work bank is flagged as not available, but usually not available is the result of the REU being less than 16MB. A 2MB CMD 1750 XL REU provides 32 banks. (32 * 64KB = 2048KB = 2MB.) Therefore, all the banks from 32 to 255 are physically not available.

Continuing with the example, if the user specified 8 Fast App Switching slots, then bytes 1 to 8 would begin as bnk_empt, empty slots. And bytes 9 to 31 would begin as bnk_free, available to be allocated by an Application for data usage. When the first App is loaded it gets assigned the first available slot, so, slot 1, and the bank allocation map is changed to bnk_full to indicate that this bank is a Fast App Switching slot with an App in it.

Let's say that the App in slot 4 calls the bank allocation routine requesting 5 contiguous data banks. The memory.lib (which is what implements the bkalloc routine) looks in this table for 5 contiguous bnk_free bytes. To allocate them, it sets them to the number 4. This tells the allocator that, not only are they allocated, but they belong to the App in slot 4. If the user (or the system) quits this Application, not only is slot 4 reverted back to bnk_empt, but all the banks with a 4 in the map get changed to bnk_free. Simple but effective.

Storage for RTC driver

There is still quite a bit of room left over in the work bank for future use for caching important system files. One such file that is cached during a fresh boot is the RTC driver.

The reason this is important is because after dropping to BASIC and running some other C64 software, it is entirely possible that the Time of Day clock in the CIA chips will be modified. After a Fast Reboot, C64 OS needs to reconfigure the CIA TOD clock, and it needs to fetch and format the current date and time for C64's internal tracking. Rather than having to go back to the disk and load in the RTC driver, it swaps it in from the REU, performs a standard readrtc operation just like what happens during a fresh boot, and now the clock and date are back in working order.

The Switcher Utility

In order to use Fast App Switching (FAS) it is not necessary to use the Switcher Utility. Fast App Switching happens automatically, and banks automatically manage themselves.

Let's say you're in App Launcher, and you use it to open Gallery. App Launcher itself is automatically frozen in the bank that was assigned to it when it was first loaded in. And Gallery is assigned a bank when it is first opened. From Gallery, when you choose Go Home, Fast App Switching kicks in. Gallery is frozen to its own bank, and App Launcher is very quickly restored from its bank and thawed back into action.

The process of freezing gives the Application an opportunity to prepare to be frozen, for example by pausing music, closing any open files, writing out modified state data, stopping or pausing any timers. When an App is thawed, it gets an opportunity to do the reverse; start any paused timers, reopen files or (in the future) network connections, and so on.

You can bounce between open apps by fast switching back home and then using Homebase to fast switch into the next App. When you run out of Fast App Switching slots, the switcher component automatically expunges the app that was least recently accessed, according to the access order table.

However, it would also be useful to be able to switch directly from any current Application to any other open Application. For this there is the Switcher Utility.

Switcher is just a regular Utility, and by default in v1.05 it has been added to the bottom of the Utilities menu so it can be accessed quickly and easily from within any Application. The keyboard shortcut CONTROL+S is assigned to open it.

Switcher reads in the FAS tables, and presents a list of all the currentl open Apps in the order they were most recently accessed. The currently running App, for this reason, is always at the top of the list. When opened, Switcher automatically selects the second App from the top, which is the mostly recently accessed App prior to the currently running one. This is convenient, because you can just press RETURN to trigger the Switch button and you'll be jumped back to the immediately previous App.

Applications can be clicked on to select them, you can press the cursor up and down keys to move the section up and down the list, or you can continue to hold CONTROL and press s to move the selection down the list. When the selection is at the bottom of the list CONTROL+S cycles the selection back to the top.

There are, in v1.05, 3 possible states that an App can be in: Running, Frozen or Unsaved. There is only ever one running Application, and it's always the one at the top of the list, which incidently you can see and interact with behind the Switcher panel itself. A frozen App is not only frozen but has declared to the system at the time that it frozen that it has no unsaved data. If a frozen App finds itself at the bottom of the list, and the user wants to open a new App, but there are no empty FAS slots, the switcher component will auto-purge that frozen App to make room for the new.

However, an App can also declare that it has unsaved changes. You can see an example in the screenshot of Switcher above, right, where Chess is selected. Chess is an Application that can open a saved game file. After opening the game file, if you make some moves, the game enters an unsaved state. Switching to another App while the Chess game is unsaved results in Switcher showing its state as unsaved. But if you switch back to Chess, choose Save, or press COMMODORE+S to save, Switcher then shows it in just the plain Frozen state.

Quit vs. Kill

There is a Quit button at the bottom left of the Switcher Utility. This can be used to quit any frozen Application. Because the App has declared that it has no unsaved state, the App is not informed when it is quit. Switcher simply clears its FAS slot, marks its slot in the bank allocation map as empty, and marks any of its allocated data banks as free again. That's it, the App is no longer open, and its FAS slot is immediately available for the next App.

The Quit button becomes a Kill button if the selected App is in an unsaved state. If you click Kill it does precisely the same thing as Quit, except that you are made aware that this App will lose some unsaved changes. The switcher component will never auto-kill an unsaved App. It will only ever auto-quit an App that has declared itself safe to be auto-quit.

Drop to READY.

The top of the Switcher Utility shows a label "Drop to" and button "READY." You can now open the Switcher from within any Application and click that button to drop to the READY prompt. By default C64 OS returns the system device to the root directory of the system partition, but if you hold the CONTROL key while dropping to READY, C64 OS sets the drive's default partition and directory path to the place where the Application's currently open file is located.

This can be handy for any number of things. It's nice sometimes to just be able to do some file management tasks from JiffyDOS at the READY prompt. Copy a file, create a new directory, run a tool or other C64 program that's not available directly inside C64 OS, or just to write a quick BASIC program to do some simple but repetitive task.

With Fast Reboot, being able to drop to the READY prompt suddenly makes a lot of sense. When you're ready to return to C64 OS you just run the booter like you normally would:

With JiffyDOS: (Specifying the partition and root directory in the command:)

^1//:c64os

With JiffyDOS: (If already in the right place it's even easier.)

^c64os

Without JiffyDOS: (Specifying whatever device number it's installed on.)

load"1//:c64os",8
run

The booter scarcely begins to load anything, it brings in the reuboot.o component, which sees the C64 OS magic in the REU work bank, and instead of loading the switcher and calling its reuinit routine, it copies the switcher component in from the REU and calls its reurestore routine.

In an instant you are right back in C64 OS, exactly in the Application you were just in, with the Switcher Utility still open, etc. Dropping to READY is now as easy as Fast App Switching to some other App, and rebooting is as fast as Fast App Switching back to the App you just left from. It's pretty great.

REC Shunt

Before showing you the new Usage Utility, I wanted to get a bit technical and talk about what I call an REC Shunt.

In developing v1.05 and Fast App Switching, I learned a lot about how to use the REU. The memory in the 17xx-family of REUs (RAM Expansion Units) is not directly addressable by the 6502/6510 CPU. The 6510 CPU has a 16-bit address bus which can only specify an address from 0 to 65535, i.e., 64KB. The REU has, effectively, a 24-bit address scheme; 3 addressing bytes instead of the CPU's 2 bytes. In order for the CPU to access the memory, though, data has to be moved from the external RAM to the internal RAM or vice versa.

In order to move the data from internal to external memory or vice versa, the REU includes a special controller chip called the REC, or RAM Expansion Controller. The REC chip is mapped into the C64's I/O area, typically at $DFxx, otherwise known as I/O2. Like other chips, such as the SID, the CIAs or the VIC-II, the REC has a handful of registers that can be written to or read from to activate different types of transfers. When a transfer is activated, the REC chip asserts /DMA, the direct memory access line that's available on the expansion port. This line allows a device on the expansion port to temporarily halt the CPU. With the CPU on pause, the VIC-II and the REC begin to share bus access to the main RAM, just the way the VIC-II and the CPU typically share access to main RAM. During the VIC-II's badlines it also has the ability to pause the REC just as it usually pauses the 6510. While the REC has access to the bus, it is able to provide the address on the bus and read or write main RAM directly. This is why DMA means direct memory access, because the REC chip temporarily becomes a kind of special purpose processor that is very efficient at copying memory, far more efficient than the 6510 which is a general purpose processor.

I find most documentation, and often even most sample code, hard to follow. I don't know if that's just me or if it's because the people writing the documentation haven't fully grokked the way the engineers originally designed the registers. An example of this was my discovery of how the 6510 Processor Port works to control the C64's memory mapping. The documentation was just befuddling and seemed impossible to memorize. Once I grokked what was going on, it was like a lightbulb turned on and it became clear and easy, and I only rarely have to refer to the documentation to use it correctly.

In an effort to make the REU easier to understand, the registers of the REC are grouped as follows:

Group	Registers	Notes
Status	$DF00	Read-only, status.
Command	$DF01	Used to start a transfer.
C64 Address	$DF02-$DF03	The C64's 16-bit start address, little endian.
REU Address	$DF04-$DF06	The REU's 24-bit start address, little endian.
Transfer Length	$DF07-$DF08	A 16-bit transfer length, little endian.
Options	$DF09-$DF0A	Special options bit flags.

Although this looks like a 10,000 foot view, and it's missing all the details that are usually given in any documentation (such as this page on codebase64,) it sometimes takes a big picture view to see what's important. And what's important is the block of registers in the middle, and the fact the command register is at the edge of that block:

Command
C64 Address
REU Address
Length

The status register is a separate thing, you can use it if you want, and it's read only. The options flags at the end are useful for doing very specific tricky things. But they also just kind of get in the way and add complication that can mostly be ignored. Just like with the 6510 Processor Port, the best way to tame the beast is to establish a stable operating mode. Anything that needs to deviate from that stable mode is free and welcome to do so, but has to restore the default operating mode when it's finished. Anything that doesn't do that is "broken." Then the situation is much like pushing data to the stack but failing to pop that data back off before the RTS. It predicatably leads to a catastrophic crash, and should be considered a bug.

Let us therefore ignore the special options bits and assume that the OS has already configured them for the standard, baseline operational mode.

The fact that the command byte comes at the edge of the block of writable registers is surely not a coincidence. The bits of the 6510 Processor Port used for controlling the memory map also come at an edge, bit 0 and bit 1, which allow for simple INC and DEC operations to climb the memory map ladder without side effects. In the same way, some engineer knew that having the command byte on an edge would be handy, but why it's handy is often overlooked by documentation.

An REC Shunt is what C64 OS calls a simple transfer operation (either to or from the REU, or a swap, they all count as a shunt) that can be universally performed by a standard system call (starting in v1.05) passing it a pointer to a shunt table. A shunt table is a structured block of bytes, otherwise called a struct or a structure. The shunt table contains the same values in the same order as those middle REC registers. The REC Shunt routine is, more or less, just a loop that copies the table in reverse order to the REC, such that the last byte written is the command byte.

Triggered Transfers

There is one further complication to simple REU transfers. While you're able to write to the REC registers the C64 must have I/O mapped in. But the instant you write the command byte the REC halts the CPU, takes over the bus and begins the transfer. This would ordinarily preclude the ability for the REC to access the RAM beneath I/O, because even when the REC takes over the bus, the PLA is still performing its duty of activating and deactivating different chips based on the address that appears on the address bus.

In order to overcome this problem the REC has a special bit in the command register to make the transfer "deferred" or "triggered" by a later event. That event is a write operation to address $FF00. The idea is that you write all the setup bytes: the length, the REU address, the C64 address, then you write the command byte with the trigger bit set. The transfer does not begin, but everything is prepared and ready to go. You then change the memory map by writing to the 6510's processor port bits 0 and 1, and then you write something to $FF00. The REC sees the write to $FF00 and begins the transfer. You can then change the memory map back after the transfer is complete.

Although the REU can be used to do fancy things, such as repeatedly writing bytes out to the same address in I/O (for example to stream digital audio sample data to a DAC on the user port,) this is not what an REC Shunt is all about. The REC Shunt is about making it conceptually easier to move blocks of data between REU RAM and the C64's main RAM. The REC Shunt routine is less about careful timing or 100% efficiency (like what is required for a tight demo) and more about ease of use and conceptual simplicity.

The REC Shunt routine always performs a write to $FF00, even if the type of transfer does not require the trigger. However, because $FF00 is an address of RAM where C64 OS stores various things, it is important not to change that memory. If the KERNAL ROM is patched in, a read from $FF00 reads the KERNAL ROM but a write to $FF00 goes into the underlying RAM. Therefore, the REC Shunt routine always patches the KERNAL ROM out, performs a read from RAM from $FF00 then a write to RAM to $FF00 with the value it just read, thus changing nothing, then restores the memory map.

Using an REC Shunt

The following is how it looks in code to use an REC Shunt table and the REC Shunt system routine.

setmagic .block #ldxy tab jmp recshunt tab .byte rc_ctr ;C64 to REU (not triggered) .word magic ;Source in C64 .word $0000 ;Destination in REU .byte workbank ;REU Bank .word 5 ;Length to transfer .bend magic .text "C64OS"

In the above example, there is a routine called "setmagic" which will write the C64 OS magic string into the first 5 bytes of the REU's bank 0. The entire routine is wrapped in a block/bend so the label "tab" is local to that block and can be used again in other similar blocks.

The table consists of just 5 lines. The command byte, which has been given constants in C64 OS so you don't need to worry about the bits. Followed by a word, which is a label to somewhere in this source code, in this case it's a label to a string of text below the routine. Next is the address within an REU bank as a word. This could be set by a constant, but in this case I've just plugged in $0000. Next is a byte for the REU's bank. In the above example we're using the constant "workbank" which is defined as $00. And lastly, a word for a 16-bit length. I've just plugged in the number 5. Because the 6510 is little endian, the assembler "word" puts the bytes in the order $05 $00, which is correct for the REC registers which are little endian too.

The routine uses the macro #ldxy to load a RegPtr to the REC Shunt table, and JMPs to recshunt. This performs the transfer and returns from the setmagic routine in one step.

The following table lists the available command bytes.

Constant	Value	Notes
rc_ctr	%1001 0000	C64 to REU
rc_rtc	%1001 0001	REU to C64
rc_swap	%1001 0010	Swap
rc_tctr	%1000 0000	Triggered C64 to REU
rc_trtc	%1000 0001	Triggered REU to C64
rc_tswap	%1000 0010	Triggered Swap

I don't know about you, but for me these constants are simple and descriptive enough to be memorized. And the order of the values in the shunt table are simple enough to be memorized too, which means I can now write code that moves data into and out of the REU without even needing to consult documentation. It's also much easier to understand code I've written in the past.

It is often the case that some portion of the shunt table will be dynamic. In this case we can use the shunt table constants to address individual bytes in the struct. Like this:

lda #mapapp ldx #2 ;1 page jsr pgalloc sty reumap_+rsh_cadr+1 jsr getreumap ;... do something... jsr setreumap rts getreumap ldx #rc_rtc .byte $2c ;Skip Next 2 Bytes setreumap ldx #rc_ctr stx reumap_+rsh_cmd #ldxy reumap_ jmp recshunt reumap_ .byte $00 .word $ff00 .word reumap .byte workbank .word $0100

In this code sample, we need to allocate a page of main memory into which we can transfer the 1-page REU bank allocation map. Lines 1, 2 and 3 allocate one page, the page is returned in the Y register. Line 5 writes that allocated page number into the REC Shunt table at offset rsh_cadr+1, that is: REC Shunt C64 Address, and the plus one is because the address is two bytes little endian and we're setting the high byte.

There are then two routines, getreumap and setreumap. Each is an intro to a common routine, they load the X register with the REC Shunt command, rc_rtc or rc_ctr, and pass it through to the common routine. The command byte is written into the shunt table at offset rsh_cmd. Then a RegPtr to the shunt table is loaded and the call to recshunt performs the transfer. Later, code can call getreumap, at any time, to fetch a copy of the REU bank allocation map from the REU into the main memory buffer. Code can then analyze the main memory copy or draw it to screen, or modify it in some way and write it back to the REU with setreumap.

For me anyway, the "REC Shunt" technique makes working with the REU feel much more intuitive. The switcher component, the Switcher Utility and the new Usage Utility all make liberal use of REC Shunts.

The New Usage Utility

Very early on in the development of C64 OS I wrote the Memory Utility. It along with Peek were useful for debugging. I used the Memory Utility to see which pages had been allocated, and used Peek to look at the contents of those pages while C64 OS was still running.

Memory Utility was written before I even began to develop Toolkit. It used a combination of a static template, with manual manipulation of the Utility's layer buffer, plus all the mouse events were computed manually. The resultant code was rather nasty. However, it worked and it got the job done, so it stayed pretty much as is through the 1.0 release and on up until I finally rewrote it for v1.05.

Actually, the original Memory Utility is still available, and you can even choose (using the Configure Tool) to open it by double-clicking the available memory indicator in the status bar. But the Configure Tool now also offers the Usage Utility as the default.

Usage was written from scratch and built with a Toolkit UI. The main UI consists now of a TKTabs view with 3 tabs: RAM, REU and CPU, plus a couple of controls for auto and manual reload options in the footer. Each of the three tabs is filled with an instance of three new Toolkit classes: TKRAMU, TKREUU and TKCPUU, for RAM Usage, REU Usage and CPU Usage. The whole Utility is now twice as wide as Memory was. In Memory I crammed the display of 256 pages of memory into just 16 columns by 8 rows, or 128 character cells. Only every odd page actually shows its allocation status in the map, although, if you double-click the left half of a cell it opens the odd-numbered page in Peek, but if you double-click the right half of the cell you get the even-numbered page. The new Usage Utility eschews this weird and early design decision in favor of showing every single page with its own character cell, even though it takes up more screen real estate.

Usage Utility - RAM

The TKRAMU class is an example of the composite or composition design pattern. ¹ It directly subclasses TKView, but when it is initialized it instantiates a TKButton, sets its title text, configures its size and anchoring properties, then the TKRAMU appends the button to itself. Finally, it assigns the button's target to call one of TKRAMU's own methods. This is a very object-oriented technique, and putting it to use makes me very happy.

You can click any individual page of memory and the selected page is indicated with a checkmark. When you click the Peek button it opens the Peek Utility to the selected memory page.

Usage Utility - REU

The main impetus for replacing the older Memory Utility with the new Usage Utility is that I wanted to have room to show the REU's bank allocation map. Prior to v1.05 there was no REU bank allocation. Starting in v1.05 the memory.lib (which extends the routines in the memory KERNAL module) now has bank allocation routines; bkalloc and bkfree are near perfect parallels to the pgalloc and pgfree KERNAL routines.

In the main memory page allocation map, each page is assigned a type of allocation indicating who owns it. However, there really are only a few possible owners. Workspace memory is statically allocated. The system (such as the KERNAL, drivers, and some libraries) make allocations, the Application makes allocations for itself, and a Utility can make allocations for itself. But there can only be one Application runnning, and only one Utility at a time.

In the REU bank allocation map, things are a little different. The workbank is statically allocated, just like main memory's workspace, at the bottom of memory (bank 0 of the REU.) But after that, there can be many different Applications, up to 30, all open at the same time and frozen in different Fast App Switching slots (one slot equals one bank.) There can be values in the map for banks that are unavailable. This is impossible in the paged memory map, because every C64 comes with 64KB of memory, where as the number of banks in an REU can vary. Some banks are set aside for Fast App Switching slots, and these can be either full or empty. And the rest of the banks begin their life marked as free, but any Application in any of the Fast App Switching slots can allocate a data bank.

It doesn't make sense in the map's key to show a different color for each possible allocation type, since there could be up to 30 different App allocations. Instead, when you click on a Fast App Switch slot the checkmark appears on that bank, and pluses appear above each of the data banks that has been allocated by the selected App bank. And these don't have to be contiguous either. Just as pages in the paged allocation map can be interleaved with different owners, data banks can be interleaved with different owners. Conversely, if you click on a data bank, it behaves as though you had just clicked the App bank that owns that data bank; the checkmark appears on the App bank and all the allocated databanks (including the one you just clicked) get pluses.

In the Switcher Utility, when you select any open App it shows how much memory the App is using. This memory usage indicator is precise to increments of 64KB. Every App uses a minimum of 64KB, which is the 64KB of main memory or the Fast App Switching slot itself. Each additional data bank that it allocates adds another 64KB to the usage indicator.

Usage Utility - CPU

This last tab is mostly just for fun. I think it looks cool. It's not hugely useful, but it is fun!

A lot of people have asked me how this works. I gave a description of it to some interested folks in the C64 OS Community Support Discord server, so I'm going to just repeat that here with some light editing.

The ISR (interrupt service routine) runs every 60th of a second (50th of a second on PAL.) The ISR increments a counter every time it runs. So, after one second the counter would reach 60. After 4.25 seconds the counter would reach 255. The ISR does not loop the counter, if it reaches 255, it doesn't increment it back to 0 it just stays at 255.

At the same time, the ISR has a second process that it does every 1 second. Every one second, it shifts a set of 10 bytes, and shifts that counter byte on to the end of that set.

The trick is that the main event loop ZEROs the counter. When nothing else is going on, the main event loop runs like 2000 times (I haven't done the math) between every time the ISR runs once. So the ISR never gets a chance to increment that counter beyond 1. It sets it to one, and the main event loop instantly sets it back to 0.

Except for when the main event loop gets an event. It delivers the event to the App, and the App processes it through Toolkit, and a view gets marked dirty, and then the main event loop does a draw cycle at the end... all of that takes some amount of time. During that time the ISR is incrementing the counter, and the Main Event Loop isn't zeroing, and so all of a sudden, if processing the event and redrawing the screen take 0.25 seconds, then the ISR has incremented the counter to 15. (i.e., 60 * 0.25)

It's that count of 15 that gets pushed onto the set of 10 bytes. So the counter is counting up and getting zeroed and counting up and getting zeroed, and counting up and getting zeroed, continuously, and the other process is just sampling that number at regular 1 second intervals, and keeping a history of the last 10 samples. Is this an accurate way to measure CPU usage? I have no idea. I'm not a computer scientist. But the result does look good.

The scale of the visualizer is ... some mathy term I don't know. I took a base number 1, and multiplied it by some multiple like 1.58, and then again and again and again such that you get a range from 1 to 255 across 13 segments.

Segment	Running Total	~ Rounded
1	1	1
2	1.58	2
3	2.496	3
4	3.944	4
5	6.232	6
6	9.847	10
7	15.558	16
8	24.581	25
9	38.838	39
10	61.364	61
11	96.955	97
12	153.189	153
13	242.039	242

Starting at 1, I multiply that by 1.58, then that by 1.58, and so on such that 13 segments gets up to 242. Then I put those numbers in a table in the code. So, if the count is from 10 to 15, the bar gets drawn 6 high. If the count is 39 to 60, the bar gets drawn 9 high. And so on. (I don't think I used 1.58 exactly... but something close to that.) This gives an accuracy bias in the visualization to the smaller numbers. For example, the extra single character height of a bar going from 12 to 13 high could be the result of a count increase of anywhere from 1 to 89 (i.e., 153 to 242). Whereas increments of count from 1 to 89 have 10 different bar hieghts at the low end of the scale.

The reason I did it like this, rather than, say, just dividing 255 by 13 (255/13 = ~19.6) and then saying, okay let's just add one more character to the bar height for every time 20 more is added to the count, is purely a matter of real world result. I did it this way first, because it seems the most logical. But in practice, you just never see the bars rise up, so it's very boring. If you try really really hard and you shake windows and drag scrollbars and resize text wrapping views, you occasionally get one bar that pops up to a height of 2 characters. If you think about it, that would mean that the 2-high bar could have a count value between 20 and 40. Anything less than 20 would remain always at 1-high. But under this new non-linear scale, a range of counts from 1 to 40 gets bar heights from 1 to 9. That's much more exciting!

Fast App Switching in practice

In practice, Fast App Switching is... really damn fast. In some cases, stunningly fast. In some cases, an Application saves out its current state to disk (just in case it gets expunged from memory while it's frozen in the background.) This write to disk makes the switch less than instantaneous, but it's still massively faster. And any App that doesn't have to hit the disk at all upon freezing, the switch happens in the blink of an eye.

For example, observe how quickly Image Viewer gets banked out and File Manager gets banked in, you see that around time index 0:48. It's very very fast. See again at time index 0:58 when File Manager switches back to Image Viewer. It's for all intents and purposes instantaneous. Now, in that latter case, the App switch is instant, but the App then gets a message from File Manager telling it to load a new file. The graphic data from the new file is obviously not yet loaded, so the App switch is immediate, but the data load from disk takes a couple of seconds.

Nonetheless, it's pretty damn fast.

More detail about how this works, and some technical stuff you have to be aware of to make sure your Applications and Utilities play nice with Fast App Switching will be added to the C64 OS Programmer's Guide.

Remember, if you are a licensed C64 OS user, you can download the system updates for free from the Official Software Updates page on c64os.com. Just make sure to read the instructions for how to properly update your system. I.e., one point version at a time.

And, we're off to demo C64 OS v1.05 plus all the other new features introduced since the v1.0 release last year, at World of Commodore 2023 which is this weekend. Hopefully I'll see you there in person! But if not, you can check YouTube for videos of the presentations.

I learned a lot about different object-oriented design patterns by reading these old Cocoa fundamentals documents from Apple's now-archived developer library. Cocoa Design Patterns. Cocoa is the descendent of NextStep, and the C64 OS Toolkit is based (very loosely) on OpenStep, the open source implementation of the NextStep object hierarchy.

Do you like what you see?

You've just read one of my high-quality, long-form, weblog posts, for free! First, thank you for your interest, it makes producing this content feel worthwhile. I love to hear your input and feedback in the forums below. And I do my best to answer every question.

I'm creating C64 OS and documenting my progress along the way, to give something to you and contribute to the Commodore community. Please consider purchasing one of the items I am currently offering or making a small donation, to help me continue to bring you updates, in-depth technical discussions and programming reference. Your generous support is greatly appreciated.

Greg Naçu — C64OS.com