NEWS, EDITORIALS, REFERENCE

Subscribe to C64OS.com with your favorite RSS Reader
February 5, 2018#54 Technical Deep Dive

KERNAL, File Refs and Services

Post Archive Icon

I want to talk about File References and the services provided to the developer by the File module in C64 OS. But to do that, we'll have to first take a brief tour through the IEC serial bus, the KERNAL and the C64's native I/O architecture.

To Rewrite or Not To Rewrite

First things first. C64 OS does not patch out the KERNAL ROM, it makes use of it for many low-level routines. Some of those routines include file handling. Many people who write an Operating System for the C64 make it one of their first and top priorities to rewrite the low-level file access routines. But I definitely don't want to do that. Why not?

Several reasons:

  1. C64 OS is single tasking
  2. The C64 is already short on RAM
  3. Writing serial routines is hard
  4. JiffyDOS is already very fast
  5. C64 OS wants to be compatible and agnostic

If I can be allowed the pleasure of describing those points in prose, the bottom line is that writing serial routines is hard, and serial routines already exist in the KERNAL ROM. The RAM beneath the KERNAL's addressing space is still available to the machine,1 so anything you can make use of from the KERNAL, even if you only use 20% of the KERNAL, that's 20% of 8K. That's 1.6K of memory you can spend implementing something else that the KERNAL doesn't do.

Block diagram of memory, with I/O, Character, BASIC and KERNAL ROM overlays.
Compute's 1st Book of C64: Block diagram of memory with I/O and ROM overlays.

Furthermore, the reasons people write their own file routines usually fall into a couple of categories. They want serial access to be faster, they're targeting a specific common device such as the 1541, or they're doing something more sophisticated than the KERNAL's routines can handle. These special needs might arise as a result of the the OS you're writing being more sophisticated, such as by supporting multi-tasking. C64 OS is single-tasking, that was a decision I made very early.

I want C64 OS to be able to use any type of device, IDE64, SD2IEC, 1541 Ultimate, 1541/71/81, CMD FD/HD/RL, etc. JiffyDOS, which if you don't have, you should get, already has routines that have been carefully thought out to get a good balance of greatly improved speed without sacrificing compatibility. I had a discussion with Jim Brain about how JiffyDOS's speed loading routines work, relative to how the custom speed loaders in some games work. Yes, some custom speed loaders can get more speed than JiffyDOS, but some of them also require you to have only one drive on the IEC bus, or will only work with a 1541 or 100% compatible clone. Those limitations might make sense for a demo or a game, but they are the sorts of tradeoffs that I would never make for a platform like C64 OS.

Check out this speed comparison between JiffyDOS and the original KERNAL rom. It's pretty stark.

 

So, C64 OS uses the KERNAL's routines. And if you want speed, replace the stock KERNAL ROM with a JiffyDOS KERNAL ROM.

The IEC Serial Bus

Let's talk about the C64's serial bus for a moment, so we have some idea of what any software routine is actually working with.

The word serial in serial bus means it sends one bit at a time along a single data line. But you can have many devices hooked up to the bus all at the same time. And somehow the data gets to and from all the right places. How does that work?

My understanding of how this works has evolved approximately like this: When I was I kid I was happy to believe that it was pure magic. By the time I was in my 20s I knew enough to be thoroughly confused because I could imagine the limits of a single data line (plus a couple of signal lines), but I wasn't creative enough to imagine the solution. Now, by the time I'm part way through my 30s, and have rekindled my interest in all things Commodore, I've taken the time to read up on the hardware and understand as much of the code as necessary to finally get the gist of how it actually works.

In digital electronics, one line is one bit. So a serial bus is effectively a 1-bit bus. Now, you can read or write 8 bits in a row to transfer a byte-sized package, but where is that byte going to go? Who is it addressed to? How does it get there? If you're writing a byte, how does that byte get to the destination device but not to any other device? These were some of the questions I had, up until quite recently, that I just didn't know the answers to. I'm not suddenly a serial bus guru but at least I have a pretty good understanding now of how this generally works.

There is a protocol, that is to say, an order to the rise and fall of the data and signal lines, and an interpretation of the bytes that are transmitted at certain stages of communication, which is agreed upon by the software in the computer and the software running on each of the serial devices. The C64 is the master device on the bus, so it organizes who is allowed to talk on the bus and who should be listening.

Besides the data line, there are a few signal lines: Clock, Attention and Reset. The data and signal lines are electrically continuous all the way from the computer through each daisy chained device. What does that mean in practice? It means that if the data line goes high, every device on the bus knows that it is high. And this is a hardware quality of the bus, it isn't controlled by software, each line is just one continuous electrical wire along which the various devices are connected. The question then is, how does the computer direct the data to just one of those devices?

IEC Serial Port PINOUT
IEC Serial Port PINOUT, from IEC disected.

Every device on the bus must be manually assigned a unique address. This is done by the user, with DIP switches or cutting or joining traces on the device's logic board.2 These range from 4 to 30. The first for storage devices is 8, hence the famous load "*",8. The computer uses the attention line to tell every device it's going to issue a control byte. In the control byte, some of the bits represent a command: TALK, UNTALK, LISTEN, UNLISTEN, and some others. With the TALK and LISTEN commands, the other bits are used to represent the device number. Because the data and signal lines are electrically continuous along the whole bus, every device always sees all the information on the bus. But, if the computer begins with a control byte that is a combination of the command LISTEN and the device #12, then every device that knows its own address is not 12 ignores all subsequent data, until the next control byte is sent. The device that has address 12 starts paying attention to all subsequent data bytes because it knows the data on the bus is meant for it.

  • TALK tells a device that it should transmit data to the computer.
  • UNTALK tells the device that is TALKing that it should cease transmitting data.
  • LISTEN tells a device that it should receive data from the computer.
  • UNLISTEN tells the device that is LISTENing that it should stop listening.

There is a bit more to it, but that is the general idea of the protocal in a super tiny nutshell.3

What the KERNAL does

What does the KERNAL do? First I'll talk about what the KERNAL does, then I'll talk about what the KERNAL does not do, because trust me, there is a lot of stuff that the KERNAL does not do.

The KERNAL divides the labor into (roughly) two layers. And thank goodness it divides it at all, because this division is what allows storage devices that are not physically connected to the serial bus to work more or less as if they were. (Devices like IDE64, RamLink, an REU with RamDOS, and so on, are only connected via the expansion port). The first layer is a set of 8 low-level routines for working directly with the serial bus. These are as follows:

There are a couple of others. Some routines, which programmers have taken to using, do not have official jump table entries. But, this is the gist.
  • TALK — Command a serial bus device to TALK
  • TKSA — Send a secondary address to a serial bus device that has been commanded to TALK
  • ACPTR — Retrieve a byte from a device that is TALKing on the serial bus
  • UNTLK — Send the UNTALK command to the serial bus

  • LISTEN — Command a serial bus device to LISTEN
  • SECOND — Send a secondary address to a serial bus device that has been commanded to LISTEN
  • CIOUT — Send a byte to a device that is LISTENing on the serial bus
  • UNLSN — Send the UNLISTEN command to the serial bus

As you can see, these routines are very low-level. They mirror the essential hardware protocol of the IEC serial bus itself. But they do a lot of heavy lifting for you, by communicating with the CIA #2 chip to manipulate the DATA, CLK and ATN lines to send and receive one bit at a time to and from the serial bus.

Schematic fragment, showing CIA 2 connecting to the IEC serial port.
Stripped down segment of the C64 Schematics. Shows Port A of CIA 2 driving the Serial Port.

The second layer is a channel-based layer of abstraction above the hardware level routines. What does that mean? The designers at Commodore, possibly getting their ideas from some other reference design, had in mind a certain native way that the machine should work. It's actually quite clever, I hope you'll agree, but at the same time it is very foreign to the way modern computers work. Considering the hardware limitations, though, the more I think about how it works the more I like it in concept. The only problem is that, today, no one wants to use their computer like this. So let's get into it.

The idea is that the user should be able to use generic routines to read data from any input device, and be able to stream that data to any output device. It sounds pretty great. This works by opening channels to device numbers. Then the KERNAL's routines, which can be accessed by BASIC4, are used to pipe data from any input channel through to any output channel.

The quintessential input/output device is the disk drive, or other file storage device. And that is what we're talking about in this post. But, to understand the architecture we really need to take a step back and look at the bigger picture. Every device is treated like a generic I/O device, not just hard drives and disk drives. Here is what the KERNAL supports, and their corresponding device numbers:

  1. Keyboard (Input only)
  2. Tape (Input & Output)
  3. RS232/Modem (Input & Output)
  4. Screen (Input & Output)
  5. Printer 1 (Output only)
  6. Printer 2 (Output only)
  7. Plotter 1 (Output only)
  8. Plotter 2 (Output only)
  9. Disk 1 (Input & Output)
  10. ... 30. Up to 22 more serial storage devices

Some devices are input only, and some devices are output only. Obviously, you can't output data to the keyboard, just as clearly as you can't read data in from a printer. But you can input from the keyboard and you can output to a printer.

In another post I went deep on how the keyboard works. As you type, the interrupt service routine is scanning the keyboard 60 times a second, and converting key presses into PETSCII characters, which it puts into a 10 byte buffer in workspace memory starting at $0277. So while one routine is dealing with the low-level nature of the keyboard, and is populating a generic buffer in memory, another set of routines can completely ignore the technicals of how that buffer gets populated, and just read data out of it.

A printer, on the other hand, is on the serial bus. It requires a completely different set of low-level routines for sending data to it, probably involving those mentioned above. But at a higher level the KERNAL supports directing any input source to any output destination. And so with only a few commands you can make your C64 behave just like a typewriter. You press a key on the keyboard, and instead of the character going to the screen, it goes to a printer and out onto paper. Sure, it's always possible to get a computer to do this, but on a Commodore 8-Bit one can do this with remarkably little code.

Let's look at another example. The Screen itself is abstracted into an I/O device via the routines that implement the screen editor. While the VIC-II can be configured to see a certain range of memory for the 1000 characters of its text mode, routines in the KERNAL abstract this too. The screen editor code treats the range of memory, that gets rendered by the VIC chip to the physical screen, as one long rolling buffer. The cursor is an abstract notion. It is maintained by two variables in workspace memory that represent the current column and row. When you output a byte to the screen device, the low-level screen editor routines kick into action. First its PETSCII value is analyzed to see if it's a control code. A control code may change the state of the screen's modes or move the cursor, but otherwise, if it is a printable character it gets converted to a screen code according to the current screen modes and put into memory where ever the cursor offsets indicate it should go. Then the cursor is advanced by incrementing and rolling over the column and/or row variables. When the cursor rolls off the end of a row, the column index is reset to 0, and the row index is incremented. But when the cursor moves off the end of the last row, the KERNAL shifts every row in screen memory up and off the top of the physical screen, and out of screen memory into nowhere, and the cursor gets repositioned appropriately. These low-level routines allow the screen to behave like a generic I/O device.

In fact, this is how the C64 is configured when it is first turned on. The keyboard is the default input device, and the screen is the default output device. Every time you press a key, the generic piping routines read a byte from the keyboard device and write a byte to the screen device. That's pretty slick! And any combination of two devices can be wired together this way. This is why it is so easy to open a SEQ (aka text) file from disk, and output its contents straight to screen. It can just as easily be used to pipe a text file from disk straight to printer, or from keyboard straight to printer, as we saw before, or from keyboard straight to disk. It's really very clever.

However, it isn't just a direct one-to-one, device to device. There is another layer of abstraction that comes into play; logical files. The KERNAL maintains a table in workspace memory that holds the relationship between Logical File Numbers, Device Numbers, and Secondary Addresses. The table is 30 bytes long. Three ranges of 10 bytes each, to support 10 open channels. Each of which is more or less just a data stream to an I/O device.

  • $259 — $262 Ten Logical File Numbers
  • $263 — $26C Ten Device Numbers
  • $26D — $276 Ten Secondary Addresses

You assign a logical file number to a channel being opened on a device. When you pipe data, it doesn't go from device to device, but from logical file number to logical file number. Some examples of how this works, and how the secondary address is used with different devices can be seen here.


Right, so, that's what the KERNAL does, again in a tiny nutshell. I really do think it's pretty cool. It let's you do so many neat things in such a streamlined and simple way. It reminds me a lot of the power of Unix piping and treating everything like a file. It's neat right? You can stream data from a modem straight to a file on disk. That's cool.

There are several problems though, or should I say, limitations. The biggest problem is that, well, quite frankly, no one really wants to use their computer like this anymore. It is super neat that with little effort you can turn your sophisticated 1MHz 6510, 64K of RAM, computer into an electronic typewriter. But who the hell actually wants the functionality of a typewriter anymore? It's also pretty nifty that you can dump the contents of a text file to a rolling screen buffer, but, if you do that it will roll the whole screen, whether you want it to do that or not. And if you want to scroll backwards through the file you are completely screwed. Because streams can't really be interacted with, they just stream out of one thing and linearly into the other. You can slow down the stream or pause it, but that's it. (By the way, that's what the scroll-lock key on PC keyboards was traditionally for.)

So what does the KERNAL actually know about files? Believe it or not, it knows next to nothing. The KERNAL knows that devices in the range from 8 to 30 are storage I/O devices on the serial bus. The secondary address is a concept that the KERNAL knows about. It's a number that lets the computer signal some very basic intentions to the storage device. 0 for binary load, 1 for binary save, 2 through 14 for byte-by-byte sequential read or write, and 15 for sending commands. If more than one file is opened with secondary addresses from 2 through 14, that number is used for the computer and the device to know which of the concurrently open streams is being dealt with.

To open a file then, the KERNAL really only has two routines: SETLFS and SETNAM. The first sets the Logical File Number, Device Number, and Secondary Address. These three values will be put into the table described earlier, such that, given the logical file number of a previously opened channel, the KERNAL is able to recover the corresponding device number and secondary address. SETNAM takes a pointer to a string in memory and its length. The inner structure of that string, though, is something that the KERNAL knows nothing about. It is merely a command of some sort that will be sent blindly to the device.

When you call OPEN, it uses the low-level serial routines, such as TALK, TKSA and so on, to pass the command string to the device. Only the remote device itself has to understand what that command means. It can include a partition/mechanism number, a path (absolute or relative), a filename, a file type, read or write flags, an append flag, an overwrite command, and more. Then, depending on how the command string has prepared the device, the connection is open and the KERNAL can read (or write) data from the device, without having any clue what it is actually reading. Is it reading the contents of a sequential file? Is it reading part of a record from a relative file? Is it reading a directory listing? Or is it reading data from the Realtime Clock, or is it the DOS version of the device? It doesn't know and it doesn't care.

The KERNAL knows almost nothing about files, with only a couple of minor exceptions. Such as having the ability to load a program file straight into memory to be run.

As I've discussed scattered across posts from the past, it's useful that the KERNAL is so generic about files. The KERNAL is truly not a "Disk Operating System." Without a disk drive, a C64 has no disk operating system. It does not know how to manage or format disks, nor how to organize files nor do any file management. All of that work is offloaded into the DOS that is running on the storage devices themselves.

The C64 KERNAL ROM, above a 1541 from the 80's, a CMD HD from the 90's and an SD2IEC from the 2000's
One KERNAL ROM, many eras of storage device

This allows the C64's own KERNAL to be quite small, and yet still work with newer storage devices, such as SD2IEC. Those devices abstract away from the KERNAL exactly how it is that the files are stored, organized, renamed, copied, moved, fetched and so on, from the physical media. A C64 itself, today, does not know or see any difference between a 1541 floppy drive with 166K of storage, a 4 Gigabyte SCSI-based CMD HD, and a 128 Gigabyte SD Card with a Microsoft FAT32 file system. The KERNAL accesses all of them exactly the same way. Meanwhile it's just 8K of code and the only significant change it's undergone in the last 35 years is when we updated our systems to use JiffyDOS.

What the KERNAL does not do

Above I gave a fairly detailed description of what the KERNAL does, and how in many ways it is quite advantageous for the C64. 35 years later, the KERNAL still works well, for communicating with even modern storage devices.

But, there are in fact some pretty big changes between a 166 kilobyte floppy, which does not support subdirectories, and a 128 gigabyte SD card that supports partitions, subdirectories, and also track-&-sector-emulating disk images.

The KERNAL supports 10 open files, simultaneously. After that it starts throwing errors, because its correlation table of Logical File Number to Device Number and Secondary Address is just 30 bytes total, 10 entries of 3 bytes each. But, this does not mean you can open 10 files on your 1541 at the same time. The 1541's internal RAM and DOS have their limits too. Depending on how you're opening the files, and what type of files they are, the number of files that can be open at the same time on a single device varies from around 4 or 5 down to just 1. The KERNAL's support for 10 can still be reached though, by opening a few files each across multiple devices.

If you have two files open, on two different devices, it makes sense how they are distinguished and accessed separately. The Logical File Numbers correlate to two different device numbers, and each device is aware of just its own one open file. What about when you open two files on the same device? That's what the Secondary Address is for. Both Logical File Numbers are correlated to the same device number, but to two different Secondary Addresses. When the KERNAL asks the device to TALK it must also give it the Secondary Address. The device itself knows about its own two open files, and each is connected to a unique Secondary Address.

When you open a file on, say, a CMD HD, the CMD HD uses its own internal RAM to hold all the information it needs to work with that file. As long as that file is open, it is associated with a Secondary Address, and the in-device representaiton knows exactly where on disk the file comes from, and maintains a cursor into that file for reading or writing.

A device like the CMD HD supports multiple partitions, and in Native partitions it supports nested subdirectories. It also, conveniently, remembers the current partition, and within each, that partition's current directory. This is very convenient. When you ask the device for a directory, load "$",8, you don't have to keep supplying the partition number and full path, it just gives you the directory of the current default location. All file related commands are relative to the default location, unless an alternative partition and/or path are provided. Each directory looks just like the directory of a 1541, as far as the C64 is concerned. To change directory, or to change partition, commands can be issued to the device via channel 15. These merely set new partition and path defaults, such that when you request a directory, the directory from the new current defaults are returned. That makes sense.

So, now let's imagine a typical scenario. You change the default partition to 2. (@cp2). Then you switch the default subdirectory, (@cd//os/services/app launcher). Then you open a file by name and assign it a secondary address 2. That file is looked up, by its name, in the current partition and current subdirectory. You read some bytes from that file but don't close it. Next, you change the device's default partition to 3, (@cp3), and go into some subdirectories there. (@cd//programming/documentation). Then you open a second file, by name and assign it a secondary address 3. This file is looked up by its name, relative to the new current partition and path. Each of these open files has a different Secondary Address (2 and 3) that the device uses to distinguish between them. And the computer can toggle between them by issuing low-level commands to the device to switch between the Secondary Addresses.

When the computer issues commands to the CMD HD to change Secondary Address, to read more data from one of the open files, the current partition and the current path have no bearing on continuing to read from that file, which itself was opened from a partition and path which are no longer the current defaults. The CMD HD has maintained enough information to know exactly where on disk the next byte of that file will come from.5 The current partition, that partition's current directory, and even the filename, are totally irrelevant at that point. The device knows what partition the open file is in, and the directory and filename don't matter, because the in memory reference maintains a cursor for the physical track, sector, and offset into the sector of where the next data for this file comes from. While a file is open, its name can be changed, as this merely changes the directory entry that points to the first block of the chain of blocks that make up the file. With a bit more work its directory entry could even be moved to a different directory. This makes no difference to where the file's physical, on-disk, data comes from.

But. And this is a big but. The KERNAL immediately forgets almost everything about an open file. The KERNAL doesn't maintain the filename, nor does it know the partition number, nor the path on the device. It doesn't even know what kind of device it's talking to. It can only look up the device number and the secondary address and that's it.

That's fine as long as you keep that file open. But remember, the KERNAL can only keep 10 files open, and each device can only have open far fewer than that. Why would this matter? Well, let's take a look at my Mac. Right now, the Finder alone has 145 files open. Safari has 197 files open. And BBEdit has 98 files open (even though I'm only editing this single document). Modern computers, with their many running tasks, have hundreds if not thousands of files open, at the same time. Is that insane? Yes! It's a bit insane. Will C64 OS have that many files open? No, it will not. And we don't want to lose the simplicity of a system that is useful without needing to work with that many files at the same time. But, the principle should be evident.

In order to accomplish great things, an application and the system as a whole should be able to work with files almost seamlessly, and without direct user interaction. Writing a short file, a handful of bytes perhaps, only takes a fraction of a second. A few examples of what might access files automatically come readily to mind: Writing changed settings back to a file in the application bundle. Appending some data to a system log file. Writing a copied selection of text to a clip file in the system folder. Reading a folder-display-settings file when navigating the file system through a file manager. Loading file metadata on demand as the user moves a selection focus. Loading in some graphical UI elements that are only needed under certain circumstances. Loading a cache of some online resource, like the directory of an FTP server. All of these things require accessing bits and pieces of files, here and there, above and beyond the user explicitly navigating to a folder where he or she has a document and opening it. Not even to mention what happens when an application (still single tasking) allows the user to open two or more documents to be worked on at the same time.

The way it works right now, what happens inside the C64 is conveniently generic and what happens inside the drive is conveniently opaque. But this arrangement has its downsides too.

Many one-file games, to take a common example, write a highscore file back to disk. But, they just assume that the disk in the drive, which is equivalent to the partition and subdirectory of a Hard Drive, is the same one whence they were loaded. And so, imagine you're in partition #1, subdirectory: //games/arcade/, of device #9.

@#9
@cp1
@cd//games/arcade/

When you load and run the game, if you're lucky, it will write the highscore file back to the correct device, in which case it will end up in the same folder it itself was run from. Although often enough it'll just blindly write the file back to device 8.

But now let's imagine another scenario. You're in partition #2, subdirectory: //programming/tools/, and you load the game like this: ^1//games/arcade/:paradroid

@#8
@cp2
@cd//programming/tools/
^1//games/arcade/:paradroid

It will load and run alright. But when it saves its highscore file, guess what? It's going to save it back to partition #2, //programming/tools/paradroid-hi.mem or whatever. What's worse, is that upon first load in this scenario, it will search partition #2 //programming/tools/ looking for a highscore file to load in and it won't find one.

So what? Big deal! Yeah, you're right. It's not a big deal when it's a game and an unimportant highscore file. But this does demonstrate the general shortcoming of the opaque nature of the KERNAL's file accesses. The C64 simply has no means of knowing internally where anything is, where anything—including the program that's running—came from, where anything should be, or where to put or get any additional resources beyond that first load. And the result is very limiting. This is a shame considering that we're not using 166K floppies anymore, we're using 128 GIGABYTE SD Cards.

C64 OS File References

I am under no illusions that anyone will ever write a C64 game that requires C64 OS to run. It would be cool if someone found its services so useful that to rewrite them would be a waste of time. But, chances are a game won't need most of its services, and you'd be crazy to waste 4 to 8K of memory on the libraries of an OS that you don't need. The beauty of the C64 is that you can patch out all the roms and use every last byte and every last CPU cycle in precisely the way you want, in the service of some visual and auditory spectacle that will blow people away. Please! Have at it!

But I've seen how people write tools for the C64. They are ad hoc at best. They rarely benefit from the behavior of other C64 tools and they suffer from all the same problems of there simply being so few modern resources to work with. Rightfully, no one wants to rewrite complex string, file and UI libraries for every little tool. It is for this that I'm writing C64 OS. So that I can write my own tools and applications and feel like I'm writing them with ease and that each becomes more useful the more of them that are written.

So let's talk File References and File Routines. Dynamically allocated File References, also known as File Refs or FREFs for short, are exactly one page of page-aligned memory. This is what you get when you call pgalloc with a request for 1 page. A File Reference does not necessarily have to be a full page, if its memory is managed manually. The contents of a File Ref is structured as follows:

Name Index Size
Device Number frefdev 1 Byte
Partition Number frefpart 1 Byte
Logical File Number freflfn 1 Byte
File Name frefname 17 Bytes
Absolute Path frefpath 0 to 236 Bytes

The C64 OS booter loads and runs a drive detection and registration routine. This builds a table of device numbers to drive type IDs. The table is constructed automatically during bootup, but a C64 OS standard tool will allow you to re-run the routine during runtime to update the table. You'd typically do this after you turn a connected drive on or off. C64 OS's file routines use the device number in a File Reference to look up the corresponding drive type, and use this to know whether the device supports partitions and/or subdirectories.

The logical file number (LFN) doubles as a flag for whether the file is open. If the LFN is 0, the file is closed. When you open a file, by File Reference, it is automatically assigned an available LFN, so you never have to keep track of these manually. And if you try to open a file whose LFN is not zero, you'll get a File Already Open standard KERNAL error.

The Filename is 17 bytes long, because it is a C-style string. This affords 16 characters of filename plus a 1 byte null terminator. You never need to manually supply the length of the filename to the C64 OS file routines. Even though C64 OS uses the SETLFS and SETNAM KERNAL routines under the hood, it measures the length of the null-terminated filename for you.

The Path is structured as what has become the defacto standard, as used by CMD devices, and as also works on SD2IEC devices. //absolute/path/name/. The pathname can be up to a maximum of 235 characters, and it too is a C-style string that ends with a null character. Any path that is set will be completely ignored if the drive type is found to be the kind that does not support subdirectories. If you try to use a path on a device that supports subdirectories, but in a partition that does not, it will simply produce an error.

You might be thinking, oh what a pain! I have to create one of these things just to open a file?! C64 OS itself maintains at least 2 (sometimes 3, and for a brief moment while transitioning between apps, up to 4).

The first is the System File Ref. This is a manually memory-managed file reference, so it's less than a page. It knows where the system folder is and whence this instance of C64 OS is running. This is pretty cool actually. You could very easily have two different versions of C64 OS, or two different configurations, set up on different partitions or even in side-by-side folders of the same partition. The System File Ref is what identifies to the currently running instance, what device, partition and system folder it is running from.

The second is the App File Ref. This informs the currently running application what it is. It seems a simple concept, for a program to be able to answer Who am I? and Where do I come from? but most C64 programs actually do not know these things. Just imagine when you write a little BASIC program, and save it as, "hello world". The program, while it's in memory, has absolutely no idea that it is called "hello world", nor where and on which device it is saved. Actually, it doesn't even know that it has an on-disk representation, it may have just been freshly typed in from scratch.

The third is the Open File Ref. This one is optional, it is configured to reference a document file that was used to open the application. In the simplest example, a text file that is opened by a text editor. You use the C64 OS file manager to navigate about the file system. When you come across a text file you may click to open it in a text editor. The system will configure the Open File Ref to point to the text file, plus a File Ref to point to the text editor and will then call loadapp passing a pointer to the new App File Ref. The old app has a chance to quit using the old App File Ref to find its own bundle, for saving state, such as the last device, partition and path you'd navigated to in the File Manager. Then the new AppFileRef pointer is pointed to the new App File Ref for the new app to be loaded. It is the job of the app that supports loading document files to check, during its initialization, if Open File Ref is configured. If it is, the app should proceed to open that file. When the app saves the file it can use the settings of the Open File Ref to know where to write the file back to.

Lastly, I mentioned that there may momentarily be one more File Reference allocated. This is, as just mentioned above, what happens when the current app calls loadapp for the next. For a short time both apps have a configured file reference. The outgoing one still has access to its own app file ref so it knows where to save its state or settings back to. And the new one is configured ready to be put in place for the incoming app to know where to load its additional resources.

It is always possible for your app to manually configure a file reference. But for the most common cases it actually doesn't need to. For instance, if you are in your text editor and working on a new file, when you want to save you can invoke the system save panel. This is a UI that allows the user to navigate the file system and enter a filename. What is returned is simply a pointer to the file reference to use for saving the file. Your app doesn't actually have to construct that file reference, it just passes the pointer to it through to the file routines. And that now naturally takes us to the file routines.

File Routines

Working with files is more than just opening and closing connections to an I/O device. You also have to be able to read data from the file into memory, and write data from memory to the file. The KERNAL really only provides generic I/O routines to retrieve or to send one byte at a time. CHRIN reads a byte from the current input channel. The keyboard is a bit of an exception, there is also the routine GETIN. And there is also the low-level ACPTR routine which retrieves a byte from the physical serial bus. But in any of these cases, it merely puts a single byte into the CPU's accumulator.

What do you do with it then? Well, you need to write code to figure out where to put that byte, and how many times to loop, and you need conditional code for status checking. And we're on an 8-bit CPU so heaven forbid you need to read more than 256 bytes of data or you're into the fun world of 16-bit looping.

C64 OS provides some of the most essential features that make life easier: fopen, fclose, fread and fwrite. Plus one additional routine, finit. For now, a future version of C64 OS may add other file routines. Let's go over these briefly, starting with the oddball, finit.

FINIT

FINIT takes a RegPtr (a 16-bit pointer, .X low byte, .Y high byte) to a File Reference. The referenced device will have its default partition and default path set according to the partition and path in the file reference. This is convenient if you want to do something low-level with the device that doesn't consist of explicitly opening a file. For example, you might do this before loading a directory. The finit sets the default path, then you can request a directory with the relative $.

There is another function of this routine, if you set the filename as $:filename, it will return the 16-bit block size of that file, as a RegWrd (a 16-bit word, .X low byte, .Y high byte.) It is an unfortunate limitation that this can only be done on files with a filename of 14 or fewer characters, because 2 characters in the 16 byte filename string must be taken up by "$:". Nonetheless, it is very convenient to know how big a file is, especially if you want to allocate a memory buffer big enough to hold the whole file, before proceeding to read in the file's data.

FOPEN

FOPEN takes a RegPtr to a File Ref, plus a set of bit flags in the Accumulator. The flags are as follows:

Constant Meaning
FF_R Read
FF_W Write
FF_O Overwrite
FF_A Append

If these are passed in illogical combinations, you won't get an error, but will get an unexpected result. Read cannot be combined with Write, for example. Nor can Overwrite or Append be combined with Read. If you use Write, it can sensibly be combined either with Overwrite or with Append, but not with both. This makes sense if you think about it for a moment. There is only one way to open a file for Read. But if you'll open a file for Write, by default you'll get an error if the filename already exists. Alternatively, you can pass Overwrite, which will replace any existing file with the same name. Or, you can pass Append. If no such file already exists you'll get an error. But if the file already exists it will be opened, and the cursor will be moved to the end of the file. Append is rare, but it is most useful for log files and the like.

You do not need to specify a Device Number, a Logical File Number, or a Secondary Address. Nor do you need to specify the length of the filename. Just the pointer to the FileRef is all you need to open that file. The Logical File Number, Secondary Address, and all other commands and flags are sorted out for you. And of course, the file will be opened on the device, in the partition, and at the path specified in the FREF. That is after all, the point.

If you try to call FOPEN passing a pointer to a File Ref that is already open, you'll generate the KERNAL's standard File Already Open error.

FCLOSE

FCLOSE is a very simple routine. You pass a RegPtr to a previously opened File Ref, and it will close it for you. It will set the Logical File Number on the FREF back to 0, indicating that its closed, and that LFN will be freed back up to be automatically used by a subsequent FOPEN.

Calling FCLOSE on a File Ref that's not open generates a File Not Open error. There is not much else to say about FCLOSE.

FREAD

FREAD requires three 16-bit arguments. As I discussed in the post, Passing Inline Arguments, this is a perfect candidate routine for using inline arguments. For consistency with the other routines, the File Ref whence to read is passed as a RegPtr. Followed by two inline 16-bit words: a pointer to a buffer and a 16-bit length to read. The File Ref has to have previously been opened for read or an error will be generated. Here's how this looks:

    #rdxy opnfileref  ;Read the Open File Ref pointer into RegPtr
    lda #ff_r         ;Set "read" flag
    jsr fopen         ;Open the Open File Ref
		
    jsr fread
    .word buffer      ;Pointer to buffer
    .word 30          ;Read 30 bytes
		
    jsr fclose         ;Close the File Ref
	
    [...]
	
buffer                ;Declare 30 bytes of space here.

My source code has a few standard macros for more easily working with 16-bit values. #ldxy loads a 16-bit number into .X and .Y. If used with a label, it loads the address of the label. #rdxy loads a 16-bit value into .X and .Y that is stored at the memory locations represented by the label. And #stxy will write .X and .Y to the memory at a label.

So the first #rdxy reads the address pointed to by opefileref (a standard pointer in workspace memory) into .X and .Y, which I refer to as a RegPtr. Next the Read flag is loaded into the accumulator, before calling fopen.

Note that, .X and .Y are left undisturbed by any of the file routines. So, after calling fopen, RegPtr is still pointing at the same file reference. Therefore, the call to fread only needs the two inline arguments that follow it. These two arguments, in the example above, are hardcoded to an inline buffer and for a specific size. If you wanted these to be dynamic you could simply add labels to these two inline arguments and set them as you would set any other variables.

Lastly, the call to fclose doesn't require any additional arguments. The RegPtr is still set to the same File Ref after calling fread. Everything is done and closed and you've got 30 bytes from the file into the buffer. If you ask me, that is really simple. It gets even better if that 30 byte read exceeds 255. It could just as easily be 4096 or any 16-bit number and you don't have to see any of that nastiness.

FWRITE

FWRITE is nearly identical to FREAD, except in the other direction. Let's just take a look at the code straight away.

    #rdxy opnfileref  ;Read the Open File Ref pointer into RegPtr
    lda #ff_w.ff_o    ;Set "write" and "overwrite" flags
    jsr fopen         ;Open the Open File Ref
		
    jsr fwrite
    .word buffer      ;Pointer to buffer
    .word 32          ;Write 32 bytes
		
    jsr fclose         ;Close the File Ref
	
    [...]
	
buffer .text "This string contains at 32 bytes"

The process is virtually identical to FREAD. Read in a RegPtr to a File Ref. Set the flags in accumulator to Write and Overwrite, logically OR'd together, and call fopen. Next you can immediately call fwrite because RegPtr is still configured. The first inline argument is a pointer to a buffer out of which data should be written to file. The second inline argument is the number of bytes to write. There is no bounds checking. If you say you want to write 1024 bytes but the buffer is only 32 bytes long, then whatever else is out there in memory beyond the end of the buffer will get written to file.

And, lastly the file is closed. RegPtr is still set, and so there is still nothing to extra to do. If you did some work between the frwrite and the fclose that disrupts .X or .Y, all you need to do is slip a #rdxy opnfileref (or whatever is your file reference) before the final call to fclose.

One more thing

The file routines in C64 OS have at least one other trick up their sleeve. But there will likely be other tricks to come in future versions of the OS.

C64 OS uses a mouse and mouse pointer. But anyone who has programmed any file accesses on the C64 knows that when the VIC-II is showing sprites file access gets slowed way down. It has something to do with the timing getting interrupted. Both FREAD and FWRITE automatically disable and reenable the mouse pointer for you. This is one less thing you have to do on a typical file access.

You can still use the KERNAL

Here's the thing. C64 OS doesn't get rid of the KERNAL. Under the hood, SETLFS, SETNAM, OPEN, and CHRIN and CHROUT, etc. are still being used. If the device is legitimately on the IEC serial bus, then TALK and TKSA and ACPTR and so on are still be used. But if the device is an IDE64, then TALK and TKSA, etc. are being shortcut around by the way in which the IDE64 wedges itself into the CHRIN and CHROUT vectors. In other words, C64 OS is exactly as compatible with the IDE64 as if you were just manually using the KERNAL yourself. But FREAD and FWRITE make it easy to get 16-bit length reads and writes to and from memory. And FOPEN and FCLOSE make it easy to work with files that actually live in real places on devices that are 30 years newer than a 1541.

The reliance on the KERNAL goes a step further than just compatibility. You can actually still use the KERNAL, at your leisure, for binary loads and single byte reads and writes. And you can combine what you want from C64 OS's routines with KERNAL calls. So you can call FOPEN to open a file from somewhere that you don't care to know about where it is. If you want you can use FREAD to load in some part of the file, or not. But then you can simply read the assigned Logical File Number out of the File Reference structure, do a CHKIN and start reading single bytes from the stream with CHRIN just as you normally would.

That's all for now. In future posts I'll give some examples of where I'm using File References and the File Routines to greatly simply an application's interaction with the file system, and derive unexpected benefits from being able to read and write data from and to the system folder and the application's own bundle.

Rock the box.

  1. My plan is to use the space beneath the KERNAL for rendering bitmaps. It's visible to the VIC-II even when the KERNAL rom is patched in. And it's a region that's physically discontiguous with main memory $0800 - $AFFF. It's also hard to run regular code from there, because the interrupt handler depends on the KERNAL being patched in. []
  2. The addressing on the C64's IEC bus is simple compared to serial buses on other computers. The Apple Desktop Bus (ADB, 1986), for example, allowed all the devices on the bus to auto-negotiate their addresses with each other. So you never needed to set an address manually, and you never got a conflict. Later serial buses got even more sophisticated allowing hot swapping which ADB did not support. []
  3. You can read all the gory details in IEC disected, a paper put together by Jan Derogee. There are plenty of other places that document it too, if you search around the internet. []
  4. I've said this before, but I'll say it again. BASIC is essentially a built-in, easy-to-learn, scripting language. Many of its routines backend on KERNAL routines, just the way many PHP routines backend on the C and C++ libraries. []
  5. It's slightly more complicated than knowing where on disk the next byte will come from. Because the drive reads one or more blocks from the disk into a buffer in memory. Individual bytes are then fed from the buffer. Eventually it uses the buffered data to find the next block to read into the buffer. But, we can ignore that detail for now. []

Do you like what you see?

You've just read one of my high-quality, long-form, weblog posts, for free! First, thank you for your interest, it makes producing this content feel worthwhile. I love to hear your input and feedback in the forums below. And I do my best to answer every question.

I'm creating C64 OS and documenting my progress along the way, to give something to you and contribute to the Commodore community. Please consider purchasing one of the items I am currently offering or making a small donation, to help me continue to bring you updates, in-depth technical discussions and programming reference. Your generous support is greatly appreciated.

Greg Naçu — C64OS.com

Want to support my hard work? Here's how!