NEWS, EDITORIALS, REFERENCE
Loading Sequential Files
In the post I've now referenced several times, Memory Managed Loader, I talked about how I created two loaders for C64 OS that are integrated with the memory manager. One is the fixed address loader, primarily for executable code. The other is a relocatable loader meant to be a quick way to load in a sequential file. I've got a couple of updates as I've learned things during my implementation and testing of this code.
How Sequential Data is Loaded
I've been learning a lot about how the IEC serial bus works in the process of writing code for dealing with files. And it has been an interesting area of exploration for me to have at hand the complete disassembled and fully commented KERNAL and BASIC disassembly. While I write some 6502 ASM, and use the KERNAL, I also do some tests in BASIC and try to visualize the steps that BASIC must be taking in ASM to accomplish what it does.
Let me give you an example. I was curious about open files and channels and how those worked. In BASIC I wrote a quick little sample that opens two sequential files for read. Both come from the same drive, (I use a one-gigabyte CMD HD for most of my work and testing,) and so they use different channel numbers, also referred to as secondary addresses, and of course different Logical File Numbers as mandated by the KERNAL. Next what I did was write a little for loop and inside the loop first I would GET# one byte from the first file and print it to screen, then GET# one byte from the second file and print it to screen. After getting and printing maybe 15 bytes from each file the loop ends and both files are closed.
10 open5,8,5,"file1.txt,s,r" 20 open6,8,6,"file2.txt,s,r" 30 for i=1 to 15 40 get#5,a$:print a$; 50 get#6,b$:print b$; 60 next i 70 close5:close6
This totally works as expected. The result is a print out where the text from the two SEQ files are interleaved on screen every other character. Something like this maybe: "HTehlilso aW voerrlyd". Which you can interpret as "Hello World" interleaved with "This is a very". Okay, that's neat, that works. BASIC can do that!
When trying to do something similar in ASM, however, I was surprised to learn that there are only so many KERNAL calls and some critical ones seemed missing. There are 8 low-level calls that work very explicitly with the IEC Serial bus, at a hardware level via the CIA that controls the bus. These are:
- Read/Input
- TALK
- TKSA
- UNTLK
- ACPTR
- Write/Output
- LISTEN
- SECOND
- UNLSN
- CIOUT
These send the low level commands across the serial bus to tell the device either to talk (send data) or to listen (receive data) and then ACPTR and CIOUT actually read or write a byte at a time to a properly prepared device. By the way, these routines cannot be used with the IDE64, because it's not a true serial bus device. To inject its functionality the IDE64 depends on customizable vectors, that are checked during higher level routines which I'll talk about next. Of course, you don't have to use these low-level calls, because they are called internally by higher level routines. These routines can deal with the IEC bus, but they are more abstract and also can deal with "devices" that are not on the serial bus, including the screen, rs232, and special devices like IDE64. These routines are:
- Open
- SETLFS
- SETNAM
- OPEN
- Close
- CLOSE
- CLRCHN
- CLALL
- CHKIN
- CHRIN 1
- CHKOUT
- CHROUT
The steps to use the above have mostly sensible analogs to how it works in BASIC. To open a file you use OPEN, which itself takes no arguments. First you call SETLFS which sets the Logical File Number, Device Number and Channel, just like the BASIC open command's first three arguments. Then you call SETNAM which specifies the filename plus other optional flags just like the 4th argument of BASIC's open command. Now you have an "open file". The KERNAL tracks that open file via the Logical File Number you chose, which is tied to a channel on a device number. The filename is no longer relevant. The device now has one of 16 channels occupied and related to a file inside its physical storage mechanism.
The KERNAL also manages one thing (for lack of a better word) as the current "input" and one thing as the current "output." Usually input is the keyboard and output is the screen. This is why when you press a letter on the keyboard a corresponding byte is read in and then output onto the screen. To get data from an open file you have to change that current input to an open file, which you do by passing a Logical File Number in a call to CHKIN. Now you have the ability to call CHRIN repeatedly, and each character is read from the current input which is the open file. Each time you read a byte, the cursor inside the file, managed by the DOS inside the drive, is moved forward one byte, so the next read will get the next byte. That's the nature of a "sequential" file.
Things get a bit mucky when you start trying to compare this to BASIC. BASIC does not have a concept of the current input or current output device. Initially I was quite confused by how files and channels are closed, because there are just three KERNAL calls. CLOSE is used with a Logical File Number to tell the device to close the file. CLALL sends a CLOSE to each of the open files and then calls CLRCHN for you. CLRCHN clears the "channels" and sets the current input and output back to the keyboard and screen. This confused me initially, because the call seems to refer to "clearing all channels." Channel is the word which is sometimes used to refer to the 16 channels of a serial bus device, which is also called the secondary address. Confusing! Furthermore, when you read the documentation for CHKIN it uses language like, "Any logical file that has already been opened by the OPEN routine, can be defined as an input channel by this routine." It says "as an input channel." That's not quite what happens. There aren't multiple input channels that can all be currently open. What it does is makes this particular open file the one single current input source. 2
What I really wanted to do was replicate what the BASIC program listed above does, but in ASM. In BASIC you "open" a file. And then you can "open" a second file and a third and a fourth, and then you simply read bytes from whichever file you want. Why does this seem so hard in ASM? The answer is that BASIC abstracts the underlying complexity and the physical limitations of how the hardware works. But in the process it also masks some horrendous inefficiencies. The serial bus doesn't just magically read a byte from any device out there just because you happen to have a Logical File Number on hand for a file you previously opened. Everytime the computer reads a byte from a device and a file on that device both have to be ready for that read. And so, reading one byte at a time interleaved from two different files is terribly inefficient because the next device and file have to be re-readied before each read.
In assembly, what you have to do, after you have opened two files and have two Logical File Numbers, is call CHKIN with the first file #. This, under the hood, figures out that it's a serial bus device and sends any required UNTLK's or UNLSN's to any device currently talking or listening. Next it has to send a TALK and a TKSA to the device that corresponds with your file #. Then it gets the byte using ACPTR. To get the next byte but from the other file, all of this has to be repeated!! Call CHKIN with the second file #. This determines that it needs to send UNTLK to the first device and then call TALK and TKSA to the new device and then call ACPTR to get a byte from the other file. I'm not 100% sure, maybe if it's the same device between files it could skip the UNTLK and TALK, and just change the secondary address with TKSA, but if the two open files are on different physical devices such a shortcut as it is would not even be possible.
Obviously getting bytes interleaved from two different files is not a smart idea. I did in BASIC just to prove to myself that it can be done. So that I could then turn to ASM and ask myself, how does BASIC do that? What you would do in ASM if reading data from a single file is do the CHKIN once. And then call CHRIN as many times as necessary in as tight a loop as possible. Even CHKIN is not super efficient, because it is abstractly suitable for reading bytes from many different source devices. If you know the device is truly on the serial bus calling ACPTR repeatedly would be much faster. But then the problem is that you break compatibility with storage devices that are not physically on the serial bus, like the IDE64 and maybe RamLink or RamDrive or others like that.
But, if the above is how you read efficiently with ASM, how do you read efficiently with BASIC? I suppose you would just call GET# repeatedly, right? Essentially, yes. However, I read through the implementation of BASIC's GET# command. Everytime you call it, after meandering through some general code to figure out if you're in direct mode or not, and if the file number exists and so on... it calls CHKIN. Whoa. I pray to god that CHKIN is smart enough to know that the current input is already set to this open file, because if it's not it would have to send UNTLK, TALK and TKSA all over again at every GET# call. And, given how absurdly slow BASIC actually is at reading in bytes via a loop of GET# I fear it might actually be doing all that stuff between each read. This does though explain how it is able to liberally get bytes interleaved from two different open files. Every time you do a GET# it goes through the process of re-preping the device to send another byte.
Now, as it happens, the BASIC GET# command can be followed by numerous variables. Which means that a single GET# command is able to read multiple bytes in a row from the same file without having to go through CHKIN between each byte read. But, if you didn't know this, which I certainly didn't, you might think it was totally reasonable to call GET# inside a for loop that loops 10 times, instead of doing something like: GET#2,a$,b$,c$,d$,e$,f$,g$,h$,i$,j$
BASIC is inefficient. Not just because it's interpreted, but because it's hard (or impossible) to do a tight loop on just the nuts and bolts of what you want to do. Each hi-level command called in a loop means lots and lots of excess code is being executed on each loop. However, even in ASM, if you rely on the KERNAL, some of those calls are quite high level. If you want to maintain compatibility with the IDE64, for example, you can't start calling TALK and TKSA yourself and then do a tight loop around ACPTR, instead the most you can tight loop around is CHRIN and just live with whatever excess waste is in that routine. All of this is what was motivating me to write the Sequential file relocatable loader, which I'll talk about next.
Loading a Sequential File Fast
Files on a C64 have a few general types. PRG, SEQ, USR, REL and DEL. DEL is a weird one, I don't know if any DOS command can produce it. It supposedly represents a file that's been deleted. But if you scratch a file, its directory entry is removed. It can be produced by using a sector editor and changing the type flag of a file in the directory entry. REL is non-sequentially stored data, useful for on-disk database type storage. PRG, SEQ and USR are all "sequentially" stored data. SEQ is pretty common for all kinds of data that will be loaded in manually or piecemeal, it's not generally a problem for a SEQ file to be super long, more than 64K for example. PRG is meant mostly for address-oriented data, the most obvious being program code that has to load to a specific address where it's been assembled to. Although, it can also be used for other types of data that should load to a specific address like Koala images or SID files, etc. USR is "user defined". It's essentially SEQ but is a way of marking that something is special. GEOS uses USR for all of its file types, because they use a special format.
Let's talk PRG files for a minute. They are sequentially stored bytes just like a SEQ file. However, on disk, if you check in a sector editor for example, the first two bytes are not data but the address to where the rest of the file's data should load to in memory. But this only matters when using the KERNAL's (and hence BASIC's too, which backends on the KERNAL) LOAD routine. When you use load, the serial bus communications are the same, it opens a file and starts reading bytes in. But the first two bytes read from disk are not placed in memory. There are two ways they can be handled:
By default, and I'm not really sure why, all loaded programs are automatically relocated to $0801. So if you just give the command, LOAD "program",8 a file named program will be opened, the first two bytes will be read across the bus but then just discarded, and the rest of the data will be read in and stored starting at $0801. If the program was not assembled to $0801 it will likely crash. The BASIC load command has an additional flag, and the KERNAL calls have an equivalent byte that can be set. So in BASIC, LOAD "program",8,1 that extra ,1 tells the KERNAL to not relocate to $0801. In which case those first two bytes are not ignored but are used to determine whence to load the rest of the file.
The reason why a LOAD is much faster than writing a loop that calls CHRIN over and over is because of the fact that CHRIN has significantly more overhead per byte read. Including the ability to CHKIN to a different open file between reads! Whereas with LOAD once the initial conditions are setup the loop around reading bytes and sequentially storing them in memory is much tighter. That makes a lot of sense. However, because loads can be relocated to wherever you want, it made sense to me to try to LOAD a SEQ file that has textual data. I wanted this for loading in simple things, like config files and menu definitions files (which I'll be returning to in my next post, stay tuned.) It would be faster than my own loop around CHRIN, and I could determine the file size prior to the LOAD to make sure I can allocate enough memory for it ahead of time.
I discussed this with the guys in IRC and they seemed to think it was a pretty good idea. I wrote the code, in anticipation of using it for loading menu.m files. But I only have gotten around to really using it this past weekend. The one catch was that the text file's first two bytes would have to be bogus, because LOAD would ignore them. Big surprise when I tried to load one of these files for the first time.
The error light on the drive flashed and the KERNAL replied with a file read error or some sort. When I read the drive's error channel I get a 64: FILE TYPE MISMATCH. The file type does not match the file type in the directory entry for the requested file. Oh Crap! Why didn't anyone in IRC mention the fact that you physically cannot LOAD a file of type SEQ? At first, I didn't think this was even possible, because it didn't seem to me that the KERNAL ever knows what the directory entry's file type is. And it's the KERNAL that actually does something with those initial two bytes, how does it know? As it turns out, and I knew this so I should have figured, in order to do a LOAD you have to use channel 0 with the device. channel 1 is reserved for doing a SAVE, and channel 15, of course, is the command channel. All the other channels, 2 to 14 are for arbitrary I/O to any file type. It must be the case that when you LOAD and that uses channel 0, not only is the KERNAL optimizing the reads but the Drive is optimizing the sending of bytes too. In any case, the drive knows the KERNAL is trying to LOAD not just read all the bytes from beginning to end as quickly as possible (which is what you'd think a LOAD is, from the drive's perspective.) And the Drive's DOS itself freaks out when the file being LOADed is not PRG type. Damn.
Here's the thing though, when you open a PRG file for read or write via one of the channels, 2 through 14, it behaves exactly the same as a SEQ file. If you open a new PRG file for write, then the first two bytes you write are the first two bytes put down on the disk. Even though these two bytes will be treated as the load address if that file is subsequently LOADed. When writing a text file from a text editor, there is no reason why the text editor couldn't read and write the data to and from a PRG file. But most of them don't. Novatext, which I use all the time, just writes SEQ files as you would expect. The problem is that a file written from Novatext cannot be immediately thereafter LOADed without getting the Error #64.
It is possible to use a special tool that will modify the file type byte in the directory entry. It is also possible to use a simple BASIC program to convert, say, menu.t (text in SEQ) to, say, menu.m (text, but in PRG format with 2 bogus first bytes). In fact, this should do it:
10 open 2,8,2,"menu.t" 20 open 3,8,3,"@0:menu.m,p,w" 30 print#2,"xx" 40 get#2,a$:print#3,a$ 50 ifst=0thengoto40 60 close2:close3
That should open a preexisting file, menu.t, for read via channel 2. Then it should open a file, menu.m, as a prg for write. That's the ,p,w at the end of the file name. The @0: at the beginning is to overwrite any pre-existing file named menu.m. Line 30 then writes two meaningless bytes as the first two bytes, these will be skipped by the relocating LOAD. Line 40 reads a byte from channel 2 and writes it to channel 3. 50 checks if we're at the end of the data. If not it returns to line 40 to copy the next byte. Line 60 closes both files.
At this point, I would be able to do what I wanted, because menu.m would be a PRG file that contains just a bunch of text data. However, I thought about it, and it no longer feels worth it. The requirement to convert the menu.t file before it can be used is a real damper. Text editors generally aren't going to produce PRG files directly. And furthermore, I can't think of any other good examples of where I would want to load big text files super fast, as one contiguous chunk of memory. The menu definitions files themselves are so small, 1 page, maximum 2 pages, the load speed is effectively negligible. Other sorts of config files, I'll probably load them in a few bytes at a time, parsing in realtime, and storing the bytes into custom locations, so I won't be able to just load the whole thing at once anyway. And lastly, what about big text files like emails or notes, or other documents like markdown, XML or HTML? The truth is, once you get into large files, the efficient way to edit them in memory is strewn through chained pages. So that if you want to edit somewhere you only need to shift the data within one page. And if the page is going to overflow you can split the data into two pages (which is the computational equivalent of one shift, or even half a shift) And then you can continue to write into the remainder of the page without needing to do any more shifting. Also, large copies and pastes can be done by splitting up the first page, splitting up the last page, and unlinking the second half of the first page split and the first half of the last page split and splicing the pages back in somewhere else. The majority of the data doesn't actually have to be moved.
Anyway, that last stuff will be for the work of an efficient text editing "service app." But the point is, now that I'm rethinking using LOAD to load SEQ data and the downside of not being able to directly edit it without having to convert it first, I realize there isn't all that much benefit to LOADing it anyway. So, I've gotten rid of the loadseq routine. While I'm here I'll mention that I also go rid of loadprg, which was going to be a PRG loading primitive. Instead I've renamed it to loadapp, which handles loading menu.m (as a SEQ) internally, using a loop around CHRIN. And it works just fine. Add this to the list of "Lessons Learned." Believe me, there are a lot of lessons to be learned when you're basically teaching yourself how everything works from the ground up. But, what counts is that I'm making good progress and I'm still having fun.
Next up, bug fixing the recursive menu parsing code, and taking a look at how that arranges the structures of pointers in memory.
- There's also GETIN which confused me for a while. I looked it up in the PRG and it is for getting one character at a time from the Keyboard buffer or screen. CHRIN gets an entire 80 character logcal line from the screen, and/or all data from the keyboard buffer. It says if you're getting data from the serial bus, use CHRIN instead. [↩]
- As it turns out, it's slightly more complicated than I thought. I think it's possible to command more than one device on the serial bus to "listen" at the same time. And then when you're writing data to the bus more than one device receives the data. I'm not yet sure how to make use of this. But it might explain the use of the language implying more than one open channel. [↩]