Written and Maintained by Gregory Nacu

NEWS, EDITORIALS, REFERENCE

February 14, 2017Programming Theory

What is a pointer? (1/3)

Happy Valentines Day. I hope you all have dates and aren't just at home playing with your computer. But if you are, well, here's a post that may be for you.

This is the first of a three part post discussing Pointers, Structures and Recursion in 6502 assembly. The first part is focused on pointers. And the next two posts will be about structures and recursion, respectively. They are closely related topics and relevant to what I'm working on in C64 OS, but they are each big topics that need their own lengthy post. Bare in mind that I've only been seriously programming in ASM for about 5 months, since last October, so I hope that everything I present in these posts will be factually correct. Feel free to correct me in the comments. But I do have a solid background in these concepts as I've been working as a professional software developer for over 15 years. The work that I did on WiNGs made extensive use of pointers, structures and recursion, however my contribution was written in C and cross compiled on a Mac to the SuperCPU's 65816. This time around things are a bit different for me. C64 OS targets the breadbox C64. 65101 and 64K of ram. And I'm not programming in C but in assembly. I'm also doing the coding and assembling natively on the C64/C128 with an REU and Turbo Macro Pro+REU. So, here goes.

What are Pointers?

Pointers are sometimes referred to as vectors. The terminology gets confusing when newer CPUs and programming languages and constructs are introduced. For the purposes of this discussion and for the 6502, pointers and vectors are essentially the same.

Here is the basic concept. The 6502 has a 16-bit address bus. This means it can address memory from \$0000 to \$FFFF. Two bytes are required to represent any address in that space. A pointer is when two contiguous bytes hold the numeric values that represent an address. We can say that those two bytes point to some memory address. The basic concept is that simple. Things get a bit more complicated when it comes time for manipulating pointers, following pointers, reading data through pointers and through pointers with offsets and so on. So that's what we'll discuss now.

The first thing to mention is that the 6502 is Little Endian. You can read all about Endianness on Wikipedia. But, in a nutshell whenever a CPU has to store multi-byte integers the endianness determines the order of those bytes. Little endian means the least significant byte is stored first, followed by more significant bytes.

Big endian reverses this order. The way we write numbers, in Base-10, on paper is analogous to big endian. When you write the number one-thousand twenty-four, you write 1024. The left most digit is the most significant. If we wrote numbers in little endian we would write 4201 to represent the same quantity. Many of the 6502's instructions use some form of 16-bit memory addressing. Those memory addresses must be stored as the Least Significant Byte (LSB) first, and the Most Significant Byte (MSB) second. Let's look at an example:

```            LDA avariable
STA \$0400
RTS

avariable   .byte \$01
```

Here's a very simple routine. Loads a byte from memory into the Accumulator (.A) and stores it at \$0400, which is the home position of the screen.2 In this case, a letter "A" would appear in the upper left corner of the screen. That "avariable" is a label that represents a location in memory that will only be resolved at the time of assembly. Therefore, the actual address that LDA is reading from doesn't need to be known by the programmer, which is extremely convenient. However, when it is assembled, of course, everything gets turned into numbers. Those 4 lines above would end up essentially like this:

```0800:       \$AD \$07 \$08
0803:       \$8D \$00 \$04
0806:       \$60
0807:       \$01
```

I assembled that by hand, so forgive me if I made a mistake. Actually, better yet, leave a comment to let me know and I'll fix any issues.

Note: I also love languages and grammar. I want to reignite the use of whence, hence and thence, which mean: from where, from here and from there, respectively. And also whither, hither and thither, which mean: to where, to here and to there, respectively. So beautiful.

\$AD is the op-code that represents an LDA from an absolute address. It takes two arguments, the LSB and MSB of the address whence it should load a byte into .A. The second line, \$8D is the op-code for STA to an absolute address. It takes two arguments, the LSB and MSB whither it should write the value in .A. You'll notice the arguments are \$00 and \$04. That's little endian for \$0400, first spot in screen memory. The third line is \$60, the op-code for RTS. And the fourth line is just \$01, the data to be loaded by the instruction in the first line. The first column in the sample above shows the memory address where the following bytes are. So the LDA op-code is at \$0800, and the data to be loaded is at \$0807. Now you can see that the arguments for LDA are also in little endian, \$07 and \$08 represent \$0807.

This point may be controversial, but it occurs to me that the arguments of the LDA instruction are essentially a pointer. It's just that this pointer is hardcoded into the middle of some code so it's not hugely practical to manipulate it. In fact, if one were to change it that would be considered self-modifying code. Some people frown on this because it can lead to very strange bugs, but it can also be useful and super efficient if used carefully.3 However, the essential nature of what a pointer is can be seen here. The bytes stored at \$0801 and \$0802 hold values that represent an address elsewhere in memory, \$0807 in this case. Thus we can say that \$0801,\$0802 is a pointer. Put another way, LDA in absolute mode takes an implicit pointer as its argument.

Now, you might wonder, in the code above, why not just use LDA in immediate mode and have it load .A with a static byte? It could have been LDA #\$01. No need to get the value from somewhere else. The problem with that is that if you ever want to change the value that will be loaded, you would be forced to resort to self-modifying code. By using LDA with an absolute pointer to a memory location where the value is stored, the stored value can be considered no longer "code" but "data." This may seem to be a very basic step up in sophistication but it represents a hugely important leap of abstraction.

Pointers to pointers

In immediate mode, the data to load is hard coded right inline with the code. But in absolute mode the data is abstracted out to some other place, but the pointer to that data's location becomes hard coded. The next conceptual leap, which unleashes an awesome and unlimited degree of flexibility is to abstract the location of the pointer. This is referred to in 6502 parlance as indirect mode. What gets hard coded in indirect mode is a pointer to a pointer. Or, put another way, the pointer itself is not part of "code", but is part of "data." And since the whole point of data is to be changed and manipulated it suddenly makes sense and is never frowned upon to manipulate a pointer. When most people talk about pointers, they mean them in this sense; bytes stored in memory as data, ready to be manipulated, that represent not just text or numbers to be displayed but dynamic addresses to other places in memory where other data or even another pointer can be found.

It can be a head turning experience to follow exactly how this works. But once it is understood it is very powerful and unleashes a whole new level of dynamism in computer programming. Remember that a pointer is just two consecutive bytes, little endian ordered, such that the first byte is the least significant value of a memory address and the second byte is the most significant value of a memory address. Rather than having these two bytes be the direct arguments of an instruction, these two bytes hang out in memory and can be found along side other data. In fact, two bytes that represent a pointer can sit in memory right side by side with two more bytes that represent another pointer and then followed by two more bytes for another pointer and so on. 6 bytes in a row that are all just pointers, that's the definition of "data," because those 6 bytes can't just be executed, as there are no instructions represented there.

Pointers, aka data that holds the address of some memory location, can be anywhere in memory. They can even be written to disk if that made sense for the particular situation. However, beyond just storing pointers in memory, there are some gotchas about the way the 6502 is able to make use of them in indirect mode. The 6502 has only 9 instructions that have an indirect mode of addressing. These are: (grouped according to their general purpose)

Arithmetic
ADC — Add with Carry
SBC — Subtract with Carry
Comparison
CMP — Compare Accumulator
Bitwise Operations
AND — AND Accumulator
EOR — Exclusive OR Accumulator
ORA — OR Accumulator
Load/Store
LDA — Load Accumulator
STA — Store Accumulator
Jump
JMP — Jump

Only the above instructions have an indirect addressing mode. And of these, JMP is most unlike the others. So I'll start by talking about it. The JMP instruction is represented by two op-codes. An op-code is a specific variant of an instruction. Each variant can have a different number and kind of arguments. So JMP is represented by \$4C and \$6C. \$4C is the absolute mode JMP. It takes two argument bytes, little endian again, the first is the LSB memory address to jump to, and the second is the MSB memory address to jump to. Literally, these two bytes are read in and assigned as the new program counter so the 6502 continues executing somewhere else. In other words \$4C's two argument bytes are a pointer to where execution should continue.

The \$6C version of JMP however has two argument bytes that are a pointer to a pointer to where execution should continue. Here's how the absolute version looks in ASM and in Machine Language.

```            JMP aplace
...
aplace      LDA #0

Might assemble into this:

0800:       \$4C \$00 \$0A
...
0A00:       \$A9 \$00
```

The arguments are a two-byte pointer stored at \$0801,\$0802 and they point to the address \$0A00. And that's where execution carries on after the JMP. Now, here's how the indirect version looks in ASM and in Machine Language.

```            JMP (apointer)
apointer    .word aplace
...
aplace      LDA #0

Might assemble into this:

0800:       \$6C \$03 \$08
0803:       \$00 \$0A
...
0A00:       \$A9 \$00
```

The parentheses around apointer are the notation used in Assembly language to make the difference between the immediate and indirect versions of the JMP instruction. With the parentheses the assembler uses \$6C as the op-code. When the CPU reads in the two argument bytes, \$03 and \$08, instead of jumping to \$0803, it uses \$0803 as the address of a pointer. \$0803,\$0804 is a two byte pointer (little endian of course) \$00 and \$0A. And so execution carries on at \$0A00 after the indirect JMP. Because the pointer to the final address is "data" not "code" the values at \$0803,\$0804 are meant to be manipulated. Although we'll get into more about how that happens in Part 2 and Part 3 of this post series.

Pointers and Zero Page

You'll notice that the indirect JMP was able to reference its pointer with a two byte pointer. That is, its inline arguments are a full two byte pointer that point to a data pointer that can be stored anywhere in memory. This is unusual. Indirect JMP is unique on the 6502 for allowing its data pointer to be anywhere in memory. The other 8 instructions which offer indirect addressing must reference their pointer out of zero page, as we'll see in more detail below. This is a huge limitation and frankly makes the 6502 a very un-ideal platform for multi-tasking. Which is one of the reasons why C64 OS is not multi-tasking.

Perhaps the reason for the limitation on using zero page is because the indirection used by all other instructions also involves one of two types of indexing. The two indexing modes are called: Indirect-Indexed and Indexed-Indirect. I'll explain how these work in detail in a moment. I want to point out that by corollary this means there is no unindexed indirection for anything other than JMP. This is also a huge limitation of the 6502 as indexing requires the use of either the, aptly named, X index register or the Y index register. 4

Indirect-Indexed is much more common than Indexed-Indirect, so let's just explore that first one a bit. Let's say you want an indirection, you want to read something through a pointer, but you don't care about indexing. The limitation is that there is no indirect mode that does not involve indexing, so even if you don't care about indexing you still must first set the index register to 0 (i.e., LDY #0). When you do this though, whatever was in .Y gets disrupted. If you care about what is stored in .Y you have to first save .Y somewhere, then set .Y to zero, do an LDA (apointer),Y, then restore .Y to its original value. The number of registers in a 6502 is very limited so juggling them can become a real pain.

Another interesting observation about indirect addressing is that it is only supported by instructions that affect the accumulator. For example, LDA and STA support indirect mode, but LDY and STY do not, nor do LDX and STX. CMP supports indirect mode, but it compares against .A. Meanwhile, CPY and CPX do not support indirect mode. AND, EOR and ORA only ever work with the accumulator. Same with ADC and SBC. This focus on the accumulator with indirection is just another limitation of the 6502. But, hey, we're learning about this stuff because it's fun right?

Let's look at how pointers are used in the indirect-indexed addressing mode of the LDA and STA instructions. When you see how they work with these it will be obvious how the same pattern applies for the other indirect-indexed instructions.

Might assemble into this:

Again, the above was hand-assembled, please forgive any mistakes. I left the branch offset uncalculated because I hate trying to figure those out.

What you can see is that the first two blocks of four lines simply set up two pointers. But the pointers are setup in zero page. The first, \$FB,\$FC is a little-endian-ordered pointer to some data. The second, \$FD,\$FE is a little-endian-ordered pointer to the start of screen memory. They are pointers, just like any other, two consecutive bytes that hold an address to somewhere in memory, however the pointers themselves are held in zero page.

Next I have a pretty standard way to do a loop. I've pre-counted the data bytes "Hello World" as 11 characters. The range of offsets therefore is 0 to 10. The .Y register is initialized to 10. Next we do an indirect-indexed LDA from the first pointer and an indirect-indexed STA to the second pointer. Decrement Y. If Y is >= 0 we loop back up to do the LDA/STA pair again, repeating until .Y finally equals less than 0. Then it stops looping and does the RTS.

Here's what you should notice. The op-code for the indirect-indexed mode of LDA is \$B1. \$B1 takes only one argument byte. Its one argument byte cannot afford a full pointer, as a full pointer requires two bytes. The one argument byte therefore is only the lo byte of the pointer, and the hi byte is implicitly \$00. Aka page \$00, otherwise known as zero page. The same is true for the indirect-indexed version of STA, the op-code is \$91 and it takes only a single argument byte, the lo byte of a pointer to zero page.

Internally, when the 6502 encounters the indirect-indexed LDA, it takes the argument, \$FB in this case, and makes an implicit address of \$00FB. This is the address of a pointer \$00FB,\$00FC which we previously pointed at \$081A. It then adds .Y to \$081A to find the final address whence to load .A. \$081A + \$0A = \$0824. That's the address of the screen code "d". Then when it encounters the indirect-indexed STA, it does almost the same thing. Takes the single argument, \$FD, and makes an implicit address of \$00FD. This is the address of a pointer \$00FD,\$00FE which we set up to point to \$0400. It then adds .Y to \$0400 to get a final address whither to write .A. \$0400 + \$0A = \$040A. So the "d" gets written to the first screen row several columns in from the left. Decrementing .Y moves the index backwards from 10 to 0, inclusive. And copies the data "Hello World", one character at a time, in reverse order, into the first row of the screen.

Our example above is Indirect-Indexed, which uses .Y as the index. Indexed-Indirect is a bit different, but what those differences are is a discussion about indexing, not a discussion about pointers. Therefore I will leave that discussion to another post. Instead, I want to just focus on how the pointers work. LDA is able to load .A with a value that comes from somewhere in memory, but where in memory is pointed to by a pointer. The immediate argument to LDA indicates not where to find the data, but where to find the pointer to the data. The same is true for STA. Where to store .A is pointed to by a pointer. And the immediate argument to STA is where to find that pointer.

Of course, one doesn't need to read from a pointer and write to a pointer, one could read an immediate value and write through a pointer. Or read from a pointer and write to an absolute address. Or any other combination. Each of the other instructions that support an indirect mode work the same way to resolve the actual address whence their data comes. And they all, except JMP, have this limitation that the pointer to their data must be in zero page. Why this is even a limitation and how pointers can truely be put to use in a powerful and dynamic way will be the topic of the next post, in which I will talk about structures.

Please leave a comment if you have any questions, find any issues, or want to encourage me.

1. More or less a 6502 as far as assembly programming is concerned. Only the 6510 has the special processor port controlled by \$0000 and \$0001. []
2. Screen memory can be moved, of course, but \$0400 is the default when a C64 is turned on. []
3. I am actually using self-modifying code in C64 OS's text screen blitting routines. It's potentially dangerous if more than one thing calls the routine, but the result can be blisteringly fast, and for screen compositing, I want the most speed I can get from a 1MHz clock. []
4. Indirect-Indexed must and can only use .Y as the index, and Indexed-Indirect must and can only use .X as the index. []