I'm currently writing a compiler for a custom asm-like programming language and I'm really confused on how to do proper PC-relative addressing for data labels.
main LDA RA hello
IPT #32
HLT
hello .STR "Hello, world!"
The pseudo-code above, after compilation, results in the following hex:
31 80 F0 20 F0 0C 48 65 6C 6C 6F 2C 20 77 6F 72 6C 64 21 00
3180
, F020
and F00C
are the LDA
, IPT
and HLT
instructions.
As seen in the code, the LDA
instruction uses the label hello
as an argument. Which, when compiled, becomes the value 02
, which means "Incremented PC + 0x02" (if you look at the code, that's the location of the "Hello, world!" line, relative to the LDA
call.
The thing is: .STR
is not an instruction, as it only tells the compiler it needs to add a (0-terminated) string at the end of the executable, so, were there other instructions after the hello
label declaration, that offset would be wrong.
But I can't find a way to calculate the right offset, other than having the compiler being able to travel through time. Do I have to "compile" it two times? First for the data labels, then for the actual instructions?
hello
would not be changes by more instructions after the.STR
variable. – kdopen