2
votes

I am writing an assembler in C for MIPS assembly (so it converts MIPS assembly to machine code).

Now MIPS has three different instructions: R-Type, I-Type and J-Type. However, in the .data. section, we might have something like message: .asciiz "hello world". In this case, how would we convert an ASCII string into machine code for MIPS?

Thanks

3

3 Answers

6
votes

ASCII text is not converted to machine code. It is stored via the format found on Wikipedia.

ASCII Code Chart

MIPS uses this format to store ASCII strings. As for .asciiz in particular, it is the string plus the NUL character. So, according to the sheet, A is 41 in hexadecimal, which is just 0100 0001 in binary. But don't forget the NUL character, so: 0100 0001 0000.

When storing the string, I'd take Mars MIPS simulator's idea and just start the memory section at a known address in memory and make any references to the label message set to that location in memory.

Please note that everything in the data section is neither R-type, I-type, nor J-type. It is just raw data.

3
votes

Data is not executable and should not be converted to machine code. It should be encoded in the proper binary representation of the data type for your target.

3
votes

As other answers have noted, the ascii contained in a .ascii "string" directive is encoded in it's raw binary format in the data segment of the object file. As to what happens from there, that depends on the binary format the assembler is encoding into. Ordinarily data is not encoded into machine code, however GNU as will happily assemble this:

.text
start:
  .ascii "Hello, world"
  addi $t1, $zero, 0x1
end:

If you disassemble the output in objdump ( I'm using the mips-img-elf toolchain here ) you'll see this:

Disassembly of section .text:

00000000 <message>:
   0:   48656c6c    0x48656c6c
   4:   6f2c2077    0x6f2c2077
   8:   6f726c64    0x6f726c64
   c:   20090001    addi    t1,zero,1

The hexadecimal sequence 48 65 6c 6c 6f 2c 20 77 6f 72 6c 64 spells out "Hello, world". I came here while looking for an answer as to why GAS behaves like this. Mars won't assemble the above program, giving an error that data directives can't be used in the text segment Does anyone have any insight here?