2
votes

I was working on a project for my class which requires me to read in from a file line by line. Our end goal is to make a MIPS assembler using the MIPS language, but the problem pertinent to this post is that I cannot read in line by line from the file. I have created this so far as a mechanism to read in information from the file.

.data
file_loc: .asciiz "test.asm" #note: when launching from commandline, test.asm should be within the same folder as Mars.jar
buffer: .space 1024 #buffer of 1024
new_line: .asciiz "\n"  #where would I actually use this?

#error strings
readErrorMsg: .asciiz "\nError in reading file\n"
openErrorMsg: .asciiz "\nError in opening file\n"

.text
main:
jal openFile
j endProgram

openFile:
#Open file for for reading purposes
li $v0, 13          #syscall 13 - open file
la $a0, file_loc        #passing in file name
li $a1, 0               #set to read mode
li $a2, 0               #mode is ignored
syscall
bltz $v0, openError     #if $v0 is less than 0, there is an error found
move $s0, $v0           #else save the file descriptor

#Read input from file
li $v0, 14          #syscall 14 - read filea
move $a0, $s0           #sets $a0 to file descriptor
la $a1, buffer          #stores read info into buffer
li $a2, 1024            #hardcoded size of buffer
syscall             
bltz $v0, readError     #if error it will go to read error

li $v0, 4
la $a0, buffer
syscall

#Close the file 
li   $v0, 16       # system call for close file
move $a0, $s0      # file descriptor to close
syscall            # close file
jr $ra

openError:
la $a0, openErrorMsg
li $v0, 4
syscall
j endProgram

readError:
la $a0, readErrorMsg
li $v0, 4
syscall
j endProgram


endProgram:
li $v0, 10
syscall

The problem is reading in a file will read in as many bytes that can be held within the buffer (1024), rather than the line.

For example reading in a file named test.asm with the following data:

test abc abc abc

test2 1231 123 123

will yield an output of:

test abc abc abc

test2 1231 123 123

Where as I am hoping to read in one line at at time:

test abc abc abc

I know that reducing the buffer size will help limit this information, but in the case of a long line of input it may cause some problems. I was hoping to see if anyone knows how to read in a certain amount from the buffer and then split it at the new line indicator (I'm pretty sure its "\n").

Any help/tips would be appreciated! Thanks!

1
So read 1024 (or whatever number) bytes at a time and write a function that returns the next line from the buffer.Michael
any luck with that?Antonio Correia

1 Answers

0
votes

It's certainly possible to read large chunks like you're doing. This is efficient in terms of minimizing syscalls but the code is a bit fussy to write. You'd need a loop to walk over each chunk to locate any newlines and a buffer to store incomplete lines that require multiple reads to build.

A lazy approach that's easier to code is to read the file byte by byte which makes newline detection effortless:

# if current byte is a newline, consume line
lb $s4 ($s1)  # load the current byte from the buffer
li $t0 10     # ASCII newline
beq $s4 $t0 consume_line

You could use sbrk (system call 9) to allocate more memory for the buffer to handle arbitrarily long lines, but for this example I'll assume lines are never longer than 1024 bytes. I leave better error handling, line consumption code and modularity as an exercise in the interest of keeping the example minimal.

You can save this as read_file_lines.s, run with spim -f read_file_lines.s and it'll print its own source code.

.data  
fin: .asciiz "read_file_lines.s"
buffer: .space 1
line: .space 1024
.globl main
.text
main:
    la $s1 buffer
    la $s2 line
    li $s3 0      # current line length

    # open file
    li $v0 13     # syscall for open file
    la $a0 fin    # input file name
    li $a1 0      # read flag
    li $a2 0      # ignore mode 
    syscall       # open file 
    move $s0 $v0  # save the file descriptor 

read_loop:

    # read byte from file
    li $v0 14     # syscall for read file
    move $a0 $s0  # file descriptor 
    move $a1 $s1  # address of dest buffer
    li $a2 1      # buffer length
    syscall       # read byte from file

    # keep reading until bytes read <= 0
    blez $v0 read_done

    # naively handle exceeding line size by exiting
    slti $t0 $s3 1024
    beqz $t0 read_done

    # if current byte is a newline, consume line
    lb $s4 ($s1)
    li $t0 10
    beq $s4 $t0 consume_line

    # otherwise, append byte to line
    add $s5 $s3 $s2
    sb $s4 ($s5)

    # increment line length
    addi $s3 $s3 1

    b read_loop

consume_line:

    # null terminate line
    add $s5 $s3 $s2
    sb $zero ($s5)

    # reset bytes read
    li $s3 0

    # print line (or consume it some other way)
    move $a0 $s2
    li $v0 4
    syscall

    # print newline
    li $a0 10
    li $v0 11
    syscall

    b read_loop

read_done:

    # close file
    li $v0 16     # syscall for close file
    move $a0 $s0  # file descriptor to close
    syscall       # close file

    # exit the program
    li $v0 10
    syscall