Better algorithm for comparing adjacent lines in a file

Question

I have completed the assignment (yes it is for a programming class), but I am afraid I didn't go about it in the most efficient way possible. It is basically the uniq program, it will compare adjacent lines in a file and only print one copy of any repeated lines. A few notes: printUniq() is my own function that takes into account various flags, readline() is another function that reads a line of arbitrary length into a char * buffer using malloc and realloc. Here is the part I am worried about:

if(prevline != NULL)
{
  while(thisline != NULL)
  {
     while(thisline != NULL && strcmp(prevline, thisline) == 0)
     {
        count++;
        free(prevline);
        prevline = thisline;
        thisline = readline(stream);
     }
     printUniq(prevline, cflag, dflag, uflag, count);
     count = 1;
     free(prevline);
     if (thisline != NULL)
     {
        prevline = thisline;
        if((thisline = readline(stream)) == NULL)
        {
           printUniq(prevline, cflag, dflag, uflag, count);
        }
     }  
  }

Is there a better way to structure this program? I hate having to check thisline for NULL three times in a loop. The first NULL check in the outer while loop is necessary, and the next check in the nested while is needed in case the last lines are duplicates. The next check after the call to free basically checks if the "Duplicate loop" was exited because of thisline being null, and if not, it will allow the program to get another line. Then the next check is only there for the very last line in the file, because if it weren't there, when readline returns a null (there were no more lines in the file), the loop exits and the prevline was never printed.

Anyways, any help is appreciated.

Mihai Mihai · Accepted Answer · 2012-04-07T21:42:14

I sugest to read the file in only one place, since it will make code more manageable. Maybe something like this might work:

prevline = NULL;
count = 1;
while ((thisline = readline(stream)) != NULL) // will stay in the loop for as long as it reads from file
{
    if (prevline == NULL)
    { // this is the first read from file
         prevline = thisline;
         continue;

    }

    if (strcmp(thisline, prevline) == 0)
    {
         count++;
    } else // found a different line
         if (count > 1) // but after I already counted several identical
         {    // so I will print the line
                printUniq(prevline, cflag, dflag, uflag, count);
                count = 1;
         }
    free(prevline);
    prevline = thisline;
}
if (count > 1) and (prevline != NULL)
{
     printUniq(prevline, cflag, dflag, uflag, count);
}
free(prevline);

Better algorithm for comparing adjacent lines in a file

3 Answers