0
votes

I've just started learning C language and as the topic says, I have to write a code that will read another text file and count the number of "characters", "words" and "sentences" until EOF is reached. My current problems is that I'm not able to produce the right output.

For example a text file containing the following contents...

the world 
is a great place.
lovely
and wonderful

should output with 39 characters, 9 words and 4 sentences and somehow I get 50(characters) 1(words) 1(sentences)

This is my code:

#include <stdio.h>

int main()
{
int x;
char pos;
unsigned int long charcount, wordcount, linecount;

charcount = 0;
wordcount = 0;
linecount = 0;

while(pos=getc(stdin) != EOF)
{
    if (pos != '\n' && pos != ' ')
    {
    charcount+=1;
    }

    if (pos == ' ' || pos == '\n')
    {
    wordcount +=1;  
    }

    if (pos == '\n')
    {
    linecount +=1;
    }

}

    if (charcount>0)
    {
    wordcount+=1;
    linecount+=1;
    }

printf( "%lu %lu %lu\n", charcount, wordcount, linecount );
return 0;
}

Thanks for any sort of help or suggestion

1
With char pos; ... while(pos=getc(stdin), better to use int pos; to distinguish the 257 different values returned by fgetc() - though I doubt this is your current problem,chux - Reinstate Monica
Where do you open the file ?Tony Tannous
You may want to edit your question to indicate that you're expecting the user to type the sample text into stdin, or remove stdin from the code.jrh
If pos == '\n' is used twice in the if. You are counting words even though it might just be a new row.Tony Tannous
Please note that a text file might not contain a final newline, you should take that into account.Weather Vane

1 Answers

2
votes

Due to operator precedence, the 2 below lines are the same.

// Not what OP needs
pos=getc(stdin) != EOF
pos=(getc(stdin) != EOF)

Instead, use ()

while((pos=getc(stdin)) != EOF) 

Use int ch to distinguish the values returned from fgetc() which are values in the unsigned char range and EOF. Typically 257 different, too many for a char.

int main() {
  unsigned long character_count = 0;
  unsigned long word_count = 0;
  unsigned long line_count = 0;
  unsigned long letter_count = 0;
  int pos;

  while((pos = getc(stdin)) != EOF) {
    ...

You may want to review your word count strategy too. @Tony Tannous


For me, I would count a "word" as any time a letter occurred that did not follow a non-letter. This avoids a problem @Tony Tannous and other issues. Like-wise, I would count a line as any character that followed a '\n' or the very first one and avoid any post loop calculation. This handles the issue commented by Weather Vane.

It also appear 39 is a letter count and not a character count @BLUEPIXY.
Suggest using <ctype.h> functions to test for letter-ness (isapha())

int previous = '\n';
while((pos = getc(stdin)) != EOF) {
  character_count++;
  if (isalpha(pos)) {
    letter_count++;
    if (!isalpha(previous)) word_count++;
  }
  if (previous == '\n') line_count++;
  previous = pos;
}

printf("characters %lu\n", character_count);
printf("letters %lu\n", letter_count);
printf("words %lu\n", word_count);
printf("lines %lu\n", line_count);