3
votes

this is my implementation of strcmp ,

   #include <stdio.h>
   #include <string.h>

   int ft_strcmp(const char *s1, const char *s2)
   {
       while (*s1 == *s2)
       {
           if (*s1 == '\0')
              return (0);
          s1++;
          s2++;
      }
      return (*s1 - *s2);
  }

  int main()
  {
      char    s1[100] = "bon";
      char    s2[100] = "BONN";
      char    str1[100] = "bon";
      char    str2[100] = "n";
      printf("%d\n", ft_strcmp(s1, s2));
      printf("%d\n", ft_strcmp(str1, str2));
      return (0);
  }

from the book kernighan and Ritchie but i use a while loop, instead of the for, i ve tested it many times and my strcmp geaves the same results as the original strcmp, but i do not understand the results , i rode the man: "The strcmp() and strncmp() functions lexicographically compare the null-terminated strings s1 and s2." what does lexicography means ? "return an integer greater than, equal to, or less than 0, according as the string s1 is greater than, equal to, or less than the string s2." i understand this part but my questions are how can it come up with such results:

32
-12

s1 looks < s2 for me so how and why do i get 32 and how the calcul is made ? str1 looks > str2 for me, how and why do i get -12 and how the calcul is made. I ve compile it with the real STRCMP and i get the Same results..

last question why do i need to compare *s1 to '\0' won't it work fine without ?

thank you for your answers i m confused..

4
This isn't quite equivalent to the standard strcmp function. It can fail if either string contains characters with negative values. This can happen only if plain char is signed, which it commonly is. Quoting the standard: "The sign of a nonzero value returned by the comparison functions memcmp, strcmp, and strncmp is determined by the sign of the difference between the values of the first pair of characters (both interpreted as unsigned char) that differ in the objects being compared." - Keith Thompson
A couple of answers mention ASCII. That's a character set with one encoding. A character set maps a character to a number. An encoding maps the number to byte(s). You're probably not using ASCII (nor ever will). Windows-1252 (and similar) and Unicode/UTF-8 are much more common. It's important to know which character set and encoding you are using. The character number would determine the lexicographic ordering. The algorithm must deal with the encoding. Lexicographic ordering is the simplest. In general, ordering is specified by a collation, which can be associated with a locale or culture. - Tom Blodget

4 Answers

3
votes

1) K&R are comparing the ascii values of those chars, that's why you get 32 and -12, check out an ascii table and you'll understand.

2)If you don't check for \0 , how can you know when the string end? That's the c strings terminator.

1
votes

Capital letters in terms of ASCII codes actually precede lowercase letters, as you can see here.

So in terms of lexicographic ordering, s1 is treated as being bigger than s2, because the ascii value of the first letter that differs is the larger one.

0
votes

SO we compare *s1 to '\0' to see when does the string ends, and the results are made using the decimal value of the first characteres of each string.

0
votes
int ft_strcmp(char *s1,char *s2)
{
    int x;

    x = 0;
    while(s1[x] != '\0' && s2[x] != '\0' && s1[x] == s2[x])
        i++;
    return (s1[x] - s2[x]);
}

by mokgohloa ally