3
votes

I am trying to figure out the pattern I should be giving to sscanf. I have a string abcde(1GB). I want to extract 1 and GB. I am using

    char list[]= "abcde(1GB)";
    int memory_size =0;
    char unit[3]={0} ;
sscanf(list, "%*s%d%s" , &memory_size, unit);

I do not see tokens extracted when I print I see memory_size =0 and NULL in unit.

Thanks

4

4 Answers

6
votes

your sscanf() string format should be:

sscanf(list, "%*[^(](%d%[^)]" , &memory_size, unit);
  • %[^)] means catch charachters and stop ctaching when finding the charachter ) or end of the string
  • %*[^(] means:
    • [^\(] means catch charachters and stop ctaching when finding the charachter ( - as opposed to a more conventional %s - catching charachters and stop ctaching when finding space characters"
    • * means "read but not store"
3
votes

The %s conversion specifier in the format string of scanf means that scanf will read a sequence of non-whitespace characters. The * in %*s is called the assignment suppression character. It means that scanf will read and discard the non-whitespace characters.

Since %s matches any sequence of non-whitespace characters, %*s in "%*s%d%s" means that sscanf will read an discard all the characters in the string list. Therefore, there's nothing left in the array list to be read and assigned to the arguments &memory_size and unit. This explains why memory_size and unit are unchanged. What you need is the format string

sscanf(list, "%*[^(](%d%2[^)]%*s", &memory_size, unit);

Here, in the format string "%*[^(](%d%2[^)]%*s" -

  1. %*[^(] means that sscanf will first read and discard a sequence of character not containing the character (.
  2. ( means that sscanf will read and discard (.
  3. %d means the sscanf will read a decimal integer.
  4. %2[^)] means that sscanf will read at most a sequence of 2 characters not containing ) and store them in the corresponding argument. This is to ensure that sscanf does not overrun the buffer unit in case the string to be stored is too large. It's one less than the size of unit to save space for the terminating null byte which is automatically added by sscanf at the end of the buffer.
  5. %*s means sscanf will read and discard any sequence of leftover non-whitespace characters.
2
votes

plz go through the following link for the sscanf() function.

sscanf function description

Now,

as per MOHAMED said %[^(] is for catch the characters till "("

means %*[^(] truncate the string until ( after that %d as integer data and then capture the string until ).

1
votes

After accept answer - supplemental

@MOHAMED answer is good, but below are candidate improvements.

1) Always check the sscanf() result to insure data was scanned as intended.

if (sscanf(list, "%*[^(](%d%[^)]", &memory_size, unit) != 2) ScanFailure();

2) When using the "%s" or "%[]" specifiers, include a limiting length. @ajay

char unit[3]={0} ;
//                          v--- 1 less than buffer size.
if (sscanf(list, "%*[^(](%d%2[^)]" , &memory_size, unit) != 2) ScanFailure();

3) Appending a "%n" (save scan position) is a sure-fire means of detecting the trailing ) was there and extra junk was not at the end of the input string.
"%n" does not add to sscaanf() result.

int n = 0;
//                               )%n <--- Look for ) then save scan position
if (sscanf(list, "%*[^(](%d%2[^)])%n", &memory_size, unit, &n) != 2) || 
    list[n] != '\0') ScanFailure();

4) Making room for optional white-space may/may not be useful. Specifiers already allow optional leading white-space. 3 exceptions: %c %[] %n

   "%*[^(](%d%2[^)])%n"
//  v         v       v
   " %*[^(](%d %2[^)]) %n"