1
votes

Following are some basic questions that I have with respect to strings in C.

  • If string literals are stored in read-only data segment and cannot be changed after initialisation, then what is the difference between the following two initialisations.

char *string = "Hello world";

const char *string = "Hello world";

  • When we dynamically allocate memory for strings, I see the following allocation is capable enough to hold a string of arbitary length.Though this allocation work, I undersand/beleive that it is always good practice to allocate the actual size of actual string rather than the size of data type.Please guide on proper usage of dynamic allocation for strings.

char *str = (char *)malloc(sizeof(char));

scanf("%s",str);

printf("%s\n",str);

7
please only one question at a time - Jens Gustedt

7 Answers

2
votes

1.what is the difference between the following two initialisations. The difference is the compilation and runtime checking of the error as others already told about this.

char *string = "Hello world";--->stored in read only data segment and can't be changed,but if you change the value then compiler won't give any error it only comes at runtime.

const char *string = "Hello world";--->This is also stored in the read only data segment with a compile time checking as it is declared as const so if you are changing the value of string then you will get an error at compile time ,which is far better than a failure at run time.

2.Please guide on proper usage of dynamic allocation for strings.

 char *str = (char *)malloc(sizeof(char));
 scanf("%s",str);
 printf("%s\n",str);

This code may work some time but not always.The problem comes at run-time when you will get a segmentation fault,as you are accessing the area of memory which is not own by your program.You should always very careful in this dynamic memory allocation as it will leads to very dangerous error at run time.

You should always allocate the amount of memory you need correctly. The most error comes during the use of string.You should always keep in mind that there is a '\0' character present at last of the string and during the allocation its your responsibility to allocate memory for this.

Hope this helps.

1
votes

what is the difference between the following two initialisations.

String literals have type char* for legacy reasons. It's good practice to only point to them via const char*, because it's not allowed to modify them.

I see the following allocation is capable enough to hold a string of arbitary length.

Wrong. This allocation only allocates memory for one character. If you tried to write more than one byte into string, you'd have a buffer overflow.

The proper way to dynamically allocate memory is simply:

char *string = malloc(your_desired_max_length);

Explicit cast is redundant here and sizeof(char) is 1 by definition.

Also: Remember that the string terminator (0) has to fit into the string too.

1
votes

In the first case, you're explicitly casting the char* to a const one, meaning you're disallowing changes, at the compiler level, to the characters behind it. In C, it's actually undefined behaviour (at runtime) to try and modify those characters regardless of their const-ness, but a string literal is a char *, not a const char *.

In the second case, I see two problems.

The first is that you should never cast the return value from malloc since it can mask certain errors (especially on systems where pointers and integers are different sizes). Specifically, unless there is an active malloc prototype in place, a compiler may assume that it returns an int rather than the correct void *.

So, if you forget to include stdlib.h, you may experience some funny behaviour that the compiler couldn't warn you about, because you told it with an explicit cast that you knew what you were doing.

C is perfectly capable of implicit casting between the void * returned from malloc and any other pointer type.

The second problem is that it only allocates space for one character, which will be the terminating null for a string.

It would be better written as:

char *string = malloc (max_str_size + 1);

(and don't ever multiply by sizeof(char), that's a waste of time - it's always 1).

1
votes

The difference between the two declarations is that the compiler will produce an error (which is much preferable to a runtime failure) if an attempt to modify the string literal is made via the const char* declared pointer. The following code:

const char* s = "hello"; /* 's' is a pointer to 'const char'. */
*s = 'a';

results in the VC2010 emitted the following error:

error C2166: l-value specifies const object

An attempt to modify the string literal made via the char* declared pointer won't be detected until runtime (VC2010 emits no error), the behaviour of which is undefined.

When malloc()ing memory for storing of strings you must remember to allocate one extra char for storing the null terminator as all (or nearly all) C string handling functions require the null terminator. For example, to allocate a buffer for storing "hello":

char* p = malloc(6); /* 5 for "hello" and 1 for null terminator. */

sizeof(char) is guaranteed to be 1 so is unrequired and it is not necessary to cast the return value of malloc(). When p is no longer required remember to free() the allocated memory:

free(p);
1
votes

Difference between the following two initialisations.

first, char *string = "Hello world";
- "Hello world" stored in stack segment as constant string and its address is assigned to pointer'string' variable.
"Hello world" is constant. And you can't do string[5]='g', and doing this will cause a segmentation fault.
Where as 'string' variable itself is not constant. And you can change its binding:
string= "Some other string"; //this is correct, no segmentation fault

const char *string = "Hello world";
Again "Hello world" stored in stack segment as constant string and its address is assigned to 'string' variable. And string[5]='g', and this cause segmentation fault.
No use of const keyword here!

Now,
char *string = (char *)malloc(sizeof(char));

Above declaration same as first one but this time you are assignment is dynamic from Heap segment (not from stack)

0
votes

The code:

char *string = (char *)malloc(sizeof(char));

Will not hold a string of arbitrary length. It will allocate a single character and return a pointer to char character. Note that a pointer to a character and a pointer to what you call a string are the same thing.

To allocate space for a string you must do something like this:

char *data="Hello, world";
char *copy=(char*)malloc(strlen(data)+1);
strcpy(copy,data);

You need to tell malloc exactly how many bytes to allocate. The +1 is for the null terminator that needs to go onto the end.

As for literal string being stored in a read-only segment, this is an implementation issue, although is pretty much always the case. Most C compilers are pretty relaxed about const'ing access to these strings, but attempting to modify them is asking for trouble, so you should always declare them const char * to avoid any issues.

0
votes

That particular allocation may appear to work as there's probably plenty of space in the program heap, but it doesn't. You can verify it by allocating two "arbitrary" strings with the proposed method and memcpy:ing some long enough string to the corresponding addresses. In the best case you see garbage, in the worst case you'll have segmentation fault or assert from malloc or free.