Today's Question:  What does your personal desk look like?        GIVE A SHOUT

Pointers, arrays, and string literals

  Artful Code        2011-09-22 13:29:23       2,768        0    

A recently posted question on Stack Overflow highlighted a common misconception about the role of pointers and arrays held by many programmers learning C.

The confusion stems from a misunderstanding concerning the role of pointers and strings in C. A pointer is an address in memory. It often points to an index in an array, such as in the function strtoupper in the following code:

void strtoupper(char *str)
{
    if (str) {  // null ptr check, courtesy of Michael
        while (*str != '\0') {
            // destructively modify the contents at the current pointer location
            // using the dereference operator to access the value at the current
            // pointer address.
            *str = toupper(*str);
            ++str;
        }
    }
}
 
int main()
{
    char my_str[] = "hello world";
    strtoupper(my_str);
    printf("%s", my_str);
    return 0;
}

my_str is actually a pointer to a block of memory holding chars. This allows us to use address math to access indices of the array and modify them using the dereference operator. In fact, an array index such as my_str[3] is identical to the expression *(my_str + 3).

char my_str[] = "hello world";
*my_str = toupper(*my_str);
*(my_str + 6) = toupper(*(my_str + 6));
printf("%s", my_str); // prints, "Hello World"

However, if my_str is declared as a char pointer to the string literal “hello world” rather than a char array, these operations fail:

char *my_str = "hello world";
*my_str = toupper(*my_str); // fails
*(my_str + 6) = toupper(*(my_str + 6)); // fails
printf("%s", my_str);

Let’s explore the difference between the two declarations.

char *a = "hello world";
char b[] = "hello world";

In the compiled program, it is likely that “hello world” is stored literally inside the executable. It is effectively an immutable, constant value. Pointing char *a to it provides the scope with read-only access to an immutable block of memory. Therefore, attempting to assign a value might cause other code that points to the same memory to behave erratically (read this response to the above post on Stack Overflowfor an excellent explanation of this behavior.)

The declaration of char b[] instead declares a locally allocated block of memory that is then filled with the chars, “hello world”. b is now a pointer to the first address of that array. The complete statement, combining the declaration and assignment, is shorthand. Dispensing with the array size (e.g., charinstead of char[12]) is permitted as the compiler is able to ascertain its size from the string literal it was assigned.

In both cases the pointer is used to access array indices:

int i;
for (i = 0; a[i] != '\0'; ++i)
    printf("%c", toupper(a[i]));

However, only with b is the program able to modify the values in memory, since it is explicitly copied to a mutable location on the stack in its declaration:

int i;
for (i = 0; b[i] != '\0'; ++i)
    b[i] = toupper(b[i]);
printf("%s", b);

CHAR POINTER  INITIALIZATION  LITERAL  CANN 

Share on Facebook  Share on Twitter  Share on Weibo  Share on Reddit 

  RELATED


  0 COMMENT


No comment for this article.