Strings are paramount datatype for any real use case scenario. However, in C, there is no basic datatype as ‘string’. A string is understood as a collection set of characters i.e. in programmatic language, an array of characters, which are:
In C, there are two ways to declare and define a string. Firstly, as an array of characters, where just like any array, index starts from zero.:
This is a string of limited 99 characters. The string length would be 100 but the last character has to be null character, so the actual string is left to be of 99 char length. Here the memory is already allocated in the function stack of size as specified, which is 100 here. One just needs to assign a string value.
Another way is through pointers
This is a pointer which would be pointing to a set of memory of type char, which would be a string. However, at present the pointer just has some junk address with no memory to string allocated. Hence, before assigning the string value, the programmer should allocate memory to it.
In case the string is defined as an array of chars, then here are some common ways to initialize the string:
1. Direct initialization with a constant string:
Note, here the string is automatically terminated by a null character. That is, this is how the memory snapshot of variable ‘mystr’ looks like:
2. Indirect initialization through another string:
Note, I am illustrating using a loop, as the library string methods will be taken in later section
In here, certain things to be taken by the programmer:
For example, following is a wrong way to do:
It has to be assigned char by char.
However, in the other way, where a string is annotated using a pointer, it has to be allocated memory as already mentioned in the previous section. Besides, assigning a string value is similar.
For example:
Here as well, programmer needs to take care about length and the terminating null character aspects.
Moving further, there is another way of initializing strings, but interestingly, only applicable for read only constant strings.
In here, the string is written in the read-only section of the executable. Hence, one will get errors while trying to change the value.
C provides a standard string library for general string operations. One needs to include following header file for the string methods:
All of the string methods assumes and expects the strings would be null terminated.
Let’s discuss the commonly used string methods here. One can find the various available versions through their man pages.
Here is one program to illustrate the usage of all the above string methods:
The output:
Now you know how to use these string methods. Explore more and manipulate to play around with these.
- Directly assigned in double quotes
- Terminated by a null character i.e. ‘\0’
Defining strings in C
In C, there are two ways to declare and define a string. Firstly, as an array of characters, where just like any array, index starts from zero.:
Code:
char mystr[100];
Another way is through pointers
Code:
char *mystr;
Initializing strings
In case the string is defined as an array of chars, then here are some common ways to initialize the string:
1. Direct initialization with a constant string:
Code:
char mystr[100] = “my string”;
Code:
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] m y s t r i n g \0
Note, I am illustrating using a loop, as the library string methods will be taken in later section
Code:
// A typical example of copying a string value to a char array
for (index = 0; index < length; index++)
{
mystr[index] = some_str[index];
}
- The max storage of ‘mystr’ should be less than or equal to the length of ‘some_str’.
- ‘length’ should be computed taking into consideration of the null character in the end.
For example, following is a wrong way to do:
Code:
mystr = “a string”; // incorrect way
However, in the other way, where a string is annotated using a pointer, it has to be allocated memory as already mentioned in the previous section. Besides, assigning a string value is similar.
For example:
Code:
char *pStr;
pStr = (char*) malloc ( 100 * sizeof(char)); //allocating memory
for (index = 0; index < length; index++)
{
mystr[index] = some_str[index];
}
Moving further, there is another way of initializing strings, but interestingly, only applicable for read only constant strings.
Code:
char *pConstStr = “My constant string”;
String methods from standard string library
C provides a standard string library for general string operations. One needs to include following header file for the string methods:
Code:
#include <string.h>
Let’s discuss the commonly used string methods here. One can find the various available versions through their man pages.
- strlen()
To determine the length of a string. Takes the input as a string and returns the length of that string. The string length does not include the null character.
Syntax:
Code:size_t strlen ( const char *string)
- strcpy(), strncpy()
To copy one string value to another. There are two versions of it in common use as stated above.
Syntax:
It is recommended to use the second version for copying strings i.e. strncpy() as it specifies the length of string to be copied. This avoids certain vulnerable situations. Using strcpy() can originate a loophole in the application. How? Suppose, an anti-social element can overwrite the memory of the terminating null character of the source string. Now, since the correct terminating null character is gone, the copy would happen of the string more than the actual length of the source. This extra memory copied can be misused in unbound ways by hackers.Code:char* strcpy (char *destination, const char *source); char* strncpy (char *destination, const char *source, size_t length);
Hence, one can avoid this vulnerability by mentioning the actual length of the source string using strncpy().
- strcmp()
Used to compare two strings. Returns a number according to the difference found in the strings. Zero value returned means the two strings are same.
Syntax:
It also has another version strncmp()Code:int strcmp(const char* str1, const char* str2)
Syntax:
Code:int strncmp(const char* str1, const char* str2, size_t length)
- strcat(), strncat()
To concatenate a string to another. It concatenates the second parameter string the the first one and returns the same.
Syntax:
Code:char *strcat(char *destination, const char *source) char *strncat(char *destination, const char *source, size_t length)
- strstr()
To determine a substring in string. The null characters are not taken into consideration. It returns a the pointer to the first occurrence in case the sub-string is found, and a null in case it is not found.
Syntax:
Code:char *strstr(const char *mainstring, const char *substring)
Usage
Here is one program to illustrate the usage of all the above string methods:
Code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define MAX 30
int main()
{
char str1[MAX];
char *str2;
char *str3 = "Go4Expert";
int length;
printf("Enter a string\n");
scanf("%s", str1);
str2 = (char*) malloc(MAX *sizeof(char));
//get length of str3
length = strlen(str3);
printf("Length of str3 = %s is %d\n", str3, length);
//copy str3 to str2
strncpy(str2, str3, length);
printf ("Copied str3 to str2 = %s\n", str2);
//copy something to str1
strncpy(str1, " is great!", strlen(" is great!") + 1);// +1 to include null character in the end
//compare str2 and str3
if (strncmp(str2, str3, length))
{
printf("str2 and str3 are not same\n");
}
else
{
printf("str2 and str3 are same\n");
}
//concatenate
strncat(str2, str1, strlen(str1));
printf("str2 is now %s\n", str2);
//find sub string
if (strstr(str2, str3))
{
printf("Substring found\n");
}
else
{
printf("Substring not found\n");
}
//free memory allocated - str2
free(str2);
return 0;
}
Code:
rupali@home-OptiPlex-745:~/programs/strings$ ./strings Enter a string Go4Expert Length of str3 = Go4Expert is 9 Copied str3 to str2 = Go4Expert str2 and str3 are same str2 is now Go4Expert is great! Substring found
Some Interesting facts
- For any string either as char *str or char str[10], when we write ‘str’, it can be used for the first character of the string.
- C does not throw any error if char arrays i.e. strings exceeds its designated size. However, it might lead to any unexpected behaviours or errors.
- Even in case of array, like char str[10], ‘str’ can be taken treated similar to as a pointer to a string.
