C Style Strings

There are two methods for creating strings in C++. One method is to use the object oriented features to create objects. The other method is the method that was used in C, which did not allow the creation of objects. Since we haven't studied objects yet, we'll look at the method that was used in C. We refer to these types of strings as "C style strings" to distinguish them from string objects.

C style strings use arrays of characters to hold strings. Each array must contain a special value to mark the end of the string. This value is stored in the array element right after the last character of the string. This value is a byte of all zeros and is represented by the symbol \0. We call this value "the null byte". The null byte allows us to put a string into an array even if the string doesn't fill up the entire array; the characters in the string are placed in the array and the null byte is placed after them to mark the end of the string.

String Constants

A string constant, or string literal, is any sequence of characters enclosed in double quotes. We have been using them all semester in our output. A string literal is stored in an array and a null byte is added after the last character. For example:

     cout << "The value is " << val << endl;

In memory, this string is stored as

T h e   v a l u e   i s   \0

String Variables

To create a C style string variable, just declare an array of char. One byte of the array must hold the null byte, so the maximum length of a string that can be stored in the array is one less than the size of the array. Smaller strings can be stored in the array; the array elements after the null byte do not contain the string and will be ignored.

     char name[51];  //  holds strings up to 50 chars in length
     char word[10];  //  holds strings up to 9 chars in length

Initializing Strings

A string variable can be initialized by setting it equal to a string literal in the declaration. If the size of the array is omitted, the array will be created so that it is just large enough to hold the literal (including the null byte).

     char word[12] = "maritime";
     char phrase[] = "Hi Mom!";

These declarations create the following arrays, where '??' represents garbage:

m a r i t i m e \0 ?? ?? ??

H i   M o m ! \0

Strings as Arrays

Strings are arrays, not scalar variables, even though we think of the string as a single variable. This means that there are some restrictions on how we work with strings:

String Output

Because we frequently read and print strings, the I/O objects in C++ have been designed to let us read and print an entire string in one statement. This is one case where a loop is not needed to process a string.

We already know how to output string literals. We output string variables by putting the string name in the output statement. No loop is needed:

     cout << "Today's word is " << word << endl;
     cout << phrase << endl;

When we output a string variable, all characters up to (but not including) the null byte will be printed. Therefore, the above cout statements will print:

Today's word is maritime
Hi Mom!

String Input

There are two methods for reading into a string. One method reads all characters up to the next whitespace, and puts those characters into the string. A null byte is added after the last character. This corresponds to the idea of reading one word. The other method reads everything remaining on the current input line, and puts those characters into the string. A null byte is added after the last character. This corresponds to the idea of reading one line.

Reading One "Word" at a Time

To read up to the next whitespace, use the extraction operator >>. The extraction operator will skip any leading whitespace, read all characters into the string up to the next whitespace, and put the null byte in the string after the last character read. C++ does NOT check to make sure that the array is large enough; it will store data past the end of the array if there is not enough room. For example, suppose we are given:

     char first[10],second[12],third[8];
     cin >> first >> second;
     cin >> third;

Here are some input data and the values that will be stored if this data is used:

Input: the first input

first:
t h e \0 ?? ?? ?? ?? ?? ??

second:
f i r s t \0 ?? ?? ?? ?? ?? ??

third:
i n p u t \0 ?? ??

Input:    another word
     bunch

first:
a n o t h e r \0 ?? ??

second:
w o r d \0 ?? ?? ?? ?? ?? ?? ??

third:
b u n c h \0 ?? ??

Input: thereisabig space problem

first:
t h e r e i s a b i
g \0

second:
s p a c e \0 ?? ?? ?? ?? ?? ??

third:
p r o b l e m \0

Notice the problem: the first "word" is too large to fit into the first string, so the extra characters will overflow the string. They will be placed into the memory locations following the end of the array, which will result in unpredictible behavior: the program may give erroneous results or crash.

Reading One Line at a Time

To read a whole line of input into a string, we use the getline function. This function can be used with cin or with an ifstream. The getline function has two parms: the first parm is the string, the second parm is the size of the string (the size of the array). The getline function will read the next input line into the string and place a null byte at the end of the data. However, if the data is too large to fit into the string, getline will only read enough data to fill up the string and place a null byte at the end. For example:

     char line[20];
     cin.getline(line,20);

Input: the first line

line:
t h e   f i r s t   l i n e \0 ?? ?? ?? ?? ??

Input: the extra   long line

line:
t h e   e x t r a       l o n g   l i \0

Notice that getline will only read 19 characters into this string. If the input line contains more data than that, only 19 characters are read and the 20th character is set to the null byte.

String Library Functions

There are a number of built in functions to help us work with strings. To use these functions, you must include <string.h> in your program. A couple of the most useful functions are:

strlen: returns the length of a string

strlen has one parm: a string. It returns the length of the string (not including the null byte):
length = strlen(word);

strcat: concatenates two strings

strcat accepts two strings as parameters and concatenates (or appends) the second string to the end of the first string. It is the programmer's responsibility to make sure the first string is large enough to hold the contents of both strings:
strcat(string1, string2);

strcpy: string copy

strcpy accepts two strings as parameters and copies the second string into the first string:
strcpy(name, "john jones");
strcpy(name, other_name);

strcmp: string compare

strcmp accepts two strings as parameters and compares the strings as follows: returns an integer less than 0 if the first string is less than the second string, returns 0 if the strings are equal, returns an integer greater than 0 if the first string is greater than the second string:
result = strcmp(string1, string2);


Email Me | Office Hours | My Home Page | Department Home | MCC Home Page

© Copyright Emmi Schatz 2004