Lecture Notes for Chapter 8

C-style strings, char *'s and c_str

C-style strings are character arrays that are NULL-terminated. This means that their last character is '\0', which is called the "NULL character." You can create a C-style string in many ways. The program pexD.cpp uses four ways -- using string constants (in double-quotes), using array initialization, writing code to create a string, and finally using the c_str() method of a C++ string. Note, they all need the NULL character at the end:


  #include <iostream>
  #include <cstring>
  using namespace std;

  int main()
  {
    char *s1 = "Jim";
    char s2[6] = { 'P', 'l', 'a', 'n', 'k', '\0' };
    char s3[4];
    const char *s4;
    string string4 = "Mayo";

    s3[0] = 'M';
    s3[1] = 's';
    s3[2] = '.';
    s3[3] = '\0';

    s4 = string4.c_str();

    cout << "The four strings: " << s1 << ", ";
    cout << s2 << ", " << s3 << ", " << s4 << endl;

    cout << endl;
    cout << "Strlen(s1) = " << strlen(s1);
    cout << ".  Strlen(s2) = " << strlen(s2);
    cout << ".  Strlen(s3) = " << strlen(s3);
    cout << ".  Strlen(s4) = " << strlen(s4) << "." << endl;

    cout << endl;
    cout << "Strcmp(s1, \"Jim\") = " << strcmp(s1, "Jim") << endl;
    cout << "Strcmp(s1, \"Fred\") = " << strcmp(s1, "Fred") << endl;
    cout << "Strcmp(s1, \"Plank\") = " << strcmp(s1, "Plank") << endl;  
    cout << "Strcmp(s1, \" Jim\") = " << strcmp(s1, " Jim") << endl;
    cout << "Strcmp(s1, \"jim\") = " << strcmp(s1, "jim") << endl;
    cout << "Strcmp(s1, \"Jix\") = " << strcmp(s1, "Jix") << endl;
 
    return 0;
  }

This creates four strings -- note, the middle two have to specify the memory; the first and last are simply pointers. If you use c_str() the pointer must be a const, which says that you cannot modify its contents.

The program also shows the use of strlen(), which returns the length of the string, minus the NULL character. Then it shows strcmp() which compares two strings lexicographically, and returns 0 if they are equal, a negative number if the first is less than the second, and a positive number if the first is greater than the second. Lexicographic comparison is done using the ASCII character codes for the characters. Since space is less than capital-J, the fourth strcmp() statement shows that " Jim" is less than "Jim".

  UNIX> g++ -o pexD pexD.cpp
  UNIX> pexD
  The four strings: Jim, Plank, Ms., Mayo

  Strlen(s1) = 3.  Strlen(s2) = 5.  Strlen(s3) = 3.  Strlen(s4) = 4.

  Strcmp(s1, "Jim") = 0
  Strcmp(s1, "Fred") = 4
  Strcmp(s1, "Plank") = -6
  Strcmp(s1, " Jim") = 42
  Strcmp(s1, "jim") = -32
  Strcmp(s1, "Jix") = -11
  UNIX>

Note that although it appears that strcmp() is returning the difference between the ASCII character codes of the differing characters, you should not rely on that, because the definition of strcmp() only specifies positive/negative numbers. This means that on a different machine, it may return different positive/negative values.


strchr(), strrchr() and strstr()

These are very useful functions on strings. Here are the prototypes:

  char *strchr(char *s, const char c);
  char *strrchr(char *s, const char c);
  char *strstr(char *s, const char *tofind);
They all find something in the string s -- strchr() finds the first occurrence of c, and strrchr() finds the last occurrence of c. strstr() finds the first occurrence of the substring tofind.

If they find what they're looking for, they return a pointer to it inside s. If they don't, they return the global constant NULL.

Below is a nice example of it working (pexE.cpp):


  #include <iostream>
  #include <cstring>
  using namespace std;

  int main()
  {
    const char *s;
    string str;
    char *found;

    cout << "Enter a string: ";
    cin >> str;

    s = str.c_str();

    found = strchr(s, 'a');
    cout << "strchr(\"" << s << "\", 'a') returned ";
    if (found == NULL) {       // -OR- if (!found) ...
      cout << "NULL\n";
    } else {
      cout << '"' << found << '"' << endl;
    }

    found = strrchr(s, 'a');
    cout << "strrchr(\"" << s << "\", 'a') returned ";
    if (!found) {
      cout << "NULL\n";
    } else {
      cout << '"' << found << '"' << endl;
    }

    found = strstr(s, "ba");
    cout << "strstr(\"" << s << "\", \"ba\") returned ";  
    if (found == NULL) {
      cout << "NULL\n";
    } else {
      cout << '"' << found << '"' << endl;
    }
    return 0;
  }

This shows calling the various functions on a user-entered string:

  UNIX> pexE
  Enter a string: Jim
  strchr("Jim", 'a') returned NULL
  strrchr("Jim", 'a') returned NULL
  strstr("Jim", "ba") returned NULL
  UNIX> pexE
  Enter a string: Abacab
  strchr("Abacab", 'a') returned "acab"
  strrchr("Abacab", 'a') returned "ab"
  strstr("Abacab", "ba") returned "bacab"
  UNIX> 


strcpy()

The prototype for strcpy() is   char *strcpy(char *s1, const char *s2);

It copies s2 into s1 without checking string lengths. If s2 is longer than s1, strcpy() overwrites memory - not good. If s2 is shorter than s1, s1 will hold two strings. Why?

Example:


Example: