CGI Tutorial by Luojian Chen

Contents


Overview

The Common Gateway Interface (CGI) is a standard for interfacing external applications with information servers, such as HTTP or Web servers. A plain HTML document is static. A CGI program, however, is executed in real-time and can output dynamic information.

Because a CGI program is executed on your system, there are some security precautions that need to be implemented. Please read the CS department's CGI security page very carefully before you start.

A CGI program can be written in any language that allows it to be executed on the system, Some of them are:


CGI Specification

The HTTP server communicates with the CGI scripts in these four major ways:

Command line

The command line is used only in the case of an ISINDEX query. The server searches the query information (the QUERY_STRING environment variable) for a non-encoded = character to determine whether or not to use command line. If it finds on, the command line is not used.

Environment Variables

The following environment variables are set for all requests:

The following environment variables are specific to the request being fulfilled by the gateway program:

The three most important environment variables are:

In addition to these, the header lines received from the client, if any, are placed into the environment with the prefix HTTP_ followed by the header name. Any - characters in the header name are changed to _ characters. The server may exclude any headers which it has already processed, such as Authorization, Content-type, and Content-length. If necessary, the server may choose to exclude any or all of these headers if including them would exceed any system environment limits.

An example of this is the HTTP_ACCEPT variable which was defined in CGI/1.0. Another example is the header User-Agent.

Please read the NCSA's CGI web pages for more information.

Standard Input

For requests which have information attached after the header, such as HTTP POST or PUT, the information will be sent to the script on stdin. The server will send CONTENT_LENGTH bytes on this file descriptor. Remember that it will give the CONTENT_TYPE of the data as well. The server is in no way obligated to send end-of-file after the script reads CONTENT_LENGTH bytes.

Standard Output

The CGI script sends its output to stdout. This output can either be a document generated by the script, or instructions to the server for retrieving the desired output.

The output of scripts begins with a small header. This header consists of text lines, in the same format as an HTTP header, terminated by a blank line (a line with only a linefeed or CR/LF).

Any headers which are not server directives are sent directly back to the client. Currently, this specification defines three server directives:


HTML Form

CGI scripts are commonly used to handle the output of HTML forms.

The FORM Tag

The FORM tag specifies a fill-out form in an HTML document. More than one fill-out form may be in a single document. But forms cannot be nested.

The INPUT Tag

The INPUT tag specifies an input element inside a FORM. Its type can be "text", "password", "checkbox", "radio", "submit", or "reset".

The SELECT Tag

The SELECT tag specifies a number of options the user can select. It is instantiated as Motif option menus and scrolled lists.

The TEXTAREA Tag

The TEXTAREA tag specifies a multi-line text entry field with optional default contents.

Form Submission

GET Method

When a submit button is pressed, the content of the form is assembled into a query URL that looks like:

	action?name=value&name=value&name=value

("action" is the URL specified by the ACTION attribute of the FORM tag, or the current document URL if no ACTION attribute was specified.)

Special characters in "name" or "value" instances are escaped.

POST Method

The contents of the form are encoded the same as with the GET method. However, instead of appending the query string to the URL specified by the ACTION attribute, the query string is sent to the server in a data block. The CGI scripts can read the query string from standard input.

Please read NCSA's fill-out form web pages for more details.


Perl

Practical Extraction and Report Language (Perl) is a programming language that encapsulates the best features of the shell, sed, grep, awk, tr, C and Cobol.

Please read Dr. Plank's Perl lecture notes if you are interested in learning it. Suppose you want to know who is viewing your web page. You can vi the web server log file but that's tedious. I wrote several Perl scripts to parse the log file and generate statistics automatically. A line in the log file may look like this:

escher.dmsa.unipd.it - - [27/Aug/1997:19:57:51 -0400] "GET /%7Elchen/stats/today.html HTTP/1.0" 200 5947 user_agent="Wget/1.4.3" referer="http://www.cs.utk.edu:80/%7Elchen/stats/" cookie="-"

It tells me that somebody visited my statistics page from host escher.dmsa.unipd.it in Italy. If I now want to count accesses to my web pages, how many times a person accesses the web pages, and what specific pages they look at. Here is a segment of the script written in Perl:
  while ($line = <fin>) {
    @fields = split(/ /, $line);
    $host = $fields[0];
    $page = $fields[6];

    $found = ($page =~ /~lchen/);
    if ($found == 0) {
      $found = ($page =~ /%7Elchen/);
      if ($found == 1) {
        $page =~ s/%7E/~/g;
      }
    }

    $ignore = grep(/$host/, @ignoreIPs);
    if (($found == 1) && ($ignore == 0)) {
      $isToday = $fields[3] =~ /$date/;
      if ($isToday == 1) {
        $hostStat{$host} ++;

	$domain = $host;
	$domain =~ s/.*\.//;
	$domain =~ y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/;
	if ($domain + 0 != 0) {
	  $domainStat{"Unknown"} ++;
	  $countryStat{"Unknown"} ++;
	} elsif (length($domain) == 3) {
	  $domainStat{$countryCodes{$domain}} ++;
	  $countryStat{$countryCodes{"us"}} ++;
	} else {
	  $domainStat{"Other"} ++;
	  $isUS = $domain =~ /^us$/;
	  if ($isUS == 1) {
	    $countryStat{$countryCodes{"us"}} ++;
	  } else {
	    $country = $countryCodes{$domain};
	    if (length($country) == 0) {
	      $countryStat{"Unknown"} ++;
	    } else {
	      $countryStat{$country} ++;
	    }
	  }
	}

        $page =~ s/^\/~lchen/\//;
        $page =~ s/\/\//\//;
        $pageStat{$page} ++;
	
      }
    }

Basically, what I have done above is:

You can see that you can do very powerful things quickly. If you write the above program in C, it may be several times larger. Click here for a real statistic page generated by my perl scripts.


Examples

Basically, what your CGI program should do is:

Example 1: Your First CGI Program

Here is a very simple non-interactive CGI program which print a string "Hello, World Wide Web" to your web browser:

#!/usr/local/bin/perl

printf("Content-type: text/html\n\n");
printf("<H1>Hello, World Wide Web</H1>");

The first printf statement tells the web server that the CGI script will send an HTML document to the server. The second printf statement prints the contents of the HTML document. Click here to test it.

Example 2: Web Mail

Here is a e-mail sender based on HTML form.

The HTML document looks like:

<P>
You can use the following form to send e-mail to anybody. Fill in all the
text fields. Use space to separate different email addresses in the
<B>To</B> and <B>Cc</B> field. After you finish, press button <B>"Send"</B> to
send the message. Press button <B>"Clear"</B> to clear all fields.

<P>
If you want the receiver to reply to a different email address, fill in the
<B>Reply-To</B> field. Otherwise, reply message will be sent be the address
specified in <B>Your Email</B> field.

<P>
If you leave both <B>To</B> and <B>Cc</B> fields empty, the message will be
sent to <I>lchen@cs.utk.edu</I>.

<P>
<FORM ACTION="http://www.cs.utk.edu/~cs460.d&im/cgi-bin/webmail.cgi" METHOD="GET">
Your Name: <INPUT TYPE="text" NAME="Name" SIZE=60 MAXLENGTH=180><BR>
Your Email: <INPUT TYPE="text" NAME="Email" SIZE=60 MAXLENGTH=180><BR>
To: <INPUT TYPE="text" NAME="To" SIZE=60 MAXLENGTH=180><BR>
Cc: <INPUT TYPE="text" NAME="Cc" SIZE=60 MAXLENGTH=180><BR>
Subject: <INPUT TYPE="text" NAME="Subject" SIZE=60 MAXLENGTH=180><BR>
Reply-To: <INPUT TYPE="text" NAME="Reply-To" SIZE=60 MAXLENGTH=180><BR>
<P>
Message:<BR>
<TEXTAREA NAME="Message" ROWS=10 COLS=80>
</TEXTAREA><BR>
<INPUT TYPE="submit" NAME="send" VALUE="Send">
<INPUT TYPE="reset" NAME="clear" VALUE="Clear">
</FORM>

Here is the CGI script:

#!/usr/local/bin/perl5

$queryString = $ENV{'QUERY_STRING'};

@queries = split(/&/, $queryString);

for ($i = 0; $i < @queries; $i ++) {
  @pairs = split(/=/, $queries[$i]);
  $value = $pairs[1];
  do decode(*value);
  $fields{$pairs[0]} = $value;
}

if (length($fields{"Name"}) == 0) {
  $fields{"Name"} = "Anonymous";
}

if (length($fields{"Email"}) == 0) {
  $fields{"Email"} = "nobody\@cs.utk.edu";
}

if (length($fields{"To"}) == 0) {
  if (length($fields{"Cc"}) == 0) {
    $fields{"To"} = "lchen\@cs.utk.edu";
  }
}

$date = do getDate();

printf("Content-type: text/html\n\n");
printf("<HTML>\n");
printf("<HEAD>\n<TITLE>Mail Sent</TITLE>\n</HEAD>\n");
printf("<BODY>\n<H1>Mail Sent</H1>\n");
printf("<P>\n%s,\n<P>\nYou message has been sent to %s %s.\n",
       $fields{"Name"}, $fields{"To"}, $fields{"Cc"});
printf("</BODY>\n</HTML>\n");

$message = "From: " . $fields{"Name"} . " " . "<" . $fields{"Email"} . ">\n";
$message .= "To: " . $fields{"To"} . "\n";
$message .= "Date: $date\n";
$message .= "Cc: " . $fields{"Cc"} . "\n";
$message .= "Subject: " . $fields{"Subject"} . "\n";

if (length($fields{"Reply-To"}) > 0) {
  $message .= "Reply-To: " . $fields{"Reply-To"} . "\n";
}

$message .= "X-Mailer: WebMail [Version 1.0]\n\n";
$message .= $fields{"Message"};

$command ="echo \"$message\" | /bin/rmail ";
$command .= $fields{"To"} . " " . $fields{"Cc"};

system($command);

sub decode
{
  local(*string) = $_[0];

  $string =~ s/\+/ /g;
  $string =~ s/%09/\t/g;
  $string =~ s/%0D%0A/\n/g;
  $string =~ s/%21/!/g;
  $string =~ s/%22/\\"/g;
  $string =~ s/%23/#/g;
  $string =~ s/%24/\$/g;
  $string =~ s/%25/%/g;
  $string =~ s/%26/\&/g;
  $string =~ s/%27/'/g;
  $string =~ s/%28/(/g;
  $string =~ s/%29/)/g;
  $string =~ s/%2B/+/g;
  $string =~ s/%2C/,/g;
  $string =~ s/%2F/\//g;
  $string =~ s/%3A/:/g;
  $string =~ s/%3B/;/g;
  $string =~ s/%3C/</g;
  $string =~ s/%3D/=/g;
  $string =~ s/%3E/>/g;
  $string =~ s/%3F/?/g;
  $string =~ s/%5B/[/g;
  $string =~ s/%5C/\\/g;
  $string =~ s/%5D/]/g;
  $string =~ s/%5E/\^/g;
  $string =~ s/%60/\\`/g;
  $string =~ s/%7B/{/g;
  $string =~ s/%7C/|/g;
  $string =~ s/%7E/\\~/g;
  $string =~ s/%7D/}/g;
}

sub getDate
{
  local($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst) =
    localtime(time());
  local($date);

  $date = do StringWday($wday);
  $date .= ", " . $mday . " " . do StringMonth($mon) . " " . "19" . $year . " ";

  if ($hour < 10) {
    $date .= "0" . $hour . ":";
  } else {
    $date .= $hour . ":";
  }

  if ($min < 10) {
    $date .= "0" . $min . ":";
  } else {
    $date .= $min. ":";
  }

  if ($sec < 10) {
    $date .= "0" . $sec;
  } else {
    $date .= $sec;
  }

  return($date);
}


sub StringWday
{
  $wday = $_[0];

  local($string);

  if ($wday == 0) {
    $string = "Sun";
  } elsif ($wday == 1) {
    $string = "Mon";
  } elsif ($wday == 2) {
    $string = "Tue";
  } elsif ($wday == 3) {
    $string = "Wed";
  } elsif ($wday == 4) {
    $string = "Thu";
  } elsif ($wday == 5) {
    $string = "Fri";
  } elsif ($wday == 6) {
    $string = "Sat";
  }

  return($string);
} 

sub StringMonth
{
  $mon = $_[0];

  local($string);

  if ($mon == 0) {
    $string .= "Jan";
  } elsif ($mon == 1) {
    $string .= "Feb";
  } elsif ($mon == 2) {
    $string .= "Mar";
  } elsif ($mon == 3) {
    $string .= "Apr";
  } elsif ($mon == 4) {
    $string .= "May";
  } elsif ($mon == 5) {
    $string .= "Jun";
  } elsif ($mon == 6) {
    $string .= "Jul";
  } elsif ($mon == 7) {
    $string .= "Aug";
  } elsif ($mon == 8) {
    $string .= "Sep";
  } elsif ($mon == 9) {
    $string .= "Oct";
  } elsif ($mon == 10) {
    $string .= "Nov";
  } elsif ($mon == 11) {
    $string .= "Dec";
  }
}

The form looks like this:

You can use the following form to send e-mail to anybody. Fill in all the text fields. Use space to separate different email addresses in the To and Cc field. After you finish, press button "Send" to send the message. Press button "Clear" to clear all fields.

If you want the receiver to reply to a different email address, fill in the Reply-To field. Otherwise, reply message will be sent be the address specified in Your Email field.

If you leave both To and Cc fields empty, the message will be sent to lchen@cs.utk.edu.

Your Name:
Your Email:
To:
Cc:
Subject:
Reply-To:

Message:

Example 3: Pizza Order From (Modified version from NCSA)

The HTML document looks like:

<FORM METHOD="POST" ACTION="http://hoohoo.ncsa.uiuc.edu/cgi-bin/post-query"> 
 
<H3 ALIGN=LEFT><FONT FACE="arial,helvetica">
Pizza Internet Delivery Service,
</FONT></H3> 
 
<P>
Your street address: <INPUT NAME="address">
 
<P>
Your phone number: <INPUT NAME="phone">
 
<P>
Which toppings would you like?
 
<P>
<OL> 
<LI> <INPUT TYPE="checkbox" NAME="topping" VALUE="pepperoni"> 
     Pepperoni. 
<LI> <INPUT TYPE="checkbox" NAME="topping" VALUE="sausage"> Sausage. 
<LI> <INPUT TYPE="checkbox" NAME="topping" VALUE="anchovies"> 
     Anchovies. 
</OL> 
 
<P>
How would you like to pay?  Choose any one of the following:

<P>
<OL> 
<LI> <INPUT TYPE="radio" NAME="paymethod" VALUE="cash" CHECKED> Cash. 
<LI> <INPUT TYPE="radio" NAME="paymethod" VALUE="check"> Check. 
<LI> <I>Credit card:</I> 
<UL> 
<LI> <INPUT TYPE="radio" NAME="paymethod" VALUE="mastercard"> Mastercard. 
<LI> <INPUT TYPE="radio" NAME="paymethod" VALUE="visa"> Visa. 
<LI> <INPUT TYPE="radio" NAME="paymethod" VALUE="americanexpress"> 
     American Express. 
</UL> 
</OL> 
 
<P>
Would you like the driver to call before leaving the store?
 
<P>
<DL> 
<DD> <INPUT TYPE="radio" NAME="callfirst" VALUE="yes" CHECKED> <I>Yes.</I> 
<DD> <INPUT TYPE="radio" NAME="callfirst" VALUE="no"> <I>No.</I> 
</DL> 
 
<P>
To order your pizza, press this button: <INPUT TYPE="submit" 
VALUE="Order Pizza">.
 
</FORM> 

The form looks like this:

Internet Pizza Delivery Service

Your street address:

Your phone number:

Which toppings would you like?

  1. Pepperoni.
  2. Sausage.
  3. Anchovies.

How would you like to pay? Choose any one of the following:

  1. Cash.
  2. Check.
  3. Credit card:
    • Mastercard.
    • Visa.
    • American Express.

Would you like the driver to call before leaving the store?

Yes.
No.

To order your pizza, press this button: .


Copyright © 1997 Luojian Chen / lchen@cs.utk.edu
Last updated: Fri Aug 29 15:49:09 1997