The Common Gateway Interface (CGI) is a standard for interfacing external applications with information servers, such as HTTP or Web servers. A plain HTML document is static. A CGI program, however, is executed in real-time and can output dynamic information.
Because a CGI program is executed on your system, there are some security precautions that need to be implemented. Please read the CS department's CGI security page very carefully before you start.
A CGI program can be written in any language that allows it to be executed on the system, Some of them are:
The HTTP server communicates with the CGI scripts in these four major ways:
The command line is used only in the case of an ISINDEX query. The server
searches the query information (the QUERY_STRING environment
variable) for a non-encoded = character to determine whether or not to
use command line. If it finds on, the command line is not used.
The following environment variables are set for all requests:
SERVER_SOFTWARE
SERVER_NAME
GATEWAY_INTERFACE
The following environment variables are specific to the request being fulfilled by the gateway program:
SERVER_PROTOCOL
SERVER_PORT
REQUEST_METHOD
PATH_INFO
PATH_TRANSLATED
SCRIPT_NAME
QUERY_STRING
REMOTE_HOST
REMOTE_ADDR
AUTH_TYPE
REMOTE_USER
REMOTE_IDENT
CONTENT_TYPE
CONTENT_LENGTH
The three most important environment variables are:
QUERY_STRING
The information which follows the ? in the URL which referenced this script. This is the query information. It should not be decoded in any fashion. This variable should always be set when there is query information, regardless of command line decoding.
CONTENT_TYPE
For queries which have attached information, such as HTTP POST and PUT, this is the content type of the data.
CONTENT_LENGTH
The length of the said content as given by the client.
In addition to these, the header lines received from the client, if any, are placed into the environment with the prefix HTTP_ followed by the header name. Any - characters in the header name are changed to _ characters. The server may exclude any headers which it has already processed, such as Authorization, Content-type, and Content-length. If necessary, the server may choose to exclude any or all of these headers if including them would exceed any system environment limits.
An example of this is the HTTP_ACCEPT variable which was defined in CGI/1.0. Another example is the header User-Agent.
HTTP_ACCEPT
HTTP_USER_AGENT
Please read the NCSA's CGI web pages for more information.
For requests which have information attached after the header, such as HTTP POST or PUT, the information will be sent to the script on stdin. The server will send CONTENT_LENGTH bytes on this file descriptor. Remember that it will give the CONTENT_TYPE of the data as well. The server is in no way obligated to send end-of-file after the script reads CONTENT_LENGTH bytes.
The CGI script sends its output to stdout. This output can either be a document generated by the script, or instructions to the server for retrieving the desired output.
The output of scripts begins with a small header. This header consists of text lines, in the same format as an HTTP header, terminated by a blank line (a line with only a linefeed or CR/LF).
Any headers which are not server directives are sent directly back to the client. Currently, this specification defines three server directives:
Content-type
This is the MIME type of the document you are returning.
Location
This is used to specify to the server that you are returning a reference to a document rather than an actual document.
If the argument to this is a URL, the server will issue a redirect to the client.
If the argument to this is a virtual path, the server will retrieve the document specified as if the client had requested that document originally. ? directives will work in here, but # directives must be redirected back to the client.
Status
This is used to give the server an HTTP/1.0 status line to send to
the client. The format is nnn xxxxx, where nnn
is the 3-digit status code, and xxxxx is the reason string,
such as "Forbidden".
CGI scripts are commonly used to handle the output of HTML forms.
The FORM tag specifies a fill-out form in an HTML document.
More than one fill-out form may be in a single document. But forms cannot
be nested.
The INPUT tag specifies an input element inside a
FORM. Its type can be "text", "password", "checkbox", "radio",
"submit", or "reset".
The SELECT tag specifies a number of options the user can
select. It is instantiated as Motif option menus and scrolled lists.
The TEXTAREA tag specifies a multi-line text entry field with
optional default contents.
When a submit button is pressed, the content of the form is assembled into a query URL that looks like:
action?name=value&name=value&name=value
("action" is the URL specified by the ACTION
attribute of the FORM tag, or the current document URL if
no ACTION attribute was specified.)
Special characters in "name" or "value" instances are escaped.
The contents of the form are encoded the same as with the GET
method. However, instead of appending the query string to the URL specified
by the ACTION attribute, the query string is sent to the server
in a data block. The CGI scripts can read the query string from standard
input.
Please read NCSA's fill-out form web pages for more details.
Practical Extraction and Report Language (Perl) is a programming language that encapsulates the best features of the shell, sed, grep, awk, tr, C and Cobol.
Please read Dr. Plank's Perl lecture notes if you are interested in learning it. Suppose you want to know who is viewing your web page. You can vi the web server log file but that's tedious. I wrote several Perl scripts to parse the log file and generate statistics automatically. A line in the log file may look like this:
escher.dmsa.unipd.it - - [27/Aug/1997:19:57:51 -0400] "GET /%7Elchen/stats/today.html HTTP/1.0" 200 5947 user_agent="Wget/1.4.3" referer="http://www.cs.utk.edu:80/%7Elchen/stats/" cookie="-"It tells me that somebody visited my statistics page from host
escher.dmsa.unipd.it in Italy.
If I now want to count accesses to my web pages, how many
times a person accesses the web pages, and what specific pages
they look at. Here is a segment of the script written in Perl:
while ($line = <fin>) {
@fields = split(/ /, $line);
$host = $fields[0];
$page = $fields[6];
$found = ($page =~ /~lchen/);
if ($found == 0) {
$found = ($page =~ /%7Elchen/);
if ($found == 1) {
$page =~ s/%7E/~/g;
}
}
$ignore = grep(/$host/, @ignoreIPs);
if (($found == 1) && ($ignore == 0)) {
$isToday = $fields[3] =~ /$date/;
if ($isToday == 1) {
$hostStat{$host} ++;
$domain = $host;
$domain =~ s/.*\.//;
$domain =~ y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/;
if ($domain + 0 != 0) {
$domainStat{"Unknown"} ++;
$countryStat{"Unknown"} ++;
} elsif (length($domain) == 3) {
$domainStat{$countryCodes{$domain}} ++;
$countryStat{$countryCodes{"us"}} ++;
} else {
$domainStat{"Other"} ++;
$isUS = $domain =~ /^us$/;
if ($isUS == 1) {
$countryStat{$countryCodes{"us"}} ++;
} else {
$country = $countryCodes{$domain};
if (length($country) == 0) {
$countryStat{"Unknown"} ++;
} else {
$countryStat{$country} ++;
}
}
}
$page =~ s/^\/~lchen/\//;
$page =~ s/\/\//\//;
$pageStat{$page} ++;
}
}
Basically, what I have done above is:
You can see that you can do very powerful things quickly. If you write the above program in C, it may be several times larger. Click here for a real statistic page generated by my perl scripts.
Basically, what your CGI program should do is:
Here is a very simple non-interactive CGI program which print a string "Hello, World Wide Web" to your web browser:
#!/usr/local/bin/perl
printf("Content-type: text/html\n\n");
printf("<H1>Hello, World Wide Web</H1>");
The first printf statement tells the web server that the CGI script will send an HTML document to the server. The second printf statement prints the contents of the HTML document. Click here to test it.
Here is a e-mail sender based on HTML form.
The HTML document looks like:
<P> You can use the following form to send e-mail to anybody. Fill in all the text fields. Use space to separate different email addresses in the <B>To</B> and <B>Cc</B> field. After you finish, press button <B>"Send"</B> to send the message. Press button <B>"Clear"</B> to clear all fields. <P> If you want the receiver to reply to a different email address, fill in the <B>Reply-To</B> field. Otherwise, reply message will be sent be the address specified in <B>Your Email</B> field. <P> If you leave both <B>To</B> and <B>Cc</B> fields empty, the message will be sent to <I>lchen@cs.utk.edu</I>. <P> <FORM ACTION="http://www.cs.utk.edu/~cs460.d&im/cgi-bin/webmail.cgi" METHOD="GET"> Your Name: <INPUT TYPE="text" NAME="Name" SIZE=60 MAXLENGTH=180><BR> Your Email: <INPUT TYPE="text" NAME="Email" SIZE=60 MAXLENGTH=180><BR> To: <INPUT TYPE="text" NAME="To" SIZE=60 MAXLENGTH=180><BR> Cc: <INPUT TYPE="text" NAME="Cc" SIZE=60 MAXLENGTH=180><BR> Subject: <INPUT TYPE="text" NAME="Subject" SIZE=60 MAXLENGTH=180><BR> Reply-To: <INPUT TYPE="text" NAME="Reply-To" SIZE=60 MAXLENGTH=180><BR> <P> Message:<BR> <TEXTAREA NAME="Message" ROWS=10 COLS=80> </TEXTAREA><BR> <INPUT TYPE="submit" NAME="send" VALUE="Send"> <INPUT TYPE="reset" NAME="clear" VALUE="Clear"> </FORM>
Here is the CGI script:
#!/usr/local/bin/perl5
$queryString = $ENV{'QUERY_STRING'};
@queries = split(/&/, $queryString);
for ($i = 0; $i < @queries; $i ++) {
@pairs = split(/=/, $queries[$i]);
$value = $pairs[1];
do decode(*value);
$fields{$pairs[0]} = $value;
}
if (length($fields{"Name"}) == 0) {
$fields{"Name"} = "Anonymous";
}
if (length($fields{"Email"}) == 0) {
$fields{"Email"} = "nobody\@cs.utk.edu";
}
if (length($fields{"To"}) == 0) {
if (length($fields{"Cc"}) == 0) {
$fields{"To"} = "lchen\@cs.utk.edu";
}
}
$date = do getDate();
printf("Content-type: text/html\n\n");
printf("<HTML>\n");
printf("<HEAD>\n<TITLE>Mail Sent</TITLE>\n</HEAD>\n");
printf("<BODY>\n<H1>Mail Sent</H1>\n");
printf("<P>\n%s,\n<P>\nYou message has been sent to %s %s.\n",
$fields{"Name"}, $fields{"To"}, $fields{"Cc"});
printf("</BODY>\n</HTML>\n");
$message = "From: " . $fields{"Name"} . " " . "<" . $fields{"Email"} . ">\n";
$message .= "To: " . $fields{"To"} . "\n";
$message .= "Date: $date\n";
$message .= "Cc: " . $fields{"Cc"} . "\n";
$message .= "Subject: " . $fields{"Subject"} . "\n";
if (length($fields{"Reply-To"}) > 0) {
$message .= "Reply-To: " . $fields{"Reply-To"} . "\n";
}
$message .= "X-Mailer: WebMail [Version 1.0]\n\n";
$message .= $fields{"Message"};
$command ="echo \"$message\" | /bin/rmail ";
$command .= $fields{"To"} . " " . $fields{"Cc"};
system($command);
sub decode
{
local(*string) = $_[0];
$string =~ s/\+/ /g;
$string =~ s/%09/\t/g;
$string =~ s/%0D%0A/\n/g;
$string =~ s/%21/!/g;
$string =~ s/%22/\\"/g;
$string =~ s/%23/#/g;
$string =~ s/%24/\$/g;
$string =~ s/%25/%/g;
$string =~ s/%26/\&/g;
$string =~ s/%27/'/g;
$string =~ s/%28/(/g;
$string =~ s/%29/)/g;
$string =~ s/%2B/+/g;
$string =~ s/%2C/,/g;
$string =~ s/%2F/\//g;
$string =~ s/%3A/:/g;
$string =~ s/%3B/;/g;
$string =~ s/%3C/</g;
$string =~ s/%3D/=/g;
$string =~ s/%3E/>/g;
$string =~ s/%3F/?/g;
$string =~ s/%5B/[/g;
$string =~ s/%5C/\\/g;
$string =~ s/%5D/]/g;
$string =~ s/%5E/\^/g;
$string =~ s/%60/\\`/g;
$string =~ s/%7B/{/g;
$string =~ s/%7C/|/g;
$string =~ s/%7E/\\~/g;
$string =~ s/%7D/}/g;
}
sub getDate
{
local($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst) =
localtime(time());
local($date);
$date = do StringWday($wday);
$date .= ", " . $mday . " " . do StringMonth($mon) . " " . "19" . $year . " ";
if ($hour < 10) {
$date .= "0" . $hour . ":";
} else {
$date .= $hour . ":";
}
if ($min < 10) {
$date .= "0" . $min . ":";
} else {
$date .= $min. ":";
}
if ($sec < 10) {
$date .= "0" . $sec;
} else {
$date .= $sec;
}
return($date);
}
sub StringWday
{
$wday = $_[0];
local($string);
if ($wday == 0) {
$string = "Sun";
} elsif ($wday == 1) {
$string = "Mon";
} elsif ($wday == 2) {
$string = "Tue";
} elsif ($wday == 3) {
$string = "Wed";
} elsif ($wday == 4) {
$string = "Thu";
} elsif ($wday == 5) {
$string = "Fri";
} elsif ($wday == 6) {
$string = "Sat";
}
return($string);
}
sub StringMonth
{
$mon = $_[0];
local($string);
if ($mon == 0) {
$string .= "Jan";
} elsif ($mon == 1) {
$string .= "Feb";
} elsif ($mon == 2) {
$string .= "Mar";
} elsif ($mon == 3) {
$string .= "Apr";
} elsif ($mon == 4) {
$string .= "May";
} elsif ($mon == 5) {
$string .= "Jun";
} elsif ($mon == 6) {
$string .= "Jul";
} elsif ($mon == 7) {
$string .= "Aug";
} elsif ($mon == 8) {
$string .= "Sep";
} elsif ($mon == 9) {
$string .= "Oct";
} elsif ($mon == 10) {
$string .= "Nov";
} elsif ($mon == 11) {
$string .= "Dec";
}
}
The form looks like this:
You can use the following form to send e-mail to anybody. Fill in all the text fields. Use space to separate different email addresses in the To and Cc field. After you finish, press button "Send" to send the message. Press button "Clear" to clear all fields.
If you want the receiver to reply to a different email address, fill in the Reply-To field. Otherwise, reply message will be sent be the address specified in Your Email field.
If you leave both To and Cc fields empty, the message will be sent to lchen@cs.utk.edu.
The HTML document looks like:
<FORM METHOD="POST" ACTION="http://hoohoo.ncsa.uiuc.edu/cgi-bin/post-query">
<H3 ALIGN=LEFT><FONT FACE="arial,helvetica">
Pizza Internet Delivery Service,
</FONT></H3>
<P>
Your street address: <INPUT NAME="address">
<P>
Your phone number: <INPUT NAME="phone">
<P>
Which toppings would you like?
<P>
<OL>
<LI> <INPUT TYPE="checkbox" NAME="topping" VALUE="pepperoni">
Pepperoni.
<LI> <INPUT TYPE="checkbox" NAME="topping" VALUE="sausage"> Sausage.
<LI> <INPUT TYPE="checkbox" NAME="topping" VALUE="anchovies">
Anchovies.
</OL>
<P>
How would you like to pay? Choose any one of the following:
<P>
<OL>
<LI> <INPUT TYPE="radio" NAME="paymethod" VALUE="cash" CHECKED> Cash.
<LI> <INPUT TYPE="radio" NAME="paymethod" VALUE="check"> Check.
<LI> <I>Credit card:</I>
<UL>
<LI> <INPUT TYPE="radio" NAME="paymethod" VALUE="mastercard"> Mastercard.
<LI> <INPUT TYPE="radio" NAME="paymethod" VALUE="visa"> Visa.
<LI> <INPUT TYPE="radio" NAME="paymethod" VALUE="americanexpress">
American Express.
</UL>
</OL>
<P>
Would you like the driver to call before leaving the store?
<P>
<DL>
<DD> <INPUT TYPE="radio" NAME="callfirst" VALUE="yes" CHECKED> <I>Yes.</I>
<DD> <INPUT TYPE="radio" NAME="callfirst" VALUE="no"> <I>No.</I>
</DL>
<P>
To order your pizza, press this button: <INPUT TYPE="submit"
VALUE="Order Pizza">.
</FORM>
The form looks like this: