Perl Tutorial

Written by:
Ron Jurincie

(Edited by: M. Berry)

Note:  This tutorial is only intended to provide a very simplified
       introduction to writing scripts in perl.  Use the links provided 
       within this document, as well as the ones on the course homepage
       to further your understanding.

Brief History of Perl
Perl (Practical Extraction and Report Language) is a computer scripting
language developed by Larry Wall in 1986.  Originally, Wall created perl
because he wanted a language which combined many of the best features of
sed, awk, sh and other scripting languages.

Wall immediately made his new scripting language freely available, allowing
other other programmers to use his language as they liked.  Soon, many
programmers began sharing individually developed features with other perl
users. Some of these features found their way into later versions of perl,
and the current version (Perl 5.0) is a complete rewrite of earlier
versions.
Perl Help
Perl's original 15 man pages bear little resemblance to today's extensive collection. In addition to the local man pages, you can also learn about perl from online tutorials, newsgroups, mailing lists, and FAQ's. Our man pages can be accessed as follows: Unix> man perl Perl's on-line man pages are available here.
Perl is an Interpreted Language
Perl is an interpreted language. Unlike compiled languages such as C, perl compiles its source code into a parse tree and executes it immediately. This means that the development cycle of perl scripts is much quicker than with compiled languages. However, the execution speeds of perl scripts cannot compete with compiled object code. Because of its rapid development time, perl is often used as a prototyping language for large software projects. Perl allows programmers to develop simplified versions of projects, which can later be converted into other faster running compiled languages.
Perl's Strengths
- Perl uses sophisticated pattern-matching techniques to swiftly scan large amounts of textual (or binary) data. - Perl can read and write to TCP/IP sockets. - Perl is free and readily available. - Perl help is easily available. - Perl is an excellent CGI scripting language. - Perl can be used to automate FTP file retrieval. - Perl has specialized extensions for handling Oracle and other popular data bases.
Perl is Simple
Perl is relatively easy to learn, especially for programmers familiar with C and C++. Much of perl's syntax is similar to that of C. Perl can make simple programs much shorter, and easier to write. Take a look at C code compared to perl code for the ubiquitous Hello World program. C code Perl code #include <stdio.h> print "Hello World\n"; main() { printf("Hello World\n"); } See how easy simple perl scripts can be? Now lets look at some perl syntax. Comments are proceeded by the # character, and continue until the end of the line. They can appear at the beginning of a line, or after perl code. $J = 333; # assigns the value 33 to the scalar variable J # this is a comment too There are many ways to perform loops, and if / else operations. Refer to perl's man pages for a complete list. Perl supports C's for loop functionality, as well as the while statement, along with a foreach structure. Here is an example of the foreach structure:
Executing Perl Scripts
Perl programs can be executed by proceeding the filename with perl. To execute our Hello World script one simply enters: Unix> perl hello_world.perl On our Unix system perl 5.0 is located at /usr/local/bin/perl5 .
Perl's Command Line
Perl can be made to issue warnings about possible execution problems by using the -w tag before the filename on the command line. example: Unix> perl -w hello_world.perl Similarly, the -d tag can be used to activate the perl debugger which allows for step by step running of the program with stop points and value checking.
Perl's Variable Types
Perl's 3 basic variable types are scalars, arrays and hashes. Unlike C, perl variables do not need to be declared prior to their use.
Scalars
Scalar variables are perl's most primitive variable type. Scalars can hold numbers, strings of characters or even strings of numbers, and these values are all completely interchangeable. Scalar variables are proceeded by the $ character. Numbers are represented as either unsigned integers, or double precision floating point numbers, depending on context. Here are some number assignment examples: $a = 4; $count = 0; $val = 1.22345; Strings are usually delimited by single or double quotes. Double quote delimited strings are subject to backslash and variable interpolation, while single quote delimited strings are not. The most common backslashed characters are \n for a newline, and \t for a tab (look familiar?). Refer to perl's man pages for more info. Here are some string assignment examples: $a = "hello"; $letter = 'a'; $name = "Jones";
Comparisons
Perl has different ways of comparing strings and numerics. Function Strings Numerics equal eq == not equal ne != less than lt < greater then gt > less than or equal to le <= greater than or equal to ge >= comparison with signed cmp <=> result
Arrays
Arrays are ordered groups of variables. They are always proceeded by the @ character. Arrays are composed of comma separated values surrounded by parenthesis. Here are some array assignment examples: @teachers = ("Vose", "Berry", "Vander Zanden"); @letters = (a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z); @percentages = (33.5555,55.0,11,23.44); Array values are accessed via the $ operator and identified by an integer value surrounded by brackets [ ]. As in the C programming language, Arrays indices always begin at 0. To access the letter c in the array @letters you would refer to $letters[2]. To determine the number of elements in an array, you assign the array to a scalar value as seen below: $count = @letters; The scalar count receives the value 26, which is the number of elements in the array letters. Perl provides built-in push and pop functions which operate on arrays, treating them as stacks. Using our array examples above, observe these uses of push and pop: push(@teachers,"Dongarra"); # adds the string to the end of @teachers # Now @teachers = {"vose","berry", # "Vander Zanden","Dongarra"} $last = pop(@teachers); # $last = "Dongarra" # Now @teachers = {"vose","berry","Vander Zanden"}
Hashes
Hashes are unordered sets of key/value pairs. They are always proceeded by the % character. Values are assigned with the key with either a comma , or an arrow => . Pairs are separated by commas only. Entire hash assignments surrounded by parenthesis. Here are some hash assignment examples: %grades = ('Mackey',A , 'Frost',B+, 'Jhonston',C , 'Toms',A); %jersey_numbers = (Ripken => 8, Hayes => 22, Ruth => 3); One way to access a value from its key is shown below: $number = $numbers{"Ripken"}; # $number = 8
I/O with Perl
Perl uses filehandles to control input and output. Perl has 3 built-in filehandles which are opened automatically for each program. These are STDIN, STDOUT, and STDERR. Additional filehandles are created by the open command. open(DATA, "myfile.text"); # opens myfile.text for reading # with the filehandle DATA open(OUT,">myfile.text"); # opens myfile.text for writing # with the filehandle OUT open(OUT2,">>myfile.text"); # opens myfile.text for appending # with the filehandle OUT2 Open returns a true if the file is successfully opened, and false on failure. To access files, surround the filehandle with the diamond operator: <DATA> See perltest1. Now notice how perltest2 accomplishes the same results. In perltest2 note how $_ is used as a default variable. Perl provides the close command to deallocate filehandles as seen below:
close(OUT2); # closes filehandle OUT2
Text Processing with Perl
Perl provides a great deal of built-in text-processing functions. We will cover only some of the most popular such functions. Once again refer to perl's man pages for further info.
Pattern Matching
Perl uses forward slashes to delimit regular expressions for pattern matching and substitution. Strings are evaluated to true of false via the =~ operator. $a = "Mary had a litle lamb"; $a =~/little/ # evaluates to true $a =~/brittle/ # evaluates to false Perl provides a set of modifying characters for string matching, some of these are shown below: Modifier Meaning i matches characters regardless of case g matches characters globally s treats string as a single line Perl uses a set of meta-characters to extend the functionality of pattern matching. Below is a table of commonly used meta characters. Metacharacter Meaning . matches any single character except for \n ^ matches strings which occur at the front of a line $ matches strings which occur at the end of a line * matches proceeded character 0 or more times + matches proceeded character 1 or more times ? matches proceeding character 0 or 1 times [...] matches any of the class of characters Perl also has a set of special characters proceeded with a backslash, some of which are listed below. Special Character Meaning \s any whitespace \S any non whitespace \d any digit i.e. [0-9] \w any alphanumeric i.e [a-zA-Z0-9] \n newline \t tab
Substitution
Perl provides a simple way of searching for patterns and substituting new patterns in their place. This accomplished by using a s before a slash delimited regular expression. This is an extremely powerful text processing tool. s/string1/string2/i # replaces the first instance of string1 with # string 2 # the /i forces a case sensitive search See the following files for examples: example1 reads from <STDIN> and writes each line in reverse order to <STDOUT> example1A places lines from <STDIN> into a buffer and prints those lines in reverse order after reading an EOF <CONTROL d> example2 reads from <STDIN> and writes each line in reverse order to the file revline.txt example3 sorts an array and prints its sorted elements to STDOUT example4 reads a series of strings from <STDIN> and replaces each occurrence of the string "[tT][iI][mM][eE]" with the string "money"