UNIX> sed [ -n ] [ -e script ] [ -f sfilename ] [ filenames ]Usually you call sed in the following manner (omitting the -e):
UNIX> sed script filenameor you omit the filename and sed will work on standard input.
Scripts have the following syntax:
[ address [, address ] ] function [ arguments ]
The addresses define the lines on which sed will
act. The function defines what will get done to those
lines. On the other lines (those not included in the address
specification), sed either copies them to standard output
unmodified, or if sed was called with the -n option,
it does not copy them to standard output.
If no addresses are defined, then sed acts on every line.
Instead of using a line number, you may use a regular expression. See the grep lecture for a primer on regular expressions. If you specify just one address with a regular expression, then sed will act on all lines that match that regular expression. If you specify a range of two regular expressions, then sed will act on all lines from the first line matching the first regular expression to the next line matching the second regular expression. Then it skips until it finds a line matching the first regular expression and acts on all lines from that line until the next line matching the second regular expression and so on. If it hits the end of the file before finding the a match for the second regular expression, then it simply goes to the end of the file.
Some of the regular expressions are summarized below
^ |
Beginning of the line |
$ |
End of the line |
. |
Matches any single character |
(character)* |
Matches arbitrarily any occurrences of (character) |
(character)? |
Match 0 or 1 instance of (character) |
[^a..z] |
Match any character NOT enclosed in [] |
(character)\{m,n\} |
Match m-n repetitions of (character) |
(character)\{m,\} |
Match m or more repetitions of (character) |
(character)\{,n\} |
Match n or less (possibly 0) repetitions of (character) |
(character)\{n\} |
Match exactly n repetitions of (character) |
\(expression\) |
Group operator. |
UNIX> cat usopen
1 Ernie Els -- -4 -4 0
2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5
5 Bob Tway -- +2 -4 6
6 Olin Browne -- +2 -4 6
7 Jim Furyk -- +2 -4 6
8 Tommy Tolles -- +2 -4 6
9 Jay Haas -- +2 -4 6
10 Scott McCarron -- +3 -4 7
11 Scott Hoch -- +3 -4 7
12 David Ogrin -- +3 -4 7
13 Loren Roberts -- +4 -4 8
14 Stewart Cink -- +4 -4 8
15 Billy Andrade -- +4 -4 8
16 Bradley Hughes -- +5 -4 9
17 Jose Maria Olazabal -- +5 -4 9
18 Davis Love III -- +5 -4 9
19 Nick Price -- +6 -4 10
20 Lee Westwood -- +6 -4 10
UNIX> sed d usopen # delete all lines
UNIX> sed 6,19d usopen # delete lines 6 - 19
1 Ernie Els -- -4 -4 0
2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5
5 Bob Tway -- +2 -4 6
20 Lee Westwood -- +6 -4 10
UNIX> sed '2,$d' usopen # delete all but the first line
1 Ernie Els -- -4 -4 0
UNIX> sed '/+/d' usopen # delete all lines with +'s
1 Ernie Els -- -4 -4 0
2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
UNIX> sed '/^.....3/,/^.....9/d' usopen # delete lines 3-9 and 13-19
1 Ernie Els -- -4 -4 0
2 Colin Montgomerie -- -3 -4 1
10 Scott McCarron -- +3 -4 7
11 Scott Hoch -- +3 -4 7
12 David Ogrin -- +3 -4 7
20 Lee Westwood -- +6 -4 10
UNIX>
The p command explicitly prints out a line. This is most often
used in combination with the -n option. In this way, you can
have sed strip out all but the specified lines. Note that
UNIX> sed -n /RE/pis equivalent to:
UNIX> grep REWhen used without the -n option, the p command duplicates a line, which is sometimes convenient. Some examples:
UNIX> sed -n 5p usopen # print just line 5
5 Bob Tway -- +2 -4 6
UNIX> sed -n 1,5p usopen # print lines 1-5
1 Ernie Els -- -4 -4 0
2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5
5 Bob Tway -- +2 -4 6
UNIX> sed -n '/J/p' usopen # print all lines with J's
4 Jeff Maggert -- +1 -4 5
7 Jim Furyk -- +2 -4 6
9 Jay Haas -- +2 -4 6
17 Jose Maria Olazabal -- +5 -4 9
UNIX> sed -n '/J/,/J/p' usopen # Can you figure it out?
4 Jeff Maggert -- +1 -4 5
5 Bob Tway -- +2 -4 6
6 Olin Browne -- +2 -4 6
7 Jim Furyk -- +2 -4 6
9 Jay Haas -- +2 -4 6
10 Scott McCarron -- +3 -4 7
11 Scott Hoch -- +3 -4 7
12 David Ogrin -- +3 -4 7
13 Loren Roberts -- +4 -4 8
14 Stewart Cink -- +4 -4 8
15 Billy Andrade -- +4 -4 8
16 Bradley Hughes -- +5 -4 9
17 Jose Maria Olazabal -- +5 -4 9
UNIX> head -5 usopen | sed p # duplicate lines 1-5
1 Ernie Els -- -4 -4 0
1 Ernie Els -- -4 -4 0
2 Colin Montgomerie -- -3 -4 1
2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5
4 Jeff Maggert -- +1 -4 5
5 Bob Tway -- +2 -4 6
5 Bob Tway -- +2 -4 6
UNIX>
s/pattern1/pattern2/commandIn its simplest form, you give a regular expression for pattern1, a substitution string for pattern2, and no command. That means to substitute the substitution string when you encounter the regular expression in a line.
Examples:
UNIX> cat ussmall
1 Ernie Els -- -4 -4 0
2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5
5 Bob Tway -- +2 -4 6
UNIX> sed 's/--/XXXX/' ussmall
1 Ernie Els XXXX -4 -4 0
2 Colin Montgomerie XXXX -3 -4 1
3 Tom Lehman XXXX -2 -4 2
4 Jeff Maggert XXXX +1 -4 5
5 Bob Tway XXXX +2 -4 6
UNIX> sed 's/-/X/' ussmall
1 Ernie Els X- -4 -4 0
2 Colin Montgomerie X- -3 -4 1
3 Tom Lehman X- -2 -4 2
4 Jeff Maggert X- +1 -4 5
5 Bob Tway X- +2 -4 6
UNIX>
You'll note that it only performs the substitution on the first
matching of pattern1. In other words, in the last
command, only the first dash is replaced with an X, and not
all of them. To replace all occurrences of pattern1,
use a command of g.
You can specify addresses on which to perform substitution. If pattern1 is not found on a line, then it performs no substitution.
Examples:
UNIX> sed 's/-/X/g' ussmall
1 Ernie Els XX X4 X4 0
2 Colin Montgomerie XX X3 X4 1
3 Tom Lehman XX X2 X4 2
4 Jeff Maggert XX +1 X4 5
5 Bob Tway XX +2 X4 6
UNIX> sed '/Lehman/s/--/XX/' ussmall
1 Ernie Els -- -4 -4 0
2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman XX -2 -4 2
4 Jeff Maggert -- +1 -4 5
5 Bob Tway -- +2 -4 6
UNIX> sed '1,/Maggert/s/Maggert/Sluman/' ussmall
1 Ernie Els -- -4 -4 0
2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
4 Jeff Sluman -- +1 -4 5
5 Bob Tway -- +2 -4 6
UNIX> sed 's/Ernie/Burt/' ussmall
1 Burt Els -- -4 -4 0
2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5
5 Bob Tway -- +2 -4 6
UNIX>cat file
http://www.foo.com/mypage.html
UNIX>sed -e 's@http://www.foo.com@http://www.bar.net@' file
http://www.bar.net/mypage.html
Note that we used a different delimiter, @ for the substitution command.
Sed permits several delimiters for the s command including @%,;:
these alternative delimiters are good for substitutions which include strings
such as filenames, as it makes your sed code much more readable
UNIX> sed 's/^ *\([0-9]\)/\1 \1/' ussmall
1 1 Ernie Els -- -4 -4 0
2 2 Colin Montgomerie -- -3 -4 1
3 3 Tom Lehman -- -2 -4 2
4 4 Jeff Maggert -- +1 -4 5
5 5 Bob Tway -- +2 -4 6
UNIX> sed 's/^\(.*\) -- \(..\).*/\2 \1/' ussmall
-4 1 Ernie Els
-3 2 Colin Montgomerie
-2 3 Tom Lehman
+1 4 Jeff Maggert
+2 5 Bob Tway
UNIX> sed 's/\([A-Z][a-z]*\) /"\1 Baby" /' ussmall
1 "Ernie Baby" Els -- -4 -4 0
2 "Colin Baby" Montgomerie -- -3 -4 1
3 "Tom Baby" Lehman -- -2 -4 2
4 "Jeff Baby" Maggert -- +1 -4 5
5 "Bob Baby" Tway -- +2 -4 6
UNIX>
When performing the matching, sed finds the largest leftmost string that
matches the pattern. The precedence is leftmost first, then largest. Thus,
\(.*\) will match the entire string. \(.\) will match the
leftmost character. And so on. Sometimes this gives you surprises
when you use stars, because they can match nothing. However, you usually
figure out how to get things working pretty quickly.
UNIX> sed 's/\(.*\)/XXX \1 XXX/' ussmall XXX 1 Ernie Els -- -4 -4 0 XXX XXX 2 Colin Montgomerie -- -3 -4 1 XXX XXX 3 Tom Lehman -- -2 -4 2 XXX XXX 4 Jeff Maggert -- +1 -4 5 XXX XXX 5 Bob Tway -- +2 -4 6 XXX UNIX> sed 's/\(.\)/"\1"/' ussmall " " 1 Ernie Els -- -4 -4 0 " " 2 Colin Montgomerie -- -3 -4 1 " " 3 Tom Lehman -- -2 -4 2 " " 4 Jeff Maggert -- +1 -4 5 " " 5 Bob Tway -- +2 -4 6 UNIX> sed 's/\([A-Za-z]*\)/"\1 Baby"/' ussmall " Baby" 1 Ernie Els -- -4 -4 0 " Baby" 2 Colin Montgomerie -- -3 -4 1 " Baby" 3 Tom Lehman -- -2 -4 2 " Baby" 4 Jeff Maggert -- +1 -4 5 " Baby" 5 Bob Tway -- +2 -4 6 UNIX> sed 's/[^A-Z]*\([^ ]*\) \([^ ]*\) -- \(..\).*/\3: \2, \1/' ussmall -4: Els, Ernie -3: Montgomerie, Colin -2: Lehman, Tom +1: Maggert, Jeff +2: Tway, Bob UNIX> sed 's/[^A-Z]*\([A-Za-z ]*\).*/\1\1\1/' ussmall Ernie Els Ernie Els Ernie Els Colin Montgomerie Colin Montgomerie Colin Montgomerie Tom Lehman Tom Lehman Tom Lehman Jeff Maggert Jeff Maggert Jeff Maggert Bob Tway Bob Tway Bob Tway UNIX>You can call sed with -n, and use the p string as a command to the substitution, and then sed will only print out lines which performed the substitution:
UNIX> sed -n 's/Maggert/Maggot/p' ussmall
4 Jeff Maggot -- +1 -4 5
UNIX> sed -n 's/-- -\(.\).*/: \1 under/p' ussmall
1 Ernie Els : 4 under
2 Colin Montgomerie : 3 under
3 Tom Lehman : 2 under
UNIX>
The g command works with replacement substrings too. Again, beware
of the *:
UNIX> sed 's/\([A-Z][a-z]*\)/X \1/g' ussmall
1 X Ernie X Els -- -4 -4 0
2 X Colin X Montgomerie -- -3 -4 1
3 X Tom X Lehman -- -2 -4 2
4 X Jeff X Maggert -- +1 -4 5
5 X Bob X Tway -- +2 -4 6
UNIX> sed 's/\([A-Za-z]*\)/X \1/g' ussmall # can you figure out why???
Output line too long.
Output line too long.
Output line too long.
Output line too long.
X X X X X X X X X X X X X X X X X X X....
UNIX>
Finally, you can use other characters besides the slash to delimit a
substitution. In fact, you can use any character you want. This is
useful if you want to mess with slashes:
UNIX> sed 's#\([A-Za-z]\{1,\}\)#/\1/#g' ussmall
1 /Ernie/ /Els/ -- -4 -4 0
2 /Colin/ /Montgomerie/ -- -3 -4 1
3 /Tom/ /Lehman/ -- -2 -4 2
4 /Jeff/ /Maggert/ -- +1 -4 5
5 /Bob/ /Tway/ -- +2 -4 6
UNIX>
UNIX> cat script s/[^A-Z]*// s/\([A-Za-z]*\) \([A-Za-z ]*\) /\2, \1/ s/--/:/ s/ .*// s/-\([0-9]*\)/\1 under/ s/+\([0-9]*\)/\1 over/ s/Maria Olazabal, Jose/Olazabal, Jose Maria/ UNIX> sed -f script usopen Els, Ernie: 4 under Montgomerie, Colin: 3 under Lehman, Tom: 2 under Maggert, Jeff: 1 over Tway, Bob: 2 over Browne, Olin: 2 over Furyk, Jim: 2 over Tolles, Tommy: 2 over Haas, Jay: 2 over McCarron, Scott: 3 over Hoch, Scott: 3 over Ogrin, David: 3 over Roberts, Loren: 4 over Cink, Stewart: 4 over Andrade, Billy: 4 over Hughes, Bradley: 5 over Olazabal, Jose Maria: 5 over Love III, Davis: 5 over Price, Nick: 6 over Westwood, Lee: 6 over UNIX>
Ed's syntax is like sed's:
[ address [, address ] ] function [ arguments ]
Another address you should know about is dot ('.'), which is the
``current'' line.
Some basic ed functions you should know are:
UNIX> cp ussmall work
UNIX>ed -p"ed: " work
191 # This tells you how big the file is in bytes
ed: 1,$p
1 Ernie Els -- -4 -4 0
2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5
5 Bob Tway -- +2 -4 6
ed: 1,$n
1 1 Ernie Els -- -4 -4 0
2 2 Colin Montgomerie -- -3 -4 1
3 3 Tom Lehman -- -2 -4 2
4 4 Jeff Maggert -- +1 -4 5
5 5 Bob Tway -- +2 -4 6
ed: 3d
ed: 1,$n
1 1 Ernie Els -- -4 -4 0
2 2 Colin Montgomerie -- -3 -4 1
3 4 Jeff Maggert -- +1 -4 5
4 5 Bob Tway -- +2 -4 6
ed: /Els/,3n
1 1 Ernie Els -- -4 -4 0
2 2 Colin Montgomerie -- -3 -4 1
3 4 Jeff Maggert -- +1 -4 5
ed: /Mont/,$d
ed: 1,$n
1 1 Ernie Els -- -4 -4 0
ed: w
36 # This tells you how big the file is in bytes
ed: q
UNIX> cat work
1 Ernie Els -- -4 -4 0
UNIX>
You can do substitution commands in ed, just like in sed:
Note that when you specify just one address in ed, it only
does the first matching line:
UNIX> cp ussmall work
UNIX> ed work
191
ed: 1,$s/-- \([-+][0-9]*\).*/: \1/
ed: 1,$p
1 Ernie Els : -4
2 Colin Montgomerie : -3
3 Tom Lehman : -2
4 Jeff Maggert : +1
5 Bob Tway : +2
ed: /M/s/[A-Z]/X/g
ed: 1,$p
1 Ernie Els : -4
2 Xolin Xontgomerie : -3
3 Tom Lehman : -2
4 Jeff Maggert : +1
5 Bob Tway : +2
ed: q
?
ed: q # don't save the changes
UNIX>
You can use the g function to apply the command that follows to all
lines that match the regular expression after the g:
UNIX> ed work
191
ed: 1,$n
1 1 Ernie Els -- -4 -4 0
2 2 Colin Montgomerie -- -3 -4 1
3 3 Tom Lehman -- -2 -4 2
4 4 Jeff Maggert -- +1 -4 5
5 5 Bob Tway -- +2 -4 6
ed: 1,$g/M/d
ed: 1,$p
1 Ernie Els -- -4 -4 0
3 Tom Lehman -- -2 -4 2
5 Bob Tway -- +2 -4 6
ed: q
?
ed: q
UNIX> ed work
191
ed: 1,$g/M/s/[A-Z]/X/g
ed: 1,$p
1 Ernie Els -- -4 -4 0
2 Xolin Xontgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
4 Xeff Xaggert -- +1 -4 5
5 Bob Tway -- +2 -4 6
ed: w
191
ed: q
UNIX>
You can add or insert lines of text in ed. The a function says
to add the lines after the given line. The i function says to
insert the lines before the given line. The c command says to
replace (change) the specified line(s) with the ones that you type in. You
complete the addition of text with a control-D, or by a line that has
nothing but a period:
UNIX> cp ussmall work
UNIX> ed work
191
ed: 1a
Ernie was the only player who didn't choke.
The rest were total chokaramas.
.
ed: 1,$n
1 1 Ernie Els -- -4 -4 0
2 Ernie was the only player who didn't choke.
3 The rest were total chokaramas.
4 2 Colin Montgomerie -- -3 -4 1
5 3 Tom Lehman -- -2 -4 2
6 4 Jeff Maggert -- +1 -4 5
7 5 Bob Tway -- +2 -4 6
ed: $a
No one else
.
ed: 1,$p
1 Ernie Els -- -4 -4 0
Ernie was the only player who didn't choke.
The rest were total chokaramas.
2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5
5 Bob Tway -- +2 -4 6
No one else
ed: $c
Definitely no one else
.
ed: 1,$n
1 1 Ernie Els -- -4 -4 0
2 Ernie was the only player who didn't choke.
3 The rest were total chokaramas.
4 2 Colin Montgomerie -- -3 -4 1
5 3 Tom Lehman -- -2 -4 2
6 4 Jeff Maggert -- +1 -4 5
7 5 Bob Tway -- +2 -4 6
8 Definitely no one else
ed:
UNIX> cp ussmall work
UNIX> ed work
191
ed: /Maggert/ka
ed: 'ap
4 Jeff Maggert -- +1 -4 5
ed: 1,'an
1 1 Ernie Els -- -4 -4 0
2 2 Colin Montgomerie -- -3 -4 1
3 3 Tom Lehman -- -2 -4 2
4 4 Jeff Maggert -- +1 -4 5
ed: 2
2 Colin Montgomerie -- -3 -4 1
ed: kb
ed: 'b,'ap
2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5
ed:
Note in the kb command, the mark is set to the current line, which
was set in the previous command.
The m function moves a set of lines to the line following the specified line. The t (transfer) function copies a set of lines to the line following the specified line:
ed: 1,$p
1 Ernie Els -- -4 -4 0
2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5
5 Bob Tway -- +2 -4 6
ed: 1,2m4
ed: 1,$p
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5
1 Ernie Els -- -4 -4 0
2 Colin Montgomerie -- -3 -4 1
5 Bob Tway -- +2 -4 6
ed: 1
3 Tom Lehman -- -2 -4 2
ed: 1t1
ed: 1,$p
3 Tom Lehman -- -2 -4 2
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5
1 Ernie Els -- -4 -4 0
2 Colin Montgomerie -- -3 -4 1
5 Bob Tway -- +2 -4 6
ed:
UNIX> cp ussmall work
UNIX> ed work
191
ed: 1,$n
1 1 Ernie Els -- -4 -4 0
2 2 Colin Montgomerie -- -3 -4 1
3 3 Tom Lehman -- -2 -4 2
4 4 Jeff Maggert -- +1 -4 5
5 5 Bob Tway -- +2 -4 6
ed: 1,2j
ed: 1,$p
1 Ernie Els -- -4 -4 0 2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5
5 Bob Tway -- +2 -4 6
ed: 3,4j
ed: 1,$p
1 Ernie Els -- -4 -4 0 2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5 5 Bob Tway -- +2 -4 6
ed: 0r ussmall
191
ed: 1,$p
1 Ernie Els -- -4 -4 0
2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5
5 Bob Tway -- +2 -4 6
1 Ernie Els -- -4 -4 0 2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5 5 Bob Tway -- +2 -4 6
ed: u
ed: 1,$p
1 Ernie Els -- -4 -4 0 2 Colin Montgomerie -- -3 -4 1
3 Tom Lehman -- -2 -4 2
4 Jeff Maggert -- +1 -4 5 5 Bob Tway -- +2 -4 6
ed: q
?
ed: q
UNIX>
Of course, learning vi does try ones nerves if one is not used to ``edit'' and ``insert'' modes, but the interplay of sed, ed and vi is one of vi's big strengths.
A simple example. Note the use of ',' to specify all lines, and the expansion of $1
UNIX> cat shscript #!/bin/sh if [ $# -ne 1 ]; then echo "usage: shscript name" >&2 exit 1 fi cp ussmall work ed - work << EOF ,s/[^A-Z]*// ,s/ --.*// ,s/\([^ ]*\)/\1 "$1"/ w q EOF UNIX> UNIX> shscript Thumper UNIX> cat work Ernie "Thumper" Els Colin "Thumper" Montgomerie Tom "Thumper" Lehman Jeff "Thumper" Maggert Bob "Thumper" Tway UNIX>Summary of functions in Ed
Functions Operation __________________________________________________________________________ $ Reads the last line in this case "of the ed editor" - Moves back one line. + The + moves one line forward /. Reads the first line /text/ Searches for the text typed in-between the forward slashes i Inserts text above the current line. j Joins lines of text. t Copies the line. c Used to change text. r Removes the specific line you are currently on. X Allows you to skip to a line. Additionally many of the above options can be combined.
Tr
Finally, sometimes you want to put a newline in certain places, and sed and ed irritatingly don't let you do it easily. For that you use tr. Read the man page for a full description. The most useful incantation of tr is:tr char '\012'When char is a regular expression for a character and \012 is the ASCII representation of a newline. This will replace all instances of that char with a newline. So, for example, to replace all capital letters with newlines, you do:UNIX> tr '[A-Z]' '\012' < ussmall 1 rnie ls -- -4 -4 0 2 olin ontgomerie -- -3 -4 1 3 om ehman -- -2 -4 2 4 eff aggert -- +1 -4 5 5 ob way -- +2 -4 6 UNIX>