book

sed & awk, 2nd Edition

by Dale Dougherty, Arnold Robbins

March 1997

Intermediate to advanced

432 pages

11h 31m

English

O'Reilly Media, Inc.

Read now

Unlock full access

DOS VersionsOther Sources of Information About sed and awkSample Programs
FTPFtpmailBITFTPUUCP

2.2.1. Scripting2.2.2. Sample Mailing List
2.3.1. Specifying Simple Instructions2.3.1.1. Command garbled2.3.2. Script Files2.3.2.1. Saving output2.3.2.2. Suppressing automatic display of input lines2.3.2.3. Mixing options (POSIX)2.3.2.4. Summary of options
2.4.1. Running awk2.4.2. Error Messages2.4.3. Summary of Options
3.2.1. The Ubiquitous Backslash3.2.2. A Wildcard3.2.3. Writing Regular Expressions3.2.4. Character Classes3.2.4.1. A range of characters3.2.4.2. Excluding a class of characters3.2.4.3. POSIX character class additions3.2.5. Repeated Occurrences of a Character3.2.6. What’s the Word? Part I3.2.7. Positional Metacharacters3.2.7.1. Phrases3.2.8. A Span of Characters3.2.9. Alternative Operations3.2.10. Grouping Operations3.2.11. What’s the Word? Part II3.2.12. Your Replacement Is Here3.2.12.1. The extent of the match3.2.13. Limiting the Extent
4.1.1. The Pattern Space
4.2.1. Grouping Commands
4.3.1. testsed4.3.2. runsed
4.4.1. Multiple Edits to the Same File4.4.2. Making Changes Across a Set of Files4.4.3. Extracting Contents of a File4.4.3.1. Extracting a macro definition4.4.3.2. Generating an outline4.4.4. Edits To Go
5.3.1. Replacement Metacharacters5.3.1.1. Correcting index entries
5.6.1. Stripping Out Non-Printable Characters from nroff Files
5.11.1. Checking Out Reference Pages
6.1.1. Append Next Line6.1.1.1. Converting an Interleaf file6.1.2. Multiline Delete6.1.3. Multiline Print
6.3.1. A Capital Transformation6.3.2. Correcting Index Entries (Part II)6.3.3. Building Blocks of Text
6.4.1. Branching6.4.2. The Test Command6.4.3. One More Case
7.4.1. Describing Your Script
7.5.1. Referencing and Separating Fields7.5.2. Field Splitting: The Full Story
7.6.1. Averaging Student Grades
7.7.1. Working with Multiline Records7.7.2. Balance the Checkbook
7.8.1. Getting Information About Files
7.11.1. Finding a Glitch
8.1.1. Conditional Operator
8.2.1. While Loop8.2.2. Do Loop8.2.3. For Loop8.2.4. Deriving Factorials
8.4.1. Associative Arrays8.4.2. Testing for Membership in an Array8.4.3. A Glossary Lookup Script8.4.4. Using split( ) to Create Arrays8.4.5. Making Conversions8.4.6. Deleting Elements of an Array
8.5.1. Multidimensional Arrays
8.6.1. An Array of Command-Line Parameters8.6.2. An Array of Environment Variables
9.1.1. Trigonometric Functions9.1.2. Integer Function9.1.3. Random Number Generation9.1.4. Pick ‘em
9.2.1. Substrings9.2.2. String Length9.2.3. Substitution Functions9.2.4. Converting Case9.2.5. The match( ) Function
9.3.1. Writing a Sort Function9.3.2. Maintaining a Function Library9.3.3. Another Sorted Example
10.1.1. Reading Input from Files10.1.2. Assigning the Input to a Variable10.1.3. Reading Input from a Pipe
10.5.1. Directing Output to a Pipe10.5.2. Working with Multiple Files
10.7.1. Make a Copy10.7.2. Before and After Photos10.7.3. Finding Out Where the Problem Is10.7.4. Commenting Out Loud10.7.5. Slash and Burn10.7.6. Getting Defensive About Your Script
11.1.1. Escape Sequences11.1.2. Exponentiation11.1.3. The C Conditional Expression11.1.4. Variables as Boolean Patterns11.1.5. Faking Dynamic Regular Expressions11.1.6. Control Flow11.1.7. Field Separating11.1.8. Arrays11.1.9. The getline Function11.1.10. Functions11.1.11. Built-In Variables
11.2.1. Common Extensions11.2.1.1. Deleting all elements of an array11.2.1.2. Obtaining individual characters11.2.1.3. Flushing buffered output11.2.1.4. Special filenames11.2.1.5. The nextfile statement11.2.1.6. Regular expression record separators (gawk and mawk)11.2.2. Bell Labs awk11.2.3. GNU awk (gawk)11.2.3.1. Command line options11.2.3.2. An awk program search path11.2.3.3. Line continuation11.2.3.4. Extended regular expressions11.2.3.5. Regular expression record terminators11.2.3.6. Separating fields11.2.3.7. Additional special files11.2.3.8. Additional variables11.2.3.9. Additional functions11.2.3.10. A general substitution function11.2.3.11. Time management for programmers11.2.4. Michael’s awk (mawk)
11.3.1. MKS awk11.3.2. Thompson Automation awk (tawk)11.3.2.1. Tawk language extensions11.3.2.2. Additional built-in tawk functions11.3.3. Videosoft VSAwk
12.1.1. BEGIN Procedure12.1.2. Main Procedure12.1.3. END Procedure12.1.4. Supporting Functions12.1.5. The spellcheck Shell Script
12.2.1. The masterindex Program12.2.2. Standardizing Input12.2.3. Sorting the Entries12.2.4. Handling Page Numbers12.2.5. Merging Entries with the Same Keys12.2.6. Formatting the Index12.2.6.1. The masterindex shell script
12.3.1. How to Hide a Special Character12.3.2. Rotating Two Parts12.3.3. Finding a Replacement12.3.4. A Function for Reporting Errors12.3.5. Handling See Also Entries12.3.6. Alternative Ways to Sort
13.1.1. Program Notes for uutot.awk
13.2.1. Program Notes for phonebill
13.3.1. Program Notes for combine
13.4.1. Program Notes for mailavg
13.5.1. Program Notes for adj
13.6.1. Program Notes for readsource
13.7.1. Program Notes for gent
13.8.1. Program Notes for plpr
13.9.1. Program Notes for transpose
13.10.1. Program Notes for m1
A.2.1. Pattern AddressingA.2.2. Regular Expression Metacharacters for sed
B.1.1. Shell Wrapper for Invoking awk
B.2.1. Records and FieldsB.2.2. Format of a ScriptB.2.2.1. Line terminationB.2.2.2. CommentsB.2.3. PatternsB.2.4. Regular ExpressionsB.2.5. ExpressionsB.2.5.1. ConstantsB.2.5.2. Escape sequencesB.2.5.3. VariablesB.2.5.4. ArraysB.2.5.5. System variablesB.2.5.6. OperatorsB.2.6. Statements and Functions
B.3.1. Format Expressions Used in printf and sprintf

Content preview from sed & awk, 2nd Edition

Testing and Saving Output

In our previous discussion of the pattern space, you saw that sed:

Makes a copy of the input line.
Modifies that copy in the pattern space.
Outputs the copy to standard output.

What this means is that sed has a built-in safeguard so that you don’t make changes to the original file. Thus, the following command line:

$ sed -f sedscr testfile

does not make the change in testfile. It sends all lines to standard ouput (typically the screen)—the lines that were modified as well as the lines that are unchanged. You have to capture this output in a new file if you want to save it.

$ sed -f sedscr testfile > newfile

The redirection symbol “>” directs the output from sed to the file newfile. Don’t redirect the output from the command back to the input file or you will overwrite the input file. This will happen before sed even gets a chance to process the file, effectively destroying your data.

One important reason to redirect the output to a file is to verify your results. You can examine the contents of newfile and compare it to testfile. If you want to be very methodical about checking your results (and you should be), use the diff program to point out the differences between the two files.

$ diff testfile newfile

This command will display lines that are unique to testfile preceded by a “<” and lines unique to newfile preceded by a “>”. When you have verified your results, make a backup copy of the original input file and then use the mv command to overwrite the ...