Sunday, August 7, 2011

unix tool awk

The Awk text-processing programming language and is a useful tool for manipulating text.

  • Awk recognizes the concepts of "file", "record", and "field".
  • A file consists of records, which by default are the lines of the file. One line becomes one record.
  • Awk operates on one record at a time.
  • A record consists of fields, which by default are separated by any number of spaces or tabs.
  • Field number 1 is accessed with $1, field 2 with $2, and so forth. $0 refers to the whole record.
  • Examples

    Print every line after erasing the 2nd field

    awk '{$2 = ""; print}' file

    Print hi 48 times
    yes | head -48 | awk '{ print "hi" }'
    print only lines of less than 65 characters
    awk 'length < 64'

    In awk, the $0 variable represents the entire current line, so print and print $0
    do exactly the same thing.

    AWK Variables

    awk variables are initialized to either zero or the empty string the first time they are used.
    Variables
  • Variable declaration is not required
  • May contain any type of data, their data type may change over the life of the program
  • Must begin with a letter and continuing with letters, digits and underscores
  • Are case senstive
  • Some of the commonly used built-in variables are:

    • NR -- The current line's sequential number
    • NF -- The number of fields in the current line
    • FS -- The input field separator; defaults to whitespace and is reset by the -F command line parameter
    Integer variables can be used to refer to fields. If one field contains information about which other field is important, this script will print only the important field:
    $ awk '{imp=$1; print $imp }' calc

    The special variable NF tells you how many fields are in this record.
    This script prints the first and last field from each record, regardless
    of how many fields there are:

    No comments:

    Post a Comment