awk scripting explained with practical examples

Awk command / tool is used to manipulate text rows and columns in a file.  Awk has built in string functions and associative arrays. Awk supports most of the operators, conditional blocks and available in C language. awk scripting

awk scripting

One of the good thing is we can use awk command along with other commands to achieve the required output. We can also convert awk script to perl.

Basic systax of awk: ‘BEGIN {start_action} {action} END {stop_action}’ file_name

Here are the actions

  • Begin block is performed before the file
  • End block block is performed after processing the file
  • Rest of the actions are performed while processing the file

Examples: Create a file with name test below data in it

[root@TechTutorial awk]# cat test

From above data, you can observe that file has rows and columns separated by space and rows are new lines. To explain this article we are going to use test file for few examples.

1.Print required columns using print string

Command Syntax: awk ‘{print $3}’ test

Here $3 has a meaning print 3rd columns out of all the columns from test file. In the way of you would like to print multiple columns mention the column names separated by comma $1,$2,$3….. below is the output which as 3rd column in all the rows

[root@TechTutorial awk]# awk '{ print $3 }' test
root

[root@TechTutorial awk]# awk '{ print $1,$3,$6 }' test
-rw-r--r--. root Apr

To print the 4th and 6th columns in a file use awk ‘{print $4,$6}’ test

Here the begin and end blocks are not used in awk. So, the print command will be executed for each row it reads from the file. In the next example we will see how to use begin and end blocks.

2. Print sum of the column value

Command Syntax:   awk ‘BEGIN {sum=0} {sum=sum+$7} END {print sum}’ test

The above example will prints the sum of the value in the 7th column. In the begin block the variable sum is assigned with value 0. In the next block the value of 7th column is added to the sum variable. This addition of the 7th column to the sum variable repeats for every row it processed. When all the rows are processed the sum variable will hold the sum of the values in the 7th column. This value is printed in the End block as shown in below:

[root@TechTutorial awk]# awk 'BEGIN {sum=0} {sum=sum+$7} END {print sum}' test
300

3. Sum of column value using awk script

In 2nd example we saw that how to SUM the column 7th value, in same way instead of writing in one line statement we write as script. Create a file sumofcolumn and paste below script in that file

#!/usr/bin/awk -f
BEGIN {sum=0} 
{sum=sum+$7} 
END {print sum}

Now execute the script using awk command as shown below

[root@TechTutorial awk]# awk -f sumofcolumn test
300

This will run the script in sumofcolumn file and displays the sum of the 7th column in the test.

4. Find string and print matched line

Command Syntax:    awk ‘{if($9 == “arkit”) print $0;}’ test

above example will checks for the string “arkit” in the 3rd column and if it finds a match then it will print entire line. The output of this awk command is below

[root@TechTutorial awk]# awk '{ if($3 == "arkit") print $0;}' test

5. For loop with multiplication of mentioned value incremented by +1

Command Syntax:       awk ‘BEGIN { for(i=1;i<=10;i++) print “Multiplied value of”, i, “is”,i*i; }’

Above command will print the multiplied of first numbers from 1 to 10. i++ will add +1 to the number so that it will keep increase up to 10. The output of the command is below

[root@TechTutorial awk]# awk 'BEGIN { for(i=1;i<=10;i++) print "Multiplied value of", i, "is",i*i; }'
Multiplied value of 1 is 1

6.  Input field Separator

You have already seen $0,$1,$2.. which prints the entire line, first column, second column.. respectively. Now we will see other built in variables with examples.

As per our example file test we have columns which are separated by by space, but instead of space if you have any other symbol like .i.e. : , – we can make use of them to separate and print.

Example if you have : (colon) as separator then use below

awk ‘BEGIN {FS=”:”} {print $2}’ test

OR

awk -F: ‘{print $2}’ test

this will print the output as below

[root@TechTutorial awk]# awk -F: '{print $2}' test
41 file12

7. OFS – Output field separator variable

By default whenever we printed the fields using the print statement the fields are displayed with space character as delimiter. For example

Command syntax:     awk ‘{print $4,$5}’ test

[root@TechTutorial awk]# awk '{print $4,$5}' test
root 0

We can change this default behavior using OFS variable as

Command Syntax:    awk ‘BEGIN {OFS=”:”} {print $4,$5}’ test

[root@TechTutorial awk]# awk 'BEGIN {OFS=":"} {print $4,$5}' test
root:0

Note: print $4,$5 and print $4$5 will not work the same way. The first one displays the output with space as delimiter. The second one displays the output without any delimiter.

8. NF – Number of fields count

NF can be used to know the number of fields in each line below is the command example

[root@TechTutorial awk]# awk '{print NF}' test
9

9. NR – number of records count

NR can be used to know the line number or count of lines in a file

[root@TechTutorial awk]# awk '{print NR}' test
1
2
3
4
5
6
7
8
9
10

Above example will print line number, in test file we have ten lines.

10. Print number of records in particular file

if you see above example of 9th section is printed all line numbers but requirement is i would to see only count of records.

[root@TechTutorial awk]# awk 'END {print NR}' test
10

This will display the total number of lines in the test file.

String functions in Awk:

Some of the string functions in awk are:

  • index(string,search)
  • length(string)
  • split(string,array,separator)
  • substr(string,position)
  • substr(string,position,max)
  • tolower(string)
  • toupper(string)

Advanced Examples:

11. Filtering lines using Awk split function

The awk split function splits a string into an array using the delimiter.

The syntax of split function is
split(string, array, delimiter)

Now we will see how to filter the lines using the split function with an example.

The input “advanced.txt” contains the data in the following format

[root@TechTutorial awk]# cat advanced.txt
1 U,N,ARKIT,000
2 A,B,TEST,111
3 I,M,ARKIT,222
4 C,D,TECH,333
5 T,I,RAVI,444

Required output: Now we have to print only the lines in which those 2nd field has the string “ARKIT” as the 3rd field.

The output is:
1 U,N,ARKIT,000
3 I,M,ARKIT,222

The awk command for getting the output is below mentioned

Command: 
awk '{ 
        split($2,arr,","); 
        if(arr[3] == "0") 
        print $0 
} ' advanced.txt
[root@TechTutorial awk]# awk '{
        split($2,arr,",");
        if(arr[3] == "ARKIT")
        print $0
} ' advanced.txt

1 U,N,ARKIT,000
3 I,M,ARKIT,222

Few awk command / tool examples we will see in upcoming post Stay tune.

Keywords: awk scripting in linux,how to run a awk script in linux,how to write an awk script in linux,awk in linux bash,awk in linux command,awk script examples linux,awk in linux example,awk in linux means,awk in linux pdf,awk in linux programming,awk in linux ppt,awk in linux shell script,awk command in linux shell scripting,awk in linux tutorial,awk in linux terminal,awk in linux with examples, awk scripting, awk scripting, awk scripting, awk scripting, awk scripting

Thanks for the read.

Please do comment your feedback on this article.

Related Articles

25 mostly used Linux Commands

ls command with 25 practical examples

RHEL 7 Tutorial

Thanks for your wonderful Support and Encouragement