Linux高階文字處理之gawk printf命令與函數

2020-06-16 17:22:15

一、使用printf格式化輸出

printf 可以非常靈活、簡單地以你期望的格式輸出結果。

語法:

printf "print format", variable1,variable2,etc.

printf 中的特殊字元：

printf 不會使用 OFS 和 ORS，它只根據”format”裡面的格式列印資料。

printf 格式化字元：

範例1：

[root@localhost ~]# cat pri.awk 
BEGIN {
    printf "s--> %sn", "String"
    printf "c--> %cn", "String"
    printf "s--> %sn", 101.23
    printf "d--> %dn", 101,23
    printf "e--> %en", 101,23
    printf "f--> %fn", 101,23
    printf "g--> %gn", 101,23
    printf "o--> %on", 0x8
    printf "x--> %xn", 16
    printf "percentage--> %%n", 17
}
[root@localhost ~]# awk -f pri.awk 
s--> String
c--> S
s--> 101.23
d--> 101
e--> 1.010000e+02
f--> 101.000000
g--> 101
o--> 10
x--> 10
percentage--> %

printf中修飾字元：

修飾符：#[.#] 第一個數位控制顯示的寬度；第二個#表示小數點後精度

– 左對齊（預設右對齊）%-15s

+ 顯示數值的正負符號 %+d，0也會新增正號

$ 如果要在價錢之前加上美元符號,只需在格式化字串之前(%之前)加上$即可

0 左邊補 0 (而不是空格),在指定寬度的數位前面加一個 0，例如使用"%05s"代替"%5s"

範例2：

[root@localhost ~]# awk 'BEGIN { printf "|%6s%7.3f|n", "Good","2.1" }'  
|  Good  2.100|
[root@localhost ~]# awk 'BEGIN { printf "|%-6s%-7.3f|n", "Good","2.1" }'
|Good  2.100  |

把結果重定向到檔案:

Awk 中可以把 print 語句列印的內容重定向到指定的檔案中。

範例3：

[root@localhost ~]# awk 'BEGIN{a=5;printf "%3dn",a> "report.txt"}'
[root@localhost ~]# cat report.txt 
  5

另一種方法使用awk -f script.awk file > redirectfile

awk指令碼執行方式：

範例4：

[root@localhost ~]# cat fz.awk      
#!/bin/awk -f
BEGIN {
FS=",";
OFS=",";
total1 = total2 = total3 = total4 = total5 = 10;
total1 += 5; print total1;
total2 -= 5; print total2;
total3 *= 5; print total3;
total4 /= 5; print total4;
total5 %= 5; print total5;
}
[root@localhost ~]# chmod +x fz.awk   
[root@localhost ~]# ./fz.awk        
15
5
50
2
0

二、awk內建函數與自定義函數

數值處理常式：

rand（）函數

rand()函數用於產生 0~1 之間的亂數，它只返回 0~1 之間的數，絕不會返回 0 或 1。這些數在 awk 執行時是隨機的，但是在多次執行中，又是可預知的。

範例1：產生 1000 個亂數(0 到 100 之間)

[root@localhost ~]# cat occ.awk 
BEGIN {
    while(i<1000)
    {
        n = int(rand()*100);
        rnd[n]++;
        i++;
    }
    for(i=0;i<=100;i++)
    {
        print i,"Occured",rnd[i],"times";
    }
}
[root@localhost ~]# awk -f occ.awk 
0 Occured 11 times
1 Occured 8 times
2 Occured 9 times
3 Occured 15 times
4 Occured 16 times
5 Occured 5 times
6 Occured 8 times
7 Occured 9 times
8 Occured 7 times
9 Occured 7 times
10 Occured 11 times
11 Occured 7 times
12 Occured 10 times
13 Occured 9 times
14 Occured 6 times
15 Occured 18 times
16 Occured 10 times
17 Occured 10 times
18 Occured 9 times
19 Occured 8 times
20 Occured 11 times
21 Occured 13 times
22 Occured 10 times
23 Occured 9 times
24 Occured 15 times
25 Occured 8 times
26 Occured 3 times
27 Occured 17 times
28 Occured 9 times
29 Occured 13 times
30 Occured 11 times
31 Occured 9 times
32 Occured 12 times
33 Occured 12 times
34 Occured 9 times
35 Occured 6 times
36 Occured 13 times
37 Occured 15 times
38 Occured 6 times
39 Occured 9 times
40 Occured 7 times
41 Occured 8 times
42 Occured 6 times
43 Occured 8 times
44 Occured 10 times
45 Occured 7 times
46 Occured 10 times
47 Occured 8 times
48 Occured 16 times
49 Occured 12 times
50 Occured 6 times
51 Occured 15 times
52 Occured 6 times
53 Occured 12 times
54 Occured 8 times
55 Occured 13 times
56 Occured 6 times
57 Occured 16 times
58 Occured 5 times
59 Occured 7 times
60 Occured 11 times
61 Occured 12 times
62 Occured 14 times
63 Occured 11 times
64 Occured 9 times
65 Occured 6 times
66 Occured 7 times
67 Occured 10 times
68 Occured 8 times
69 Occured 12 times
70 Occured 13 times
71 Occured 9 times
72 Occured 10 times
73 Occured 11 times
74 Occured 7 times
75 Occured 13 times
76 Occured 13 times
77 Occured 10 times
78 Occured 5 times
79 Occured 12 times
80 Occured 17 times
81 Occured 8 times
82 Occured 7 times
83 Occured 10 times
84 Occured 12 times
85 Occured 12 times
86 Occured 11 times
87 Occured 14 times
88 Occured 4 times
89 Occured 8 times
90 Occured 15 times
91 Occured 10 times
92 Occured 15 times
93 Occured 8 times
94 Occured 11 times
95 Occured 5 times
96 Occured 12 times
97 Occured 11 times
98 Occured 7 times
99 Occured 11 times
100 Occured  times

注意：可見rand（）函數產生的亂數重復概率很高。

srand(n)函數

srand(n)函數使用給定的引數 n 作為種子來初始化亂數的產生過程。不論何時啟動， awk 只會從 n 開始產生亂數，如果不指定引數 n， awk 預設使用當天的時間作為產生亂數的種子。

範例2：產生 5 個從 5 到 50 的亂數

[root@localhost ~]# cat srand.awk 
BEGIN {
    #Initialize the sedd with 5.
    srand(5);
    #Totally I want to generate 5 numbers
    total = 5;
    #maximun number is 50
    max = 50;
    count = 0;
    while(count < total)
    {
        rnd = int(rand()*max);
        if( array[rnd] == 0 )
        {
            count++;
            array[rnd]++;
        }
    }
    for ( i=5;i<=max;i++)
    {
        if (array[i])
            print i;}
    }
[root@localhost ~]# awk -f srand.awk 
14
16
23
33
35

常用字串函數：

length函數：

length([S]) 返回指定字串長度。

範例1：length函數

[root@bash ~]# awk 'BEGIN{print length("young")}'
5

sub函數：

sub(r,s,[t]) 對t字串進行搜尋r表示的模式匹配的內容（可使用正則匹配），並將第一個匹配的內容替換為s代表的字串。

範例1：

[root@bash ~]# awk 'BEGIN{a="geek young";sub("young","xixi",a);print a}' 
geek xixi  #注意字串要用引號

範例2：

[root@bash ~]# echo "geek young hahahaha"|awk '
>{sub(/<young>/,"xixi",$2);  #正則匹配模式中字串不加引號
>print $2}'   
xixi

範例3：

[root@bash ~]# echo "2008:08:08:08 08:08:08" | awk 'sub(/:/,"",$1)'
200808:08:08 08:08:08

範例4：

[root@bash ~]# cat sub.awk
BEGIN {
state="CA is California"
sub("C[Aa]","KA",state);
print state;
}
[root@bash ~]# awk -f sub.awk
KA is California

gsub函數：

gsub([r,s,[t]]) 對t字串進行搜尋r表示的模式匹配的內容（可使用正則匹配），並全部替換為s。

範例1：

[root@bash ~]# echo "2008:08:08:08 08:08:08" | awk 'gsub(/:/,"",$1)'
2008080808 08:08:08

split函數：

split(s,array,[r]) 以r為分割符切割字元s，並將切割後的結果存至array表示的陣列中第一個索引值為1,第二個索引值為2,…。

範例1：

[root@bash ~]# echo "192.168.1.1:80"|awk '
>{split($1,ip,":");
>print ip[1],"----",ip[2]}'                       
192.168.1.1 ---- 80

範例2：

[root@bash ~]# netstat -tan | awk '
>/^tcp>/{split($5,ip,":");
>count[ip[1]]++}  #將一個陣列的值作為另一個陣列的索引並自加通常用來計算重複次數
>END{for (i in count){print i,count[i]}}'
116.211.167.193 3
0.0.0.0 4
192.168.1.116 1

範例3：

[root@bash ~]# cat items-sold1.txt   
101:2,10,5,8,10,12
102:0,1,4,3,0,2
103:10,6,11,20,5,13
104:2,3,4,0,6,5
105:10,2,5,7,12,6
[root@bash ~]# cat split.awk
BEGIN {
FS=":"
} {
split($2,quantity,",");
total=0;
for(x in quantity)
total=total+quantity[x];
print "Item",$1,":",total,"quantities sold";
}
[root@bash ~]# awk -f split.awk items-sold1.txt
Item 101 : 47 quantities sold
Item 102 : 10 quantities sold
Item 103 : 65 quantities sold
Item 104 : 20 quantities sold
Item 105 : 42 quantities sold

substr 函數

語法：

substr(input-string,location,length)

substr 函數從字串中提取指定的部分(子串)，上面語法中：
input-string:包含子串的字串
location:子串的開始位置
length:從 location 開始起，出去的字串的總長度。這個選項是可選的，如果不指
定長度，那麼從 location 開始一直取到字串的結尾

範例1：從字串的第 5 個字元開始，取到字串結尾並列印出來

[root@localhost ~]# cat items.txt 
101,HD Camcorder,Video,210,10
102,Refrigerator,Appliance,850,2
103,MP3 Player,Audio,270,15
104,Tennis Racket,Sports,190,20
105,Laser Printer,Office,475,5
[root@localhost ~]# awk '{ print substr($0,5) }' items.txt
HD Camcorder,Video,210,10
Refrigerator,Appliance,850,2
MP3 Player,Audio,270,15
Tennis Racket,Sports,190,20
Laser Printer,Office,475,5

範例2：從第 2 個欄位的第 1 個字元起，列印 5 個字元

[root@localhost ~]# awk -F"," '{ print substr($2,1,5) }' items.txt
HD Ca
Refri
MP3 P
Tenni
Laser

呼叫shell函數

雙向管道 |&

awk 可以使用”|&”和外部進程通訊，這個過程是雙向的。

範例1：

[root@localhost ~]# cat doub.awk 
BEGIN {
    command = "sed 's/Awk/Sed and Awk/'"
    print "Awk is Great!" |& command
    close(command,"to");  #awk中同時只能存在一個管道
    command |& getline tmp
    print tmp;
    close(command);
}
[root@localhost ~]# awk -f doub.awk 
Sed and Awk is Great!

說明：”|&”表示這裡是雙向管道。 ”|&”右邊命令的輸入來自左邊命令的輸出。close(command,"to") – 一旦命令執行完成，應該關閉”to”進程。 command |& getline tmp –既然命令已經執行完成，就要用 getline 獲取其輸出。前面命令的輸出會被存在變數”tmp”中。close(command) 最後，關閉命令。

system系統函數

執行系統命令時，可以傳遞任意的字串作為命令的引數，它會被當做作業系統命令準確第執行，並返回結果(這和雙向管道有所不同)。

範例1：

[root@localhost ~]# awk 'BEGIN{system("hostname");}' #不用加print命令
localhost.localdomain  
[root@localhost ~]# awk 'BEGIN{system("pwd")}'
/root
[root@localhost ~]# awk 'BEGIN{system("date")}'
Fri Jan 20 23:57:55 CST 2017

getline函數

geline 命令可以控制 awk 從輸入檔案(或其他檔案)讀取資料。注意，一旦 getline執行完成， awk 指令碼會重置 NF,NR,FNR 和$0 等內建變數。

範例1：

[root@localhost ~]# cat items.txt 
101,HD Camcorder,Video,210,10
102,Refrigerator,Appliance,850,2
103,MP3 Player,Audio,270,15
104,Tennis Racket,Sports,190,20
105,Laser Printer,Office,475,5
[root@localhost ~]# awk -F"," '
>{getline;print $0;}' items.txt #類似sed中n命令改變awk執行流程
102,Refrigerator,Appliance,850,2
104,Tennis Racket,Sports,190,20
105,Laser Printer,Office,475,5

開始執行 body 區域時，執行任何命令之前， awk 從 items.txt 檔案中讀取第一行資料，儲存在變數$0 中
getline – 我們用 getline 命令強制 awk 讀取下一行資料，儲存在變數$0 中(之前的內容被覆蓋掉了)
print $0 –既然現在$0 中儲存的是第二行資料， print $0 會列印檔案第二行(而不是第一行)
body 區域繼續執行，只列印偶數行的資料。 (注意到最後一行 105 也列印了 )

除了把 getline 的內容放到$0 中，還可以把它儲存在變數中。

範例2：列印奇數行

[root@localhost ~]# awk -F"," '{getline tmp; print $0;}' items.txt
101,HD Camcorder,Video,210,10
103,MP3 Player,Audio,270,15
105,Laser Printer,Office,475,5

說明：

開始執行 body 區域時，執行任何命令之前， awk 從 items.txt 檔案中讀取第一行資料，儲存在變數$0 中
getline tmp – 強制 awk 讀取下一行，並儲存在變數 tmp 中
print $0 – 此時$0 仍然是第一行資料，因為 getline tmp 沒有覆蓋$0,因此會列印第一行資料(而不是第二行)
body 區域繼續執行，只列印奇數行的資料。

範例3：從其他的檔案 getline 內容到變數中

[root@localhost ~]# cat items.txt 
101,HD Camcorder,Video,210,10
102,Refrigerator,Appliance,850,2
103,MP3 Player,Audio,270,15
104,Tennis Racket,Sports,190,20
105,Laser Printer,Office,475,5
[root@localhost ~]# cat items-sold.txt 
101 2 10 5 8 10 12
102 0 1 4 3 0 2
103 10 6 11 20 5 13
104 2 3 4 0 6 5
105 10 2 5 7 12 6
[root@localhost ~]# awk -F"," '{
>print $0; 
>getline tmp < "items-sold.txt";
>print tmp;}' items.txt
101,HD Camcorder,Video,210,10
101 2 10 5 8 10 12
102,Refrigerator,Appliance,850,2
102 0 1 4 3 0 2
103,MP3 Player,Audio,270,15
103 10 6 11 20 5 13
104,Tennis Racket,Sports,190,20
104 2 3 4 0 6 5
105,Laser Printer,Office,475,5
105 10 2 5 7 12 6

範例4：getline 執行外部命令

[root@localhost ~]# cat get.awk 
BEGIN {
    FS=",";
    "date" | getline
    close("date")
    print "Timestamp:" $0
} {
if ( $5 <= 5)
    print "Buy More:Order",$2,"immediately!"
else
    print "Sell More:Give discount on",$2,"immediatelty!"
}
[root@localhost ~]# cat items.txt 
101,HD Camcorder,Video,210,10
102,Refrigerator,Appliance,850,2
103,MP3 Player,Audio,270,15
104,Tennis Racket,Sports,190,20
105,Laser Printer,Office,475,5
[root@localhost ~]# awk -f get.awk items.txt 
Timestamp:Sat Jan 21 00:23:53 CST 2017
Sell More:Give discount on HD Camcorder immediatelty!
Buy More:Order Refrigerator immediately!
Sell More:Give discount on MP3 Player immediatelty!
Sell More:Give discount on Tennis Racket immediatelty!
Buy More:Order Laser Printer immediately!

範例5：除了把命令輸出儲存在$0 中之外，也可以把它儲存在任意的 awk 變數中

[root@localhost ~]# cat get2.awk              
BEGIN {FS=",";
    "date" | getline timestamp
    close("date")
    print "Timestamp:" timestamp
} {
if ( $5 <= 5)
    print "Buy More: Order",$2,"immediately!"
else
    print "Sell More: Give discount on",$2,"immediately!"
}
[root@localhost ~]# awk -f get2.awk items.txt 
Timestamp:Sat Jan 21 00:26:29 CST 2017
Sell More: Give discount on HD Camcorder immediately!
Buy More: Order Refrigerator immediately!
Sell More: Give discount on MP3 Player immediately!
Sell More: Give discount on Tennis Racket immediately!
Buy More: Order Laser Printer immediately!

awk自定義函數

格式：

function name ( parameter, parameter, ... ) {
statements
return expression
}

範例1：

[root@localhost ~]# cat fun.awk 
function max(v1,v2) {
    v1>v2?var=v1:var=v2
    return var
}
BEGIN{a=3;b=2;print max(a,b)}
[root@localhost ~]# awk -f fun.awk 
3

本文永久更新連結地址：http://www.linuxidc.com/Linux/2017-02/140274.htm

Linux高階文字處理之gawk printf命令與函數

一、使用printf格式化輸出

二、awk內建函數與自定義函數

熱門文章