2021-05-12 14:32:11
Linux高階文字處理之gawk printf命令與函數
一、使用printf格式化輸出
printf 可以非常靈活、簡單地以你期望的格式輸出結果。
語法:
printf "print format", variable1,variable2,etc.
printf 中的特殊字元:
printf 不會使用 OFS 和 ORS,它只根據”format”裡面的格式列印資料。
printf 格式化字元:
範例1:
[root@localhost ~]# cat pri.awk BEGIN { printf "s--> %sn", "String" printf "c--> %cn", "String" printf "s--> %sn", 101.23 printf "d--> %dn", 101,23 printf "e--> %en", 101,23 printf "f--> %fn", 101,23 printf "g--> %gn", 101,23 printf "o--> %on", 0x8 printf "x--> %xn", 16 printf "percentage--> %%n", 17 } [root@localhost ~]# awk -f pri.awk s--> String c--> S s--> 101.23 d--> 101 e--> 1.010000e+02 f--> 101.000000 g--> 101 o--> 10 x--> 10 percentage--> %
printf中修飾字元:
修飾符:#[.#] 第一個數位控制顯示的寬度;第二個#表示小數點後精度
– 左對齊(預設右對齊)%-15s
+ 顯示數值的正負符號 %+d,0也會新增正號
$ 如果要在價錢之前加上美元符號,只需在格式化字串之前(%之前)加上$即可
0 左邊補 0 (而不是空格),在指定寬度的數位前面加一個 0,例如使用"%05s"代替"%5s"
範例2:
[root@localhost ~]# awk 'BEGIN { printf "|%6s%7.3f|n", "Good","2.1" }' | Good 2.100| [root@localhost ~]# awk 'BEGIN { printf "|%-6s%-7.3f|n", "Good","2.1" }' |Good 2.100 |
把結果重定向到檔案:
Awk 中可以把 print 語句列印的內容重定向到指定的檔案中。
範例3:
[root@localhost ~]# awk 'BEGIN{a=5;printf "%3dn",a> "report.txt"}' [root@localhost ~]# cat report.txt 5
另一種方法使用awk -f script.awk file > redirectfile
awk指令碼執行方式:
範例4:
[root@localhost ~]# cat fz.awk #!/bin/awk -f BEGIN { FS=","; OFS=","; total1 = total2 = total3 = total4 = total5 = 10; total1 += 5; print total1; total2 -= 5; print total2; total3 *= 5; print total3; total4 /= 5; print total4; total5 %= 5; print total5; } [root@localhost ~]# chmod +x fz.awk [root@localhost ~]# ./fz.awk 15 5 50 2 0
二、awk內建函數與自定義函數
數值處理常式:
rand()函數
rand()函數用於產生 0~1 之間的亂數,它只返回 0~1 之間的數,絕不會返回 0 或 1。這些 數在 awk 執行時是隨機的,但是在多次執行中,又是可預知的。
範例1:產生 1000 個亂數(0 到 100 之間)
[root@localhost ~]# cat occ.awk BEGIN { while(i<1000) { n = int(rand()*100); rnd[n]++; i++; } for(i=0;i<=100;i++) { print i,"Occured",rnd[i],"times"; } } [root@localhost ~]# awk -f occ.awk 0 Occured 11 times 1 Occured 8 times 2 Occured 9 times 3 Occured 15 times 4 Occured 16 times 5 Occured 5 times 6 Occured 8 times 7 Occured 9 times 8 Occured 7 times 9 Occured 7 times 10 Occured 11 times 11 Occured 7 times 12 Occured 10 times 13 Occured 9 times 14 Occured 6 times 15 Occured 18 times 16 Occured 10 times 17 Occured 10 times 18 Occured 9 times 19 Occured 8 times 20 Occured 11 times 21 Occured 13 times 22 Occured 10 times 23 Occured 9 times 24 Occured 15 times 25 Occured 8 times 26 Occured 3 times 27 Occured 17 times 28 Occured 9 times 29 Occured 13 times 30 Occured 11 times 31 Occured 9 times 32 Occured 12 times 33 Occured 12 times 34 Occured 9 times 35 Occured 6 times 36 Occured 13 times 37 Occured 15 times 38 Occured 6 times 39 Occured 9 times 40 Occured 7 times 41 Occured 8 times 42 Occured 6 times 43 Occured 8 times 44 Occured 10 times 45 Occured 7 times 46 Occured 10 times 47 Occured 8 times 48 Occured 16 times 49 Occured 12 times 50 Occured 6 times 51 Occured 15 times 52 Occured 6 times 53 Occured 12 times 54 Occured 8 times 55 Occured 13 times 56 Occured 6 times 57 Occured 16 times 58 Occured 5 times 59 Occured 7 times 60 Occured 11 times 61 Occured 12 times 62 Occured 14 times 63 Occured 11 times 64 Occured 9 times 65 Occured 6 times 66 Occured 7 times 67 Occured 10 times 68 Occured 8 times 69 Occured 12 times 70 Occured 13 times 71 Occured 9 times 72 Occured 10 times 73 Occured 11 times 74 Occured 7 times 75 Occured 13 times 76 Occured 13 times 77 Occured 10 times 78 Occured 5 times 79 Occured 12 times 80 Occured 17 times 81 Occured 8 times 82 Occured 7 times 83 Occured 10 times 84 Occured 12 times 85 Occured 12 times 86 Occured 11 times 87 Occured 14 times 88 Occured 4 times 89 Occured 8 times 90 Occured 15 times 91 Occured 10 times 92 Occured 15 times 93 Occured 8 times 94 Occured 11 times 95 Occured 5 times 96 Occured 12 times 97 Occured 11 times 98 Occured 7 times 99 Occured 11 times 100 Occured times
注意:可見rand()函數產生的亂數重復概率很高。
srand(n)函數
srand(n)函數使用給定的引數 n 作為種子來初始化亂數的產生過程。不論何時啟動, awk 只會從 n 開始產生亂數,如果不指定引數 n, awk 預設使用當天的時間作為產生亂數的 種子。
範例2:產生 5 個從 5 到 50 的亂數
[root@localhost ~]# cat srand.awk BEGIN { #Initialize the sedd with 5. srand(5); #Totally I want to generate 5 numbers total = 5; #maximun number is 50 max = 50; count = 0; while(count < total) { rnd = int(rand()*max); if( array[rnd] == 0 ) { count++; array[rnd]++; } } for ( i=5;i<=max;i++) { if (array[i]) print i;} } [root@localhost ~]# awk -f srand.awk 14 16 23 33 35
常用字串函數:
length函數:
length([S]) 返回指定字串長度。
範例1:length函數
[root@bash ~]# awk 'BEGIN{print length("young")}' 5
sub函數:
sub(r,s,[t]) 對t字串進行搜尋r表示的模式匹配的內容(可使用正則匹配),並將第一個匹配的內容替換為s代表的字串。
範例1:
[root@bash ~]# awk 'BEGIN{a="geek young";sub("young","xixi",a);print a}' geek xixi #注意字串要用引號
範例2:
[root@bash ~]# echo "geek young hahahaha"|awk ' >{sub(/<young>/,"xixi",$2); #正則匹配模式中字串不加引號 >print $2}' xixi
範例3:
[root@bash ~]# echo "2008:08:08:08 08:08:08" | awk 'sub(/:/,"",$1)' 200808:08:08 08:08:08
範例4:
[root@bash ~]# cat sub.awk BEGIN { state="CA is California" sub("C[Aa]","KA",state); print state; } [root@bash ~]# awk -f sub.awk KA is California
gsub函數:
gsub([r,s,[t]]) 對t字串進行搜尋r表示的模式匹配的內容(可使用正則匹配),並全部替換為s。
範例1:
[root@bash ~]# echo "2008:08:08:08 08:08:08" | awk 'gsub(/:/,"",$1)' 2008080808 08:08:08
split函數:
split(s,array,[r]) 以r為分割符切割字元s,並將切割後的結果存至array表示的陣列中第一個索引值為1,第二個索引值為2,…。
範例1:
[root@bash ~]# echo "192.168.1.1:80"|awk ' >{split($1,ip,":"); >print ip[1],"----",ip[2]}' 192.168.1.1 ---- 80
範例2:
[root@bash ~]# netstat -tan | awk ' >/^tcp>/{split($5,ip,":"); >count[ip[1]]++} #將一個陣列的值作為另一個陣列的索引並自加通常用來計算重複次數 >END{for (i in count){print i,count[i]}}' 116.211.167.193 3 0.0.0.0 4 192.168.1.116 1
範例3:
[root@bash ~]# cat items-sold1.txt 101:2,10,5,8,10,12 102:0,1,4,3,0,2 103:10,6,11,20,5,13 104:2,3,4,0,6,5 105:10,2,5,7,12,6 [root@bash ~]# cat split.awk BEGIN { FS=":" } { split($2,quantity,","); total=0; for(x in quantity) total=total+quantity[x]; print "Item",$1,":",total,"quantities sold"; } [root@bash ~]# awk -f split.awk items-sold1.txt Item 101 : 47 quantities sold Item 102 : 10 quantities sold Item 103 : 65 quantities sold Item 104 : 20 quantities sold Item 105 : 42 quantities sold
substr 函數
語法:
substr(input-string,location,length)
-
substr 函數從字串中提取指定的部分(子串),上面語法中:
-
input-string:包含子串的字串
-
location:子串的開始位置
-
length:從 location 開始起,出去的字串的總長度。這個選項是可選的,如果不指
-
定長度,那麼從 location 開始一直取到字串的結尾
範例1:從字串的第 5 個字元開始,取到字串結尾並列印出來
[root@localhost ~]# cat items.txt 101,HD Camcorder,Video,210,10 102,Refrigerator,Appliance,850,2 103,MP3 Player,Audio,270,15 104,Tennis Racket,Sports,190,20 105,Laser Printer,Office,475,5 [root@localhost ~]# awk '{ print substr($0,5) }' items.txt HD Camcorder,Video,210,10 Refrigerator,Appliance,850,2 MP3 Player,Audio,270,15 Tennis Racket,Sports,190,20 Laser Printer,Office,475,5
範例2:從第 2 個欄位的第 1 個字元起,列印 5 個字元
[root@localhost ~]# awk -F"," '{ print substr($2,1,5) }' items.txt HD Ca Refri MP3 P Tenni Laser
呼叫shell函數
雙向管道 |&
awk 可以使用”|&”和外部進程通訊,這個過程是雙向的。
範例1:
[root@localhost ~]# cat doub.awk BEGIN { command = "sed 's/Awk/Sed and Awk/'" print "Awk is Great!" |& command close(command,"to"); #awk中同時只能存在一個管道 command |& getline tmp print tmp; close(command); } [root@localhost ~]# awk -f doub.awk Sed and Awk is Great!
說明:”|&”表示這裡是雙向管道。 ”|&”右邊命令的輸入來自左邊命令的輸出。close(command,"to") – 一旦命令執行完成,應該關閉”to”進程。 command |& getline tmp –既然命令已經執行完成,就要用 getline 獲取其輸出。前面命令的輸出會被存在變數”tmp”中。close(command) 最後,關閉命令。
system系統函數
執行系統命令時,可以傳遞任意的字串作為命令的引數,它會被當做作業系統命令準確第執行,並返回結果(這和雙向管道有所不同)。
範例1:
[root@localhost ~]# awk 'BEGIN{system("hostname");}' #不用加print命令 localhost.localdomain [root@localhost ~]# awk 'BEGIN{system("pwd")}' /root [root@localhost ~]# awk 'BEGIN{system("date")}' Fri Jan 20 23:57:55 CST 2017
getline函數
geline 命令可以控制 awk 從輸入檔案(或其他檔案)讀取資料。注意,一旦 getline執行完成, awk 指令碼會重置 NF,NR,FNR 和$0 等內建變數。
範例1:
[root@localhost ~]# cat items.txt 101,HD Camcorder,Video,210,10 102,Refrigerator,Appliance,850,2 103,MP3 Player,Audio,270,15 104,Tennis Racket,Sports,190,20 105,Laser Printer,Office,475,5 [root@localhost ~]# awk -F"," ' >{getline;print $0;}' items.txt #類似sed中n命令改變awk執行流程 102,Refrigerator,Appliance,850,2 104,Tennis Racket,Sports,190,20 105,Laser Printer,Office,475,5
-
開始執行 body 區域時,執行任何命令之前, awk 從 items.txt 檔案中讀取第一行資料,儲存在變數$0 中
-
getline – 我們用 getline 命令強制 awk 讀取下一行資料,儲存在變數$0 中(之前的內容被覆蓋掉了)
-
print $0 –既然現在$0 中儲存的是第二行資料, print $0 會列印檔案第二行(而不是第一行)
-
body 區域繼續執行,只列印偶數行的資料。 (注意到最後一行 105 也列印了 )
除了把 getline 的內容放到$0 中,還可以把它儲存在變數中。
範例2:列印奇數行
[root@localhost ~]# awk -F"," '{getline tmp; print $0;}' items.txt 101,HD Camcorder,Video,210,10 103,MP3 Player,Audio,270,15 105,Laser Printer,Office,475,5
說明:
-
開始執行 body 區域時,執行任何命令之前, awk 從 items.txt 檔案中讀取第一行資料,儲存在變數$0 中
-
getline tmp – 強制 awk 讀取下一行,並儲存在變數 tmp 中
-
print $0 – 此時$0 仍然是第一行資料,因為 getline tmp 沒有覆蓋$0,因此會列印第一行資料(而不是第二行)
-
body 區域繼續執行,只列印奇數行的資料。
範例3:從其他的檔案 getline 內容到變數中
[root@localhost ~]# cat items.txt 101,HD Camcorder,Video,210,10 102,Refrigerator,Appliance,850,2 103,MP3 Player,Audio,270,15 104,Tennis Racket,Sports,190,20 105,Laser Printer,Office,475,5 [root@localhost ~]# cat items-sold.txt 101 2 10 5 8 10 12 102 0 1 4 3 0 2 103 10 6 11 20 5 13 104 2 3 4 0 6 5 105 10 2 5 7 12 6 [root@localhost ~]# awk -F"," '{ >print $0; >getline tmp < "items-sold.txt"; >print tmp;}' items.txt 101,HD Camcorder,Video,210,10 101 2 10 5 8 10 12 102,Refrigerator,Appliance,850,2 102 0 1 4 3 0 2 103,MP3 Player,Audio,270,15 103 10 6 11 20 5 13 104,Tennis Racket,Sports,190,20 104 2 3 4 0 6 5 105,Laser Printer,Office,475,5 105 10 2 5 7 12 6
範例4:getline 執行外部命令
[root@localhost ~]# cat get.awk BEGIN { FS=","; "date" | getline close("date") print "Timestamp:" $0 } { if ( $5 <= 5) print "Buy More:Order",$2,"immediately!" else print "Sell More:Give discount on",$2,"immediatelty!" } [root@localhost ~]# cat items.txt 101,HD Camcorder,Video,210,10 102,Refrigerator,Appliance,850,2 103,MP3 Player,Audio,270,15 104,Tennis Racket,Sports,190,20 105,Laser Printer,Office,475,5 [root@localhost ~]# awk -f get.awk items.txt Timestamp:Sat Jan 21 00:23:53 CST 2017 Sell More:Give discount on HD Camcorder immediatelty! Buy More:Order Refrigerator immediately! Sell More:Give discount on MP3 Player immediatelty! Sell More:Give discount on Tennis Racket immediatelty! Buy More:Order Laser Printer immediately!
範例5:除了把命令輸出儲存在$0 中之外,也可以把它儲存在任意的 awk 變數中
[root@localhost ~]# cat get2.awk BEGIN {FS=","; "date" | getline timestamp close("date") print "Timestamp:" timestamp } { if ( $5 <= 5) print "Buy More: Order",$2,"immediately!" else print "Sell More: Give discount on",$2,"immediately!" } [root@localhost ~]# awk -f get2.awk items.txt Timestamp:Sat Jan 21 00:26:29 CST 2017 Sell More: Give discount on HD Camcorder immediately! Buy More: Order Refrigerator immediately! Sell More: Give discount on MP3 Player immediately! Sell More: Give discount on Tennis Racket immediately! Buy More: Order Laser Printer immediately!
awk自定義函數
格式:
function name ( parameter, parameter, ... ) {
statements
return expression
}
範例1:
[root@localhost ~]
# cat fun.awk
function
max(v1,v2) {
v1>v2?var=v1:var=v2
return
var
}
BEGIN{a=3;b=2;print max(a,b)}
[root@localhost ~]
# awk -f fun.awk
3
本文永久更新連結地址:http://www.linuxidc.com/Linux/2017-02/140274.htm
相關文章