2021-05-12 14:32:11
Linux管道的作用--管道命令在指令碼中的使用
在UNIX下的管理性檔案,大部分是不需要特殊的檔案專用工具即可編輯,列印和閱讀的簡易文字檔案。
這些檔案大部分放在標準目錄:/etc下。如:
常見的密碼檔案和組檔案:(passwd,group)
檔案系統載入表:(fstab,vfstab)
主機檔案:(hosts)
預設的shell啟動檔案:(profile)
系統啟動和關機的shell指令碼:(存放於子目錄樹rc0.d,rc1.d ... rc6.d下)
從結構化文字檔案中提取資料
練習1:切割passwd下第一,第七欄位
[linuxidc@test ~]$ vi patest.sh
#!/bin/bash
umask 077
PERSON=/tmp/pd.key.person.$$
OFFICE=/tmp/pd.key.office.$$
TELEPHONE=/tmp/pd.key.telephone.$$
USER=/tmp/pd.key.user.$$
trap "exit 1" HUP INT PIPE QUIT TERM
trap "rm -f $PERSON $OFFICE $TELEPHONE $USER " EXIT
awk -F: '{ print $1 ":" $7 }' /etc/passwd > $USER
awk -F: '{ print $1}' < $USER | sort >$PERSON
sed -e 's=^[:]∗:[^/]*/[/]∗/.*$=1:2=' < $USER | sort >$OFFICE
sed -e 's=^[:]∗:[^/]*/[^/]*/[/]∗=1:2=' < $USER | sort >$TELEPHONE
join -t: $PERSON $OFFICE |
join -t: - $TELEPHONE |
sort -t: -k1,1 -k2,2 -k3,3 |
awk -F: '{ printf("%-39st%st%sn", $1,$2,$3) }'
[linuxidc@test ~]$ chmod +x patest.sh
[linuxidc@test ~]$ bash +x patest.sh
adm sbin nologin
alert2system bin bash
alert2systemtest bin bash
avahi sbin nologin
bin sbin nologin
cvsroot bin bash
dbus sbin nologin
dovecot sbin nologin
ftp sbin nologin
ftpuser bin bash
games sbin nologin
gdm sbin nologin
git_test usr local/git/bin/git-shell
gopher sbin nologin
gup sbin nologin
....
練習2: 如果/etc/passwd 下第五欄位包含姓名,辦公室號碼,電話等,
如下文件,試建立辦公室名錄
[linuxidc@test ~]$ vi passwd1
gz_willwu:x:843:843:Will wu/SN091/555-6728:/home/gz_willwu:/bin/bash
ninf_thomaschan:x:853:853:Thomas chan/INF002/554-4565:/home/sninf_thomaschan:/bin/bash
llwu:x:843:843:Will wu/SN091/555-6728:/home/gz_willwu:/bin/bash
sninf_thomaschan:x:853:853:Thomas chan/INF002/554-4565:/home/sninf_thomaschan:/bin/bash
sninf_tonyhung:x:856:856:Tonny huang/HK0501/553-6465:/home/sninf_tonyhung:/bin/bash
gz_kinma:x:857:857:Kin ma/SN021/555-6733:/home/gz_kinma:/bin/bash
linuxidc:x:859:859:Field yang/SN001/555-6765:/home/linuxidc:/bin/bash
gz_hilwu:x:843:843:hil wu/SN021/555-6744:/home/gz_willwu:/bin/bash
步驟解析:
①[linuxidc@test ~]$ awk -F: '{ print $1 ":" $5 }' passwd1 |
> sed -e 's=/.*==' -e 's=^[:]∗:.∗ []∗=1:3, 2='
ninf_thomaschan:chan, Thomas
llwu:wu, Will
sninf_thomaschan:chan, Thomas
sninf_tonyhung:huang, Tonny
gz_kinma:ma, Kin
linuxidc:yang, Field
gz_willwu:wu, Will
# ^[:]∗ 匹配使用者名稱稱欄位,如gz_willwu
# .∗□ 匹配文字到空白處,如will□wu
# []∗ 匹配剩下的非空白處,如will
# 1:3, 2 參照第一個左括號匹配到的內容:第三個左括號匹配到的內容, 第二個左括號匹配到的內容
#結果如 sninf_thomaschan:chan, Thomas
②[linuxidc@test ~]$ awk -F: '{ print $1 ":" $5 }' passwd1 |
> sed -e 's=^[:]∗:[^/]*/[/]∗/.*$=1:2='
ninf_thomaschan:INF002
llwu:SN091
sninf_thomaschan:INF002
sninf_tonyhung:HK0501
gz_kinma:SN021
linuxidc:SN001
gz_willwu:SN091
③[linuxidc@test ~]$ awk -F: '{ print $1 ":" $5 }' passwd1 |
> sed -e 's=^[:]∗:[^/]*/[^/]*/[/]∗=1:2='
ninf_thomaschan:554-4565
llwu:555-6728
sninf_thomaschan:554-4565
sninf_tonyhung:553-6465
gz_kinma:555-6733
linuxidc:555-6765
gz_willwu:555-6728
實際執行指令碼如下:建立辦公室名錄的指令碼
[linuxidc@test ~]$ vi patest.sh
#!/bin/bash
# 過濾/etc/passwd之類的輸入流
#並以此書庫衍生出辦公室名錄
#
#
umask 077
PERSON=/tmp/pd.key.person.$$
OFFICE=/tmp/pd.key.office.$$
TELEPHONE=/tmp/pd.key.telephone.$$
USER=/tmp/pd.key.user.$$
trap "exit 1" HUP INT PIPE QUIT TERM
trap "rm -f $PERSON $OFFICE $TELEPHONE $USER " EXIT
awk -F: '{ print $1 ":" $5 }' passwd1 > $USER
sed -e 's=/.*=='
# s=/.*== 刪除第一個/後直至行結尾所有內容,擷取後結果如gz_willwu:Will wu
-e 's=^[:]∗:.∗ []∗=1:3, 2=' < $USER | sort >$PERSON
sed -e 's=^[:]∗:[^/]*/[/]∗/.*$=1:2=' < $USER | sort >$OFFICE
sed -e 's=^[:]∗:[^/]*/[^/]*/[/]∗=1:2=' < $USER | sort >$TELEPHONE
join -t: $PERSON $OFFICE |
#結合個人資訊與辦公室位置
join -t: - $TELEPHONE |
#加入電話號碼
cut -d: -f 2- |
#刪除鍵值,使用cut擷取欄位2直至結束
sort -t: -k1,1 -k2,2 -k3,3 |
# 以:分隔欄位,依次對欄位1,2,3進行排序
awk -F: '{ printf("%-39st%st%sn", $1,$2,$3) }'
#重新格式化輸出
附:
$# 是傳給指令碼的引數個數
$0 是指令碼本身的名字
$1 是傳遞給該shell指令碼的第一個引數
$2 是傳遞給該shell指令碼的第二個引數
$@ 是傳給指令碼的所有引數的列表
$* 是以一個單字串顯示所有向指令碼傳遞的引數,與位置變數不同,引數可超過9個
$$ 是指令碼執行的當前進程ID號
$? 是顯示最後命令的退出狀態,0表示沒有錯誤,其他表示有錯誤
[linuxidc@test ~]$ ./patest2.sh
chan, Thomas INF002 554-4565
chan, Thomas INF002 554-4565
huang, Tonny HK0501 553-6465
ma, Kin SN021 555-6733
wu, hil SN021 555-6744
wu, Will SN091 555-6728
yang, Field SN001 555-6765
[linuxidc@test ~]$
練習3:建立一個指令碼,查詢匹配調節的特定文字
[linuxidc@test ~]$ vi puzzle-help.sh
#!/bin/bash
#通過一堆單詞列表,進行模式匹配
#語法: ./puzzle-help.sh egrep-pattern [word-list-file]
FILES="/usr/share/dict/words
/usr/dict/words
/usr/share/lib/dict/words
/usr/local/share/dict/words.biology
/usr/local/share/dict/words.chemistry
/usr/local/share/dict/words.general
/usr/local/share/dict/words.knuth
/usr/local/share/dict/words.latin
/usr/local/share/dict/words.manpages
/usr/local/share/dict/words.mathematics
/usr/local/share/dict/words.physics
/usr/local/share/dict/words.roget
/usr/local/share/dict/words.sciences
/usr/local/share/dict/words.UNIX
/usr/local/share/dict/words.webster
"
#FILES變數儲存了單詞列表檔案的內建列表,可供各個本地站點客製化
pattern="$1"
egrep -h -i "$pattern" $FILES 2>/dev/null | sort -u -f
#grep -h :指示最後結果不要顯示檔名,-i:表示忽略大小寫
#sort -u :只有唯一的記錄,丟棄所有具相同鍵值的記錄
#sort -f :排序時忽略大小寫,均視為大寫字母
①[linuxidc@test ~]$ ./puzzle-help.sh '^b.....[xz]...$' | fmt
Babelizing bamboozled bamboozler bamboozles baronizing Bellinzona
Belshazzar bigamizing bilharzial Birobizhan botanizing Brontozoum
Buitenzorg bulldozers bulldozing
#匹配b開頭,中間任意五個字元,加上x/z,再加任意三個字元
②[linuxidc@test ~]$ ./puzzle-help.sh '[^aeiouy]{7}' /usr/dict/words |fmt
2,4,5-t A.M.D.G. arch-christendom arch-christianity A.R.C.S.
branch-strewn B.R.C.S. bright-striped drought-stricken earth-sprung
earth-strewn first-string K.C.M.G. latch-string light-spreading
light-struck Llanfairpwllgwyngyll night-straying night-struck
Nuits-St-Georges pgnttrp R.C.M.P. rock-'n'-roll R.S.V.P. scritch-scratch
scritch-scratching strength-bringing substrstrata thought-straining
tight-stretched tsktsks witch-stricken witch-struck world-schooled
world-spread world-strange world-thrilling
# 找出每行7個輔音字母的英文單詞
[linuxidc@test ~]$ ./puzzle-help.sh '[^aeiouy]{8}' /usr/dict/words |fmt
B.R.C.S. K.C.M.G. R.C.M.P. rock-'n'-roll R.S.V.P.
③[linuxidc@test ~]$ ./puzzle-help.sh '[aeiouy]{6}' /usr/dict/words |fmt
AAAAAA euouae
# 找出每行6個元音字母的英文單詞
[linuxidc@test ~]$ ./puzzle-help.sh '[aeiouy]{5}' /usr/dict/words |fmt
AAAAAA Aeaea Aeaean AIEEE ayuyu Bayeau Blueeye cadiueio Chaouia cooeeing
cooeyed cooeying euouae fooyoung gayyou Guauaenok Iyeyasu Jayuya
Liaoyang Mayeye miaoued miaouing Pauiie queueing Taiyuan taoiya theyaou
trans-Paraguayian ukiyoye Waiyeung
[linuxidc@test ~]$
練習4:試建立一個指令碼,作為單詞出現頻率過濾器
[linuxidc@test ~]$ vi wf.sh
#!/bin/bash
#從標準輸入流讀取文字流,在輸出出現頻率最高的前n個單詞的列表
#附上出現頻率的計數,按照這幾計數由大到小排列
#輸出到標準輸出
#語法 : ./wf [n] < file
#
tr -cs A-Za-z' 'n' |
#將非字母字元置換成換行符號,相當於:
# tr -cs [^[A-Za-z] 'n'
tr A-Z a-z |
sort |
uniq -c |
#去除重複,並顯示其計數
sort -k1,1nr -k2 |
#計數由大到小排序後,再按單詞由小到大排序
#sort -k:定義排序鍵值欄位,按照那個欄位(file)進行排序
#sort -n :依照數值的大小排序
#sort -r :以相反的順序來排序,由大到小
# sort -k1,1nr :表示從欄位1起始處開始,以數值型別反向排序,並結束與欄位1的結尾
sed ${1:-25}q
#顯示前n行,預設為25行
[linuxidc@test ~]$ vi test #隨意擷取文段建立測試檔案
Patent interference cases are historically rare; but they’ve become basically
non-existent since a change in the patent law in 2013. Today, patents are
awarded on a “first to file” basis. However, prior to 2013, patents were granted
on a “first to invent” basis, meaning whoever could prove they invented the idea
first would have rights to the patent. Since Doudna’s and Zhang’s patents were filed
before the switch went into effect, the case falls under the “first to invent” standard.
In the past, patent interference cases like this were concluded within a year,
Sherkow said, but given the value of this patent, it seems more than likely that
the losing party will appeal the decision. That process could stretch out for years.
測試範例:
①、預設情況下格式化輸出
[linuxidc@test ~]$ ./wf.sh < test | pr -c4 -t -w80
10 the 3 were 2 interferenc 2 they
5 patent 2 are 2 invent 2 this
5 to 2 basis 2 on 1 and
4 a 2 but 2 s 1 appeal
4 first 2 cases 2 since 1 awarded
3 in 2 could 2 that 1 basically
3 patents
#pr -cn:產生n欄的輸出 可縮寫為-n
#pr -t:不顯示標題
#pr -wn:每行至多n個字元
②、擷取前面12行後格式化輸出
[linuxidc@test ~]$ ./wf.sh 12 < test | pr -c4 -t -w80
10 the 4 a 3 patents 2 basis
5 patent 4 first 3 were 2 but
5 to 3 in 2 are 2 cases
③、算出去除重複行後有多少單詞出現
[linuxidc@test ~]$ ./wf.sh 9999 < test | wc -l
82
[linuxidc@test ~]$ ./wf.sh 9999 < test | wc -w
164
[linuxidc@test ~]$ ./wf.sh 999 < test | wc -c
1153
# wc -l:計算行數 ,-c:計算位元組數 , -w:計算字數
④、擷取最不常見的出現的單詞
[linuxidc@test ~]$ ./wf.sh 999 < test | tail -n -12 | pr -c4 -t -w80
1 today 1 ve 1 will 1 year
1 under 1 went 1 within 1 years
1 value 1 whoever 1 would 1 zhang
⑤、計算出測試文件中出現一次的單詞個數
[linuxidc@test ~]$ ./wf.sh 999 < test | grep -c '^ *1.'
62
#接在數位1後的.表示的是製表字元(Tab),引數999無意義,可任意取大於文件字數的數位
#grep -c:統計每個檔案匹配的行數
⑥、計算出經常出現的核心單詞個數
[linuxidc@test ~]$ ./wf.sh 999 < test | awk '$1 >=3' | wc -l
8
[linuxidc@test ~]$
本文永久更新連結地址:http://www.linuxidc.com/Linux/2016-04/130078.htm
相關文章