Linux萬用字元與正規表示式

2020-06-16 17:14:15

萬用字元

* 任意字元,可重複多次

? 任意字元,重複一次
[] 代表一個字元

舉例: [a,b,c] 表示abc中任意一個

萬用字元的作用是用來匹配檔名的

正規表示式

正規表示式是在檔案中匹配符合條件的字串的

ls find cp是不支援正規表示式的

但是grep awk sed支援正規表示式

[root@Hadoop-bigdata01 test]# touch aa
[root@hadoop-bigdata01 test]# touch aab aabb
[root@hadoop-bigdata01 test]# ll
total 0
-rw-r--r-- 1 root root 0 May 16 19:47 aa
-rw-r--r-- 1 root root 0 May 16 19:47 aab
-rw-r--r-- 1 root root 0 May 16 19:47 aabb
[root@hadoop-bigdata01 test]# ls aa
aa
[root@hadoop-bigdata01 test]# ls aa?
aab
[root@hadoop-bigdata01 test]# ls aa*
aa  aab  aabb

正規表示式特殊字元

正規表示式匹配範圍

正規表示式標準字元

使用正規表示式

grep "1" /etc/passwd

包含關鍵字1的行,grep只要包含就行,不想萬用字元,要完全一致

[root@hadoop-bigdata01 test]# grep "1" /etc/passwd
bin:x:1:1:bin:/bin:/sbin/nologin
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
gopher:x:13:30:gopher:/var/gopher:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
dbus:x:81:81:System message bus:/:/sbin/nologin
usbmuxd:x:113:113:usbmuxd user:/:/sbin/nologin
avahi-autoipd:x:170:170:Avahi IPv4LL Stack:/var/lib/avahi-autoipd:/sbin/nologin
abrt:x:173:173::/etc/abrt:/sbin/nologin
wang:x:501:501::/home/wang:/bin/bash

grep 'root' /etc/passwd

cat /etc/passwd | grep 'root'

都是同樣的道理,但是管道符更吃資源

所以

1.匹配含有數位的行

grep '[0-9]' /etc/passwd

2.匹配連續含有三個數位的行

grep '[0-9][0-9][0-9]' /etc/passwd 或者 grep ':[0-9][0-9][0-9]:' /etc/passwd

[root@hadoop-bigdata01 test]# grep '[0-9][0-9][0-9]'  /etc/passwd
games:x:12:100:games:/usr/games:/sbin/nologin
usbmuxd:x:113:113:usbmuxd user:/:/sbin/nologin
rtkit:x:499:497:RealtimeKit:/proc:/sbin/nologin
avahi-autoipd:x:170:170:Avahi IPv4LL Stack:/var/lib/avahi-autoipd:/sbin/nologin
abrt:x:173:173::/etc/abrt:/sbin/nologin
nfsnobody:x:65534:65534:Anonymous NFS User:/var/lib/nfs:/sbin/nologin
saslauth:x:498:76:"Saslauthd user":/var/empty/saslauth:/sbin/nologin
pulse:x:497:496:PulseAudio System Daemon:/var/run/pulse:/sbin/nologin
liucheng:x:500:500::/home/liucheng:/bin/bash
wang:x:501:501::/home/wang:/bin/bas

3.匹配以r開頭 n結尾的行

grep '^r.*n$' /etc/passwd

.*代表所有

[root@hadoop-bigdata01 test]# grep '^r.*n$'  /etc/passwd               
rpc:x:32:32:Rpcbind Daemon:/var/cache/rpcbind:/sbin/nologin
rtkit:x:499:497:RealtimeKit:/proc:/sbin/nologin
rpcuser:x:29:29:RPC Service User:/var/lib/nfs:/sbin/nologin

4.過濾ifconfig ,擷取ip

grep -v 代表反向擷取,意思就是去除帶有某關鍵字的行 sed有替換的意思

[root@hadoop-bigdata01 test]# ifconfig | grep 'inet addr:'
          inet addr:192.168.126.191  Bcast:192.168.126.255  Mask:255.255.255.0
          inet addr:127.0.0.1  Mask:255.0.0.0
[root@hadoop-bigdata01 test]# 
[root@hadoop-bigdata01 test]# ifconfig | grep 'inet addr:' | grep -v '127.0.0.1'
          inet addr:192.168.126.191  Bcast:192.168.126.255  Mask:255.255.255.0
[root@hadoop-bigdata01 test]# ifconfig | grep 'inet addr:' | grep -v '127.0.0.1' | sed 's/inet addr://g'
          192.168.126.191  Bcast:192.168.126.255  Mask:255.255.255.0
[root@hadoop-bigdata01 test]# ifconfig | grep 'inet addr:' | grep -v '127.0.0.1' | sed 's/inet addr://g' | sed 's/Bcast.*//g'
          192.168.126.191

誤區

這裡有個誤區,想了好久,是正規表示式和萬用字元的區別

我們知道萬用字元的*指的是任意字元,可重複多次正規表示式的*指的是匹配前一個字元>=0次

這兩個是完全不同的,那如何知道我用的*是萬用字元還是正規表示式

起初我陷入一個誤區,看下面這串命令

[root@hadoop-bigdata01 test]# touch ac aac abc abbc
[root@hadoop-bigdata01 test]# ll
total 0
-rw-r--r-- 1 root root 0 May 16 19:55 aac
-rw-r--r-- 1 root root 0 May 16 19:55 abbc
-rw-r--r-- 1 root root 0 May 16 19:55 abc
-rw-r--r-- 1 root root 0 May 16 19:55 ac
[root@hadoop-bigdata01 test]# ls | grep 'a*c'
aac
abbc
abc
ac
[root@hadoop-bigdata01 test]# ls | grep 'a.*c'
aac
abbc
abc
ac
[root@hadoop-bigdata01 test]# ls | grep '^a.*c'
aac
abbc
abc
ac
[root@hadoop-bigdata01 test]# ls | grep '^a*c' 
aac
ac

為什麼grep 'a*c' 和 grep '^a*c$' 的結果會不一樣,我以為一個是萬用字元,一個是正則,因為a*c顯示的四個結果,正好

不就是匹配任意多個字元嗎?

其實不然

萬用字元的作用是用來匹配檔名的

正規表示式是在檔案中匹配符合條件的字串的

交給管道符之後使用grep已經不是匹配檔名了,這是對檔案的操作,所以說,他完全就是正規表示式

grep 'a*c' 表示的是匹配a>=0個所以只要含有c就是可以的

而grep '^a*c$'也是正則,表示的是以a開頭,且第二個字元匹配a零次或者多次,接下來是c字母的

所以只有aac 和ac 符合條件

所以看這個例子

[root@hadoop-bigdata01 test]# ls
a  aac  abb  abbc  abc  ac  b  bb  c  cb
[root@hadoop-bigdata01 test]# ls | grep 'a*b'
abb
abbc
abc
b
bb
cb

這裡grep 'a*b' 指的可不是含有a和b 而是a重複0次或者多次然後含有b

本文永久更新連結地址：http://www.linuxidc.com/Linux/2017-05/144126.htm

Linux萬用字元與正規表示式

萬用字元

正規表示式

誤區

熱門文章