Python OpenCV實現識別信用卡號教學詳解

2021-11-19 19:01:31

通過與 OpenCV 模板匹配的 OCR

在本節中，我們將使用 Python + OpenCV 實現我們的模板匹配演演算法來自動識別信用卡數位。

為了實現這一點，我們需要應用許多影象處理操作，包括閾值、計算梯度幅度表示、形態學操作和輪廓提取。這些技術已在其他部落格文章中用於檢測影象中的條形碼並識別護照影象中的機器可讀區域。

由於將應用許多影象處理操作來幫助我們檢測和提取信用卡數位，因此我在輸入影象通過我們的影象處理管道時包含了許多中間螢幕截圖。

這些額外的螢幕截圖將讓您更深入地瞭解我們如何能夠將基本影象處理技術連結在一起以構建計算機視覺專案的解決方案。讓我們開始吧。

開啟一個新檔案，命名為 ocr_template_match.py ，我們將開始工作：

# import the necessary packages
from imutils import contours
import numpy as np
import argparse
import imutils
import cv2

要安裝/升級 imutils ，只需使用 pip ：

pip install --upgrade imutils

注意：如果您使用 Python 虛擬環境（就像我所有的 OpenCV 安裝教學一樣），請確保首先使用 workon 命令存取您的虛擬環境，然後安裝/升級 imutils 。

現在我們已經安裝並匯入了包，我們可以解析我們的命令列引數：

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image")
ap.add_argument("-r", "--reference", required=True,
	help="path to reference OCR-A image")
args = vars(ap.parse_args())

建立了一個引數解析器，新增兩個引數，然後解析它們，將它們儲存為變數 args 。兩個必需的命令列引數是：

–image ：要進行 OCR 處理的影象的路徑。

–reference ：參考 OCR-A 影象的路徑。該影象包含 OCR-A 字型中的數位 0-9，從而允許我們稍後在管道中執行模板匹配。

接下來讓我們定義信用卡型別：

# define a dictionary that maps the first digit of a credit card
# number to the credit card type
FIRST_NUMBER = {
	"3": "American Express",
	"4": "Visa",
	"5": "MasterCard",
	"6": "Discover Card"
}

信用卡型別，例如美國運通、Visa 等，可以通過檢查 16 位信用卡號中的第一位數位來識別。我們定義了一個字典 FIRST_NUMBER ，它將第一個數位對映到相應的信用卡型別。讓我們通過載入參考 OCR-A 影象來啟動我們的影象處理管道：

# load the reference OCR-A image from disk, convert it to grayscale,
# and threshold it, such that the digits appear as *white* on a
# *black* background
# and invert it, such that the digits appear as *white* on a *black*
ref = cv2.imread(args["reference"])
ref = cv2.cvtColor(ref, cv2.COLOR_BGR2GRAY)
ref = cv2.threshold(ref, 10, 255, cv2.THRESH_BINARY_INV)[1]

首先，我們載入參考 OCR-A 影象，然後將其轉換為灰度和閾值 + 反轉。在這些操作中的每一箇中，我們儲存或覆蓋 ref ，我們的參考影象。

現在讓我們在 OCR-A 字型影象上定位輪廓：

# find contours in the OCR-A image (i.e,. the outlines of the digits)
# sort them from left to right, and initialize a dictionary to map
# digit name to the ROI
refCnts = cv2.findContours(ref.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
refCnts = imutils.grab_contours(refCnts)
refCnts = contours.sort_contours(refCnts, method="left-to-right")[0]
digits = {}

找到了參考影象中的輪廓。然後，由於 OpenCV 2.4、3 和 4 版本如何不同地儲存返回的輪廓資訊，我們檢查版本並對 refCnts 進行適當更改。接下來，我們從左到右對輪廓進行排序，並初始化一個字典，digits，它將數位名稱對映到感興趣的區域。

此時，我們應該遍歷輪廓，提取ROI並將其與其對應的數位相關聯：

# loop over the OCR-A reference contours
for (i, c) in enumerate(refCnts):
	# compute the bounding box for the digit, extract it, and resize
	# it to a fixed size
	(x, y, w, h) = cv2.boundingRect(c)
	roi = ref[y:y + h, x:x + w]
	roi = cv2.resize(roi, (57, 88))
	# update the digits dictionary, mapping the digit name to the ROI
	digits[i] = roi

遍歷參考影象輪廓。

在迴圈中， i 儲存數位名稱/編號， c 儲存輪廓。我們圍繞每個輪廓 c 計算一個邊界框，用於儲存矩形的 (x, y) 座標和寬度/高度。使用邊界矩形引數從 ref（參考影象）中提取 roi。該 ROI 包含數位。

我們將每個 ROI 大小調整為 57×88 畫素的固定大小。我們需要確保每個數位都調整為固定大小，以便在本教學後面的數位識別中應用模板匹配。

我們將每個數位 0-9（字典鍵）與每個 roi 影象（字典值）相關聯。

在這一點上，我們完成了從參考影象中提取數位並將它們與相應的數位名稱相關聯的工作。

我們的下一個目標是隔離輸入 --image 中的 16 位信用卡號。我們需要先找到並隔離數位，然後才能啟動模板匹配以識別每個數位。這些影象處理步驟非常有趣且有見地，特別是如果您之前從未開發過影象處理管道，請務必密切關注。

讓我們繼續初始化幾個結構化核心：

# initialize a rectangular (wider than it is tall) and square
# structuring kernel
rectKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 3))
sqKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))

您可以將核心視為我們在影象上滑動的小矩陣，以執行（折積）操作，例如模糊、銳化、邊緣檢測或其他影象處理操作。

構造了兩個這樣的核心——一個矩形和一個正方形。我們將使用矩形作為 Top-hat 形態運算元，使用方形作為閉運算。我們很快就會看到這些。現在讓我們準備要進行 OCR 的影象：

# load the input image, resize it, and convert it to grayscale
image = cv2.imread(args["image"])
image = imutils.resize(image, width=300)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

載入了包含信用卡照片的命令列引數影象。然後，我們將其調整為 width=300 ，保持縱橫比，然後將其轉換為灰度。讓我們看看我們的輸入影象：

接下來是我們的調整大小和灰度操作：

現在我們的影象已經灰度化並且大小一致，讓我們進行形態學操作：

# apply a tophat (whitehat) morphological operator to find light
# regions against a dark background (i.e., the credit card numbers)
tophat = cv2.morphologyEx(gray, cv2.MORPH_TOPHAT, rectKernel)

使用我們的 rectKernel 和我們的灰度影象，我們執行 Top-hat 形態學操作，將結果儲存為 tophat。

Top-hat操作在深色背景（即信用卡號）下顯示淺色區域，如下圖所示：

給定我們的高帽影象，讓我們計算沿 x 方向的梯度：

# compute the Scharr gradient of the tophat image, then scale
# the rest back into the range [0, 255]
gradX = cv2.Sobel(tophat, ddepth=cv2.CV_32F, dx=1, dy=0,
	ksize=-1)
gradX = np.absolute(gradX)
(minVal, maxVal) = (np.min(gradX), np.max(gradX))
gradX = (255 * ((gradX - minVal) / (maxVal - minVal)))
gradX = gradX.astype("uint8")

我們努力隔離數位的下一步是計算 x 方向上高帽影象的 Scharr 梯度。完成計算，將結果儲存為 gradX 。

在計算 gradX 陣列中每個元素的絕對值後，我們採取一些步驟將值縮放到 [0-255] 範圍內（因為影象當前是浮點資料型別）。為此，我們計算 gradX 的 minVal 和 maxVal，然後計算第 73 行所示的縮放方程（即最小/最大歸一化）。最後一步是將 gradX 轉換為範圍為 [0-255] 的 uint8。結果如下圖所示：

讓我們繼續改進信用卡數位查詢演演算法：

# apply a closing operation using the rectangular kernel to help
# cloes gaps in between credit card number digits, then apply
# Otsu's thresholding method to binarize the image
gradX = cv2.morphologyEx(gradX, cv2.MORPH_CLOSE, rectKernel)
thresh = cv2.threshold(gradX, 0, 255,
	cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
# apply a second closing operation to the binary image, again
# to help close gaps between credit card number regions
thresh = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, sqKernel)

為了縮小差距，我們執行了一個關閉操作。請注意，我們再次使用了 rectKernel。隨後我們對 gradX 影象執行 Otsu 和二進位制閾值，然後是另一個關閉操作。這些步驟的結果如下所示：

接下來讓我們找到輪廓並初始化數位分組位置列表。

# find contours in the thresholded image, then initialize the
# list of digit locations
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
locs = []

我們找到了輪廓並將它們儲存在一個列表 cnts 中。然後，我們初始化一個列表來儲存數位組位置。

現在讓我們遍歷輪廓，同時根據每個輪廓的縱橫比進行過濾，允許我們從信用卡的其他不相關區域中修剪數位組位置：

# loop over the contours
for (i, c) in enumerate(cnts):
	# compute the bounding box of the contour, then use the
	# bounding box coordinates to derive the aspect ratio
	(x, y, w, h) = cv2.boundingRect(c)
	ar = w / float(h)
	# since credit cards used a fixed size fonts with 4 groups
	# of 4 digits, we can prune potential contours based on the
	# aspect ratio
	if ar > 2.5 and ar < 4.0:
		# contours can further be pruned on minimum/maximum width
		# and height
		if (w > 40 and w < 55) and (h > 10 and h < 20):
			# append the bounding box region of the digits group
			# to our locations list
			locs.append((x, y, w, h))

我們以與參考影象相同的方式迴圈遍歷輪廓。在計算每個輪廓的邊界矩形 c之後，我們通過將寬度除以高度來計算縱橫比 ar 。使用縱橫比，我們分析每個輪廓的形狀。如果 ar 介於 2.5 和 4.0 之間（寬大於高），以及 40 到 55 畫素之間的 w 和 10 到 20 畫素之間的 h，我們將一個方便的元組中的邊界矩形引數附加到 locs。

下圖顯示了我們找到的分組——出於演示目的，我讓 OpenCV 在每個組周圍繪製了一個邊界框：

接下來，我們將從左到右對分組進行排序並初始化信用卡數位列表：

# sort the digit locations from left-to-right, then initialize the
# list of classified digits
locs = sorted(locs, key=lambda x:x[0])
output = []

我們根據 x 值對 locs 進行排序，因此它們將從左到右排序。我們初始化一個列表 output ，它將儲存影象的信用卡號。現在我們知道每組四位數位的位置，讓我們迴圈遍歷四個排序的組並確定其中的數位。

這個迴圈相當長，分為三個程式碼塊——這是第一個塊：

# loop over the 4 groupings of 4 digits
for (i, (gX, gY, gW, gH)) in enumerate(locs):
	# initialize the list of group digits
	groupOutput = []
	# extract the group ROI of 4 digits from the grayscale image,
	# then apply thresholding to segment the digits from the
	# background of the credit card
	group = gray[gY - 5:gY + gH + 5, gX - 5:gX + gW + 5]
	group = cv2.threshold(group, 0, 255,
		cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
	# detect the contours of each individual digit in the group,
	# then sort the digit contours from left to right
	digitCnts = cv2.findContours(group.copy(), cv2.RETR_EXTERNAL,
		cv2.CHAIN_APPROX_SIMPLE)
	digitCnts = imutils.grab_contours(digitCnts)
	digitCnts = contours.sort_contours(digitCnts,
		method="left-to-right")[0]

在此迴圈的第一個塊中，我們提取並在每側填充組 5 個畫素，應用閾值處理，並查詢和排序輪廓。詳情請務必參考程式碼。下面顯示的是已提取的單個組：

讓我們用巢狀迴圈繼續迴圈以進行模板匹配和相似度得分提取：

# loop over the digit contours
	for c in digitCnts:
		# compute the bounding box of the individual digit, extract
		# the digit, and resize it to have the same fixed size as
		# the reference OCR-A images
		(x, y, w, h) = cv2.boundingRect(c)
		roi = group[y:y + h, x:x + w]
		roi = cv2.resize(roi, (57, 88))
		# initialize a list of template matching scores	
		scores = []
		# loop over the reference digit name and digit ROI
		for (digit, digitROI) in digits.items():
			# apply correlation-based template matching, take the
			# score, and update the scores list
			result = cv2.matchTemplate(roi, digitROI,
				cv2.TM_CCOEFF)
			(_, score, _, _) = cv2.minMaxLoc(result)
			scores.append(score)
		# the classification for the digit ROI will be the reference
		# digit name with the *largest* template matching score
		groupOutput.append(str(np.argmax(scores)))

使用 cv2.boundingRect 我們獲得提取包含每個數位的 ROI 所需的引數。為了使模板匹配以某種程度的精度工作，我們將 roi 的大小調整為與我們在第 144 行上的參考 OCR-A 字型數位影像（57×88 畫素）相同的大小。

我們初始化了一個分數列表。將其視為我們的置信度分數——它越高，它就越有可能是正確的模板。

現在，讓我們通過每個參考數位迴圈（第三個巢狀迴圈）並執行模板匹配。這是為這個指令碼完成繁重工作的地方。

OpenCV 有一個名為 cv2.matchTemplate 的方便函數，您可以在其中提供兩個影象：一個是模板，另一個是輸入影象。將 cv2.matchTemplate 應用於這兩個影象的目的是確定它們的相似程度。

在這種情況下，我們提供參考 digitROI 影象和包含候選數位的信用卡的 roi。使用這兩個影象，我們呼叫模板匹配函數並儲存結果。接下來，我們從結果中提取分數並將其附加到我們的分數列表中。這樣就完成了最內部的迴圈。

使用分數（每個數位 0-9 一個），我們取最大分數——最大分數應該是我們正確識別的數位。我們找到得分最高的數位，通過 np.argmax 獲取特定索引。該索引的整數名稱表示基於與每個模板的比較最可能的數位（再次記住，索引已經預先排序為 0-9）。

最後，讓我們在每組周圍畫一個矩形，並以紅色文字檢視影象上的信用卡號：

# draw the digit classifications around the group
	cv2.rectangle(image, (gX - 5, gY - 5),
		(gX + gW + 5, gY + gH + 5), (0, 0, 255), 2)
	cv2.putText(image, "".join(groupOutput), (gX, gY - 15),
		cv2.FONT_HERSHEY_SIMPLEX, 0.65, (0, 0, 255), 2)
	# update the output digits list
	output.extend(groupOutput)

對於此迴圈的第三個也是最後一個塊，我們在組周圍繪製一個 5 畫素的填充矩形，然後在螢幕上繪製文字。

最後一步是將數位附加到輸出列表中。 Pythonic 方法是使用擴充套件函數將可迭代物件（在本例中為列表）的每個元素附加到列表的末尾。

要檢視指令碼的執行情況，讓我們將結果輸出到終端並在螢幕上顯示我們的影象。

# display the output credit card information to the screen
print("Credit Card Type: {}".format(FIRST_NUMBER[output[0]]))
print("Credit Card #: {}".format("".join(output)))
cv2.imshow("Image", image)
cv2.waitKey(0)

將信用卡型別列印到控制檯，然後在隨後的第 173 行列印信用卡號。

在最後幾行，我們在螢幕上顯示影象並等待任何鍵被按下，然後退出指令碼第 174 和 175 行。

花點時間祝賀自己——你做到了。回顧一下（在高層次上），這個指令碼：

將信用卡型別儲存在字典中。
獲取參考影象並提取數位。
將數位模板儲存在字典中。
在地化四個信用卡號碼組，每個組包含四位數位（總共 16 位數位）。
提取要「匹配」的數位。
對每個數位執行模板匹配，將每個單獨的 ROI 與每個數位模板 0-9 進行比較，同時儲存每個嘗試匹配的分數。
找到每個候選數位的最高分，並構建一個名為 output 的列表，其中包含信用卡號。
將信用卡號和信用卡型別輸出到我們的終端，並將輸出影象顯示到我們的螢幕上。

現在是時候檢視執行中的指令碼並檢查我們的結果了。

信用卡 OCR 結果

現在我們已經對信用卡 OCR 系統進行了編碼，讓我們試一試。

我們顯然不能在這個例子中使用真實的信用卡號，所以我使用谷歌收集了一些信用卡範例影象。

這些信用卡顯然是假的，僅用於演示目的。但是，您可以應用此部落格文章中的相同技術來識別實際信用卡上的數位。

要檢視我們的信用卡 OCR 系統的執行情況，請開啟一個終端並執行以下命令：

$ python ocr_template_match.py --reference ocr_a_reference.png 
	--image images/credit_card_05.png
Credit Card Type: MasterCard
Credit Card #: 5476767898765432

我們的第一個結果影象，100% 正確：

請注意我們如何能夠正確地將信用卡標記為萬事達卡，只需檢查信用卡號中的第一位數位即可。讓我們嘗試第二張圖片，這次是一張簽證：

$ python ocr_template_match.py --reference ocr_a_reference.png 
	--image images/credit_card_01.png
Credit Card Type: Visa
Credit Card #: 4000123456789010

再一次，我們能夠使用模板匹配正確地對信用卡進行 OCR。

$ python ocr_template_match.py --reference ocr_a_reference.png

$ python ocr_template_match.py --reference ocr_a_reference.png 
	--image images/credit_card_02.png
Credit Card Type: Visa
Credit Card #: 4020340002345678