python機器學習sklearn實現識別數位

2022-03-29 19:01:00

簡介

本文主要簡述如何通過sklearn模組來進行預測和學習，最後再以圖表這種更加直觀的方式展現出來

資料集

學習資料

預測資料

資料處理

資料分離

因為我們開啟我們的的學習資料集，最後一項是我們的真實數值，看過小唐上一篇的人都知道，老規矩先進行拆分，前面的特徵放一塊，後面的真實值放一塊，同時由於資料沒有列名，我們選擇使用iloc[]來實現分離

def shuju(tr_path,ts_path,sep='t'):
    train=pd.read_csv(tr_path,sep=sep)
    test=pd.read_csv(ts_path,sep=sep)
    #特徵和結果分離
    train_features=train.iloc[:,:-1].values
    train_labels=train.iloc[:,-1].values
    test_features = test.iloc[:, :-1].values
    test_labels = test.iloc[:, -1].values
    return train_features,test_features,train_labels,test_labels

訓練資料

我們在這裡直接使用sklearn函數，通過選擇模型，然後直接生成其識別規則

#訓練資料
def train_tree(*data):
    x_train, x_test, y_train, y_test=data
    clf=DecisionTreeClassifier()
    clf.fit(x_train,y_train)
    print("學習模型預測成績：{:.4f}".format(clf.score(x_train, y_train)))
    print("實際模型預測成績：{:.4f}".format(clf.score(x_test, y_test)))
    #返回學習模型
    return clf

資料視覺化

為了讓我們的觀察更加直觀，我們還可以使用matplotlib來進行觀測

def plot_imafe(test,test_labels,preds):
    plt.ion()
    plt.show()
    for i in range(50):
        label,pred=test_labels[i],preds[i]
        title='實際值:{},predict{}'.format(label,pred)
        img=test[i].reshape(28,28)
        plt.imshow(img,cmap="binary")
        plt.title(title)
        plt.show()
    print('done')

結果

完整程式碼

import pandas as pd
from sklearn.tree import DecisionTreeClassifier
import matplotlib.pyplot as plt

def shuju(tr_path,ts_path,sep='t'):
    train=pd.read_csv(tr_path,sep=sep)
    test=pd.read_csv(ts_path,sep=sep)
    #特徵和結果分離
    train_features=train.iloc[:,:-1].values
    train_labels=train.iloc[:,-1].values
    test_features = test.iloc[:, :-1].values
    test_labels = test.iloc[:, -1].values
    return train_features,test_features,train_labels,test_labels
#訓練資料
def train_tree(*data):
    x_train, x_test, y_train, y_test=data
    clf=DecisionTreeClassifier()
    clf.fit(x_train,y_train)
    print("學習模型預測成績：{:.4f}".format(clf.score(x_train, y_train)))
    print("實際模型預測成績：{:.4f}".format(clf.score(x_test, y_test)))
    #返回學習模型
    return clf

def plot_imafe(test,test_labels,preds):
    plt.ion()
    plt.show()
    for i in range(50):
        label,pred=test_labels[i],preds[i]
        title='實際值:{},predict{}'.format(label,pred)
        img=test[i].reshape(28,28)
        plt.imshow(img,cmap="binary")
        plt.title(title)
        plt.show()
    print('done')

train_features,test_features,train_labels,test_labels=shuju(r"C:UserstwyPycharmProjects1train_images.csv",r"C:UserstwyPycharmProjects1test_images.csv")
clf=train_tree(train_features,test_features,train_labels,test_labels)
preds=clf.predict(test_features)
plot_imafe(test_features,test_labels,preds)

到此這篇關於python機器學習sklearn實現識別數位的文章就介紹到這了,更多相關python sklearn識別數位內容請搜尋it145.com以前的文章或繼續瀏覽下面的相關文章希望大家以後多多支援it145.com！