<em>Mac</em>Book项目 2009年学校开始实施<em>Mac</em>Book项目,所有师生配备一本<em>Mac</em>Book,并同步更新了校园无线网络。学校每周进行电脑技术更新,每月发送技术支持资料,极大改变了教学及学习方式。因此2011
2021-06-01 09:32:01
前幾天搗鼓了一下Ubuntu,正是想用一下我舊電腦上的N卡,可以用GPU來跑程式碼,體驗一下多核的快樂。
還好我這破電腦也是支援Cuda的:
$ sudo lshw -C display *-display description: 3D controller product: GK208M [GeForce GT 740M] vendor: NVIDIA Corporation physical id: 0 bus info: pci@0000:01:00.0 version: a1 width: 64 bits clock: 33MHz capabilities: pm msi pciexpress bus_master cap_list rom configuration: driver=nouveau latency=0 resources: irq:35 memory:f0000000-f0ffffff memory:c0000000-cfffffff memory:d0000000-d1ffffff ioport:6000(size=128)
首先安裝一下Cuda的開發工具,命令如下:
$ sudo apt install nvidia-cuda-toolkit
檢視一下相關資訊:
$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2021 NVIDIA Corporation Built on Thu_Nov_18_09:45:30_PST_2021 Cuda compilation tools, release 11.5, V11.5.119 Build cuda_11.5.r11.5/compiler.30672275_0
通過Conda安裝相關的依賴包:
conda install numba & conda install cudatoolkit
通過pip安裝也可以,一樣的。
簡單測試了一下,發覺報錯了:
$ /home/larry/anaconda3/bin/python /home/larry/code/pkslow-samples/python/src/main/python/cuda/test1.py Traceback (most recent call last): File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/driver.py", line 246, in ensure_initialized self.cuInit(0) File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/driver.py", line 319, in safe_cuda_api_call self._check_ctypes_error(fname, retcode) File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/driver.py", line 387, in _check_ctypes_error raise CudaAPIError(retcode, msg) numba.cuda.cudadrv.driver.CudaAPIError: [100] Call to cuInit results in CUDA_ERROR_NO_DEVICE During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/larry/code/pkslow-samples/python/src/main/python/cuda/test1.py", line 15, in <module> gpu_print[1, 2]() File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/compiler.py", line 862, in __getitem__ return self.configure(*args) File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/compiler.py", line 857, in configure return _KernelConfiguration(self, griddim, blockdim, stream, sharedmem) File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/compiler.py", line 718, in __init__ ctx = get_context() File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/devices.py", line 220, in get_context return _runtime.get_or_create_context(devnum) File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/devices.py", line 138, in get_or_create_context return self._get_or_create_context_uncached(devnum) File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/devices.py", line 153, in _get_or_create_context_uncached with driver.get_active_context() as ac: File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/driver.py", line 487, in __enter__ driver.cuCtxGetCurrent(byref(hctx)) File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/driver.py", line 284, in __getattr__ self.ensure_initialized() File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/driver.py", line 250, in ensure_initialized raise CudaSupportError(f"Error at driver init: {description}") numba.cuda.cudadrv.error.CudaSupportError: Error at driver init: Call to cuInit results in CUDA_ERROR_NO_DEVICE (100)
網上搜了一下,發現是驅動問題。通過Ubuntu自帶的工具安裝顯示卡驅動:
還是失敗:
$ nvidia-smi NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
最後,通過命令列安裝驅動,成功解決這個問題:
$ sudo apt install nvidia-driver-470
檢查後發現正常了:
$ nvidia-smi Wed Dec 7 22:13:49 2022 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 470.161.03 Driver Version: 470.161.03 CUDA Version: 11.4 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:01:00.0 N/A | N/A | | N/A 51C P8 N/A / N/A | 4MiB / 2004MiB | N/A Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
測試程式碼也可以跑了。
準備以下程式碼:
from numba import cuda import os def cpu_print(): print('cpu print') @cuda.jit def gpu_print(): dataIndex = cuda.threadIdx.x + cuda.blockIdx.x * cuda.blockDim.x print('gpu print ', cuda.threadIdx.x, cuda.blockIdx.x, cuda.blockDim.x, dataIndex) if __name__ == '__main__': gpu_print[4, 4]() cuda.synchronize() cpu_print()
這個程式碼主要有兩個函數,一個是用CPU執行,一個是用GPU執行,執行列印操作。關鍵在於@cuda.jit
這個註解,讓程式碼在GPU上執行。執行結果如下:
$ /home/larry/anaconda3/bin/python /home/larry/code/pkslow-samples/python/src/main/python/cuda/print_test.py
gpu print 0 3 4 12
gpu print 1 3 4 13
gpu print 2 3 4 14
gpu print 3 3 4 15
gpu print 0 2 4 8
gpu print 1 2 4 9
gpu print 2 2 4 10
gpu print 3 2 4 11
gpu print 0 1 4 4
gpu print 1 1 4 5
gpu print 2 1 4 6
gpu print 3 1 4 7
gpu print 0 0 4 0
gpu print 1 0 4 1
gpu print 2 0 4 2
gpu print 3 0 4 3
cpu print
可以看到GPU總共列印了16次,使用了不同的Thread來執行。這次每次列印的結果都可能不同,因為提交GPU是非同步執行的,無法確保哪個單元先執行。同時也需要呼叫同步函數cuda.synchronize()
,確保GPU執行完再繼續往下跑。
我們通過這個函數來看GPU並行的力量:
from numba import jit, cuda import numpy as np # to measure exec time from timeit import default_timer as timer # normal function to run on cpu def func(a): for i in range(10000000): a[i] += 1 # function optimized to run on gpu @jit(target_backend='cuda') def func2(a): for i in range(10000000): a[i] += 1 if __name__ == "__main__": n = 10000000 a = np.ones(n, dtype=np.float64) start = timer() func(a) print("without GPU:", timer() - start) start = timer() func2(a) print("with GPU:", timer() - start)
結果如下:
$ /home/larry/anaconda3/bin/python /home/larry/code/pkslow-samples/python/src/main/python/cuda/time_test.py
without GPU: 3.7136273959999926
with GPU: 0.4040513340000871
可以看到使用CPU需要3.7秒,而GPU則只要0.4秒,還是能快不少的。當然這裡不是說GPU一定比CPU快,具體要看任務的型別。
以上就是一文詳解如何用GPU來執行Python程式碼的詳細內容,更多關於用GPU執行Python程式碼的資料請關注it145.com其它相關文章!
相關文章
<em>Mac</em>Book项目 2009年学校开始实施<em>Mac</em>Book项目,所有师生配备一本<em>Mac</em>Book,并同步更新了校园无线网络。学校每周进行电脑技术更新,每月发送技术支持资料,极大改变了教学及学习方式。因此2011
2021-06-01 09:32:01
综合看Anker超能充系列的性价比很高,并且与不仅和iPhone12/苹果<em>Mac</em>Book很配,而且适合多设备充电需求的日常使用或差旅场景,不管是安卓还是Switch同样也能用得上它,希望这次分享能给准备购入充电器的小伙伴们有所
2021-06-01 09:31:42
除了L4WUDU与吴亦凡已经多次共事,成为了明面上的厂牌成员,吴亦凡还曾带领20XXCLUB全队参加2020年的一场音乐节,这也是20XXCLUB首次全员合照,王嗣尧Turbo、陈彦希Regi、<em>Mac</em> Ova Seas、林渝植等人全部出场。然而让
2021-06-01 09:31:34
目前应用IPFS的机构:1 谷歌<em>浏览器</em>支持IPFS分布式协议 2 万维网 (历史档案博物馆)数据库 3 火狐<em>浏览器</em>支持 IPFS分布式协议 4 EOS 等数字货币数据存储 5 美国国会图书馆,历史资料永久保存在 IPFS 6 加
2021-06-01 09:31:24
开拓者的车机是兼容苹果和<em>安卓</em>,虽然我不怎么用,但确实兼顾了我家人的很多需求:副驾的门板还配有解锁开关,有的时候老婆开车,下车的时候偶尔会忘记解锁,我在副驾驶可以自己开门:第二排设计很好,不仅配置了一个很大的
2021-06-01 09:30:48
不仅是<em>安卓</em>手机,苹果手机的降价力度也是前所未有了,iPhone12也“跳水价”了,发布价是6799元,如今已经跌至5308元,降价幅度超过1400元,最新定价确认了。iPhone12是苹果首款5G手机,同时也是全球首款5nm芯片的智能机,它
2021-06-01 09:30:45