<em>Mac</em>Book项目 2009年学校开始实施<em>Mac</em>Book项目,所有师生配备一本<em>Mac</em>Book,并同步更新了校园无线网络。学校每周进行电脑技术更新,每月发送技术支持资料,极大改变了教学及学习方式。因此2011
2021-06-01 09:32:01
搞機器學習或者深度學習演演算法很多時候需要遍歷某個目錄讀取檔案,特別是經常需要讀取某個特定字尾的檔案,比如圖片的話可能需要讀取jpg, png, bmp格式的檔案。python本身的庫函數功能沒有這麼客製化化,所以就需要再重新包裝一下。
假設我們有如下的目錄結構,以bmp結尾的是檔案,其他是資料夾。下面的程式都將以該目錄結構為例進行說明。
os.listdir僅讀取當前路徑下的檔案和資料夾,返回一個列表。讀取demo目錄結構的程式碼和結果如下:
path = r'D:data' items = os.listdir(path) # ==> ['1.bmp', '2.bmp', 'a', 'b']
os.walk本身已經是遍歷讀取,包含所有的子檔案(夾)但是其結果不像是os.listdir一樣是個list,而是一個比較複雜的資料體,難以直接使用,所以一般需要再處理一下。我們可以使用for語句將其列印出來看看:
path = r'D:data' # part 1 for items in os.walk(path): print(items) # part 2 for main_dir, sub_dir_list, sub_file_list in os.walk(path): print(main_dir, sub_dir_list, sub_file_list)
結果為:
# part 1
('D:\data', ['a', 'b'], ['1.bmp', '2.bmp'])
('D:\data\a', [], ['a1.bmp'])
('D:\data\b', [], ['b1.bmp'])
# part 2
D:data ['a', 'b'] ['1.bmp', '2.bmp']
D:dataa [] ['a1.bmp']
D:datab [] ['b1.bmp']
使用迭代器對os.walk()的結果進行輸出,發現每一條包含三個部分(part 1),在part 2中,我們給三個部分分別起名為main_dir, sub_dir_list, sub_file_list,下面對其進行簡單解釋:
連線main_dir和sub_file_list中的檔案可以得到路徑下的所有檔案。
sub_dir_list在這裡則沒有用處,我們無需再去遍歷sub_dir_list,因為它們已經包含在main_dir裡了。
程式碼邏輯如下:
需要有字尾辨別功能,並且能夠同時辨別多個字尾
需要有遞迴和非遞迴功能
返回的是以入參path為字首的路徑,所以如果path是完整路徑那麼返回的就是完整路徑,否則就不是
# -*- coding: utf-8 -*- import os def file_ext(filename, level=1): """ return extension of filename Parameters: ----------- filename: str name of file, path can be included level: int level of extension. for example, if filename is 'sky.png.bak', the 1st level extension is 'bak', and the 2nd level extension is 'png' Returns: -------- extension of filename """ return filename.split('.')[-level] def _contain_file(path, extensions): """ check whether path contains any file whose extension is in extensions list Parameters: ----------- path: str path to be checked extensions: str or list/tuple of str extension or extensions list Returns: -------- return True if contains, else return False """ assert os.path.exists(path), 'path must exist' assert os.path.isdir(path), 'path must be dir' if isinstance(extensions, str): extensions = [extensions] for file in os.listdir(path): if os.path.isfile(os.path.join(path, file)): if (extensions is None) or (file_ext(file) in extensions): return True return False def _process_extensions(extensions=None): """ preprocess and check extensions, if extensions is str, convert it to list. Parameters: ----------- extensions: str or list/tuple of str file extensions Returns: -------- extensions: list/tuple of str file extensions """ if extensions is not None: if isinstance(extensions, str): extensions = [extensions] assert isinstance(extensions, (list, tuple)), 'extensions must be str or list/tuple of str' for ext in extensions: assert isinstance(ext, str), 'extension must be str' return extensions def get_files(path, extensions=None, is_recursive=True): """ read files in path. if extensions is None, read all files, if extensions are specified, only read the files who have one of the extensions. if is_recursive is True, recursively read all files, if is_recursive is False, only read files in current path. Parameters: ----------- path: str path to be read extensions: str or list/tuple of str file extensions is_recursive: bool whether read files recursively. read recursively is True, while just read files in current path if False Returns: -------- files: the obtained files in path """ extensions = _process_extensions(extensions) files = [] # get files in current path if not is_recursive: for name in os.listdir(path): fullname = os.path.join(path, name) if os.path.isfile(fullname): if (extensions is None) or (file_ext(fullname) in extensions): files.append(fullname) return files # get files recursively for main_dir, _, sub_file_list in os.walk(path): for filename in sub_file_list: fullname = os.path.join(main_dir, filename) if (extensions is None) or (file_ext(fullname) in extensions): files.append(fullname) return files def get_folders(path, extensions=None, is_recursive=True): """ read folders in path. if extensions is None, read all folders, if extensions are specified, only read the folders who contain any files that have one of the extensions. if is_recursive is True, recursively read all folders, if is_recursive is False, only read folders in current path. Parameters: ----------- path: str path to be read extensions: str or list/tuple of str file extensions is_recursive: bool whether read folders recursively. read recursively is True, while just read folders in current path if False Returns: -------- folders: the obtained folders in path """ extensions = _process_extensions(extensions) folders = [] # get folders in current path if not is_recursive: for name in os.listdir(path): fullname = os.path.join(path, name) if os.path.isdir(fullname): if (extensions is None) or (_contain_file(fullname, extensions)): folders.append(fullname) return folders # get folders recursively for main_dir, _, _ in os.walk(path): if (extensions is None) or (_contain_file(main_dir, extensions)): folders.append(main_dir) return folders if __name__ == '__main__': path = r'.data' files = get_files(path) print(files) # ==> ['D:\data\1.bmp', 'D:\data\2.bmp', 'D:\data\a\a1.bmp', 'D:\data\b\b1.bmp'] folders = get_folders(path) print(folders) # ==> ['D:\data', 'D:\data\a', 'D:\data\b']
以上就是Python實現遍歷讀取檔案或資料夾的詳細內容,更多關於Python遍歷讀取檔案的資料請關注it145.com其它相關文章!
相關文章
<em>Mac</em>Book项目 2009年学校开始实施<em>Mac</em>Book项目,所有师生配备一本<em>Mac</em>Book,并同步更新了校园无线网络。学校每周进行电脑技术更新,每月发送技术支持资料,极大改变了教学及学习方式。因此2011
2021-06-01 09:32:01
综合看Anker超能充系列的性价比很高,并且与不仅和iPhone12/苹果<em>Mac</em>Book很配,而且适合多设备充电需求的日常使用或差旅场景,不管是安卓还是Switch同样也能用得上它,希望这次分享能给准备购入充电器的小伙伴们有所
2021-06-01 09:31:42
除了L4WUDU与吴亦凡已经多次共事,成为了明面上的厂牌成员,吴亦凡还曾带领20XXCLUB全队参加2020年的一场音乐节,这也是20XXCLUB首次全员合照,王嗣尧Turbo、陈彦希Regi、<em>Mac</em> Ova Seas、林渝植等人全部出场。然而让
2021-06-01 09:31:34
目前应用IPFS的机构:1 谷歌<em>浏览器</em>支持IPFS分布式协议 2 万维网 (历史档案博物馆)数据库 3 火狐<em>浏览器</em>支持 IPFS分布式协议 4 EOS 等数字货币数据存储 5 美国国会图书馆,历史资料永久保存在 IPFS 6 加
2021-06-01 09:31:24
开拓者的车机是兼容苹果和<em>安卓</em>,虽然我不怎么用,但确实兼顾了我家人的很多需求:副驾的门板还配有解锁开关,有的时候老婆开车,下车的时候偶尔会忘记解锁,我在副驾驶可以自己开门:第二排设计很好,不仅配置了一个很大的
2021-06-01 09:30:48
不仅是<em>安卓</em>手机,苹果手机的降价力度也是前所未有了,iPhone12也“跳水价”了,发布价是6799元,如今已经跌至5308元,降价幅度超过1400元,最新定价确认了。iPhone12是苹果首款5G手机,同时也是全球首款5nm芯片的智能机,它
2021-06-01 09:30:45