<em>Mac</em>Book项目 2009年学校开始实施<em>Mac</em>Book项目,所有师生配备一本<em>Mac</em>Book,并同步更新了校园无线网络。学校每周进行电脑技术更新,每月发送技术支持资料,极大改变了教学及学习方式。因此2011
2021-06-01 09:32:01
專案中經常遇到CSV檔案的讀寫需求,其中的難點主要是CSV檔案的解析。本文會介紹CsvHelper、TextFieldParser、正規表示式三種解析CSV檔案的方法,順帶也會介紹一下CSV檔案的寫方法。
在介紹CSV檔案的讀寫方法前,我們需要了解一下CSV檔案的格式。
一個簡單的CSV檔案:
Test1,Test2,Test3,Test4,Test5,Test6 str1,str2,str3,str4,str5,str6 str1,str2,str3,str4,str5,str6
一個不簡單的CSV檔案:
"Test1 "",""","Test2 "",""","Test3 "",""","Test4 "",""","Test5 "",""","Test6 "",""" " 中文,D23 ","3DFD4234""""""1232""1S2","ASD1"",""23,,,,213 23F32"," ",,asd " 中文,D23 ","3DFD4234""""""1232""1S2","ASD1"",""23,,,,213 23F32"," ",,asd
你沒看錯,上面兩個都是CSV檔案,都只有3行CSV資料。第二個檔案多看一眼都是精神汙染,但專案中無法避免會出現這種檔案。
CSV檔案沒有官方的標準,但一般專案都會遵守 RFC 4180 標準。這是一個非官方的標準,內容如下:
Each record is located on a separate line, delimited by a line break (CRLF).
The last record in the file may or may not have an ending line break.
There maybe an optional header line appearing as the first line of the file with the same format as normal record lines. This header will contain names corresponding to the fields in the file and should contain the same number of fields as the records in the rest of the file (the presence or absence of the header line should be indicated via the optional "header" parameter of this MIME type).
Within the header and each record, there may be one or more fields, separated by commas. Each line should contain the same number of fields throughout the file. Spaces are considered part of a field and should not be ignored. The last field in the record must not be followed by a comma.
Each field may or may not be enclosed in double quotes (however some programs, such as Microsoft Excel, do not use double quotes at all). If fields are not enclosed with double quotes, then double quotes may not appear inside the fields.
Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes.
If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote.
翻譯一下:
上面的標準可能比較拗口,我們對它進行一些簡化。要注意一下,簡化不是簡單的刪減規則,而是將類似的類似進行合併便於理解。
後面的程式碼也會使用簡化標準,簡化標準如下:
在正式讀寫CSV檔案前,我們需要先定義一個用於測試的Test類。程式碼如下:
class Test { public string Test1{get;set;} public string Test2 { get; set; } public string Test3 { get; set; } public string Test4 { get; set; } public string Test5 { get; set; } public string Test6 { get; set; } //Parse方法會在自定義讀寫CSV檔案時用到 public static Test Parse (string[]fields ) { try { Test ret = new Test(); ret.Test1 = fields[0]; ret.Test2 = fields[1]; ret.Test3 = fields[2]; ret.Test4 = fields[3]; ret.Test5 = fields[4]; ret.Test6 = fields[5]; return ret; } catch (Exception) { //做一些例外處理,寫紀錄檔之類的 return null; } } }
生成一些測試資料,程式碼如下:
static void Main(string[] args) { //檔案儲存路徑 string path = "tset.csv"; //清理之前的測試檔案 File.Delete("tset.csv"); Test test = new Test(); test.Test1 = " 中文,D23 "; test.Test2 = "3DFD4234"""1232"1S2"; test.Test3 = "ASD1","23,,,,213r23F32"; test.Test4 = "r"; test.Test5 = string.Empty; test.Test6 = "asd"; //測試資料 var records = new List<Test> { test, test }; //寫CSV檔案 /* *直接把後面的寫CSV檔案程式碼複製到此處 */ //讀CSV檔案 /* *直接把後面的讀CSV檔案程式碼複製到此處 */ Console.ReadLine(); }
CsvHelper 是用於讀取和寫入 CSV 檔案的庫,支援自定義類物件的讀寫。
github上標星最高的CSV檔案讀寫C#庫,使用MS-PL、Apache 2.0開源協定。
使用NuGet下載CsvHelper,讀寫CSV檔案的程式碼如下:
//寫CSV檔案 using (var writer = new StreamWriter(path)) using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture)) { csv.WriteRecords(records); } using (var writer = new StreamWriter(path,true)) using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture)) { //追加 foreach (var record in records) { csv.WriteRecord(record); } } //讀CSV檔案 using (var reader = new StreamReader(path)) using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture)) { records = csv.GetRecords<Test>().ToList(); //逐行讀取 //records.Add(csv.GetRecord<Test>()); }
如果你只想要拿來就能用的庫,那文章基本上到這裡就結束了。
為了與CsvHelper區分,新建一個CsvFile類存放自定義讀寫CSV檔案的程式碼,最後會提供類的完整原始碼。CsvFile類定義如下:
/// <summary> /// CSV檔案讀寫工具類 /// </summary> public class CsvFile { #region 寫CSV檔案 //具體程式碼... #endregion #region 讀CSV檔案(使用TextFieldParser) //具體程式碼... #endregion #region 讀CSV檔案(使用正規表示式) //具體程式碼... #endregion }
基於簡化標準的寫CSV檔案
根據簡化標準(具體標準內容見前文),寫CSV檔案程式碼如下:
#region 寫CSV檔案 //欄位陣列轉為CSV記錄行 private static string FieldsToLine(IEnumerable<string> fields) { if (fields == null) return string.Empty; fields = fields.Select(field => { if (field == null) field = string.Empty; //簡化標準,所有欄位都加雙引號 field = string.Format(""{0}"", field.Replace(""", """")); //不簡化標準 //field = field.Replace(""", """"); //if (field.IndexOfAny(new char[] { ',', '"', ' ', 'r' }) != -1) //{ // field = string.Format(""{0}"", field); //} return field; }); string line = string.Format("{0}{1}", string.Join(",", fields), Environment.NewLine); return line; } //預設的欄位轉換方法 private static IEnumerable<string> GetObjFields<T>(T obj, bool isTitle) where T : class { IEnumerable<string> fields; if (isTitle) { fields = obj.GetType().GetProperties().Select(pro => pro.Name); } else { fields = obj.GetType().GetProperties().Select(pro => pro.GetValue(obj)?.ToString()); } return fields; } /// <summary> /// 寫CSV檔案,預設第一行為標題 /// </summary> /// <typeparam name="T"></typeparam> /// <param name="list">資料列表</param> /// <param name="path">檔案路徑</param> /// <param name="append">追加記錄</param> /// <param name="func">欄位轉換方法</param> /// <param name="defaultEncoding"></param> public static void Write<T>(List<T> list, string path,bool append=true, Func<T, bool, IEnumerable<string>> func = null, Encoding defaultEncoding = null) where T : class { if (list == null || list.Count == 0) return; if (defaultEncoding == null) { defaultEncoding = Encoding.UTF8; } if (func == null) { func = GetObjFields; } if (!File.Exists(path)|| !append) { var fields = func(list[0], true); string title = FieldsToLine(fields); File.WriteAllText(path, title, defaultEncoding); } using (StreamWriter sw = new StreamWriter(path, true, defaultEncoding)) { list.ForEach(obj => { var fields = func(obj, false); string line = FieldsToLine(fields); sw.Write(line); }); } } #endregion
使用時,程式碼如下:
//寫CSV檔案 //使用自定義的欄位轉換方法,也是文章開頭複雜CSV檔案使用欄位轉換方法 CsvFile.Write(records, path, true, new Func<Test, bool, IEnumerable<string>>((obj, isTitle) => { IEnumerable<string> fields; if (isTitle) { fields = obj.GetType().GetProperties().Select(pro => pro.Name + Environment.NewLine + "",""); } else { fields = obj.GetType().GetProperties().Select(pro => pro.GetValue(obj)?.ToString()); } return fields; })); //使用預設的欄位轉換方法 //CsvFile.Write(records, path);
你也可以使用預設的欄位轉換方法,程式碼如下:
CsvFile.Save(records, path);
使用TextFieldParser解析CSV檔案
TextFieldParser是VB中解析CSV檔案的類,C#雖然沒有類似功能的類,不過可以呼叫VB的TextFieldParser來實現功能。
TextFieldParser解析CSV檔案的程式碼如下:
#region 讀CSV檔案(使用TextFieldParser) /// <summary> /// 讀CSV檔案,預設第一行為標題 /// </summary> /// <typeparam name="T"></typeparam> /// <param name="path">檔案路徑</param> /// <param name="func">欄位解析規則</param> /// <param name="defaultEncoding">檔案編碼</param> /// <returns></returns> public static List<T> Read<T>(string path, Func<string[], T> func, Encoding defaultEncoding = null) where T : class { if (defaultEncoding == null) { defaultEncoding = Encoding.UTF8; } List<T> list = new List<T>(); using (TextFieldParser parser = new TextFieldParser(path, defaultEncoding)) { parser.TextFieldType = FieldType.Delimited; //設定逗號分隔符 parser.SetDelimiters(","); //設定不忽略欄位前後的空格 parser.TrimWhiteSpace = false; bool isLine = false; while (!parser.EndOfData) { string[] fields = parser.ReadFields(); if (isLine) { var obj = func(fields); if (obj != null) list.Add(obj); } else { //忽略標題行業 isLine = true; } } } return list; } #endregion
使用時,程式碼如下:
//讀CSV檔案 records = CsvFile.Read(path, Test.Parse);
使用正規表示式解析CSV檔案
如果你有一個問題,想用正規表示式來解決,那麼你就有兩個問題了。
正規表示式有一定的學習門檻,而且學習後不經常使用就會忘記。正規表示式解決的大多數是一些不易變更需求的問題,這就導致一個穩定可用的正規表示式可以傳好幾代。
本節的正規表示式來自 《精通正規表示式(第3版)》 第6章 打造高效正規表示式——簡單的消除迴圈的例子,有興趣的可以去了解一下,表示式說明如下:
注:這本書最終版的解析CSV檔案的正規表示式是Jave版的使用佔有優先量詞取代固化分組的版本,也是百度上經常見到的版本。不過佔有優先量詞在C#中有點問題,本人能力有限解決不了,所以使用了上圖的版本。不過,這兩版正規表示式效能上沒有差異。
正規表示式解析CSV檔案程式碼如下:
#region 讀CSV檔案(使用正規表示式) /// <summary> /// 讀CSV檔案,預設第一行為標題 /// </summary> /// <typeparam name="T"></typeparam> /// <param name="path">檔案路徑</param> /// <param name="func">欄位解析規則</param> /// <param name="defaultEncoding">檔案編碼</param> /// <returns></returns> public static List<T> Read_Regex<T>(string path, Func<string[], T> func, Encoding defaultEncoding = null) where T : class { List<T> list = new List<T>(); StringBuilder sbr = new StringBuilder(100); Regex lineReg = new Regex("""); Regex fieldReg = new Regex("\G(?:^|,)(?:"((?>[^"]*)(?>""[^"]*)*)"|([^",]*))"); Regex quotesReg = new Regex(""""); bool isLine = false; string line = string.Empty; using (StreamReader sr = new StreamReader(path)) { while (null != (line = ReadLine(sr))) { sbr.Append(line); string str = sbr.ToString(); //一個完整的CSV記錄行,它的雙引號一定是偶數 if (lineReg.Matches(sbr.ToString()).Count % 2 == 0) { if (isLine) { var fields = ParseCsvLine(sbr.ToString(), fieldReg, quotesReg).ToArray(); var obj = func(fields.ToArray()); if (obj != null) list.Add(obj); } else { //忽略標題行業 isLine = true; } sbr.Clear(); } else { sbr.Append(Environment.NewLine); } } } if (sbr.Length > 0) { //有解析失敗的字串,報錯或忽略 } return list; } //重寫ReadLine方法,只有rn才是正確的一行 private static string ReadLine(StreamReader sr) { StringBuilder sbr = new StringBuilder(); char c; int cInt; while (-1 != (cInt =sr.Read())) { c = (char)cInt; if (c == 'n' && sbr.Length > 0 && sbr[sbr.Length - 1] == 'r') { sbr.Remove(sbr.Length - 1, 1); return sbr.ToString(); } else { sbr.Append(c); } } return sbr.Length>0?sbr.ToString():null; } private static List<string> ParseCsvLine(string line, Regex fieldReg, Regex quotesReg) { var fieldMath = fieldReg.Match(line); List<string> fields = new List<string>(); while (fieldMath.Success) { string field; if (fieldMath.Groups[1].Success) { field = quotesReg.Replace(fieldMath.Groups[1].Value, """); } else { field = fieldMath.Groups[2].Value; } fields.Add(field); fieldMath = fieldMath.NextMatch(); } return fields; } #endregion
使用時程式碼如下:
//讀CSV檔案 records = CsvFile.Read_Regex(path, Test.Parse);
目前還未發現正規表示式解析有什麼bug,不過還是不建議使用。
完整的CsvFile工具類
完整的CsvFile類程式碼如下:
using Microsoft.VisualBasic.FileIO; using System; using System.Collections.Generic; using System.IO; using System.Linq; using System.Text; using System.Text.RegularExpressions; namespace ConsoleApp4 { /// <summary> /// CSV檔案讀寫工具類 /// </summary> public class CsvFile { #region 寫CSV檔案 //欄位陣列轉為CSV記錄行 private static string FieldsToLine(IEnumerable<string> fields) { if (fields == null) return string.Empty; fields = fields.Select(field => { if (field == null) field = string.Empty; //所有欄位都加雙引號 field = string.Format(""{0}"", field.Replace(""", """")); //不簡化 //field = field.Replace(""", """"); //if (field.IndexOfAny(new char[] { ',', '"', ' ', 'r' }) != -1) //{ // field = string.Format(""{0}"", field); //} return field; }); string line = string.Format("{0}{1}", string.Join(",", fields), Environment.NewLine); return line; } //預設的欄位轉換方法 private static IEnumerable<string> GetObjFields<T>(T obj, bool isTitle) where T : class { IEnumerable<string> fields; if (isTitle) { fields = obj.GetType().GetProperties().Select(pro => pro.Name); } else { fields = obj.GetType().GetProperties().Select(pro => pro.GetValue(obj)?.ToString()); } return fields; } /// <summary> /// 寫CSV檔案,預設第一行為標題 /// </summary> /// <typeparam name="T"></typeparam> /// <param name="list">資料列表</param> /// <param name="path">檔案路徑</param> /// <param name="append">追加記錄</param> /// <param name="func">欄位轉換方法</param> /// <param name="defaultEncoding"></param> public static void Write<T>(List<T> list, string path,bool append=true, Func<T, bool, IEnumerable<string>> func = null, Encoding defaultEncoding = null) where T : class { if (list == null || list.Count == 0) return; if (defaultEncoding == null) { defaultEncoding = Encoding.UTF8; } if (func == null) { func = GetObjFields; } if (!File.Exists(path)|| !append) { var fields = func(list[0], true); string title = FieldsToLine(fields); File.WriteAllText(path, title, defaultEncoding); } using (StreamWriter sw = new StreamWriter(path, true, defaultEncoding)) { list.ForEach(obj => { var fields = func(obj, false); string line = FieldsToLine(fields); sw.Write(line); }); } } #endregion #region 讀CSV檔案(使用TextFieldParser) /// <summary> /// 讀CSV檔案,預設第一行為標題 /// </summary> /// <typeparam name="T"></typeparam> /// <param name="path">檔案路徑</param> /// <param name="func">欄位解析規則</param> /// <param name="defaultEncoding">檔案編碼</param> /// <returns></returns> public static List<T> Read<T>(string path, Func<string[], T> func, Encoding defaultEncoding = null) where T : class { if (defaultEncoding == null) { defaultEncoding = Encoding.UTF8; } List<T> list = new List<T>(); using (TextFieldParser parser = new TextFieldParser(path, defaultEncoding)) { parser.TextFieldType = FieldType.Delimited; //設定逗號分隔符 parser.SetDelimiters(","); //設定不忽略欄位前後的空格 parser.TrimWhiteSpace = false; bool isLine = false; while (!parser.EndOfData) { string[] fields = parser.ReadFields(); if (isLine) { var obj = func(fields); if (obj != null) list.Add(obj); } else { //忽略標題行業 isLine = true; } } } return list; } #endregion #region 讀CSV檔案(使用正規表示式) /// <summary> /// 讀CSV檔案,預設第一行為標題 /// </summary> /// <typeparam name="T"></typeparam> /// <param name="path">檔案路徑</param> /// <param name="func">欄位解析規則</param> /// <param name="defaultEncoding">檔案編碼</param> /// <returns></returns> public static List<T> Read_Regex<T>(string path, Func<string[], T> func, Encoding defaultEncoding = null) where T : class { List<T> list = new List<T>(); StringBuilder sbr = new StringBuilder(100); Regex lineReg = new Regex("""); Regex fieldReg = new Regex("\G(?:^|,)(?:"((?>[^"]*)(?>""[^"]*)*)"|([^",]*))"); Regex quotesReg = new Regex(""""); bool isLine = false; string line = string.Empty; using (StreamReader sr = new StreamReader(path)) { while (null != (line = ReadLine(sr))) { sbr.Append(line); string str = sbr.ToString(); //一個完整的CSV記錄行,它的雙引號一定是偶數 if (lineReg.Matches(sbr.ToString()).Count % 2 == 0) { if (isLine) { var fields = ParseCsvLine(sbr.ToString(), fieldReg, quotesReg).ToArray(); var obj = func(fields.ToArray()); if (obj != null) list.Add(obj); } else { //忽略標題行業 isLine = true; } sbr.Clear(); } else { sbr.Append(Environment.NewLine); } } } if (sbr.Length > 0) { //有解析失敗的字串,報錯或忽略 } return list; } //重寫ReadLine方法,只有rn才是正確的一行 private static string ReadLine(StreamReader sr) { StringBuilder sbr = new StringBuilder(); char c; int cInt; while (-1 != (cInt =sr.Read())) { c = (char)cInt; if (c == 'n' && sbr.Length > 0 && sbr[sbr.Length - 1] == 'r') { sbr.Remove(sbr.Length - 1, 1); return sbr.ToString(); } else { sbr.Append(c); } } return sbr.Length>0?sbr.ToString():null; } private static List<string> ParseCsvLine(string line, Regex fieldReg, Regex quotesReg) { var fieldMath = fieldReg.Match(line); List<string> fields = new List<string>(); while (fieldMath.Success) { string field; if (fieldMath.Groups[1].Success) { field = quotesReg.Replace(fieldMath.Groups[1].Value, """); } else { field = fieldMath.Groups[2].Value; } fields.Add(field); fieldMath = fieldMath.NextMatch(); } return fields; } #endregion } }
使用方法如下:
//寫CSV檔案 CsvFile.Write(records, path, true, new Func<Test, bool, IEnumerable<string>>((obj, isTitle) => { IEnumerable<string> fields; if (isTitle) { fields = obj.GetType().GetProperties().Select(pro => pro.Name + Environment.NewLine + "",""); } else { fields = obj.GetType().GetProperties().Select(pro => pro.GetValue(obj)?.ToString()); } return fields; })); //讀CSV檔案 records = CsvFile.Read(path, Test.Parse); //讀CSV檔案 records = CsvFile.Read_Regex(path, Test.Parse);
以上就是C#實現讀寫CSV檔案的方法詳解的詳細內容,更多關於C#讀寫CSV檔案的資料請關注it145.com其它相關文章!
相關文章
<em>Mac</em>Book项目 2009年学校开始实施<em>Mac</em>Book项目,所有师生配备一本<em>Mac</em>Book,并同步更新了校园无线网络。学校每周进行电脑技术更新,每月发送技术支持资料,极大改变了教学及学习方式。因此2011
2021-06-01 09:32:01
综合看Anker超能充系列的性价比很高,并且与不仅和iPhone12/苹果<em>Mac</em>Book很配,而且适合多设备充电需求的日常使用或差旅场景,不管是安卓还是Switch同样也能用得上它,希望这次分享能给准备购入充电器的小伙伴们有所
2021-06-01 09:31:42
除了L4WUDU与吴亦凡已经多次共事,成为了明面上的厂牌成员,吴亦凡还曾带领20XXCLUB全队参加2020年的一场音乐节,这也是20XXCLUB首次全员合照,王嗣尧Turbo、陈彦希Regi、<em>Mac</em> Ova Seas、林渝植等人全部出场。然而让
2021-06-01 09:31:34
目前应用IPFS的机构:1 谷歌<em>浏览器</em>支持IPFS分布式协议 2 万维网 (历史档案博物馆)数据库 3 火狐<em>浏览器</em>支持 IPFS分布式协议 4 EOS 等数字货币数据存储 5 美国国会图书馆,历史资料永久保存在 IPFS 6 加
2021-06-01 09:31:24
开拓者的车机是兼容苹果和<em>安卓</em>,虽然我不怎么用,但确实兼顾了我家人的很多需求:副驾的门板还配有解锁开关,有的时候老婆开车,下车的时候偶尔会忘记解锁,我在副驾驶可以自己开门:第二排设计很好,不仅配置了一个很大的
2021-06-01 09:30:48
不仅是<em>安卓</em>手机,苹果手机的降价力度也是前所未有了,iPhone12也“跳水价”了,发布价是6799元,如今已经跌至5308元,降价幅度超过1400元,最新定价确认了。iPhone12是苹果首款5G手机,同时也是全球首款5nm芯片的智能机,它
2021-06-01 09:30:45