(1)OleDB方式
優(yōu)點(diǎn):將Excel直接當(dāng)做數(shù)據(jù)源處理,通過SQL直接讀取內(nèi)容,讀取速度較快。
缺點(diǎn):讀取數(shù)據(jù)方式不夠靈活,無法直接讀取某一個單元格,只有將整個Sheet頁讀取出來后(結(jié)果為Datatable)再在Datatable中根據(jù)行列數(shù)來獲取指定的值。
當(dāng)Excel數(shù)據(jù)量很大時。會非常占用內(nèi)存,當(dāng)內(nèi)存不夠時會拋出內(nèi)存溢出的異常。
讀取代碼如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
|
public DataTable GetExcelTableByOleDB( string strExcelPath, string tableName) { try { DataTable dtExcel = new DataTable(); //數(shù)據(jù)表 DataSet ds = new DataSet(); //獲取文件擴(kuò)展名 string strExtension = System.IO.Path.GetExtension(strExcelPath); string strFileName = System.IO.Path.GetFileName(strExcelPath); //Excel的連接 OleDbConnection objConn = null ; switch (strExtension) { case ".xls" : objConn = new OleDbConnection( "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + strExcelPath + ";" + "Extended Properties=\"Excel 8.0;HDR=NO;IMEX=1;\"" ); break ; case ".xlsx" : objConn = new OleDbConnection( "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + strExcelPath + ";" + "Extended Properties=\"Excel 12.0;HDR=NO;IMEX=1;\"" ); break ; default : objConn = null ; break ; } if (objConn == null ) { return null ; } objConn.Open(); //獲取Excel中所有Sheet表的信息 //System.Data.DataTable schemaTable = objConn.GetOleDbSchemaTable(System.Data.OleDb.OleDbSchemaGuid.Tables, null); //獲取Excel的第一個Sheet表名 //string tableName = schemaTable.Rows[0][2].ToString().Trim(); string strSql = "select * from [" + tableName + "]" ; //獲取Excel指定Sheet表中的信息 OleDbCommand objCmd = new OleDbCommand(strSql, objConn); OleDbDataAdapter myData = new OleDbDataAdapter(strSql, objConn); myData.Fill(ds, tableName); //填充數(shù)據(jù) objConn.Close(); //dtExcel即為excel文件中指定表中存儲的信息 dtExcel = ds.Tables[tableName]; return dtExcel; } catch { return null ; } } |
下面說明一下連接字符串
HDR=Yes,這代表第一行是標(biāo)題,不做為數(shù)據(jù)使用(但是我在實(shí)際使用中,如果第一行存在復(fù)雜數(shù)值,那么讀取得到的Datatable列標(biāo)題會自動設(shè)置為F1、F2等方式命名,與實(shí)際應(yīng)用不符,所以當(dāng)時是通過HDR=No方式將所有內(nèi)容讀取到Datatable中,然后手動將第一行設(shè)置成標(biāo)題的);IMEX ( IMport EXport mode )設(shè)置
IMEX 有三種模式:
0 is Export mode
1 is Import mode
2 is Linked mode (full update capabilities)
我這里特別要說明的就是 IMEX 參數(shù)了,因?yàn)椴煌哪J酱碇煌淖x寫行為:
當(dāng) IMEX=0 時為“匯出模式”,這個模式開啟的 Excel 檔案只能用來做“寫入”用途。
當(dāng) IMEX=1 時為“匯入模式”,這個模式開啟的 Excel 檔案只能用來做“讀取”用途。
當(dāng) IMEX=2 時為“鏈接模式”,這個模式開啟的 Excel 檔案可同時支援“讀取”與“寫入”用途。
---------------------------------
另外,讀取Excel2007版本的文件時,版本應(yīng)該從8.0改為12.0,同時驅(qū)動不能再用Jet,而應(yīng)該用ACE。負(fù)責(zé)會造成“找不到可安裝的 ISAM”的錯誤。
---------------------------------
在網(wǎng)上還發(fā)現(xiàn)采用這種方式存在取出的Sheet表的個數(shù)多于實(shí)際Excel表中的Sheet表個數(shù)的情況,其原因有二:
1. 取出的名稱中,包括了XL命名管理器中的名稱(參見XL2007的公式--命名管理器, 快捷鍵Crtl+F3);
2. 取出的名稱中,包括了FilterDatabase后綴的, 這是XL用來記錄Filter范圍的。
對于第一點(diǎn)比較簡單, 刪除已有命名管理器中的內(nèi)容即可;第二點(diǎn)處理起來比較麻煩, Filter刪除后這些名稱依然保留著,簡單的做法是新增Sheet然后將原Sheet Copy進(jìn)去。但實(shí)際情況并不能為每個Excel做以上檢查。下面給出了過濾的方案。(此問題我們有驗(yàn)證過,大家自己驗(yàn)證一下吧)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
//objConn為讀取Excel的鏈接,下面通過過濾來獲取有效的Sheet頁名稱集合 System.Data.DataTable schemaTable = objConn.GetOleDbSchemaTable(System.Data.OleDb.OleDbSchemaGuid.Tables, null ); List< string > lstSheetNames = new List< string >(); for ( int i = 0; i < schemaTable.Rows.Count; i++) { string strSheetName = ( string )dtSheetName.Rows[i][ "TABLE_NAME" ]; if (strSheetName.Contains( "$" ) && !strSheetName.Replace( "'" , "" ).EndsWith( "$" )) { //過濾無效SheetName完畢.... continue ; } if (lstSheetNames != null && !lstSheetNames.Contains(strSheetName)) lstSheetNames.Add(strSheetName); } |
因?yàn)樽x取出來無效SheetName一般情況最后一個字符都不會是$。如果SheetName有一些特殊符號,讀取出來的SheetName會自動加上單引號。比如在Excel中將SheetName編輯成MySheet(1),此時讀取出來的SheetName就為:'MySheet(1)$',所以判斷最后一個字符是不是$之前最好過濾一下單引號。
---------------------------------
(2)Com組件的方式(通過添加 Microsoft.Office.Interop.Excel引用實(shí)現(xiàn))
優(yōu)點(diǎn):能夠非常靈活的讀取Excel中的數(shù)據(jù),用戶可以靈活的調(diào)用各種函數(shù)進(jìn)行處理。
缺點(diǎn):基于單元格的處理,讀取速度較慢,對于數(shù)據(jù)量較大的文件最好不要使用此種方式讀取。
需要添加相應(yīng)的DLL引用,必須存在此引用才可使用,如果是Web站點(diǎn)部署在IIS上時,還需要服務(wù)器機(jī)子已安裝了Excel,有時候還需要為配置IIS權(quán)限。
讀取代碼如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
|
private Stopwatch wath = new Stopwatch(); /// <summary> /// 使用COM讀取Excel /// </summary> /// <param name="excelFilePath">路徑</param> /// <returns>DataTabel</returns> public System.Data.DataTable GetExcelData( string excelFilePath) { Excel.Application app = new Excel.Application(); Excel.Sheets sheets; Excel.Workbook workbook = null ; object oMissiong = System.Reflection.Missing.Value; System.Data.DataTable dt = new System.Data.DataTable(); wath.Start(); try { if (app == null ) { return null ; } workbook = app.Workbooks.Open(excelFilePath, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong); //將數(shù)據(jù)讀入到DataTable中——Start sheets = workbook.Worksheets; Excel.Worksheet worksheet = (Excel.Worksheet)sheets.get_Item(1); //讀取第一張表 if (worksheet == null ) return null ; string cellContent; int iRowCount = worksheet.UsedRange.Rows.Count; int iColCount = worksheet.UsedRange.Columns.Count; Excel.Range range; //負(fù)責(zé)列頭Start DataColumn dc; int ColumnID = 1; range = (Excel.Range)worksheet.Cells[1, 1]; while (range.Text.ToString().Trim() != "" ) { dc = new DataColumn(); dc.DataType = System.Type.GetType( "System.String" ); dc.ColumnName = range.Text.ToString().Trim(); dt.Columns.Add(dc); range = (Excel.Range)worksheet.Cells[1, ++ColumnID]; } //End for ( int iRow = 2; iRow <= iRowCount; iRow++) { DataRow dr = dt.NewRow(); for ( int iCol = 1; iCol <= iColCount; iCol++) { range = (Excel.Range)worksheet.Cells[iRow, iCol]; cellContent = (range.Value2 == null ) ? "" : range.Text.ToString(); dr[iCol - 1] = cellContent; } dt.Rows.Add(dr); } wath.Stop(); TimeSpan ts = wath.Elapsed; //將數(shù)據(jù)讀入到DataTable中——End return dt; } catch { return null ; } finally { workbook.Close( false , oMissiong, oMissiong); System.Runtime.InteropServices.Marshal.ReleaseComObject(workbook); workbook = null ; app.Workbooks.Close(); app.Quit(); System.Runtime.InteropServices.Marshal.ReleaseComObject(app); app = null ; GC.Collect(); GC.WaitForPendingFinalizers(); } } /// <summary> /// 使用COM,多線程讀取Excel(1 主線程、4 副線程) /// </summary> /// <param name="excelFilePath">路徑</param> /// <returns>DataTabel</returns> public System.Data.DataTable ThreadReadExcel( string excelFilePath) { Excel.Application app = new Excel.Application(); Excel.Sheets sheets = null ; Excel.Workbook workbook = null ; object oMissiong = System.Reflection.Missing.Value; System.Data.DataTable dt = new System.Data.DataTable(); wath.Start(); try { if (app == null ) { return null ; } workbook = app.Workbooks.Open(excelFilePath, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong); //將數(shù)據(jù)讀入到DataTable中——Start sheets = workbook.Worksheets; Excel.Worksheet worksheet = (Excel.Worksheet)sheets.get_Item(1); //讀取第一張表 if (worksheet == null ) return null ; string cellContent; int iRowCount = worksheet.UsedRange.Rows.Count; int iColCount = worksheet.UsedRange.Columns.Count; Excel.Range range; //負(fù)責(zé)列頭Start DataColumn dc; int ColumnID = 1; range = (Excel.Range)worksheet.Cells[1, 1]; while (iColCount >= ColumnID) { dc = new DataColumn(); dc.DataType = System.Type.GetType( "System.String" ); string strNewColumnName = range.Text.ToString().Trim(); if (strNewColumnName.Length == 0) strNewColumnName = "_1" ; //判斷列名是否重復(fù) for ( int i = 1; i < ColumnID; i++) { if (dt.Columns[i - 1].ColumnName == strNewColumnName) strNewColumnName = strNewColumnName + "_1" ; } dc.ColumnName = strNewColumnName; dt.Columns.Add(dc); range = (Excel.Range)worksheet.Cells[1, ++ColumnID]; } //End //數(shù)據(jù)大于500條,使用多進(jìn)程進(jìn)行讀取數(shù)據(jù) if (iRowCount - 1 > 500) { //開始多線程讀取數(shù)據(jù) //新建線程 int b2 = (iRowCount - 1) / 10; DataTable dt1 = new DataTable( "dt1" ); dt1 = dt.Clone(); SheetOptions sheet1thread = new SheetOptions(worksheet, iColCount, 2, b2 + 1, dt1); Thread othread1 = new Thread( new ThreadStart(sheet1thread.SheetToDataTable)); othread1.Start(); //阻塞 1 毫秒,保證第一個讀取 dt1 Thread.Sleep(1); DataTable dt2 = new DataTable( "dt2" ); dt2 = dt.Clone(); SheetOptions sheet2thread = new SheetOptions(worksheet, iColCount, b2 + 2, b2 * 2 + 1, dt2); Thread othread2 = new Thread( new ThreadStart(sheet2thread.SheetToDataTable)); othread2.Start(); DataTable dt3 = new DataTable( "dt3" ); dt3 = dt.Clone(); SheetOptions sheet3thread = new SheetOptions(worksheet, iColCount, b2 * 2 + 2, b2 * 3 + 1, dt3); Thread othread3 = new Thread( new ThreadStart(sheet3thread.SheetToDataTable)); othread3.Start(); DataTable dt4 = new DataTable( "dt4" ); dt4 = dt.Clone(); SheetOptions sheet4thread = new SheetOptions(worksheet, iColCount, b2 * 3 + 2, b2 * 4 + 1, dt4); Thread othread4 = new Thread( new ThreadStart(sheet4thread.SheetToDataTable)); othread4.Start(); //主線程讀取剩余數(shù)據(jù) for ( int iRow = b2 * 4 + 2; iRow <= iRowCount; iRow++) { DataRow dr = dt.NewRow(); for ( int iCol = 1; iCol <= iColCount; iCol++) { range = (Excel.Range)worksheet.Cells[iRow, iCol]; cellContent = (range.Value2 == null ) ? "" : range.Text.ToString(); dr[iCol - 1] = cellContent; } dt.Rows.Add(dr); } othread1.Join(); othread2.Join(); othread3.Join(); othread4.Join(); //將多個線程讀取出來的數(shù)據(jù)追加至 dt1 后面 foreach (DataRow dr in dt.Rows) dt1.Rows.Add(dr.ItemArray); dt.Clear(); dt.Dispose(); foreach (DataRow dr in dt2.Rows) dt1.Rows.Add(dr.ItemArray); dt2.Clear(); dt2.Dispose(); foreach (DataRow dr in dt3.Rows) dt1.Rows.Add(dr.ItemArray); dt3.Clear(); dt3.Dispose(); foreach (DataRow dr in dt4.Rows) dt1.Rows.Add(dr.ItemArray); dt4.Clear(); dt4.Dispose(); return dt1; } else { for ( int iRow = 2; iRow <= iRowCount; iRow++) { DataRow dr = dt.NewRow(); for ( int iCol = 1; iCol <= iColCount; iCol++) { range = (Excel.Range)worksheet.Cells[iRow, iCol]; cellContent = (range.Value2 == null ) ? "" : range.Text.ToString(); dr[iCol - 1] = cellContent; } dt.Rows.Add(dr); } } wath.Stop(); TimeSpan ts = wath.Elapsed; //將數(shù)據(jù)讀入到DataTable中——End return dt; } catch { return null ; } finally { workbook.Close( false , oMissiong, oMissiong); System.Runtime.InteropServices.Marshal.ReleaseComObject(workbook); System.Runtime.InteropServices.Marshal.ReleaseComObject(sheets); workbook = null ; app.Workbooks.Close(); app.Quit(); System.Runtime.InteropServices.Marshal.ReleaseComObject(app); app = null ; GC.Collect(); GC.WaitForPendingFinalizers(); } } |
(3)NPOI方式讀取Excel(此方法未經(jīng)過測試)
NPOI 是 POI 項(xiàng)目的 .NET 版本。POI是一個開源的Java讀寫Excel、WORD等微軟OLE2組件文檔的項(xiàng)目。使用 NPOI 你就可以在沒有安裝 Office 或者相應(yīng)環(huán)境的機(jī)器上對 WORD/EXCEL 文檔進(jìn)行讀寫。
優(yōu)點(diǎn):讀取Excel速度較快,讀取方式操作靈活性
缺點(diǎn):需要下載相應(yīng)的插件并添加到系統(tǒng)引用當(dāng)中。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
|
/// <summary> /// 將excel中的數(shù)據(jù)導(dǎo)入到DataTable中 /// </summary> /// <param name="sheetName">excel工作薄sheet的名稱</param> /// <param name="isFirstRowColumn">第一行是否是DataTable的列名</param> /// <returns>返回的DataTable</returns> public DataTable ExcelToDataTable( string sheetName, bool isFirstRowColumn) { ISheet sheet = null ; DataTable data = new DataTable(); int startRow = 0; try { fs = new FileStream(fileName, FileMode.Open, FileAccess.Read); if (fileName.IndexOf( ".xlsx" ) > 0) // 2007版本 workbook = new XSSFWorkbook(fs); else if (fileName.IndexOf( ".xls" ) > 0) // 2003版本 workbook = new HSSFWorkbook(fs); if (sheetName != null ) { sheet = workbook.GetSheet(sheetName); } else { sheet = workbook.GetSheetAt(0); } if (sheet != null ) { IRow firstRow = sheet.GetRow(0); int cellCount = firstRow.LastCellNum; //一行最后一個cell的編號 即總的列數(shù) if (isFirstRowColumn) { for ( int i = firstRow.FirstCellNum; i < cellCount; ++i) { DataColumn column = new DataColumn(firstRow.GetCell(i).StringCellValue); data.Columns.Add(column); } startRow = sheet.FirstRowNum + 1; } else { startRow = sheet.FirstRowNum; } //最后一列的標(biāo)號 int rowCount = sheet.LastRowNum; for ( int i = startRow; i <= rowCount; ++i) { IRow row = sheet.GetRow(i); if (row == null ) continue ; //沒有數(shù)據(jù)的行默認(rèn)是null DataRow dataRow = data.NewRow(); for ( int j = row.FirstCellNum; j < cellCount; ++j) { if (row.GetCell(j) != null ) //同理,沒有數(shù)據(jù)的單元格都默認(rèn)是null dataRow[j] = row.GetCell(j).ToString(); } data.Rows.Add(dataRow); } } return data; } catch (Exception ex) { Console.WriteLine( "Exception: " + ex.Message); return null ; } } |
下面是一些相關(guān)的文章,大家可以參考下