a complete solution to the problem of OLE DB connection Excel data type not uniform It is very efficient to use microsoft.jet.oledb.4.0 to connect to Excel, to read data, and to read data relative to traditional COM. But relative to traditional COM operations Excel, and the existence of data type conversion problems.
Because when you use OLE DB to connect to Excel to read data, you need to determine the type of data. The connection string is used by default:
View Source print? 1.string connstr = "Provider=Microsoft.Jet.OLEDB.4.0;Data source=" + Excelfile + "; Extended properties= ' Excel 8.0;
When you use the connection string above to connect to Excel, you may encounter problems with inconsistent data types. The so-called inconsistent data type, refers to the same column of data types may appear a variety of, such as floating point, string, date, etc., when such a situation, the data read is empty, and even error, such as "Illegal date format" and other exceptions. When this happens, we all think of the data being read by character data, but what data type to read is not what we can control, it's OLE DB control, at least for the moment I haven't found a way to control the output data type. Because I tried to use the Convert,cast function to type-convert the output columns, but OLE DB does not support these functions when it connects to Excel. Therefore, the problem can only be resolved from other angles. I also searched the Internet a lot of solutions, the most comprehensive solution is: http://www.douban.com/note/18510346/. The following is a comparison of the methods that appear on the web to resolve this problem:
Solution | Solutions
Description |
Disadvantages |
Com |
accessing Excel using the Excel COM interface |
unmanaged, not easy to release resources, inefficient |
The
connection string adds a connection string constructed by the Imex=1 |
, such as: View source print? 1. string strconn = "Provider=Microsoft.Jet.OLEDB.4.0;Data source=" + excelFile + " ; Extended properties= ' Excel 8.0; Hdr=yes;imex=1 ' "; Where HDR indicates whether the first row of the sheet page is the field name, "Yes" means yes, "no" means no, when yes, the first row of the sheet page is used as the field name, the data starts at the second row, and if NO, the field name is the column name to be sheet, such as A,b,c, The data is taken from the first line, and Imex is used to tell the driver that using the Excel file mode, the values are 0, 1, and 23, respectively, representing the export, import, and blending modes. When we set Imex=1 to force mixed data to be converted to text, but this setting is not reliable, imex=1 only makes sure that the first 8 rows of data in a column are at least one of the text items, only to make a slight change in the behavior of finding the best selection of data types in the first 8 rows of data. For example, the first 8 rows of data in a column are all pure numbers, so it still has a numeric type as the data type of the column, and then the data containing the text in the row remains empty. (Pick to: http://www.douban.com/note/18510346/). |
determines whether to use the character type | based on the first 8 rows of data only
Imex=1 use with registry value TypeGuessRows |
The TypeGuessRows value determines the data type determined by the ISAM driver from the previous data sampling, which defaults to "8". You can change the number of sample rows by modifying the registry value under Hkey_local_machine/software/microsoft/jet/4.0/engines/excel. However, this improvement does not fundamentally solve the problem, even if we set the Imex to "1", typeguessrows set to a larger, such as 1000, assuming that the data table has 1001 rows, a column of the first 1000 rows are all pure numbers, the column of the 1001th row is a text, The ISAM-driven mechanism still makes this column of data empty. (Pick to: http://www.douban.com/note/18510346/). |
Modifying the registry is inconvenient, and it is not possible to pre-interpret how many rows sheet has, so it is still limited by the number of rows. |
Convert Excel first to CSV plain text format |
(1) Before reading the text data of the. xls type of Excel, convert it to. csv format, and save it directly in Excel as this format to achieve the purpose of the conversion. A CSV file, also known as a comma-delimited file, is a plain text file that separates data columns with ",". It is important to note that the CSV file can also be read in OLE DB or ODBC, but the ISAM mechanism also works if you read its data in these ways and return to the old path of lost data. (2) The ordinary way to read the text file to open the file, read the first line, with "," as a delimiter to obtain the field names, in the DataTable to create the corresponding fields, the type of the field can be uniformly created as "String". (3) Read rows of data line by line, use "," as delimiters to get the data for each column of a row and fill in the corresponding fields of the DataTable. Brief code: View Source print? 01.String Line;02.String [] split = null;03.DataTable table=new DataTable ("Auto");04.DataRow row=null;05.StreamReader sr=newstreamreader ("C:/auto.csv", System.Text.Encoding.Default);06.//Create a data column corresponding to the data source07.Line = Sr. ReadLine (); 08.Split=line. Split (', '); 09.foreach (String colname in Split) {10.table. Columns.Add (Colname,system.type.gettype ("System.String")); 11.} 12.//Fill in the data table13.int j=0;14.While ((LINE=SR. ReadLine ())!=null) {15.j=0;16.row = table. NewRow (); 17.Split=line. Split (', '); 18.foreach (String colname in Split) {19.Row[j]=colname;20.j + +;21st. } 22.table. Rows.Add (row);23.} 24.Sr. Close (); 25.//Display data26.datagrid1.datasource=table. DefaultView; 27.Datagrid1.databind ();(Pick to: http://www.douban.com/note/18510346/). |
You need to convert Excel into a CSV file beforehand |
This provides a more convenient approach, but only if the first row must be a character type, either as a field name or as the first row of data. That said, everyone will understand. First modify the connection string to:
View Source print?1. string strconn = "Provider=Microsoft.Jet.OLEDB.4.0;Data source=" + Excelfile + "; Extended properties= ' Excel 8.0; Hdr=no;imex=1 ' ";
This sets HDR to no because I'm reading the first row as data, and imex=1 the data type of the column based on the first 8 rows, forcing mixed data to be converted to text if there is character data. Here we understand why the first behavior of the character type is guaranteed. You can force a column's data type to be set to a character type, so what type of data appears in the column is not afraid. The job you need to do is to reset the field name and delete the first record after you've finished getting the data. The code is as follows:
View Source print? 01.DataTable dt = new DataTable ();02. 03.using (OleDbCommand cmd = new OleDbCommand ()) {04. cmd. Connection = conn; 05. cmd.commandtype = commandtype.text; 06. cmd.commandtimeout = 6; 07. Cmd.commandtext = string. Format ("select * from [{0}$]", sheetname); 08. 09. OleDbDataAdapter adapter = new OleDbDataAdapter (cmd); 10. Adapter. Fill (DT); 11.}12. 13.if (dt. Rows.Count > 0) {14. DataRow dr = dt. Rows[0]; 15. 16.For (int col = 0; col < dt. Columns.count; col++) { 17. dt. Columns[col]. ColumnName = Dr[col]. ToString (); 18. }19. 20. dt. Rows[0]. Delete (); 21st. dt. AcceptChanges (); 22.}
C # completely resolves an issue where OLE DB joins Excel data types that are not uniform