A bug when you read an Excel table from ADO (OLE DB)

Source: Internet
Author: User
Tags ole

if we have one of the following Excel tables, for example:
 
now you want to read its contents using a C # program:
usingSystem;
usingSystem.Data.OleDb;

namespaceSkyiv.Ben.Test
{
Sealed classExceltest
{
Static voidMain ()
{
Try
{
using(OleDbConnection conn= NewOleDbConnection ("provider=microsoft.jet.oledb.4.0; Extended properties=\"Excel8.0; HDR=yes\";D ata source=std"))
{
Conn. Open ();
OleDbCommand Comm= NewOleDbCommand ("SELECT [Id],[name] from [sheet1$]", conn);
using(OleDbDataReader R=Comm. ExecuteReader ())
{
while(R.read ())
{
intID=Convert.ToInt32 (R.getvalue (0));
stringname=convert.tostring (R.getvalue (1));
Console.WriteLine ("{0}:\t{1}", ID, name);
}
}
}
}
Catch(Exception ex)
{
Console.WriteLine (" error: " + ex.) Message);
}
}
}
}

The results of the execution of the program are as follows:

It appears that "DBNull" was returned when reading the cell "B3" instead of the correct "1768".

Using an essay published in my Blog park, "Database Gadget (C #)", gives the OleDb.exe a look:

Indeed, there are 2 "DBNull" values in the Execl table, and the tool looks at its structure:

The data type of the first column "ID" is found to be "double" and the data type of the second column "NAME" is "string".

After analysis. "DBNull" is in a cell whose data type does not match the data type of the column today.

Looks like. The crux of the problem is here. We know that Excel tables are not real databases, and unlike real databases, each field (column) has a specific data type. Instead, ADO guesses the data type of each column by scanning the first few lines of the table, so. When some cells in a column have data types that are inconsistent with the column data type. There's a problem. The value of the cell becomes the "DBNull".


The problem stems from my actual work. At work, you need to analyze the data in an Excel table provided by the business unit, which has thousands of rows of data. Some of the columns are mostly numeric. But some of these numbers are stored in text format. While some of the columns are most of the value of character type, but a few cell values are numbers. In this way, my analysis program will not work. My solution now is to save the Excel table as a text file (tab-delimited) and read the text file in a C # program.
Another option is to select the entire column in the Excel table. Then "convert numbers saved as text to numbers." For example, see the following:

However, I did not find an easy way to "convert numbers to text".
I do not know in ADO there is no way to change the original Excel table without changing the case. Correctly read data in Excel tables whose columns have inconsistent data types for cells? If anyone knows, kindly tell me. Thank you.
Not sure what way to report this BUG to Microsoft?
I think there are two ways to solve this BUG:
1. Suppose a column is guessed to be a digital type, assuming that there is a character type of data in the column, assuming that the data is stored as a number of text, it is converted directly to the number returned to the caller.

Suppose the data cannot be converted to a number, can return "DBNull", or throw an exception.


If a column is guessed to be a character type, only the cells in the column are not empty. is returned to the caller by converting to a character type.
2. Provide a property in the OLE DB connection string that specifies that all columns in the Excel table have a data type of "string", and that only the contents of the cell are not empty, return "DBNull" and Return "string". The Convert.toxxx () method is then used by the caller to convert to the appropriate number type.
I don't know if you think so.

A bug when you read an Excel table from ADO (OLE DB)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.