Problems arising from reading Excel files (I)

Source: Internet
Author: User

The cause of the incident was that Baidu saw a question, saying that his Excel file contains hundreds of thousands of rows of data. He hoped to delete the duplicate content in the table, the content is separated into 5000 rows of data for each excel file. Then, with this test, it may be because of the divergence of thinking. I don't know what's going on, but it turns to the process of reading images from an Excel file, which is beyond my expectation, it seems that I am not focused enough, but it also gave me an unexpected harvest.

According to the analysis, whether oledbconnection or application object is used, the Excel file needs to be loaded (purely nonsense ).Code.

Code

  String  Connection  =     String  . Format (  " Provider = Microsoft. Jet. oledb.4.0; datasource = {0}; extended properties = Excel 8.0;  "  , Filepath );
Oledbconnection Conn = New Oledbconnection (connection );
Conn. open ();
Datatable = conn. getoledbschematable (oledbschemaguid. Tables, newobject [] { Null , Null , Null , " Table " });

 

The above Code reads all workbooks in an Excel file. In fact, it obtains the structure of the Excel file, that is, sheet, which can dynamically read all the content in the Excel file.

Then let's see how to fill

Code

  For  (  Int  I  =     0  ; I <  DT. Rows. Count; I  ++  )
{
Stringtext = String . Format ( " Selec * from {0} " , DT. Rows [I] [ " Table_name " ]);
Oledbdataadaptermycommand = New Oledbdataadapter (text, connection );
Datasetmydataset = New Dataset ();
Mycommand. Fill (mydataset );
}

The above Code fills all workbooks into a dataset. It is only for demonstration here, because I know that my Excel file has only one workbook, And we will fill it separately in reality. After filling, I opened the dataset and saw that I was shocked. The dataset only contains string content, but no images. This causes me to crash.

From another angle, we use COM objects to read data. The idea is the same as above. we load the Excel file and read the string data.

 

Code

 Excel. Application Excel  =     New  Excel. Application ();
Excel. Workbook workbook = Excel. workbooks. Add (filepath );
Excel. usercontrol = True ;
Excel. Visible = False ;
For ( Int I = 0 ; I < Workbook. worksheets. Count; I ++ )
{
System. Text. stringbuilder sb = New System. Text. stringbuilder ();
Excel. worksheet Sheet = Workbook. worksheets. get_item (I + 1 ) As Excel. worksheet;
For ( Int Row = 2 ; Row <= Sheet. usedrange. Rows. Count; row ++ )
{
// Take the cell value;
For ( Int Col = 1 ; Col <= Sheet. usedrange. Columns. Count; col ++ )
{
Microsoft. Office. InterOP. Excel. Range = Sheet. cells [row, Col] As Excel. range;
SB. append ( " , " + Col. tostring () + " : " + Range. Text );
}
SB. append (system. environment. newline );
Console. writeline (sb. tostring ());
// Save images;
If (Sheet. shapes. Count > 0 )
{
Bitmap Picture;
Idataobject data = Null ;
Foreach (Excel. Shape item In Sheet. SHAPES)
{
Item. Copy ();
Data = Clipboard. getdataobject ();
If ( Null ! = Data)
{
Picture = (Bitmap) data. getdata (dataformats. Bitmap );
Picture. Save ( String . Format ( @" D: \ temp \ AA \ 201702.16.jpg " , Row ));
}
}
}
}
}
Workbook. Close ( False , Null , Null );
Excel. Quit ();

 

 

The above Code meets the requirements for reading data and images, but it is too slow, especially for reading images. This allows you to copy and paste the memory one by one, it really makes me unable to bear it. What I want is clearly in front of me. This requires two times of memory reading, which is quite speechless.

After debugging again, I found that I decided to use shape. the copypicture () method, because I found that the copy and copypicture methods actually call different implementations. Copy is copied to the memory regardless of what you are, copypicture can be copied according to a certain appearance and format. Of course, in principle, the copy speed should be faster.

Once came up with an extremely ridiculous and common way to directly convert an icomobject object to a hosted object, as shown below:

 
  Idataobject data= Null;
Bitmapimg=(Bitmap) data

 

This directly fails, not to mention running, and compilation fails.

Another idea emerged to serialize the object stream, hoping that the stream speed would be faster. The result showed that the shape object does not support serialization, which means this path cannot be implemented. Finally, it seems that, currently, only the memory copy and paste method can be used for execution. Well, I still don't believe there is no other way. I will clean you up later.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.