Big Data Import Excel

Source: Internet
Author: User

During the Big data import implementation, there are two most common problems: exceeding the line limit and memory overflow!

18 days of data, a total of 500w, how to store 500w records in Excel, I thought of two ways to implement: Plsql developer and Java poi!

Plsql DEVELOPER

There are two ways to implement this:

1, in the new SQL WINDOW, execute the query statement you want to export data, after the query, click the down arrow in the place where the results appear, let it all show, this may take a little time, after the display is finished, right click on the display of the results of the place, select Copy to Excel (XLS and xlsx, the former is 03 and previous versions, each sheet can only display 65,535 records, the latter is 07 and later versions each sheet can display 1,048,576 records).

2, in a new report WINDOW, execute the query statement you want to export data, after the query, click on the right side of the screen green pie icon (export results), the following operation is very simple, no more nonsense.

The two ways to achieve simple, easy to operate, but there are serious drawbacks: first, Plsql developer once exported Excel data Limited, only hundreds of thousands of, out of range, then memory overflow; second, if a paged query or conditional query, Then batches of data cannot be imported into the same Excel. It's kind of painful ~

When the simple method does not work, it can only go to the more complex process of the road ...

Java Poi

Before using Java poi, tried JXL, but personally think Java poi more comfortable, this is not to say that JXL is not good, jxl more oriented to the bottom, more trouble, but more flexible, and Java POI encapsulation more, more comfortable to use.

In fact, these are not the focus!

The focus is on how to handle the top two most common problems in the implementation process: exceeding the line limit and memory overflow!

Memory overflow:

A often deal with big data, the company's hardware has not kept up with the weakness, really time-consuming! The most common solution is batch processing, combined with the Java Virtual machine to observe a processing in the premise of not causing memory overflow, the maximum amount of data can be processed to achieve full utilization of virtual machines.

In the Oracle query data This paragraph, write a paging query, after paging query, all put into a set, the implementation process, for the time being no table!

Line limit exceeded:

If you query 500w data peremptorily in Oracle into Excel, you will encounter another tricky issue: exceeding the line limit.

If it is the XLS format, I let the program loop run, Loop once, import 65,535, xlsx format, let it loop once, import 1,048,576, so loop down until the program runs out!

Take a look at my code example:

 Public classXlsdto2excel {@AutowiredPrivateTodbdao Todbdao; /**     *     * @paramxls * Xlsdto an object of entity class *@throwsException * Throws an exception during Excel import*/     Public voidToexcel (String date,intcount) {        intPAGESIZE = 65535; //declare a new workbook declares a workbookHssfworkbook WB =NewHssfworkbook (); //declare a Row object reference declares a new rowHssfrow r =NULL; //declare a Cell object reference declares a cellsHssfcell C0, C1, C2, C3, C4, c5 =NULL; Hssfcell[] FirstCell=NewHssfcell[6]; //Create 2 cell styles creates 2 cells stylesHssfcellstyle cs =Wb.createcellstyle (); Hssfcellstyle CS2=Wb.createcellstyle (); //Create 2 Fonts objects creating 2 cell fontsHssffont f =Wb.createfont (); Hssffont F2=Wb.createfont (); //Set font 1 to the point type, blue and bold font types 1 to 12th numbers, blues and boldF.setfontheightinpoints (( Short) 12);        F.setcolor (HSSFColor.RED.index);        F.setboldweight (Hssffont.boldweight_bold); //Set Font 2 to ten point type, red and bold font types 2 to 10th numbers, black and boldF2.setfontheightinpoints (( Short) 10);        F2.setcolor (HSSFColor.BLACK.index);        F2.setboldweight (Hssffont.boldweight_bold); //set cell style and formatting format cellsCs.setfont (f); //Horizontal layout: Centercs.setalignment (Hssfcellstyle.align_center); //Cs.setdataformat (Df.getformat ("#,# #0.0")); //set the other cell style and formatting format other cellsCs2.setborderbottom (CS2.        Border_thin); Cs2.setdataformat (Hssfdataformat.getbuiltinformat ("Text"));        Cs2.setfont (F2); //Horizontal layout: Centercs2.setalignment (Hssfcellstyle.align_center); //get the total collection size from the databaseList List = This. Todbdao.selectfatherdata (date); //gets the number of cycles (on this result +1), one cycle, the sub-loop pagesize times, the last loop, the sub-cycle mod times        intCirclecount = List.size ()/PAGESIZE; intMoD = list.size ()%PAGESIZE; String Firstorderid= ""; String Ordertime= "";  for(inti = 0; I < Circlecount + 1; i++) {                      //Create a new sheet creates a fresh worksheet, but after a sheet loads 65,535 records, a new sheet is automatically generated to ensure that the line limit is not exceededHssfsheet sheet = wb.createsheet ("p" + i+ "page"); /** Set table header*/Hssfrow FirstRow= Sheet.createrow (0);//start with a row labeled 0string[] Names=NewString[6]; names[0] = "Access number"; names[1] = "Browse Number"; names[2] = "Average length of Visit"; names[3] = "Order Number"; names[4] = "Order Time"; names[5] = "Initial Time";  for(intj = 0; J < 6; J + +) {Firstcell[j]=Firstrow.createcell (j); Firstcell[j].setcellvalue (Newhssfrichtextstring (Names[j]));            Firstcell[j].setcellstyle (CS2); }            //last Loop            if(i = =Circlecount) {                                 for(intRowNum = 1; RowNum < mod; rownum++) {                    //Get Row ObjectR =Sheet.createrow (rownum); HashMap Father= (HASHMAP) list.get (rownum + PAGESIZE *i);  for(intCellnum = 0; Cellnum < 6; cellnum++) {                        /** Get Column objects*/C0= R.createcell (0); C1= R.createcell (1); C2= R.createcell (2); C3= R.createcell (3); C4= R.createcell (4); C5= R.createcell (5); /** Assigning values to column objects*/C0.setcellvalue (Father.get ("SessionId"). toString ());                        C0.setcellstyle (CS2); C1.setcellvalue (Father.get ("Visitpages"). toString ());                        C1.setcellstyle (CS2); C2.setcellvalue (Father.get ("Pervisitstime"). toString ());                        C2.setcellstyle (CS2); if(NULL! = Father.get ("Firstorderid") ) {Firstorderid= Father.get ("Firstorderid"). toString ();                        } c3.setcellvalue (Firstorderid);                        C3.setcellstyle (CS2); if(NULL! = Father.get ("Ordertime") ) {Ordertime= Father.get ("Ordertime"). toString ();                        } c4.setcellvalue (Ordertime);                        C4.setcellstyle (CS2); C5.setcellvalue (Father.get ("Inittime"). toString ());                    C5.setcellstyle (CS2); }                }            } Else {                 for(intRowNum = 1; RowNum <= PAGESIZE; rownum++) {                    //Get Row ObjectR =Sheet.createrow (rownum); HashMap Father= (HASHMAP) list.get (rownum + PAGESIZE *i);  for(intCellnum = 0; Cellnum < 6; cellnum++) {                        /** Get Column objects*/C0= R.createcell (0); C1= R.createcell (1); C2= R.createcell (2); C3= R.createcell (3); C4= R.createcell (4); C5= R.createcell (5); /** Assigning values to column objects*/C0.setcellvalue (Father.get ("SessionId"). toString ());                        C0.setcellstyle (CS2); C1.setcellvalue (Father.get ("Visitpages"). toString ());                        C1.setcellstyle (CS2); C2.setcellvalue (Father.get ("Pervisitstime"). toString ());                        C2.setcellstyle (CS2); if(NULL! = Father.get ("Firstorderid") ) {Firstorderid= Father.get ("Firstorderid"). toString ();                        } c3.setcellvalue (Firstorderid);                        C3.setcellstyle (CS2); if(NULL! = Father.get ("Ordertime") ) {Ordertime= Father.get ("Ordertime"). toString ();                        } c4.setcellvalue (Ordertime);                        C4.setcellstyle (CS2); C5.setcellvalue (Father.get ("Inittime"). toString ());                    C5.setcellstyle (CS2); }}} sheet.autosizecolumn (( Short) 0);//adjusts the width of the first column according to the content, but not set by default, automatically adjusts the width by the table headerSheet.autosizecolumn (( Short) 4);//adjusts the width of the fifth column according to the content, but not set by default, automatically adjusts the width by the table headerSheet.autosizecolumn (( Short) 5);//adjusts the width of the sixth column according to the content, but not set by default, automatically adjusts the width by the table header        }        //Save SavedFileOutputStream out; Try{ out=NewFileOutputStream ("D://workbook.xls");            Wb.write (out);        Out.close (); } Catch(FileNotFoundException e) {//TODO auto-generated Catch blockE.printstacktrace (); } Catch(IOException e) {//TODO auto-generated Catch blockE.printstacktrace (); } System.out.println ("--Execution complete--"); }}
View Code

The above code is imported in XLS format.

Note: Although you can import 100w records one time in xlsx format, the code execution is much slower, and I don't know why. However, you can compare it with XLS and xlsx in two ways when you import the same number of data.

In fact, in addition to the above two ways, after I find information, but also found another more convenient way, that is to build data source ODBC, connect Excel, data transmission.

How to build Oracle ODBC, online information a lot of, here I will not repeat!

After the build, we will operate Excel, create a new version of the excel,07 better, check the data.

Select other/advanced from Data Connection Wizard, select the option with Oracle, then enter the user name, password, and connection address, and when the connection succeeds, all the table names in Oracle are displayed, noting that the permutations are not sequential, But we can quickly index to the name of the table we want to export data, such as a table name is called Father_user_behavior, we can position: First press F, then A, then press T, so you can basically directly locate the name of the table you want, the next operation is very simple, not much to say!

It is important to note that when the data exceeds 1,048,576, the problem also arises (it does not automatically generate a second sheet):

So this way, there is its drawbacks.

Big Data Import Excel

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.