SOLR dataimport data import source code analysis (8)

Last Update:2018-12-05 Source: Internet

Author: User

Tags solr

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Data Reading Class jdbcdatasource. Java

Resultsetiterator is an internal class of jdbcdatasource for reading data from the database.

Private class resultsetiterator {
Resultset;

Statement stmt = NULL;

List <string> colnames;

Iterator <Map <string, Object> rsetiterator;

Public resultsetiterator (string query ){

Try {
Connection c = getconnection ();
Stmt = C. createstatement (resultset. type_forward_only, resultset. concur_read_only );
Stmt. setfetchsize (batchsize );
Stmt. setmaxrows (maxrows );
Log. debug ("executing SQL:" + query );
Long start = system. currenttimemillis ();
If (stmt.exe cute (query )){
Resultset = stmt. getresultset ();
}
Log. Trace ("time taken for SQL :"
+ (System. currenttimemillis ()-Start ));
Colnames = readfieldnames (resultset. getmetadata ());
} Catch (exception e ){
Wrapandthrow (severe, e, "unable to execute query:" + query );
}
If (resultset = NULL ){
Rsetiterator = new arraylist <Map <string, Object> (). iterator ();
Return;
}

Rsetiterator = new iterator <Map <string, Object> (){
Public Boolean hasnext (){
Return hasnext ();
}

Public Map <string, Object> next (){
Return getarow ();
}

Public void remove () {/* do nothing */
}
};
}

Private iterator <Map <string, Object> getiterator (){
Return rsetiterator;
}

Private Map <string, Object> getarow (){
If (resultset = NULL)
Return NULL;
Map <string, Object> result = new hashmap <string, Object> ();
For (string colname: colnames ){
Try {
If (! Converttype ){
// Use underlying database's type information
Result. Put (colname, resultset. GetObject (colname ));
Continue;
}

Integer type = fieldnamevstype. Get (colname );
If (type = NULL)
Type = types. varchar;
Switch (type ){
Case types. Integer:
Result. Put (colname, resultset. getint (colname ));
Break;
Case types. Float:
Result. Put (colname, resultset. getfloat (colname ));
Break;
Case types. bigint:
Result. Put (colname, resultset. getlong (colname ));
Break;
Case types. Double:
Result. Put (colname, resultset. getdouble (colname ));
Break;
Case types. Date:
Result. Put (colname, resultset. getdate (colname ));
Break;
Case types. boolean:
Result. Put (colname, resultset. getboolean (colname ));
Break;
Case types. Blob:
Result. Put (colname, resultset. getbytes (colname ));
Break;
Default:
Result. Put (colname, resultset. getstring (colname ));
Break;
}
} Catch (sqlexception e ){
Logerror ("error reading data", e );
Wrapandthrow (severe, e, "error reading data from database ");
}
}
Return result;
}

Private Boolean hasnext (){
If (resultset = NULL)
Return false;
Try {
If (resultset. Next ()){
Return true;
} Else {
Close ();
Return false;
}
} Catch (sqlexception e ){
Close ();
Wrapandthrow (severe, e );
Return false;
}
}

Private void close (){
Try {
If (resultset! = NULL)
Resultset. Close ();
If (stmt! = NULL)
Stmt. Close ();
} Catch (exception e ){
Logerror ("exception while closing result set", e );
} Finally {
Resultset = NULL;
Stmt = NULL;
}
}

}

Here, list <string> colnames is the metadata of the data table, that is, the field information of the data table, including the field name, field type, and other information.

Colnames = readfieldnames (resultset. getmetadata ());

Private list <string> readfieldnames (resultsetmetadata metadata)
Throws sqlexception {
List <string> colnames = new arraylist <string> ();
Int COUNT = metadata. getcolumncount ();
For (INT I = 0; I <count; I ++ ){
Colnames. Add (metadata. getcolumnlabel (I + 1 ));
}
Return colnames;

}

Iterator <Map <string, Object> rsetiterator is the data iterator rsetiterator = new iterator <Map <string, Object> (){

Public Boolean hasnext (){
Return hasnext ();
}

Public Map <string, Object> next (){
Return getarow ();
}

Public void remove () {/* do nothing */
}
};

It can be seen from this that SOLR's built-in data import adopts the iterator method to import data, preventing out of memery exceptions when the data volume of the data table is too large.

It is recommended that you read data by pages and add the data to the SOLR index database, when we add the data read from the database to the SOLR index database programmatically, we can refer to this method and adopt the original JDBC data access method. I will post it for sharing if I have time.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More