Recently in cooperation with the development of UBD project, simply said is a large wide table, there are 200 fields, and the volume of data is particularly large (100 million levels of data), the traditional database is not suitable, so consider the Lucene-based SOLR, and recommend the use of SOLR Cloud features to do high availability and sharding (later will update code learning for SOLR and Lucene).
The data is inserted into SOLR from the hive calculation, and the HIVE2SOLR function is implemented according to the code on GitHub itself. In fact, the final insertion of the data is still called the corresponding method of the Solrinputdocument class.
The Solrinputdocument SetField and AddField methods are used by default for SOLR to add and update data
(You can think of Solrinputfield as a database column, so solrinputdocument is a row)
Like what:
Solrinputdocument doc = new solrinputdocument ();d Oc.addfield ("id", 1);d Oc.setfield ("name", "xxxxx");
But SetField and AddField are covered behavior, where the data is inserted from 6 tables, which results in only one table for the final data.
SOLR has the function of atomic update, can implement append behavior (in fact, finally delete + add)
Reference:
Http://stackoverflow.com/questions/16234045/solr-how-to-use-the-new-field-update-modes-atomic-updates-with-solrj
Demo
import java.util.arraylist;import java.util.collection;import java.util.map;import java.util.linkedhashmap;import org.apache.solr.client.solrj.solrserver;import Org.apache.solr.client.solrj.impl.httpsolrserver;import org.apache.solr.common.solrinputdocument;public class user { public static void main (String[] args) throws exception { string[] fields = {"name1_s", "name2_s", "name4_s"}; map<string, object> setoper = null; string url = "Http://xxxxx:8888/solr/userinfo"; solrserver Server = new httpsolrserver (URL); Solrinputdocument doc = new solrinputdocument(); //constructs a Solrinputdocument object system.out.println ( Doc.keyset (). Size ()), //0 doc.addfield ("id", "1"); for (int i = 0; i< fields.length; i+ +) { setoper = new Linkedhashmap<string,object> (); setoper.put ("Set", "A2"); //is considered an atomic update when it finds that the field value set is a map type system.out.println (Fields[i]); if (!doc.keyset (). Contains (Fields[i])) { //Prevent duplicate columns Doc.addfield (Fields[i], seToper); } } System.out.println (Doc.keyset (). Size ()); //4 collection <SolrInputDocument> docs = new ArrayList<SolrInputDocument> (); docs.add (DOC); Server.add (docs); server.commit (); }}
This article is from the "Food and Light Blog" blog, please make sure to keep this source http://caiguangguang.blog.51cto.com/1652935/1599137
SOLR Atom Update