Lucene takes field as the Key-value storage unit, and the value of field can be stored as string, int, long, double, float, and byte[], and it is often necessary to store complex data types such as list, map, and so on during development. Here's how to convert a complex object to a single Key-value store into Lucene.
Lucene supports multi-domain value storage, the same document can store multiple keys of the same field, the simple understanding is Lucene support Key=value and Key=[value1,value2] two ways to store. All we need to do is convert the object to Key=value or key=[value1,value2,...] Format is stored.
For example, a row of data for user tables
{
"user_id": "00000001",
"user_name": "Test1",
"Age": 30,
"Sex": 1,
"Emails": ["[Email protected]", "[email protected]"],
"Families": {
"Children": [
{
"Name": "Son1",
"Age": 5,
"Sex": 1,
"Birth": "2013-08-08"
},
{
"Name": "Son2",
"Age": 1,
"Sex": 1,
"Birth": "2017-01-01"
}
],
"Partner": {
"Name": "Wife",
"Age": 28,
"Sex": 2,
"Birth": "1990-01-01"
}
}
"State": "A",
"Create_time": 15648784644,
"Update_time": 15648784644
}
This data, in addition to families, can be stored directly in other fields. The families itself can be converted directly to JSON string storage, but it is not possible to use the data filtering criteria in families. For example, a query children a user of age older than or equal to 5. Families can be split to convert to Families.children and Families.partner storage. The Key-value after splitting is:
User_id= "00000001"
User_name= "Test1"
Age=30
Sex=1
emails=["[Email protected]", "[email protected]"]
families.children.name=["Son1", "Son2"]
families.children.age=[5,1]
FAMILIES.CHILDREN.SEX=[1]
families.children.birth=["2013-08-08", "2017-01-01"]
Families.partner.name= "Wife"
Families.partner.age=28
families.partner.sex=2
Families.partner.birth= "1990-01-01"
State= "A"
create_time=15648784644
update_time=15648784644
In this way, a complex object is converted to multiple Key-value storage. Query children age 5 or older users only need to set conditions Numericrangequery.newintrange ("Families.children.age", 5, Integer.max_value, True, true).
The above explains how to split a complex type into multiple field stores, and if you need to use Lucene to store data, you can use a separate column store. For example, starting with "_l" to represent the stored JSON array, "_m" begins to represent the stored JSON object, the above user object can be split into
User_id= "00000001"
User_name= "Test1"
Age=30
Sex=1
emails=["[Email protected]", "[email protected]"]
families.children.name=["Son1", "Son2"]
families.children.age=[5,1]
FAMILIES.CHILDREN.SEX=[1]
families.children.birth=["2013-08-08", "2017-01-01"]
Families.partner.name= "Wife"
Families.partner.age=28
families.partner.sex=2
Families.partner.birth= "1990-01-01"
State= "A"
create_time=15648784644
update_time=15648784644
_mfamilies= "{\" children\ ": [{\" name\ ": \" son1\ ", \" age\ ": 5,\" sex\ ": 1,\" birth\ ": \" 2013-08-08\ "},{\" name\ ": \" Son2\ " , \ "age\": 1,\ "sex\": 1,\ "birth\": \ "2017-01-01\"}],\ "partner\": {\ "name\": \ "wife\", \ "age\": 28,\ "sex\": 2,\ "birth\" : \ "1990-01-01\"}} "
When reading a value, Field.name () contains "." The field value converted to map,_l beginning with _m can be skipped directly and converted to list.
Lucene Complex Data type storage