Java uses ElasticSearch to query millions of users nearby,

Source: Internet
Author: User
Tags createindex

Java uses ElasticSearch to query millions of users nearby,

The previous article introduced how ElasticSearch uses Repository and ElasticSearchTemplate to construct complex query conditions, and briefly introduced the use of geographical location in ElasticSearch.

In this article, we will take a look at the use of ElasticSearch to search for data in the N-meter range for people nearby with large data volumes.

Prepare the environment

The local test uses the latest version of ElasticSearch 5.5.1, SpringBoot1.5.4, spring-data-ElasticSearch2.1.4.

Create a Springboot project and check ElasticSearch and web.

The pom file is as follows:

<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"   xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">   <modelVersion>4.0.0</modelVersion>    <groupId>com.tianyalei</groupId>   <artifactId>elasticsearch</artifactId>   <version>0.0.1-SNAPSHOT</version>   <packaging>jar</packaging>    <name>elasticsearch</name>   <description>Demo project for Spring Boot</description>    <parent>     <groupId>org.springframework.boot</groupId>     <artifactId>spring-boot-starter-parent</artifactId>     <version>1.5.4.RELEASE</version>     <relativePath/> <!-- lookup parent from repository -->   </parent>    <properties>     <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>     <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>     <java.version>1.8</java.version>   </properties>    <dependencies>     <dependency>       <groupId>org.springframework.boot</groupId>       <artifactId>spring-boot-starter-data-elasticsearch</artifactId>     </dependency>     <dependency>       <groupId>org.springframework.boot</groupId>       <artifactId>spring-boot-starter-web</artifactId>     </dependency>      <dependency>       <groupId>org.springframework.boot</groupId>       <artifactId>spring-boot-starter-test</artifactId>       <scope>test</scope>     </dependency>     <dependency>       <groupId>com.sun.jna</groupId>       <artifactId>jna</artifactId>       <version>3.0.9</version>     </dependency>   </dependencies>    <build>     <plugins>       <plugin>         <groupId>org.springframework.boot</groupId>         <artifactId>spring-boot-maven-plugin</artifactId>       </plugin>     </plugins>   </build>  </project> 

Create model class Person

Package com. tianyalei. elasticsearch. model; import org. springframework. data. annotation. id; import org. springframework. data. elasticsearch. annotations. document; import org. springframework. data. elasticsearch. annotations. geoPointField; import java. io. serializable;/*** model class */@ Document (indexName = "elastic_search_project", type = "person", indexStoreType = "fs", shards = 5, replicas = 1, refreshInterval = "-1") public class Person implements Serializable {@ Id private int id; private String name; private String phone;/*** geographic location latitude and longitude * lat, lon longitude "40.715,-74.011" * If an array is used, the opposite is [-73.983, 40.719] */@ GeoPointField private String address; public int getId () {return id ;} public void setId (int id) {this. id = id;} public String getName () {return name;} public void setName (String name) {this. name = name;} public String getPhone () {return phone;} public void setPhone (String phone) {this. phone = phone;} public String getAddress () {return address;} public void setAddress (String address) {this. address = address ;}}

I use the address field to represent the longitude and latitude positions. Note: When using String [] and String to represent the longitude and latitude, they are different. For more information, see annotations.

import com.tianyalei.elasticsearch.model.Person; import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;  public interface PersonRepository extends ElasticsearchRepository<Person, Integer> {  } 

Take a look at the Service class to complete the function of inserting test data. I put the query function in the Controller. To facilitate viewing, it should normally be placed in the Service.

Package com. tianyalei. elasticsearch. service; import com. tianyalei. elasticsearch. model. person; import com. tianyalei. elasticsearch. repository. personRepository; import org. springframework. beans. factory. annotation. autowired; import org. springframework. data. elasticsearch. core. elasticsearchTemplate; import org. springframework. data. elasticsearch. core. query. indexQuery; import org. springframework. stereotype. Service; import java. util. arrayList; import java. util. list; @ Service public class PersonService {@ Autowired PersonRepository personRepository; @ Autowired ElasticsearchTemplate elasticsearchTemplate; private static final String PERSON_INDEX_NAME = "inline"; private static final String PERSON_INDEX_TYPE = "person "; public Person add (Person person) {return personRepository. save (per Son);} public void bulkIndex (List <Person> personList) {int counter = 0; try {if (! ElasticsearchTemplate. indexExists (PERSON_INDEX_NAME) {elasticsearchTemplate. createIndex (PERSON_INDEX_TYPE);} List <IndexQuery> queries = new ArrayList <> (); for (Person person: personList) {IndexQuery indexQuery = new IndexQuery (); indexQuery. setId (person. getId () + ""); indexQuery. setObject (person); indexQuery. setIndexName (PERSON_INDEX_NAME); indexQuery. setType (PERSON_INDEX_TYPE); // you can use IndexQueryBuilder to construct the above step // IndexQuery index = new IndexQueryBuilder (). withId (person. getId () + ""). withObject (person ). build (); queries. add (indexQuery); if (counter % 500 = 0) {elasticsearchTemplate. bulkIndex (queries); queries. clear (); System. out. println ("bulkIndex counter:" + counter);} counter ++;} if (queries. size ()> 0) {elasticsearchTemplate. bulkIndex (queries);} System. out. println ("bulkIndex completed. ");} catch (Exception e) {System. out. println ("IndexerService. bulkIndex e; "+ e. getMessage (); throw e ;}}}

Pay attention to the bulkIndex method, which is used for batch data insertion. bulk is also officially recommended by ES for batch data insertion. Here, bulk is inserted every integer multiple of 500.

Package com. tianyalei. elasticsearch. controller; import com. tianyalei. elasticsearch. model. person; import com. tianyalei. elasticsearch. service. personService; import org. elasticsearch. common. unit. distanceUnit; import org. elasticsearch. index. query. geoDistanceQueryBuilder; import org. elasticsearch. index. query. queryBuilders; import org. elasticsearch. search. sort. geoDistanceSortBuilder; import org. elasticse Arch. search. sort. sortBuilders; import org. elasticsearch. search. sort. sortOrder; import org. springframework. beans. factory. annotation. autowired; import org. springframework. data. domain. pageRequest; import org. springframework. data. domain. pageable; import org. springframework. data. elasticsearch. core. elasticsearchTemplate; import org. springframework. data. elasticsearch. core. query. nativeSearchQueryBuilder; Import org. springframework. data. elasticsearch. core. query. searchQuery; import org. springframework. web. bind. annotation. getMapping; import org. springframework. web. bind. annotation. restController; import java. text. decimalFormat; import java. util. arrayList; import java. util. list; import java. util. random; @ RestController public class PersonController {@ Autowired PersonService personService; @ Autowired E LasticsearchTemplate elasticsearchTemplate; @ GetMapping ("/add") public Object add () {double lat = 39.929986; double lon = 116.395645; list <Person> personList = new ArrayList <> (900000); for (int I = 100000; I <1000000; I ++) {double max = 0.00001; double min = 0.000001; random random = new Random (); double s = random. nextDouble () % (max-min + 1) + max; DecimalFormat df = new DecimalFormat ("#### ### 0.000000 "); // System. out. println (s); String lons = df. format (s + lon); String lats = df. format (s + lat); Double dlon = Double. valueOf (lons); Double dlat = Double. valueOf (lats); Person person = new Person (); person. setId (I); person. setName ("name" + I); person. setPhone ("phone" + I); person. setAddress (dlat + "," + dlon); personList. add (person);} personService. bulkIndex (personList); // SearchQuery sear ChQuery = new NativeSearchQueryBuilder (). withQuery (QueryBuilders. queryStringQuery ("spring boot OR books ")). build (); // List <Article> articles = elas, ticsearchTemplate. queryForList (se, archQuery, Article. class); // for (Article article: articles) {// System. out. println (article. toString (); //} return "add data";}/*** geo_distance: Find the location within a certain range from a central point geo_bounding_box: find the location in a rectangle area geo_distance_rang E: Find the location between min and max from a center geo_polygon: Find the location inside the polygon. Sort can be used to sort */@ GetMapping ("/query") public Object query () {double lat = 39.929986; double lon = 116.395645; Long nowTime = System. currentTimeMillis (); // query GeoDistanceQueryBuilder builder = QueryBuilders within a latitude and longitude range of 100. geoDistanceQuery ("address "). point (lat, lon ). distance (100, DistanceUnit. METERS); GeoDistanceSortBuilder sortBuilder = SortBuilders. geoDistanceSort ("address "). point (lat, lon ). unit (DistanceUnit. METERS ). order (SortOrder. ASC); Pageable pageable = new PageRequest (0, 50); NativeSearchQueryBuilder builder1 = new NativeSearchQueryBuilder (). withFilter (builder ). withSort (sortBuilder ). withPageable (pageable); SearchQuery searchQuery = builder1.build (); // queryForList is a page by default and queryForPage is used. The default value is 10 lists <Person> personList = elasticsearchTemplate. queryForList (searchQuery, Person. class); System. out. println ("Time consumed:" + (System. currentTimeMillis ()-nowTime); return personList ;}}

Look at the Controller class. In the add method, we insert 0.9 million pieces of test data to randomly generate different longitude and latitude addresses.

In the query method, we construct a query condition in the range of 100, sorted by distance, and 50 entries per page. If Pageable is not specified, the queryForList of ESTemplate is 10 by default, which can be seen through the source code.

Start the project, execute add, wait for millions of data to be inserted, about dozens of seconds.

Then execute the query and check the result.

It takes more than 300 ms for the first query. After the second query, the time will be greatly reduced to about 30 ms, because ES has automatically cached in the memory.

It can be seen that elasticsearch can query Geographical locations very quickly. It is applicable to query nearby persons, range queries, and other functions.

Note: In later use, in Elasticsearch2.3, the geo type cannot be indexed according to the above method, and the entry to es is String rather than geofiled. Record the solution here and change the String type to GeoPoint, which is under the org. springframework. data. elasticsearch. core. geo. GeoPoint package. Then, you need to call the mapping method explicitly when creating the index to map it to geofield correctly.

As follows:

if (!elasticsearchTemplate.indexExists("abc")) {       elasticsearchTemplate.createIndex("abc");       elasticsearchTemplate.putMapping(Person.class);     } 

The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.