How Uber uses the Go language to create efficient query services

Source: Internet
Author: User
Tags gopher
This is a creation in Article, where the information may have evolved or changed.

At the beginning of 2015 we created a microservices, and it did only one thing (and it did very well) is the geo-fence query. A year later, it became the Uber High Frequency query (QPS) service, and this is a story of why we created this service and how the programming language rookie go helped us quickly create and extend the service.

Background

In Uber, a geo-fence is an artificial geographic area (or polygon geometry) that is defined on the surface. Geo-fencing is widely used in uber-based settings for geolocation. Showing users what products are available in a given region, defining areas according to special needs (such as airports), and implementing dynamic pricing in adjacent areas at peak traffic is an important application scenario for our products.


Example of a Colorado geo fence.

The first step is to obtain geographical information such as latitude and longitude through the user's mobile phone, and then determine the geographic fence of the user. This feature is distributed across multiple services or modules. Because we migrated from the overall architecture to the microservices architecture, we chose to make this feature a new microservices.

Use the Go language

node. JS was the main development language for our real-time marketing team, so we have a lot of knowledge and experience on node. js. But go is more in line with our needs in the following ways:

1, high throughput low latency needs. Each request in the Uber mobile app requires geo-fencing queries, and the response is fast (99% < 100 milliseconds) frequently (thousands per second),
2, for CPU-intensive. Geo-fence queries are CPU-intensive services for point-and-gather computing. node. js is great for our other I/O intensive applications, but because node is inherently an interpreted dynamic language, it is not suitable for this type of application.
3, non-interrupt background load. To provide the most up-to-date geo-fencing data for the query service, the service needs to load the memory data from multiple data sources continuously in the background. Because node. js is single-threaded, background updates can clog the CPU for long periods of time (for example, CPU-intensive JSON parsing), which can affect the query response duration. But go does not have these problems, because Goroutines can use multicore, background tasks and foreground queries can be parallel.


Whether to use Geographic information index: This is a problem

When you specify a geographic location by latitude, which one does it belong to from our thousands of geo-fences? The simple and brutal approach is to use point-of-check methods, such as Ray-casting algorithms, to find from all geo-fencing data. But this style is too slow. So how do we narrow the scope of the query to improve efficiency?

We did not use R-tree to do geo-fence indexing and more complex S2, and by observing we found that Uber's business model was city-centric; business rules and geo-fencing are usually defined in a single city, so we chose a simple routing method. We organized the GEO fence as a two-storey structure, with the first layer being the urban geo-fence (defining the city boundary) and the second floor being the geographical fence in each city.

For each query, we first do a linear scan of all urban geo-fencing to find the city, and then do a linear scan of the city's geo-fencing data. The computational complexity of this solution is O (n), and with this simple technique we reduce N from 10,000s to 100s.

Architecture

We want this service to be stateless so that each request can be sent to any instance and the results are consistent. This means that each instance has a full amount of data, not just part of the data. We generated a unified pull plan so that the actual geo-fencing data for different services could be kept in sync. The architecture of this service becomes simple. Background tasks periodically pull geo-fencing data from different data stores. This data is stored in memory to increase query speed and is serialized to a local file when the service needs to be restarted.


Our geo-fence data Query architecture


Working with Go memory models

In our architecture, we need to concurrently read and write the geo-indexed data in memory. When a background pull task writes an index, the foreground query engine may read the index synchronously. People with node. JS experience are familiar with single-threaded mode, and Go's memory model is a challenge for them. This has had a negative impact on us. We tried to use the primitive storepointer/loadpointer of the sync/atomic package to manage memory boundary problems, but this caused the code to be fragile and difficult to maintain.

Finally, we took a compromise approach by using a read-write lock to asynchronously process access to a geographic index. To reduce the contention for locks, the new index creates an index fragment before it is merged into the primary index in an atomic manner. These slightly increase the latency compared to the storepointer/loadpointer approach, but we have reason to believe that the simplicity and maintainability of the code is more valuable than this little delay.

Our experience

Looking back, we were fortunate to have used the go language and developed our services using this new language. Highlights are as follows:

1, high development efficiency. Developers of C++,java and node. JS have only a short time to master go and the code is easy to maintain. (Static language is clearer, no inexplicable surprises).
2. Good performance in terms of throughput and latency. Our primary data center has independent services for non-China, with 40 servers using only 35% of the CPU at 170k QPS during the 2015 peak period. 95% response time is less than 5 milliseconds and 99% response time is less than 50 milliseconds.
3, Super stable, this service since the launch, 99.99% of the time normal operation. When the machine time is mainly caused by the programming error of the beginner and the disclosure of the file descriptor of the third party library. We have not yet encountered a run-time error for go.


Next?

Uber used to use node. js and Python in the past, and a lot of Uber's new services were chosen to create using go. Go is the future of Uber, so if you're passionate about go, whether it's an expert or a beginner, you're welcome to apply, we're recruiting go developers, oh yes, the portal please click here!

Photo Source: "Golden Gate Gopher", Author: Conor Myhrvold, in San Francisco's Golden Gate Park. Title Explanation: Gopher (Go Gopher) is the mascot of the Go Project and is the logo of go.

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.