Data-driven startups have an endless future

Source: Internet
Author: User
Keywords We are very run data driven

"We're going to use the data to make every decision. We will build the company into a data-driven company. "When you go to Silicon Valley, you'll hear a similar rhetoric everywhere, at least after Google becomes the world's most powerful company."

The above passage is Airbnb's vice president of engineering, Mike Curtis. He joined the apartment six months ago to share the initiative, and he came to Airbnb for nearly two years before working as director of engineering at Facebook. We talked last week about the true meaning of the Airbnb data-driven expansion, and how Curtis and his engineering team made it happen. Like his peers on the internet with Internet data, Curtis believes that his work with Airbnb's data research staff is intrinsically linked to the work of the company's strategic leaders.

"We think we're pushing data science in the area of tourism, so far we've probably done more than anyone," Curtis said. "In the long run, doing this-and profiting at the same time in the process-must use some sophisticated tools."

Grand vision: Apartment sharing becomes more humane

Airbnb products rely on human nature, one of the largest data problems is to find the best way to implement human search. "We want the guest to search for the location close to what they are looking for," he said. ”

Mike Curtis.

However, he added, figuring out how to rank the search for each user is a very difficult problem on the algorithm. We didn't go much further into the details, but the questions seemed clear. It's easy to sort out the results of a group or the geographical search results, but figuring out how to accurately consider the various factors of a user, such as preferences, social relationships, lease history, reviews, and other data points, can increase the complexity of another level. (plus Airbnb data from specific cities, visitors and host areas, and other metadata are also factors to consider.) )

Twitter's personalized search engine uses data science because it takes a lot of factors to determine relevance, and it's a good example of how difficult it is to achieve it.

Curtis says Airbnb is doing a lot of numerical calculations to help apartment owners find the best rental rate.

Inside the company, Airbnb wants to challenge Curits's former employer, Facebook. Facebook has a reputation for using hadoop construction tools, and almost everyone in Facebook now uses Hadoop directly or indirectly. Curtis says Facebook "really lets employees get in touch with data and identify key issues." ...... I also want to do this in the process of Airbnb products. ”

Mesos various

One of the key strategic tools in Airbnb's pocket is an Open-source cluster management project called Mesos. Airbnb uses Mesos to realize his dream of data. Mesos's technology, derived from the Amplab of the University of California, Berkeley, allows users to run multiple types of computing frameworks (and possibly just a few different hadoop clusters) in a single resource set. Mesos's reputation on the web has been given to Twitter, and the Mesos project jumped into a top Apache project last week.

Mesos structure

For Airbnb, the key to Mesos is to allow the company's engineers to maximize the use of Amazon Web services based infrastructure beyond the Hadoop category. Curtis explained that Airbnb used Hadoop in many places, but Airbnb wanted to experiment with storm in streaming (stream 處理) Airbnb wants to handle hive queries with Spark (which is also Amplab), faster than Hadoop allows.

In fact, spark in search rankings, the "bad behavior" aspect of pricing and testing services can be particularly useful, says Curtis, "many of these things involve machine learning models," and Spark's performance advantage over Hadoop means that it can run these models over and over in a relatively short time.

Chronos is a distributed job scheduler built by Airbnb for cloud environments. Chronos also runs on Mesos.

A big reason for Airbnb's use of Mesos is certainly resource management and efficiency, but Curtis says Mesos can also help Airbnb go a step further in the integrated engineering strategy and help build small teams to move quickly. Airbnb automatic resource allocation the better it is, the more time an engineer has to do something else. Ideally, the basic idea is to use Mesos automation to make a very small number of engineers have a big impact, he said. ”

Cloud: Wow! Elastic MapReduce: It doesn't matter?

While Airbnb is still using the AWS Cloud, Airbnb's Mesos is considering migrating from the popular elastic mapreduce Hadoop service. According to Curtis, there are several reasons to take this step, but the main reason is to use Mesos to manage all Airbnb needed to run the framework, so that the Airbnb Hadoop environment more granular control. Elastic MapReduce, he says, is largely Amazon's own distributed Hadoop, which means that users have to rely on aws,elastic for patching or accomplishing something like that MapReduce just doing hadoop.

Another engineer, Brenden Matthews, gave a speech at Twitter headquarters last week about Airbnb from elastic mapreduce to Mesos. His slides list the reasons for more conversions, and he lists the common challenges of running Hadoop in the cloud.

Still, AWS is generally reliable, and the flexibility of the cloud--along with Mesos's combination--means that Airbnb can do whatever it wants and when it wants to do it, says Curtis. Airbnb temporary parsing queries do not interfere with Airbnb's long-running bulk workflows, and vice versa.

"The speed at which the cluster job runs is completely executed in the resource configuration," Curtis said. How much resources do we put in the pool? ”

Curtis debuted in AltaVista in the late 90 and later spent time in AOL, Yahoo and Facebook. He smiles when he talks about resources, and in general, startups like Airbnb have a lot of potential to develop when they start to buy and manage so little investment in servers. He said, "Think about it today, all of it is abstract ... It's so beautiful and amazing. ”

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.