Scaling the Messages application back End "Turn"

Last Update:2014-08-08 Source: Internet

Author: User

Tags cassandra value store

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The 11 blog.

Facebook Messages seamlessly integrates many communication channels:email, SMS, Facebook Chat, and the existing Facebook Inbox. Combining all the functionality and offering a powerful user experience involved building an entirely new infrastructure Stack from the ground up.

To simplify the product and present a powerful user experience, integrating and supporting all the above communication cha Nnels requires a number of services to run together and interact. The system needs to:

Scale, as we need to support millions of the users with the existing message history.
Operate in real time.
be highly available.

To overcome all these challenges, we started laying down a new architecture. The heart of the application back end is the application servers. Application servers is responsible for answering all queries and take all the writes into the system. They also interact with a number of services to achieve this.

Each application server comprises:

Api:the entry point for all get and set operations, which every client calls. An application server was the sole entry point for any given user into the system. Any data written to or read from the system needs to go through this API.
Distributed logic:to understand the distributed logic we need to understand what's a cell is. The entire system is divided to cells, and each cell contains only a subset of users. A cell looks like this:

Understanding Cells

Cells give us many advantages:

They help us scale incrementally while limiting failure scenarios
Easy upgrades
Metadata store failures affect only a few users
Easy Rollout
Flexibility to host cells in different data centers with multi-homing for disaster recovery

Each cell consists of a cluster of application servers, and each application server cluster are controlled by a set of ZooKeeper machines.
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed Synchronizatio N, and providing group services. All of these kinds of services is used in some form or another by distributed applications.

ZooKeeper is open source software so we use mainly for both Purposes:as the controller for implementing Sharding and FAI Lover of application servers, and as a store for our discovery service. Since ZooKeeper provides us with a highly available repository and notification mechanism, it goes a long to towards help ing us build a highly available service.

Each application server registers itself in ZooKeeper by generating N tokens. The server uses these tokens to take N virtual positions on a consistent hash ring. This is used to shard users across these nodes. In case of failures, the neighboring nodes take over the load for those users, hence distributing the load evenly. This also allows is easy addition and removal of nodes to and from the application server cluster.

Application Business logic : This is where the magic happens. The business logic was responsible for making sense of all user data, storing and retrieving it, and applying all the Compl Ex product operations to the it to perform various functions. It also have a dedicated cache that acts as a write-through cache, since the application servers is the only entry points To read/write the data for any given user. This cache stores the entire recent image for the user and gives us a very high cache hits rate. The business logic also interacts with the WEB servers to respect user privacy and also apply any policies.
data Access Layer: The data access layer is the schema used to store the user ' s metadata. It consists mainly of a time sequenced log, which is the absolute source of truth for the user's data, and is used to back Up, retrieve, and regenerate user data. The schema also consists of snapshots that represent the serialized user objects understood by the business logic. This layer was designed to present a generic interface to the application servers while making the underlying store Pluggab Le.
Metadata Store: Each cell also has a dedicated Metadata store. We Use the HBase as our metadata store. The data access layer interacts with HBase to provide storage functionality. Late last year we talked about our Messages storage infrastructure, which are built on top of Apache HBase.

Finally, the whole system has a number of cells, and looks like this:

Other Messages Services

The Messages application back end needs to parse e-mail Messages and attachments, and also provide discovery of the right A Pplication servers for the given user. This was achieved with the following services:

MTA Proxy: This service receives all incoming email messages and are responsible for parsing the email RFCs, ATTAC Hments, large bodies of email, and so forth. These parsed out values be stored in a dedicated Haystack cluster (which was the same Key/value store that we use for phot OS). Once the proxy has created a lightweight e-mail object, it talks to the appropriate application server and delivers the MES Sage. But talking to the appropriate application server involves figuring out the cell and machine a particular user resides on, Which brings us to the discovery service.
Discovery Service: This consists of a map of User-to-cell mappings. Every client needs to the discovery service before it can contact a application server for any request. Given the stringent requirements, this service needs to be very highly available, scalable, and performant.
Distributed Logic Client: These clients listen for ZooKeeper notifications and watch for all changes in the Appli cation server cluster state. Each application server cluster or cell has a dedicated client. These clients live in the discovery service process, and once the discovery service have mapped the user ' s cell, it queries That cell's client, which executes the consistent hash algorithm to figure out the correct application server node for th E user.

The Messages application back end also relies on the following services:

Memcache Dirty Service : The application servers query message counts from the home page very fr equently to accurately display the message notification jewels. These counts is cached in memcache on order to display the home page as quickly as possible. As new messages arrive, these entries need to is dirtied from the application servers. Thus, this dedicated service runs to dirty these caches in every data center.
User Index Service : This provides the social information for each User, like friends, friends of Frie NDS, and so forth. This information was used to implement the social features of messaging. For example, on every a message that's added to the system, the Application server node queries this service to determine I F This message was from a friend or a friend of friend and directs it to the appropriate folder.

The clients of the application back end system include MTAs for email traffic, IMAP, Web servers, SMS client, and Web Chat Clients. Apart from the MTAs, which talk to the MTA proxies, all other clients talk directly to the application servers.

Given that we built this services infrastructure from scratch, one of the most important things is to has the Appropriat e tools and monitoring in place-to-push this software-almost a daily basis without any service disruption. So we ended up building a number of useful tools that can give us a view of the various cells, enable/disable cells, manag e addition and removal of hardware, do rolling deployments without disrupting service, and give us a view of the Performan Ce and bottlenecks in various parts of the system.

All these services need to work in tandem and is available and reliable for messaging to work. We is in the process of importing millions of users to this system every day. Very soon every Facebook user'll has access to the new Messages product.

At Facebook, taking on big challenges is the norm. Building this infrastructure and getting it up and running are a prime example of this. A lot of sweat have gone into bringing this to production. I would like to thank every individual the who have contributed to this effort and is continuing to doing so. This effort involved is the Messages team but also a number of interns and various teams across the company.

We spent a few weeks setting up a test framework to evaluate clusters of MySQL, Apache Cassandra, Apache HBase, and a coup Le of other systems. We ultimately chose HBase. MySQL proved to isn't handle the long tail of data well; As indexes and data sets grew large, performance suffered. We found Cassandra ' s eventual consistency model to is a difficult pattern to reconcile for our new Messages infrastructure .

HBase comes with very good scalability and performance for this workload and a simpler consistency model than Cassandra. While we ' ve do a lot of work in HBase itself over the past year, when we started we also found it to being the most feature Rich in terms of We requirements (auto load balancing and failover, compression support, multiple shards per server, etc .). HDFS, the underlying filesystem used by HBase, provides several nice features such as replication, End-to-end checksums, a nd automatic rebalancing. Additionally, our technical teams already had a lot of development and operational expertise in HDFS from data processing With Hadoop. Since we started working on hbase, we ' ve been focused on committing our changes back to hbase itself and working closely w ITH the community. The open source release of HBase is a what we ' re running today.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More