At the JavaOne 2008 meeting, the developer of the famous social networking site LinkedIn made 2 presentations on the architecture of the LinkedIn website:
Linkedin-a Professional Social Network Built with Java Technologies and Agile practices
LinkedIn Communication Architecture
Take a look at the basics of the LinkedIn website:
LinkedIn world's top-level traffic
22 million users
4 million independent user access per month
40 million Page view per day
2 million search traffic per day
250,000 invitation to send every day
1 million responses per day submitted
2 million e-mail messages per day
LinkedIn System Architecture
Operating system: Solaris (running on Sun x86 platform and Sparc)
Application server: Tomcat and Jetty as application servers
Database: Oracle and MySQL as DBs
No ORM, direct with JDBC no ORM (such as Hibernate); They use straight JDBC
Use ACTIVEMQ to send JMS. (It ' s partitioned by type of messages. Backed by MySQL.)
Lucene as a foundation for search
Spring makes logical schema spring as glue
Hudson as an integrated testing framework
2003-2005
2006 Schema Changes
Read/write separation: Copy another database, reduce the direct load core database, and another server to manage data updates for non-read-only databases.
Remove the search from the cloud and run a single server search
Add the Databus data bus to update the database, which is the core component of the distributed update, and any component needs to be databus
650) this.width=650; "src=" Http://www.jdon.com/artichect/images/linkedin.png "alt=" LinkedIn 2006 "height=" 460 "width = "648"/>
2008 Schema Changes
WebApp no longer does everything it does by itself, dividing the business logic into many parts and doing it through the server cluster.
WebApp still provides the user interface to the user, however, through the server group to manage user data, groups and so on.
Each service has its own domain database.
The new architecture allows other apps to link to LinkedIn, such as the increased recruitment and advertising business.
650) this.width=650; "src=" Http://www.jdon.com/artichect/images/linkedin2008.png "alt=" linkedin2008 Year Architecture "height=" 424 "width=" 646 "/>
Linked Performance Index
LinkedIn Clusters: Web event tracking and online search
6 nodes, up to GB of data, clients
Mixed load (% Get,% Put)
Throughput throughput
1433 QPS (node)
4299 QPS (Cluster)
Latency delay
GET
Percentile 0.05 Ms
% percentile 36.07 ms
percentile 60.65 ms
PUT
Percentile 0.09 Ms
% percentile 0.41 Ms
percentile 1.22 ms
Cloud Cache
650) this.width=650; "src=" http://www.jdon.com/artichect/images/linkedincache.png "alt=" lined in cache "height=" 297 " Width= "308"/>
Cloud Cache Size
22M nodes, 120M edges
Requires 12GB RAM
40 instances to run in a production environment
Rebuilding a cloud from a hard drive takes 8 hours and starts booting.
The cache is implemented in C + + and is invoked with JNI.
Voldemort
Apply on LinkedIn, not relational database.
is a memory cache with a storage system. This will not require a separate cache.
Cloud storage: Use Voldemort to implement read-only read-only index, using Hadoop as a data file. Establish TB-level data processing.
650) this.width=650; "src=" Http://www.jdon.com/artichect/images/voldemort.png "alt=" Voldemort "height=" width= " 597 "/>
Data model
Compact, compressed binary data
The type is int, double, float, String, Map, List, Date, etc.
Member data formats such as:
{
' Member_id ': ' Int32 ',
' First_Name ': ' String ',
' Last_Name ': ' String ',
' Age ': ' Int32 '
...
}
Data is stored in Hadoop as a sequential serialized file
The data format is also saved as a sequential file,
The data schema is read-only and is dynamically read by the Java/pig task.
LinkedIn Architecture--2008