[ArticleAuthor: Sun Li link: http://www.cnblogs.com/sunli/ updated by:]
Today, I attended the Baidu technology Salon held by infoq, with the topic "App Engine Technology Application ". In the last two years, "cloud computing" and "virtualization" have heard a lot of things, but the technology sharing is really good. This time, the topic was very attractive. Due to the WiFi problem on the site, there was no Weibo live broadcast, so I was prepared to write a blog to summarize it.
First, I will share the article 《Reveal the secrets of Baidu application development engine (Baidu App Engine)", Xiao Wei repeatedly stressed that Baidu App Engine (BAE) is not a Gae, technically not a type of Gae.
Below are some PPT records:
Bae is a standalone environment for developers.
OrientedProgramDistributed Environment for execution
Automated tool chain for O & M
Distributed Multi-language programming framework
Unified management of all distributed resources of Baidu
Bea is a network-oriented operating system that provides online service development and operation platforms for Internet users.
You can compile the above Code and use the debugger to provide the BAE Development Library. The Bae is a cluster,
Naturally distributed, he is a network operating system that manages and schedules distributed resources and supports the migration of Baidu's old services.
Unified cluster, multiple products use each product account with independent resources
-- CPU, bandwidth, memory capacity, and storage capacity resources can be dynamically adjusted based on service heat to support transplantation of traditional services
-- Direct Transplantation
-- A small number of port modifications
Bae can develop complex enterprise-level applications
All shipping services are reclamation and support multi-account access quota
Locate the process group started by the Server Manager. These processes may be local, you may also manage the status of each process in the Process Group on other machines and configure dynamic monitoring of process changes in the Process Group to push process changes to the listener program in real time to provide DNS services.
Computing node function PHP support for user process execution environment
Dynamic Scheduling details 1. How to scale resources with dedicatedCodeManage servers
One machine can allocate 1000 processes at the same time to provide services for 1000 small websites
Process isolation of different services limits each process (Virtual container, virtualization process) each process accesses its own Directory Memory Capacity CPU time disk capacity, Nic
Kernel Time Scheduling
Distributed Database, distributed kV
Message Queue cache service crontab
Internet proxy
Current work
Supports migration of old Linux services
C Language
Using virtual machines to replace the fast-CGI process PHP process will hardly communicate with each other. The C language starts multiple processes and there is a lot of Process Communication. PHP code is relatively large (video, falsh, etc ),C LanguageThe process is very small, and the program and data are split
Optimization of Process Communication
-- Optimized IPC for communication speed to shared memory communication
RPC uses TCP communication-optimizing the number of valid connections between hosts at a process point establishes a set of pipelines for processes under the same host to enjoy a unified Pipeline
Baidu Infrastructure Department iis@baidu.com
Improve machine utilization and unify the development model of the entire company. When the time is ripe, it may be open
Next is the article from Sina's Cong lei (@ Kobe 《In-depth SAE cloud computing architectureThis is a topic he hasn't mentioned before, and it goes deep into the interior of the SEA system architecture, which is very powerful. Below are some PPT records: Sina appengine
This section focuses on the following:
1. cload Service
2. RDC
3. memcachedx
4. taskqueue
Currently, Sae uses the following services:
PHP stor memcachex dB, RDC taskqueue cron defferredjobs fetchurl tmpfs appconfig Smail image xhprof
Synchronous computing pool and asynchronous computing pool
RDC target:
1. Monitor millions of databases, including heartbeat check, master-slave synchronization check, and node load
2. Manage millions of databases, including starting, stopping, migrating, restarting, and switching
3. HA for passive Replication
4. Support for the mysql5 communication protocol. The proxy has been completely transparent and has low proxy loss.
5. Stateless dependency, supporting horizontal scaling
6. Provides database isolation to ensure the security of the entire cluster.
RDC: RDC is a subset of MySQL.
1. multithreading vs multi-process (RDC is a multi-process)
2. SQL parsing, lexical analysis vs syntax analysis
3. querycache
RDC is not responsible for horizontal scaling of user databases, so horizontal scaling requires you to perform table sharding on your own.
RDC provides a master-slave dB structure. The upper layer only supports read/write splitting to ensure the security and reliability of the entire database platform,
RDC will make predictions based on its ownAlgorithmBlock Some SQL statements in advance.
RDC strongly recommends that you use the correct MySQL call habits to determine the return value for each MySQL function.
RDC pre-judgment mechanism select update insert **
Three blocking lines, current SQL,
SQL concurrent execution time and
Horizontal scaling,
Monitoring and system integration
Memcachex
1. Low overhead
2. Ha
3. statics
4. Connections Protector
5. Data dump
Taskqueue (callback) Simple offline processing of tasks
Deferredjobs (system-level calling of heavyweight asynchronous tasks, such as Database Import and Export .) Ordered queue and concurrent queue
Hard hash multi-process
Non-blocking timeout
Master-slave passive replication, memory-level master-slave Replication
Worker delay wait time
Worker death Wake-up check
Infoq subsequently worried about detailed information release, please note: http://www.infoq.com/cn/
Summary:
Baidu App Engine is used internally and is not yet open. Bae is more underlying and transparent.
Sae is open to users, so its security is higher than that of Baidu, and its maturity has also been verified by users. Sae is closer to the application and provides a lot of services, which is very good. It is equivalent to helping users naturally carry out a high-reliability architecture. It also provides resource monitoring to monitor and control inefficient code. In fact, this is also a benefit for users.
In the open discussion session after the meeting, I raised the topic "How to quickly build an enterprise's internal AE". Based on the discussion, I learned that some people are actually doing similar things, however, unlike Bae and Sae, they are still thinking about complex and heavyweight cloud computing.
I personally think that for a company, especially for medium-sized companies, we can first turn applications into services, and then abstract services into clusters, that is the internal AE of the enterprise.
Most of the articles are PPT records, and the layout is not very good. They may not be easy to express, but I believe they are helpful to interested friends.