Internet server Technology--How to learn (b)

Source: Internet
Author: User

I'd like to talk to you guys today. Distributed components. When it comes to components, Lao Wang thought it was a tall thing. I heard who who who who in which company, write such a component, admire the feeling as if the surging river of water stretches out ~ Later himself also wrote some so-called components, I think this is nothing mysterious. The so-called component, Lao Wang is so understanding: in order to a whole environment can complete work, and provide some of the functions are relatively independent, targeted, potentially more versatile functions, classes, libraries, modules, systems or services and so on. For example, if you write a print function that is used for text editing software, your function can be called a print component, or you can provide a print service to a word processing system, and your service can also be called a print component. This is an understanding of Lao Wang, not necessarily comprehensive.

that specific to the Internet Distributed system, some of the components are more common, may be a key role in our architecture design. Lao Wang brought them up to talk with him today. Okay, let's get to the chase .

Naming Services ( naming Service ):

in the previous article, when we talked about the distributed architecture (if you have not read the previous article, please pay attention to Lao Wang's: Simplemain for reference), mentioned him many times. What the hell is he? We all know the DNS bar (that is , when you ping www.baidu.com , the domain name into the IP of the thing), the role of the naming service is similar to the name of our internal system into IP The service. Then why do you have him? And how important is his role? Cochie urgent, slowly look down ~

First Stage :


when we do not have much service, the machine is not many, we have two services, respectively, called Service-a(hereinafter referred to as A) and Service-b(hereinafter referred to as B), they each have two servers, corresponding to different IP and ports. When a is called b , the simplest way is to fill in the IP and Port of all the machines of B on each machine of a,right. It seems so perfect.

But when we scale up, the problem comes.

Second Stage :

when I expand or the service pressure increases, we need to add machines, for example, now we add a B 's server number 3, what do we need to do? First build server 3 of B and then add the IP and Port of server 3 to the part of all machine configuration b of a. It doesn't seem like a problem, does it?

if we want to increase the number of Taiwan and Taiwan B?

In addition, if we add a new service C,D,E..., and then on average there are ten anda server, then we are going crazy!

Similarly, if our server has problems, to replace, to go offline, is it all to configure again?

when I was in Baidu, the diagram of the company module is this. To add or modify a module, all the modules associated with it need to be modified. Sometimes for a small function, you need to alarm N systems. Really let people collapse ~ later, back to hundred words after chopping, is to do the first systematic work, is to develop on-line naming Service. Now look back, this decision is very correct.

So how does he work?

Registration ( Register):

all the services, will be their name, IP,Port is registered to Naming-service(hereinafter referred to as NS), and NS maintains N queues:


Queuing 1:queue A:

(192.168.1.101:80) (192.168.1.102:80)


Queuing 2:queue B:

(192.168.2.101:80) (192.168.2.102:80)

when new machines are added, they themselves register themselves with the NS, no one else cares at all.

Query ( Query):

when A need to request B , ask NS: "Buddy, I want to request B 's service, give me a his IP and Port bar." The NS then returns an available IP and Port to a based on the load balancing strategy (e.g. random, consistent hash, etc.) . Even if B has a new machine to join,a does not care, because there is NS this nanny in.

Health Check ( Healthy Check):

If a machine breaks down, or if the service crash ,NS will detect a service problem in a short period of time through the heartbeat packet. At this point,the NS will remove the problematic IP and Port from the available queue to ensure reliable and stable service in the queue.

well, with the above three features, the naming service is basically able to provide services. But there is one problem, how do our other services find the naming service? We generally put the naming service behind a virtual server (such as LVS) and provide a virtual IP to the outside . In this way, each server only needs to configure a fixed IP to know. If you want to be more secure, you can configure such a set of (such as 3 )IP, so that you can do the IP hot standby.

NS occupies a central place in our distributed architecture and provides a centralized service. Working in a general architecture is perfect. However, for some non-centralized systems (such asNS internal), this is unacceptable, it is also a program, for example, we will use the Gossip protocol, help us remove the central design. Here is not a detailed explanation, the back to find time to specifically to tell ~

Message Queuing ( Message Queue ):

When Lao Wang first contacted the message queue, the first reaction was that this is not a system to push messages to the phone? It's funny to think back now. The system can actually be used to push messages to a mobile phone, but he has a wider range of applications and meanings.

when in Baidu, there is a system inside called Cm-transfer, which is used to send information to each system, is actually a message queue. Later, the old Wang back to the hundred words, did another right decision, is to write a message queue (hereinafter referred to as MQ), to solve a lot of large traffic or centralized submission problem, has been enjoying his blessing so far. What the hell is MQ? What kind of problems do you solve?

First stage:

When our service is not big, the user's submission may be very small, such as a minute to send a few posts, then we can directly insert data into XXX, no pressure. At this point, the architecture is simple, that is, the logical layer connects directly to the database.

What if, at this time, traffic suddenly increases (for example, if our users are exploding or have a second-kill business)? There are many ways to do that, roughly as follows:

1, do business and data segmentation: The business is cut into orthogonal multiple, or DB by vertical or horizontal slicing;

2, the logical layer to do data caching, timing merge and write back to the database;

3, write their own proprietary storage system, replace the database;

4. If you donot have strong consistency requirements, you can introduce MQto make synchronous commit into asynchronous commit.

The first three methods are not discussed here for the moment, and we'll focus on how the fourth approach works.

Phase II:

if the C language students know thatthe fwrite function is a buffered write function, you call him to write data into a disk file, this function does not really write data to disk, he built a buffer, Write the data in memory, and then write it into the disk at a certain time.

(narrating: You think he wrote you a disk, actually no, he -- bullying! Cheat! The You! )

He does this with his reasoning that if you have a lot of frequent write operations, he will reduce the frequency of your writing disk, thus improving the efficiency of writing. But the problem is that your data is likely to be lost.

that in the actual Internet Server development work, we can actually learn from this idea, when there are a lot of write operations, we will write data into a buffer space, and then by this buffer space, the data slowly forwarded to the real writer , let him write the data to a file or database. This buffer space is our message queue (MessageQueue). To write the data, you can see a message, put into this queue.

that just said fwrite has a problem, that is, if the program is dead, the data in memory is lost. To avoid similar problemsin MQ, MQ typically does the serialization of the content, receiving messages, successive writes to the disk, to ensure the integrity of the data.

then some classmates will ask, since MQ is also to write disk, the original storage is also write disk, then why can MQ improve efficiency, solve the problem of high concurrency write? There are roughly two reasons for this:

1. MQ writes the message order into the disk, while the logical write operation is mostly randomly written to disk. For traditional mechanical hard disk, the efficiency of sequential writing is much higher than that of random writing; even now , the performance of random write is not as goodas the sequential write performance of traditional mechanical hard disk .

2,MQ write is completely illogical, basically does not consume CPU and memory, and logical write generally contains logic, to do the operation, so whether the CPU, memory or disk read and write consumption is much larger than MQ in the order in which they were written.


In addition to the ability to smooth write peaks, MQ can also do data forwarding, the same data can be sent to multiple recipients.

This function can also be used for data backup, data playback and so on.

now the more popular open source MQ has RabbitMQ,ActiveMQ,Kafka and so on. These MQ Lao Wang was roughly researched, and later for a variety of reasons (such as the need to integrate thrift,naming service changes, need to support a variety of modes of distribution, playback, etc. do not meet the needs), In the end, Lao Wang spent two weeks writing a Java version of MQ(later possible, ready to take out open source), has now been stable online for nearly two years, receive send data should have tens of billions of (with hundred words chopped pot friends, your back of the word data in the inside OH ~). The general design idea is as follows:

here is not explained in detail, if you are interested in the future, there is a chance that the Lao Wang to do a special issue MQ 's share.

Just when Lao Wang was talking about MQ, he accidentally said one thing: thrift. He's a Facebook RPC framework for resolving logical calls and data exchange between distributed systems. So what exactly is RPC?

Remote Invocation ( rpc-remoteprocedure Call ):

Just when Lao Wang was talking about MQ, he accidentally said one thing: thrift. He's a Facebook RPC framework for resolving logical calls and data exchange between distributed systems. So what exactly is RPC?

when two systems (or two modules) want to Exchange data, how do they do it? The simplest,System-1 establishes a socket to connect to the System-2, and then uses the Send and recv functions to exchange data.

But there are a lot of questions:

1, the code is not very annoying to write it?

2, each send and recv is not to discuss and specify the data format it?

3, if the socket failed to establish, how to do? Do you want to try it again?

4, connection and read timeout how to manage it?

5、......

In order to solve the above problems, when doing distributed systems, generally will do a set of remote data communication protocol, so that the exchange of data with the call local functions, and the bottom of the communication protocol to help you network connection, protocol format, read and write management to help you do, you just care about your logic. Does this greatly reduce the cost of development and improve the stability of the system?

now the common RPC tools include Google 's protobuf,Facebook 's thrift, and more. It is understood that the goose factory inside is used protobuf.

Compared to Protobuf and Thrift , two have implemented cross-platform data protocol transformations, converting data into binary, and then converting binary to different language data. The difference is thatthrift himself with the Client-server, can be directly based on this set of CS Framework Development services. and protobuf need to write the CS framework (or use a third-party framework). In addition,thrift supports more languages than protobuf .

after a long and cautious comparison, Lao Wang will finally Thrift introduced into our system. It has been running for so long that it is more stable. Lao Wang probably read the code of Thrift 3 times, also changed some of them to adapt to the project needs, the following is the old Wang painted thrift A broad architecture diagram:

The back is also ready to do a special period of sharing, introduction of Thrift(Thrift Code is very good, layered structure written quite good, highly recommended reading ~).

Distributed Cache :

talk about the Internet, we must talk about the cache, because there is no such thing, we have a lot of servers are unable to carry.

Cache is used to cache some unchanging data, using high-speed access to the memory to replace the disk's low-speed read, try to use the fastest speed, the information back to the user. The detailed work diagram is as follows:

1, the logical system first query the cache whether there is data, if any, directly return. If not, proceed to Step 2 ;

2, from the database query, if not, then directly return to empty; otherwise, proceed to Step 3;

3, the data written to the cache , so the next time to query the time, there is data.

because the cache data is generally placed in memory, so read and write speed is higher than the DB 1-2 order of magnitude , that is, 10-100 times the speed of the db. Most of the sites are now using the cache, it is difficult to have a website directly with the DB to carry the residence under pressure (except if the pressure is not big). However, using the cache also poses other problems, such as the complexity of schema and code logic,cache invalidation, and so on. But overall, the benefits of the cache outweigh his problems.

because the cache generally uses memory, the single-machine cache can easily receive a memory-size limit. We typically use distributed cache clusters to place caches on multiple machines. Then there is a problem, where is the data for a key? How do you pick it up when you want it back?

There are two ways of doing this:

1.Centralized Management: Add a service similar to the naming service to tell the caller where the data is.

2, decentralized management: the caller through a certain algorithm (such as consistent hash and Gossip), through a certain calculation, to determine the location of data storage or acquisition.

Our common distributed caches are memcache,Redis, and so on. These are very well-known NoSQL databases, which are well suited as caches. Their management of data is generally based on a decentralized management model. The client determines where the data is stored by a consistent Hash algorithm. We are interested to read the source code (Lao Wang has not read, the back is ready to read ~).

The division line = = = = =


Well, this time we've shared some common distributed components that can give our architecture a lot of flexibility to improve the performance and capacity of our systems. There are many other components that are not covered here. Next, Lao Wang will slowly give you a detailed description of each component of the working principle and design problems encountered in the development. Interested students hurriedly pay attention to the old Wang's bar:simplemain.

PS: Next week will talk to you about some common Internet function systems, such as: Weibo and friends of the content aggregation system, content retrieval system and so on. Should be very interesting ~

that's it today ~ Next Sunday Lao Wang continues to give you the technology of the light ~


Internet server Technology--How to learn (b)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.