Kaushik Sathupadiprogrammer. Creator. Co-founder. Dad.see all my projects and Blogs→a plain 中文版 introduction toCAP theorem
You'll often hear about the CAP theorem which specifies some kind of a upper limit when designing distributed systems . As with most of my other introduction tutorials, lets try understanding caps by comparing it with a real world situatio N.
Chapter 1: "Remembrance Inc" Your new venture:
Last night when your spouse appreciated to remembering her birthday and bringing she a gift, a strange idea strikes yo U. People is so bad in remembering things. And you ' re sooo good at it. So why isn't start a venture that would put your talent to use? The more your think about it and the more you like it. In fact you even come up with a news paper ad which explains your idea
Remembrance inc! -Never forget, even without remembering! Ever felt bad, so, forget so much? Don ' t worry. Help is just a phone away! When you need to remember something, the just call 555--55-remem and the tell us about you need to remember. For eg., call us and let us know of your boss ' s phone number, and forget to remember it. When you need-know it back: Call back the same number[(555)--55-remem] and we'll tell you what ' s your boss's phone number. Charges:only $0.1 per request
So, your typical phone conversation would look like this:
- Customer:hey, Can you store my neighbor ' s birthday?
- You:sure. When is it?
- customer:2nd of Jan
- You: (write it down against the customer's page in your paper note book) Stored. Call us any time for knowing your neighbor ' s birthday again!
- Customer:thank you!
- You:no problem! We charged your credit card with $0.1
Chapter 2:you Scale up:
Your Venture gets funded by YCombinator. Your idea was so simple, needs-a paper notebook and phone, yet so effective that it spreads like wild fire. You start getting hundreds of call every day.
And there starts the problem. You see this more and more than your customers has the to-wait in the queue to speak. Most of them even hang up tired of the waiting tone. Besides when you were sick, the other day and could don't come to work for you lost a whole day's business. Mention all those dissatisfied customers, who wanted information on the day.
You decide it's time for your to scale up and bring in your wife to help you.
Your start with a simple plan:
- You and your wife both get an extension phone
- Customers still dial (555) –55-Remem and need to remember only one number
- A PBX would route the a customers call to whoever are free and equally
Chapter 3:you has your first "bad Service":
In the implemented the new system, you get a call from your get a call from your trusted customer Jhon. How it goes:
- Jhon:hey
- You:glad you called "Remembrance inc!". What can I do for you?
- Jhon:can are I flight to New Delhi?
- You:sure. 1 sec Sir
(You look up your notebook)
(wow! there is no entry for "Flight date" in Jhon ' s page)!!!!!
- You:sir, I think there is a mistake. You never told us on your flight to Delhi
- jhon:what! I just called you guys yesterday! (Cuts the call!)
How does that happen? Could Jhon be lying? Think about it for a second and the reason hits you! Could Jhon ' s call yesterday reached your wife? You go to your wife ' s desk and check her notebook. Sure enough it ' s there. Your wife and she realizes the problem too.
What a terrible flaw in your distributed design! Your Distributed system is not consistent! There could always is a chance that a customer updates something which goes to either you or your wife and when the next C all from the customer are routed to another person there'll not being a consistent reply from remembrance inc!
Chapter 4:you Fix the consistency problem:
Well, your competitors could ignore a bad service, and not you. You think all night in the bed when your wife are sleeping and come up with a beautiful plan in the morning. You wake up your wife and tell her:
"Darling This was what we were going to does from now"
- Whenever any one of us get a call for an update (when the customer wants us to remember something) before completing the CA ll we tell the same person
- This is the both of us note down any updates
- When there was call for search (when the customer wants information he had already stored) we don ' t need to talk with the OT Her. Since Both of us has the latest updated information in both of our note books we can just refer to it.
There is a problem though, you say, and so is an "update" request have to involve both of us and we cannot work in Parallel during that time. For eg. When you get a update request and telling me to the update too, I cannot take the other calls. But that's okay because most calls we get anyway is "search" (a customer updates once and asks many times). Besides, we cannot give wrong information at any cost.
"Neat" Your wife says, "but there was one more flaw in the this system, so you haven ' t thought of. What is the if one of us doesn ' t particular day? On this day, then, we won ' t is able to take "any" Update calls, because the other person cannot be updated! We'll have availability problem, i.e, for eg., if a update request comes to me I'll never be able to complete th At call because even though I has written the update in my note book and I can never update you. So I can never complete the call! "
Chapter 5:you come up with the greatest solution ever:
You being to realize a little bit in why distributed system might is as easy as you thought at first. Is it, difficult to come up with a solution this could be both "consistent and Available"? Could is difficult for others, and not for you!! Then next morning your come up with a solution that your competitors cannot think of in their dreams! You wake your wife up eagerly again.
"Look". "This was what we can do to be consistent and available". The plan is mostly similar to what I told you yesterday:
- i) whenever any one of us get a call-for-a update (when the customer wants us to remember something) before completing the Call, if the other person is a available we tell the same person. This is the both of us note down any updates
- II) But if the other person was not a available (doesn ' t report to work) we send the same person a email about the update.
- III) The next day while the other person comes to work after taking a day off, He first goes through all the emails, update s his note book accordingly. Before taking his first call.
Genius! You wife says! I can ' t find any flaws in this systems. Let's put it to use. Remembrance inc! is now both consistent and available!
Chapter 6:your wife gets angry:
Everything goes well for a while. Your system is consistent. Your system works well even when one of your doesn ' t report to work. But what if Both of the doesn ' t update the other person? Remember All those days you ' ve been waking your wife to early with your greatest-idea-ever-bullshit? * What if the Your wife decides to take calls, but are too angry with you and decides don't to update? Your idea Totally breaks! Your is good for consistency and availability are not Partition tolerant!*
You can decide to is partition tolerant by deciding don't to take any calls until your patch up with your wife. Then your system is not being "available" during that time ...
Chapter 7:conclusion:
So let's look at the CAP theorem now. Its states, if you were designing a distributed system can get cannot achieve all three of consistency, Availabi Lity and Partition tolerance. You can pick only:
- Consistency:you customers, once they has updated information with you, 'll always get the most updated information when They call subsequently. No matter how quickly they call back
- Availability:remembrance INC always is available for calls until any one of your (or your wife) report to work.
- Partition Tolerance:remembrance INC would work even if there are a communication loss between you and your wife!
Bonus:eventual consistency with a run around clerk:
Here's another food for thought. You can have a run around clerk, who would update other's notebook when one of the your ' s or your wife ' s note books is updated. The greatest benefit of this was that, he can work in background and one of the your or your wife ' s "Update" doesn ' t has to B Lock, waiting for the other one to update. Many NOSQL systems work, one node updates itself locally and a background process synchronizes all other nodes accordingly ... The only problem is so you'll lose consistency of some time. For eg., a customer's call reaches your wife first and before the clerk have a chance to update your notebook, the custome R ' calls back and it reaches. Then he won ' t get a consistent reply. But the said, this is not at all a bad idea if such cases is limited. For eg., assuming a customer won ' t forget things so quickly this he calls back in 5 minutes.
That's CAP and eventual consistency for your in simple 中文版:)
Tweet Post by Kaushik Sathupadi, founder Cull.io-a platform to recruit web developers by having them develop A web appli cation Copyright©kaushik Sathupadi 2009-2012
Go A Plain 中文版 Introduction to cap theorem