1.1 A case study:design and implementation of a simple Twitter clone using only the Redis Key-value store as data Base and PHP
This chapter will describe a design and implementation that mimics Twitter applications, using PHP and Redis as a unique database. The programmer community often key-value storage as a special database and cannot be used in lieu of relational databases in Web application development. The credit will prove the opposite conclusion.
Our Twitter emulation, called Retwis, is simple and excellent, and can be posted to n Web servers and M Redis servers at a very small cost. You can find the source code from here.
Use PHP for example because PHP may be quicker, and the same (or better) result is the same for Ruby,python,erlang. 1.2 Key-value Stores Basics
The essence of key-value storage is the ability to store data, which contains the value of a key. When we know the key of the store, we can get the data later. There is no way to search for anything with value. For example, I use set to store the value bar on key keys:
SET Foo Bar
Redis will permanently save our data, so we can later query "what the value stored on the key foo is." , and Redis will return bar:
Get Foo => bar
Other Key-value stores commonly used operations are to delete the specified key and associated values of the del,set-if-not-exists (called Setnx on the Redis) to set a key when it does not exist, And incr can increase the number of atoms stored on the specified key:
SET Foo 10
INCR Foo => 11
INCR Foo => 12
INCR Foo => 13
1.3 Atomic Operations
It's still simple, but it's a little special on incr. Imagine why this is provided and we can implement it by ourselves with little instructions. After all, it's simple:
X = Get foo
x= x +1
SET foo x
The problem is that this approach behaves well at a client's time, and X has only one value at any time. See what happens if two machines are accessed at the same time:
x = Get foo (yields 10)
y = Get foo (yields 10)
x = x + 1 (x is now 11)
y = y + 1 (y is now 11)
SET foo x (foo is now 11)
SET foo y (foo is now 11)
Something's wrong. We added the value two times, but the key that should change from 10 to 12 now has 11. This is because the incr operation is done through Get/increment/set, not an atomic operation. and Redis,memechaed. The supplied INCR is implemented by the atom, and the server protects it in the get-increment-set operation for the entire time required to avoid simultaneous access.
The difference between Redis and other key-value storage is that it provides more incr-like operations that can be combined to accomplish complex problems. That's why you can use Redis to implement the entire Web application without needing a SQL database and not go crazy. 1.4 Beyond Key-value Stores
In this section we'll see what it takes to build a Twitter-like Reids. The 1th is to know that the value of Redis can be more than string. Redis values support lists and sets, and there are atomic operations to control these more advanced values, so multiple access even on the same key is secure. Starting from list:
Lpush MyList A (now MyList holds one element list ' a ')
Lpush MyList B (now MyList holds ' b,a ')
Lpush mylist C (now MyList holds ' c,b,a ')
Lpush means left push, which adds an element to the left (or the head) of the list stored by mylist. If the MyList key does not exist, an empty list is automatically created before the push. As you can imagine, the Rpush command adds elements to the right side of the list (the tail).
This is very useful for our imitation Twitter. For example, user updates can be placed on a list that exists username:updates. Of course there are operations to get data or information from these lists. For example, Lrange returns a section of the list, or the entire list.
Lrange mylist 0 1 => C, b
Lrange uses the 0 start index, the first element index is 0, the second is 1, and so on. The parameter of the instruction is Lrange key First-index last-index,last The index argument can be negative,-1 represents the last element of the list,-2 is the penultimate, and so on. So we can get the entire list like this:
Lrange mylist 0-1 => C, B, a
Another important instruction is Llen, which returns the length of the list, and Ltrim,ltrim compares like Lrange but instead of returning the specified range, it shrinks the list, so it's like "Get a section of MySQL and then set it to a new value" atomic operation. We only use these list actions, but be sure to look at the Redis documentation to find all the Redis supported list operations.
1.5 The Set data type
Not only List,redis also supports set, unordered set of elements. You can add, remove, and detect whether a member exists and perform the intersection of different collections. You can, of course, ask for the number of elements in a list or collection. Give some examples to be clearer. Remember that Sadd is an operation added to set, Srem is an operation that is removed from set, Sismember is an operation that detects whether it is a member, Sinter is performing an intersection operation, the SCard is to get the potential (number of members) of the set, and Smember returns all members of the set.
Sadd MySet A
Sadd MySet b
Sadd MySet foo
Sadd MySet Bar
SCard MySet => 4
Smembers MySet => bar,a,foo,b
Note that the order of the returned members of the Smember is not the order in which we joined because set is a unordered set of elements. If you want to store in order the best use list. Some operations on the set:
Sadd Mynewset b
Sadd Mynewset foo
Sadd Mynewset Hello
Sinter MySet mynewset => foo,b
Sinter can return the intersection of a set, but not limited to two sets, you can query 4, 5, or 10,000 sets. Finally, let's see how Sismember works:
Sismember myset foo => 1
Sismember MySet notamember => 0
Ok, I think we can start coding.
(code too much, nothing written here in value, deleted) 1.1 Prerequisites
If you have not yet downloaded the source code, please download Retwis first. A simple tar.gz file with several PHP files inside it. Implementation is simple. You can find a PHP client library (redis.php) to connect to Redis server from inside. This library file is written by Ludovico Magnocavallo, you can use it freely in your project, but the updated version of the library file, please download from Redis release.
Another you may need is a working Redis server, get the source code, compile with make, and use the./redis-server to run. Bubble on your machine, no need to set anything.
1.2 Data Layout
If you use a relational database, this step is to plan what data should be the table, index, and so on. We don't have a watch and how to design it. We need to identify which keys our objects need and what value they are going to hold.
Start with the user. We certainly need to represent user, through username username,userid,password,followers and following user, and so on. The first question is how to tag users within our system. Username is a good idea because he is the only, but also very big, we want to save memory. So just as our database is relational, we can manage a unique ID to each user. All other information about the user is referenced by ID. It's simple to do because we have an atomic operation INCR. When we create a new user, we can look like this if the user is called "Antirez":
INCR Global:nextuserid => 1000
SET Uid:1000:username Antirez
SET Uid:1000:password p1pp0
We use the Global:nextuserid key to obtain a always unique ID for each new user. Then we join other user data with this unique ID. This is a key-value design pattern . Keep this in mind. In addition to the fields already defined, we need some more material to complete the definition of a user. For example, sometimes it is useful to get IDs through username, so we also set this key:
SET Username:antirez:uid 1000
It may seem strange at first glance, but remember that we can only access data through key. It is not possible to tell Redis the specified value returns key. This is also our strength, and this new paradigm forces us to organize data so that, in the case of relational databases, everything is accessed through primary keys. 1.3 Following, followers and updates
This is another hub that our system needs. Each user has followers users and following users. We have a perfect data structure for this. It's ... set. So we add two new fields to the plan:
Uid:1000:followers => Set of UIDs of all followers users
Uid:1000:following => Set of UIDs of all following users
Another important thing we need is that we can add updates to the user's home page display. We need to access this data from new to old in chronological order, so the most perfect value for this job is list. Basically each new update will lpush to the user's updated key, and through lrange we can work on paging. Note that the "Update" and "publish" we use are replaceable because the update is to some extent a "small release":
Uid:1000:posts => a List of post IDs, every new post is lpushed here.
1.4 Authentication
Ok, we have the user more or less information, but in addition to the right to authentication. We will be simple and robust to control: we don't need PHP sessions or anything like that, our system has to be ready for distribution on different servers, so we control the entire state on the Redis database. What we need to do is to have a random string as a verified user's cookie, and a key to tell us the client's corresponding user ID for this random string. We need two keys to implement this approach:
SET Uid:1000:auth fea5e81ac8ca77622bed1c2132a021f9
SET AUTH:FEA5E81AC8CA77622BED1C2132A021F9 1000
To verify a user, we did something Simple (login.php): Get the username and password via the login form to obtain the username and password check if the Username:<u Sername>:uid key actually exists check if the key value of USERNAME:<USERNAME>:UID exists if it exists we have the user ID, (i.e. 1000) If it exists, we have the user ID check if Uid:1000:password matches, if not, the error message checks to see if Uid:1000:password is matched, if it does not match, the error is Ok authenticate D! The Set "Fea5e81ac8ca77622bed1c2132a021f9" (the value of Uid:1000:auth) as "auth" cookie checksum passes. Set "Fea5e81ac8ca77622bed1c2132a021f9" (Uid:1000:auth) as a cookie "auth"
This is the actual code:
Code
This is performed every time a user logs on, but we also need a feature isLoggedIn to check if a user is logged in. This is the logical step to implement isLoggedIn: Get the "auth" cookie from the user. If There is no cookie, the user isn't logged in, of course. Let's call the value of this cookie <authcookie> get the user cookie "Quth", if not, the user is not logged in, of course we called this cookie value <authcookie> Check if auth:<authcookie> exists, and what the value (the user ID) is (1000 in the exmple). Detects whether the auth:<authcookie> exists, and his value in the order of sure check that Uid:1000:auth matches. Check to see if the Uid:1000:auth matches Ok the user is authenticated, and we loaded a bit of information in the $User global variable. Ok, the user checked over, we load a bit of user's global information.
The code is simpler than the description, probably:
Code
Loaduserinfo is a separate method that is not good for our applications, but it is a good template for complex applications. The only thing that checks out is logout. How do we do logout? Simply, change the random string value of the Uid:1000:auth, remove the old auth:<oldauthstring> and place a new auth:<newauthstring>.
Important : The logout process explains why we are not just checking users after finding auth:<randomstring>, but double checking whether it matches Uid:1000:auth. The real check string is the latter one, and auth:<randomstring> is just a checksum key, it may be variable, or if the program has bugs or script interrupts, we will have multiple auth:<something> Key points to the same user ID. Logout's code is as follows: (logout.php)
Code
This is just described, it is easy to understand.
1.5 Updates
Update, or release, quite simply. To create a new update in the database, we did the following things:
INCR Global:nextpostid => 10343
SET post:10343 "$owner _id| $time | I ' m have fun with Retwis '
As you can see, the user ID and time of the release are in this string, and we don't need to look up the time or user ID in this sample program, so it's a good idea to package all the information into the release string.
After we created the publication, we got the ID of the publication. We need to lpush this ID to the user of each following post and, of course, the author's release list. This is update.php and see what it does:
Code
The core of the feature is foreach, we use Smemeber to get all the folloer of the current user, and then use Lpush for each follower to uid:<userid>:p OSTs.
Note that we also maintain a timeline for all publications. So just lpush this post to Global:timeline. Look at it, don't you think it's strange to sort a chronological data by the order of SQL by. I think it's strange.