Riak introduction, Part 1: Load-heavy Cache Server integrating Riak into Web Applications

Source: Internet
Author: User
Tags apache solr riak solr ruby on rails

Reprinted: http://www.ibm.com/developerworks/cn/opensource/os-riak2/

Introduction

Some types of data are suitable for the cache access mode. For example, an online betting site has an interesting load feature: users often request a loss rate and a ticket, and such information is rarely updated.

Other articles in this series

View
More articles in the Riak introduction series.

These situations require highly scalable systems with the following features to meet high load requirements:

  • The system acts as a reliable cache to reduce the demand for application servers and databases
  • Cache items are searchable, so you can update them or invalidate them.
  • Any solution can be easily integrated into existing sites

Riak is a good choice for such a solution.

Riak is not the only candidate for implementing such a cache solution; there are many different caches available. Memcached is a popular one. However, unlike Riak, memcached does not provide any data replication type, which means that if the server that saves a specific project is down, this project becomes unavailable. Another popular type of key/value storage is redis, which can also be used as a cache and supports Replication through master-slave configuration. Riak does not have a master (node) concept. Therefore, this makes the system more flexible for failures.

Back to Top

Website integration

Any solution needs to be easily integrated into existing websites. This is important because it is not always possible (or even necessary) to migrate all your existing data to Riak. As mentioned above, some types of data are suitable for caching. In the case of one key/value storage, this is especially true if you access data through one primary key. This is a type of data that is more suitable for migration to Riak.

As in this series
Riak introduction, Part 1: language-independent HTTP APIs, which provide a large number of client libraries in PHP, Ruby, Java, and other languages. These libraries provide an API that makes Riak integration very simple. In this example, I demonstrated the use of the PHP library to show how to integrate Riak with an existing website.

Figure 1 shows the settings to be considered in this example. I ignored details such as server Load balancer and firewall. In this example, the server itself only installs a simple front-end box of the LAMP stack.

I will assume that Riak is only used internally (cannot access it from outside) and runs in a non-hostile environment, so there are no security-related issues such as identity authentication. This assumption is not as bad as it looks, because Riak does not have any built-in authorization in any case; you should really delegate security measures such as identity authentication to the application.

Figure 1. A simple website integration

The following is a basic example to show how you can integrate Riak into your existing website. You will create a simple form. When submitting a form, based on the value entered in the form, the form will use the PHP client to store the objects in Riak.

Figure 2 shows a simple form example where the Administrator may use it to create a bet in the system. Use HTML to create the form and
The PHP script in Listing 1 executes
POST;
Similar forms in source code serve as a starting point. The "key" field entered in the form will be used as the key of the object stored in the bucket.

Figure 2. example form for creating a bet

The sample PHP code in Listing 1 shows how to use the PHP client library to integrate Riak. Change the PHP client library path (specified in require_once) to the location where you install it. In this example, I just put it in the same directory as the PHP script. By default, all client databases are expected to provide Riak on port 8098.

Listing 1. Example PHP code integrating Riak

<?phprequire_once('./riak.php');# Could do check here to see if the current user has the# appropriate credentials ? delegated to application.$client = new RiakClient('192.168.1.1', 8098);$bucket = $client->bucket('odds');$bet = $bucket->newObject($_POST['key']);        $data = array(    'odds' => $_POST['odds'],    'description' => $_POST['description']);$bet->setData($data);# Save the object to Riak$bet->store();echo "Thanks!";?>

Save the code as a PHP file (named as you like) and upload it and form to a location on your website, for example, http://www.yoursite.com/riak-test.php. Enter the sample form and submit it. To prove its validity, try to use the key you entered in the form when creating the project to retrieve it directly from the Riak (see
List 2 ).

Listing 2. retrieving items from Riak

$ curl -i http://localhost:8098/riak/odds/<key>...{ "odds":"", "description":"" }

Although the integration example uses the PHP client, the method is similar to other languages or application frameworks such as Java or Ruby on Rails.

Back to Top

Provide services directly to requests

In addition to using the client library to integrate the Riak into the current settings, you can also directly provide services from the Riak to the user and use it as a simple HTTP engine. To demonstrate this, I will create a simple demo to show you how to directly request the page from Riak.

Download
Source code. Make sure that the Riak is running and then execute the script load. Sh. This script copies all HTML and JavaScript files to a bucket named demo. This example uses the Javascript client.

To view the demo, open the following URL in your browser:http://localhost:8098/riak/demo/demo.html.

If you enter some values in the form to create a bet and submit the form, a JSON object is stored in Riak. The property of the object corresponds to the field in the form. You will be redirected to a page that displays the value of the object you just created.

Listing 3 shows the code for creating an object using the value you entered.key,oddsAnd
descriptionThe equivalence value comes from the value entered in the form.

Listing 3. Example usage of the Javascript client library in Riak

client.bucket("odds", function(bucket) {    var key = $('#key').val();    bucket.get_or_new(key, function(status, object) {        object.contentType = 'application/json';        object.body = { 'odds': $('#odds').val(), 'description': $('#desc').val() };        object.store(function(status, object, request) {            if (status == 'ok') {                window.location = "http://localhost:8098/riak/odds/"+key;            } else {            alert("Failed to create object.");        }        });     });});

As mentioned above, I assume that Riak runs in a trusted environment. In this case, the pages added by the Riak for storing and retrieving projects do not cause security issues. However, you do not want this function to be completely exposed to the Internet without some form of authentication.

Although this is a simple example, it helps you understand how Riak can provide services directly to page requests. For example, you can use technologies such as jsonp or cross-source Resource Sharing (Ajax requests are restricted by the same domain policy on the same server where the page resides, you can also use a proxy to send a request to Riak through the server to directly include data stored in Riak on your existing web page to obtain the required data.

Back to Top

Use Riak as cache

Cache is used to provide quick access to data. If the cache contains the requested data (Cache hit), the application can quickly provide services to the request by reading the value from the cache, which is faster than retrieving the value from the database. If there is no data in the cache (the cache is not hit), the application must usually retrieve data in the database. Generally, the more requests you can request from the cache, the faster the system will. Riak has multiple features, which makes it a good choice for implementing the cache solution.

One of these features is the pluggable backend storage. The storage backend determines how to store data. There are several available storage backends, but I don't plan to introduce them all here (for more information, see
References ). The default storage backend is bitcask, which is an Erlang application that provides an API for storing and retrieving data supported by the hashed list, which provides fast data access; data is permanent.

One backend may be more closely related to this article: Memory backend. The memory backend uses a memory table to store all its data (it uses the Erlang ETS table internally) and, when enabled, make the Riak act like an LRU cache with a set validity period. Compared to the data that must be retrieved on the disk, the advantage of memory storage is that it is much faster. When the data is stored in the memory (it is not permanent) and a node fails, the data stored in the node will be lost. If you use it as a cache, this is not a problem (applications can always retrieve data from the database), just as you use Riak as the primary data storage. Riak replicates data across multiple nodes in the cluster, so it is still available.

Riak comes with the memory backend. To use the memory backend, open the app. config of each node in the cluster and locate the attributes.storage_backendAnd
riak_kv_bitcask_backendChangeriak_kv_memory_backend. Now
Add the code in Listing 4 to the end of the file.

Listing 4. Using the memory backend

{memory_backend, [    {max_memory, 4096},%% 4GB of memory    {ttl, 86400}        %% Time in seconds]}

Change the value to a value suitable for your settings. Restart the nodes in the cluster.

You can run multiple storage backends in the Riak cluster. This is useful because it means that different backend servers can be used for different buckets. For example, you can configure a bucket (called cache) to use the memory backend, but bitcask is used for other buckets (buckets that should store data.

Since you have made the Riak settings behave like a cache, you need some methods to access the data in the cluster to update it, or invalidate it for some reason (before its validity period ends ).

Back to Top

Search for something?

As you can see, when you use the HTTP interface to retrieve data stored in Riak, You need to construct a URL, including the bucket name and the key of the object to be retrieved, then execute an HTTP
GET. This is enough when you know what the key is! However, sometimes you do not know the key of the object to be retrieved, or you want to retrieve a group of objects that meet certain conditions. Then, you need a method to search for the objects saved in the cluster.

You have seen how to query data by running a map/reduce job in a document stored in the cluster. Generally, the query execution time is proportional to the number of documents in the cluster. The more documents, the longer the query time is. Time-insensitive queries are not a problem. In this case, the user does not expect an immediate reply to the query. For operations like searching, it is not feasible to search all documents in a dynamic manner. The result may be obtained in minutes or hours!

Fortunately, Riak already has a solution to this problem: Riak search. Riak search provides the functions required to search documents stored in the entire cluster. Searching for this topic is too large for this article to be discussed in depth, but at a high level, it works like this: The document is tagged (Riak search uses a standard Lucene analyzer ), and added to a reverse index. Then, the index is queried based on the search items entered by the user. When new files are added, they are also indexed and added to the index.

Riak search is disabled by default. You need to enable it before you can use it. On each node in the cluster, open REL/riakn/etc/APP. config to locate the attribute.
riak_searchAnd set it to true. You need to restart the nodes in the cluster.

Riak hooks pre-commit with post-commit to allow you to specify the name of the function to be run before and after a document is added to a bucket. For example, before adding a document to a bucket, you may want to check whether the document has specific required fields. To search for a document, you must first index it. To do this, you need to install a pre-commit hook on the bucket where the document is stored. To do this, run the following command:$ rel/riak/bin/search-cmd install <bucket name>

This will install a pre-commit hook on the bucketriak_search_kv_hook. Now, whenever a document is added to the bucket, it is analyzed and added to the index. The blank analyzer is the default analyzer. It processes characters as tags Based on white spaces and then marks them as indexed. You can also define your own analyzer.

In many cases, Riak search knows how to index your data. For example, if a JSON object is added to a bucket out of the box, the value of each attribute is indexed and can be queried using the attribute name in the query string. For Search examples, see
Listing 5. For more complex structures, you can define your own schema and tell Riak search how to index data.

After you have indexed some documents, you must be able to query them. One way is to run the query from Erlang shell. For example
In listing 5, you can search for the odds buckets of all bets related to horse racing. You can query the description attribute of a storage item to complete the search.

Listing 5. Search the odds bucket for betting related to horse racing

$ rel/riak/bin/riak attachsearch:search(<<"odds">>, <<"description:horse">>).

In addition, Riak search provides a SOLR-compatible http api for document search. Apache SOLR is a popular enterprise search server with a rest-like API. By making the API compatible with SOLR, you should be able to disconnect SOLR (if you use it) and use Riak search to support searching. For example, to use the SOLR interface to search for the odds of a specific activity, you can do this:$ curl "http:localhost:8098/solr/odds/select?start=0&q=description:horse"

With the search settings, you can locate the projects in data storage even if you do not know the primary keys of the projects you are searching.

Back to Top

Conclusion

Other articles in this series

View
More articles in the Riak introduction series.

Riak's ability to expand and reliably copy data (along with other features such as search) makes it an ideal choice for cache solutions for heavy-load sites. You can easily integrate it into an existing site. With its ability to directly provide services for requests, you can use Riak to reduce and eliminate loads on applications and database servers.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.