redis該如何分區-譯文(原創)

來源:互聯網
上載者:User

標籤:des   style   blog   http   color   io   os   使用   ar   

寫在最前,最近一直在研究redis的使用,包括redis應用情境、效能最佳化、可行性。這是看到redis官網中一個連結,主要是講解redis資料分區的,既然是官方推薦的,那我就翻譯一下,與大家共用。Partitioning: how to split data among multiple Redis instances. 

分區:如何把資料存放區在多個執行個體中。

 

Partitioning is the process of splitting your data into multiple Redis instances, so that every instance will only contain a subset of your keys. The first part of this document will introduce you to the concept of partitioning, the second part will show you the alternatives for Redis partitioning.

分區是把你的資料分割儲存在多個redis執行個體中的一個過程,每個執行個體中只儲存一部分key。本檔案的第一部分將介紹你到分區的概念,第二部分說明如何使用redis分區。

Why partitioning is useful 為什麼分區是有效

Partitioning in Redis serves two main goals:

在redis伺服器中使用分區有兩個主要作用:

  • It allows for much larger databases, using the sum of the memory of many computers. Without partitioning you are limited to the amount of memory a single computer can support.
    他可以利用多台電腦的記憶體共同構建一個大型資料庫。不使用分區的情況下你會單個電腦的記憶體限制。
  • It allows to scale the computational power to multiple cores and multiple computers, and the network bandwidth to multiple computers and network adapters.
    他可以在多核和多台電腦之間擴充,並且適應不同的電腦頻寬。
Partitioning basics 

分區的基本概念

 

There are different partitioning criteria. Imagine we have four Redis instances R0, R1, R2, R3, and many keys representing users like user:1user:2, ... and so forth, we can find different ways to select in which instance we store a given key. In other words there are different systems to map a given key to a given Redis server.

有多種分區方式。比如:我們有四個redis執行個體:R0, R1, R2, R3和許多代表使用者的鍵(像 user:1user:2)等等,我可以用不同的方式來從中選擇一個執行個體來儲存一個鍵。換句話說,有不同的系統來映射給定的鍵儲存到給定的redis伺服器中。

One of the simplest way to perform partitioning is called range partitioning, and is accomplished by mapping ranges of objects into specific Redis instances. For example I could say, users from ID 0 to ID 10000 will go into instanceR0, while users form ID 10001 to ID 20000 will go into instance R1 and so forth.

一個最簡單的分區方法就是定界分割,並通過具體的執行個體對象來映射該範圍。比如,id 1到10000的使用者儲存到R0中,10001到20000的使用者儲存到R1中,依此類推。

This systems works and is actually used in practice, however it has the disadvantage that there is to take a table mapping ranges to instances. This table needs to be managed and we need a table for every kind of object we have. Usually with Redis it is not a good idea.
這個方案是可以被應用到實踐中的,但是他有一個缺點就是他需要一個表來儲存每個執行個體儲存範圍的映射關係。這個表是需要維護的,並且我們需要為我們每一種對象建立這麼一張表。所以在使用redis時,這不是一個很好的方案。

An alternative to to range partitioning is hash partitioning. This scheme works with any key, no need for a key in the form object_name:<id> as is as simple as this:

散列分區:一種可以替代定界分割的分區方式。該方案適用於任何鍵,他簡單到不需要使用這樣的鍵(object_name:<id>):

  • Take the key name and use an hash function to turn it into a number. For instance I could use the crc32 hash function. So if the key is foobar I do crc32(foobar) that will output something like 93024922.
    使用一個雜湊函數把key轉換成一個數字。例如:我可以使用CRC32演算法。所以如果key是foobar,那麼執行CRC32(foobar)的結果就是像93024922一樣的東西。
  • I use a modulo operation with this number in order to turn it into a number between 0 and 3, so that I can map this number to one of the four Redis instances I‘ve. So 93024922 modulo 4 equals 2, so I know my key foobar should be stored into the R2 instance. Note: the modulo operation is just the rest of the division, usually it is implemented by the% operator in many programming languages.
    我是用一種模數的函數把一個號碼轉換到0到3中的一個數字,這樣我就可以把這個數字映射到4個redis執行個體中的一個執行個體上。93024922模4等於2,這樣我就知道foobar這個key應該存放到R2執行個體中。提示:模數運算是他工程裡的說法,通常我們在程式語言設計中只需要使用%(取餘)就可以了。

There are many other ways to perform partitioning, but with this two examples you should get the idea. One advanced form of hash partitioning is called consistent hashing and is implemented by a few Redis clients and proxies.

通過這兩個例子,你應該能想到還有很多其他的劃分方式。雜湊分割是一種先進的分區形式,它也被叫做一致性分區,他由幾個redis用戶端和代理實現。

Different implementations of partitioning 

不同的劃分方式的實現

 

Partitioning can be responsibility of different parts of a software stack.

分區可以由一個軟體棧的不同職責地區完成。

  • Client side partitioning means that the clients directly select the right node where to write or read a given key. Many Redis clients implement client side partitioning.
    用戶端實現分區:是指有用戶端直接選在合適的借點進行讀寫鍵。許多redis用戶端都實現了這種分區方式。
  • Proxy assisted partitioning means that our clients send requests to a proxy that is able to speak the Redis protocol, instead of sending requests directly to the right Redis instance. The proxy will make sure to forward our request to the right Redis instance accordingly to the configured partitioning schema, and will send the replies back to the client. The Redis and Memcached proxy Twemproxy implements proxy assisted partitioning.
    代理輔助分區: 是指用戶端把請求通過redis協議發送給代理,而不是直接發送給真正的redis執行個體伺服器。這個代理會確保我們的請求根據配置分區架構發送到正確的redis執行個體上,並返回給用戶端。redis和memcached的代理都是用 Twemproxy(twitter的一個代理架構)來實現代理服務分區的。
  • Query routing means that you can send your query to a random instance, and the instance will make sure to forward your query to the right node. Redis Cluster implements an hybrid form of query routing, with the help of the client (the request is not directly forwarded from a Redis instance to another, but the client gets redirected to the right node).
    查詢路由:是指你可以把一個請求發送給一個隨機的執行個體,這時執行個體會把該查詢轉寄給正確的節點。Redis叢集實現了一種混合查詢路由,用戶端的請求不用直接從一個執行個體轉寄到另一個執行個體,而是被重新導向到正確的節點。 
Disadvantages of partitioning
分區的一些缺點

Some features of Redis don‘t play very well with partitioning:

redis分區在有些方面做的並不好:

  • Operations involving multiple keys are usually not supported. For instance you can‘t perform the intersection between two sets if they are stored in keys that are mapped to different Redis instances (actually there are ways to do this, but not directly).
    不支援涉及多個鍵的操作。比如你不能操作映射在兩個redis執行個體上的兩個集合的交叉集。(其實可以做到這一點,但是需要間接的解決)
  • Redis transactions involving multiple keys can not be used.
    redis之間多個鍵的事務不能使用。
  • The partitioning granuliary is the key, so it is not possible to shard a dataset with a single huge key like a very big sorted set.
    使用類似於一個大的排序集合將單一的資料集進行分區是不太可能的。因為分區關鍵是鍵。
  • When partitioning is used, data handling is more complex, for instance you have to handle multiple RDB / AOF files, and to make a backup of your data you need to aggregate the persistence files from multiple instances and hosts.
    如果使用分區,資料的處理會變得複雜,你不得不對付多個redis資料庫和AOF檔案,不得在多個執行個體和主機之間持久化你的資料。
  • Adding and removing capacity can be complex. For instance Redis Cluster plans to support mostly transparent rebalancing of data with the ability to add and remove nodes at runtime, but other systems like client side partitioning and proxies don‘t support this feature. However a technique called Presharding helps in this regard.
    添加和刪除節點也會變得複雜。比如redis叢集計劃支援透明的運行時添加和刪除節點,但是像用戶端分區或者代理分區的特性就不會再被支援。不過Presharding(預分區)可以在這方面提供協助。
Data store or cache?
作為資料存放區還是作為緩衝使用?

Partitioning when using Redis ad a data store or cache is conceptually the same, however there is a huge difference. While when Redis is used as a data store you need to be sure that a given key always maps to the same instance, when Redis is used as a cache if a given node is unavailable it is not a big problem if we start using a different node, altering the key-instance map as we wish to improve the availability of the system (that is, the ability of the system to reply to our queries).
使用redis儲存資料或者快取資料在概念上是相同的,但是使用過程中這兩者有巨大的差距。當redis被當作持久化資料存放區伺服器使用的時候意味著對於相同的索引值必須被映射到相同的執行個體上面,但是如果把redis當作資料緩衝器,當我們使用不同的節點的時候,找不到對應索引值的對象不是什麼大問題(緩衝就是隨時準備好犧牲自己),改變索引值和執行個體映射邏輯可以提供系統的可用性(也就是系統處理查詢請求的能力)。

Consistent hashing implementations are often able to switch to other nodes if the preferred node for a given key is not available. Similarly if you add a new node, part of the new keys will start to be stored on the new node.
一致性雜湊可以為給定的索引值停用情況下能夠切換到其他的節點上。同樣的,你添加一個新的節點,部分新的索引值開始儲存到新添加的節點上面。

The main concept here is the following:
主要的概念如下:

  • If Redis is used as a cache scaling up and down using consistent hashing is easy.
    如果redis只作為快取服務器來使用,那麼用雜湊是相當容易的。
  • If Redis is used as a store, we need to take the map between keys and nodes fixed, and a fixed number of nodes. Otherwise we need a system that is able to rebalance keys between nodes when we add or remove nodes, and currently only Redis Cluster is able to do this, but Redis Cluster is not production ready.
    若果redis被作為資料持久化伺服器,我們需要提供節點和索引值的固定映射,還有一組固定的redis執行個體節點。否則我們需要一個系統來為我們增加或者刪除索引值和節點,目前,redis叢集可以做到這點,但是redis叢集還沒有發布正式版本。
Presharding 

預分區

 

We learned that a problem with partitioning is that, unless we are using Redis as a cache, to add and remove nodes can be tricky, and it is much simpler to use a fixed keys-instances map.

從分區的概念中,我們可以瞭解,除非只把redis當作快取服務器來使用,否則添加和刪除redis節點都會非常複雜。相反使用固定的索引值和執行個體映射確實很簡單的。

However the data storage needs may vary over the time. Today I can live with 10 Redis nodes (instances), but tomorrow I may need 50 nodes.

然而資料存放區會經常需要變化。今天我只需要10個redis節點(執行個體),但是明天我可能會需要50個節點。

Since Redis is extremely small footprint and lightweight (a spare instance uses 1 MB of memory), a simple approach to this problem is to start with a lot of instances since the start. Even if you start with just one server, you can decide to live in a distributed world since your first day, and run multiple Redis instances in your single server, using partitioning.

因為redis足夠輕量和小巧(一個備用執行個體使用1M的記憶體),解決這個問題的簡單方法就是一開始就使用大量的執行個體節點。即使你開始是有一個伺服器,你可以換成分布式的結構,因為可以在單個伺服器上通過分區分方式來運行多個redis節點。

And you can select this number of instances to be quite big since the start. For example, 32 or 64 instances could do the trick for most users, and will provide enough room for growth.

你可以選擇的執行個體可數可以非常大。例如,32或者64個執行個體能夠滿足絕大多數的使用者,並且可以為其提供足夠的增長空間。

In this way as your data storage needs increase and you need more Redis servers, what to do is to simply move instances from one server to another. Once you add the first additional server, you will need to move half of the Redis instances from the first server to the second, and so forth.

通過這樣的方法來滿足資料存放區需求的增加時你只需要更多的redis伺服器,然後把一個節點移動到另外的伺服器上面。一旦你添加了額外的伺服器,你可以將一半的redis的執行個體移動到第二個等等。

Using Redis replication you will likely be able to do the move with minimal or no downtime for your users:
你可以使用redis 的主從複製來減少服務的停止時間:

  • Start empty instances in your new server.
    在新伺服器上開啟新的redis空執行個體。
  • Move data configuring these new instances as slaves for your source instances.
    將節點的資料配置移動到新的從伺服器上
  • Stop your clients.
    停止你的redis用戶端。
  • Update the configuration of the moved instances with the new server IP address.
    在新的伺服器上更新移動過來的節點配置。
  • Send the SLAVEOF NO ONE command to the slaves in the new server.
    發送slave no one 命令到新伺服器的從節點。
  • Restart your clients with the new updated configuration.
    使用新的配置重啟用戶端。
  • Finally shut down the no longer used instances in the old server.
    最後永久關閉老伺服器上不再使用的節點。
Implementations of Redis partitioning
redis分區實踐。

So far we covered Redis partitioning in theory, but what about practice? What system should you use?

到目前為止,我們講了分區的原理。但是該如何實戰?你應該使用什麼樣的系統?

Redis Clusterredis叢集

Unfortunately Redis Cluster is currently not production ready, however you can get more information about it reading the specification or checking the partial implementation in the unstable branch of the Redis GitHub repositoriy.

不幸的是redis叢集的正式版還沒有發布,但是你可以在github上得到不穩定版,看一看他的規範和實現方式。

Once Redis Cluster will be available, and if a Redis Cluster complaint client is available for your language, Redis Cluster will be the de facto standard for Redis partitioning.

一旦redis叢集正式版發布,並且提供的用戶端語言介面可用,那麼這種方式將成為標準的redis分區方式。

Redis Cluster is a mix between query routing and client side partitioning.

redis叢集是一個查詢路由和用戶端分區的混合體。

Twemproxy 

Twemproxy 架構

 

Twemproxy is a proxy developed at Twitter for the Memcached ASCII and the Redis protocol. It is single threaded, it is written in C, and is extremely fast. It is open source software released under the terms of the Apache 2.0 license.
Twemproxy是一個由Twitter開發的適合memached和redis協議的代理。它是單線程工作,使用C語言實現的,非常的快速。並且是Apache 2.0著作權申明下的開源軟體。

Twemproxy supports automatic partitioning among multiple Redis instances, with optional node ejection if a node is not available (this will change the keys-instances map, so you should use this feature only if you are using Redis as a cache).

Twemproxy支援自動在多個redis節點分區,如果某個節點不可用,將會被自動屏蔽(這將改變索引值和節點映射表,所以如果你把redis當作快取服務器使用你應該使用這個功能)。

It is not a single point of failure since you can start multiple proxies and instruct your clients to connect to the first that accepts the connection.

你可以啟用多個代理,讓你的用戶端得到可用的串連,這樣不會發生單點故障。

Basically Twemproxy is an intermediate layer between clients and Redis instances, that will reliably handle partitioning for us with minimal additional complexities. Currently it is the suggested way to handle partitioning with Redis.

Twemproxy基本上是redis和用戶端的一個過渡層,通過簡化使用讓我們使用可靠的分區。目前這也是使用redis分區的推薦方案。

You can read more about Twemproxy in this antirez blog post.

你可以在antirez的部落格發現有關Twemproxy的更多知識。

Clients supporting consistent hashing用戶端一致性雜湊實現。

An alternative to Twemproxy is to use a client that implements client side partitioning via consistent hashing or other similar algorithms. There are multiple Redis clients with support for consistent hashing, notably Redis-rb and Predis.
替代Twemproxy的一種方案是使用用戶端一致性哈西或者其他類似的演算法。有需要redis用戶端支援一致性哈西,比如Redis-rb和Predis。

Please check the full list of Redis clients to check if there is a mature client with consistent hashing implementation for your language.

請檢查列表已確定是否有成熟的一直性雜湊實現的,並且適合於你的程式設計語言的用戶端。

轉載請註明出處:http://www.cnblogs.com/eric-z/p/3995502.html

redis該如何分區-譯文(原創)

相關文章
阿里云产品大规模降价
  • 最高幅度達59%,平均降幅23%
  • 核心產品降價
  • 多地區降價
undefined. /
透過 Discord 與我們聯繫
  • 安全、匿名的群聊,不受干擾
  • 隨時了解活動、活動、新產品等訊息
  • 支持您的所有問題
undefined. /
免費試用
  • 開啟從ECS到大數據的免費試用之旅
  • 只需三步 輕鬆上雲
  • 免費試用ECS t5 1C1G
undefined. /

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.