zhanhailiang 日期:2014-12-11
This paper mainly introduces the common data storage scheme and the evaluation standard of the corresponding selection.
Guideline:针对不同应用场景,针对性选择存储方式。
1. Data storage Solutions
Sql:
MySQL 5.5/5.6/MariaDB(对于Dev绝大多数场景下透明);Oracle|MS SQL暂不考虑;
Nosql:
Memcached 1.4.21;Redis 2.8;MongoDB 2.6.6;Hbase 0.96/0.98;
2. Evaluation Criteria
RDBMS: (MySQL):
- Data persistence is required, and user submissions cannot be lost;
- Request for business assurance;
- The application is complex, the data structure is complex, the consistency requirement is high;
- The complexity of distributed implementation is high, and the cost of sub-database is large.
- Suitable for OLTP class system and MIS class system which need strict transaction guarantee;
Typical scenario:
以电商网站为例,所有后端子系统(比如ERP,物流,财务,仓储,人事,VIS等);网站核心数据存储(比如用户,商品,库存,购物车,订单);
KV (Memcache/redis):
- Data structure is simple, just follow the simple key to query and update records;
- Data does not require persistent storage (persistent on disk), is secondary data, generally not directly written by the user, (such as by the backend job generation, can be implemented by the application of double write)
- No need for transaction transaction support;
- There may be very high qps/tps (for example, 10k+ query/transaction per second);
- There are very high response speed requirements (<1ms typically), in the case of Redis, the same machine room operation is generally dozens of microseconds level;
Typical scenario:
各类计数器;各类cache层(商品列表页,各类配置信息,商品描述信息等);
Analytics Platform:
Hadoop:ETL;科学分析;GP:BI分析;各类报表;Hbase:在线系统;OLAP分析;DocDB:应用相对简单,数据结构相对复杂,支持快速开发,非事务类处理的信息处理系统。如知识问答、社区等;
3. Performance Optimization
When an existing system encounters a performance bottleneck, the Order of optimization is:
- Capacity assessment
- Performance optimization (System optimization, Code logic optimization, SQL Optimization)
- Hardware upgrades (from low-end hardware to high-end hardware, from low-end storage to high-end storage)
- Split vertically (split database according to different modules)
- Split horizontally (for a module, no longer running in the system, you need to split the module according to the primary key or other logic)
Appendix
Start trying to use the github.com issues tool to write a blog, very cool!
Corresponding github.com Links: #4
Data storage Scenario Evaluation standard RDBMS or KV