Currently, we all live in the era of massive data storage. However, big data and its underlying technology nosql are also becoming a buzzword of the Internet. For global Internet enterprises such as Google and Facebook and IBM, the use of nosql, a highly scalable non-relational data inventory, often exceeds that of relational databases. In fact, a series of new database products have emerged in the Process of massive data and semi-structured data. These databases are called nosql.
On June 23, 2013 global big data technology summit sponsored by the wot (World of tech) brand of 51cto media group was held at Renaissance Beijing Hotel. Nosql products are ever-changing, and features and value propositions are different, so it is often difficult to choose. The reporter thoroughly communicated with software development experts from Silicon Valley, who worked at Oracle, Microsoft, and Google's Ming Lei teacher and analyzed the actual nosql, and summarized some ideas for reference by netizens.
Teacher Ming lei (left)
Distributed Systems and nosql
A Distributed System includes many different layers, including the application layer, data layer, and presentation layer. Now we mainly talk about the application layer and data layer, which are both important components of the distributed system, in general, the application layer is stateless, and the data layer constantly performs operations to save the state. The data layer is the most difficult and deep layer in a distributed system.
According to teacher Ming lei, nosql is a storage in a distributed system, and it is a type of distributed system. Or a layer of distributed systems.
Comparison of nosql caches on CDN
On the nosql side, the typical cache is memcached. The biggest difference between the nosql cache and the CDN cache is that the nosql cache refers to the cache at the data layer, not the cache at the application layer, it is not a network-layer cache, so its cache is relatively raw data. For example, the content in the application logic is not the final result for the user. If we cache the content on the network layer, the most common technology is CDN, which is called contentdeliverynetwork in English, it generally caches some specific web pages on some network servers at the end of the network near the user's end.
Memcached:
- Free & open source, high-performance, distributed memory Object Caching System, generic in nature, but intended for use in speeding up dynamic Web applications by alleviating database load.
- Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database cals, API cals, or page rendering.
Memcache-Architecture
- Sharding in client code to select server.
- Peer-to-Peer server instances.
- Server uses in-MEM storage.
- Potentially expand to persistent store.
Memcache-usage Characteristics
- Object-level consistency, isolation and atomicity.
- No persistent Storage
- No replication for load-balancing or failover
- Consistency + partition-tolerance in Cap
Nosql Security Analysis
In fact, the system can solve security issues at different levels, and does not necessarily require the system to solve security issues at each level. For example, a distributed storage system is generally a storage service, A remote network call is required to obtain the result from the request. A more favorable solution is to solve the security problem in the network call, such as adding some security management (user authorization, instead of solving this problem in distributed storage operations.
Hadoop Multidimensional Analysis Platform
Nosql and SQL
According to teacher Ming lei, the application scenarios of the two are different. When we target Internet users and consumers, such applications have low requirements on things based on our experience, while enterprises have high requirements on things, for example, financial, logistics, and personnel in an enterprise usually share the same set of databases, so it has high requirements on things.
For example, if you build a website on a service, at this time, the level of your transaction may be simply an account, that is, you have low requirements on the database, and your data volume is very large, at this time, we need a solution for different relational databases. This solution is called nosql. The biggest difference is that it requires a large amount of data and a low requirement on things.
Nosql database comparison diagram (click to expand)
Future of nosql
I think the most common application scenario on the internet is that the data volume is very large, the requirements for things are relatively low, or the level of things is relatively narrow and the structure is relatively small. For such applications, nosql is a future development direction.
However, some enterprise-level applications still need to use relational databases. Currently, there is no such trend in the industry to turn the relational databases of enterprise applications into nosql.