Wang Tao: How traditional IT people use database thinking to understand blockchain

Source: Internet
Author: User
Tags blockchain blockchain technology blockchain industry blockchain architecture

The concept of blockchain to technology has been around for a long time, but with the heat of the past two years, it has gradually become known by the market and many technicians. As a veteran of the database industry, Wang Tao saw that the blockchain technology, under the craze, traditional IT technology students have maintained a very rational, even repulsive attitude. In fact, whether it is popular or repulsive, under the bipolar view, Wang Tao believes that we should explore the blockchain technology from the perspective that IT people can understand. Because the nature of the blockchain is very similar to database technology, many mechanisms use the concept of the database to understand it is very intuitive and accurate. In the sharing of this article, Wang Tao will allow traditional IT people to better understand blockchain technology from the perspective of database comparison. The following is a detailed sharing of Wang Tao, I hope to help you. At the same time, Wang Tao also proposed the definition of “decentralized database” as a possible direction for the future integration of blockchain technology and database technology.


Introduction to Wang Tao:


The co-founder & CTO of SequoiaDB has been a core R&D member of the IBM DB2 Lab in North America. He has more than 10 years of experience in database core architecture design, database engine development and enterprise database applications. Wang Tao also has an in-depth understanding of blockchain technology, distributed architecture, distributed algorithms, blockchain technology and decentralized business applications. Since the establishment of the company in 2012, Mr. Wang Tao has been leading the architecture design and development of SequoiaDB products, and is fully promoting the future development of database, big data and blockchain technology in the industry.


At the BTA blockchain technology and application summit, Wang Tao also published the topic of “blockchain: decentralized database” in the blockchain core technology. This article is based on the content of the speech by Wang Tao.


The following is Wang Tao's sharing record:

As a veteran of the database industry, I saw that the blockchain technology, under the craze, traditional IT technology students have maintained a very rational, even repulsive attitude. In fact, whether it is hot or exclusive, under the bipolar point of view, I think we should explore the blockchain technology from the perspective that IT people can understand. Because the nature of the blockchain is very similar to database technology, many mechanisms use the concept of the database to understand it is very intuitive and accurate.


For blockchain and traditional data technologies, I think the future development of blockchain technology is “fusion”. We will interpret the various technical points in the blockchain technology system from the perspective of the database, and make a better integration of the blockchain and database technology through the concept of “decentralized database”.


Blockchain technology status


The current blockchain world is known as 1.0, 2.0, and even has reached the 3.0 era, but from the perspective of a product or technology polishing, I think the current blockchain is equivalent to the database in the 80s, in a hundred flowers contend An era of endless thoughts.


For the technical person, this is the best era, a variety of fresh ideas and ideas burst out, bringing fresh breakthroughs in the boring technology field; at the same time this is the worst era, no product or direction is definitely the future The mainstream, any fresh ideas may prove to be infeasible in a few months.


Therefore, we must correctly understand the current changes and development of blockchain technology, then let's compare the roads that the database has taken in the past and see how the future blockchain world will develop.


1) Technology evolution


First of all, I believe that the blockchain will definitely evolve from the current proprietary to the generalization. Now basically all products that do public chain are implemented and optimized for a specific scenario, but I think the future will not be a chain of applications, but a general development paradigm, just like a traditional database, no matter What kind of applications you develop, you can use most of these limited versatility products to meet most business scenarios.


Second, the evolution to standardization. For blockchain technology, each chain now has its own development paradigm. Even many public chains imitate Ethereum to try to be a programming language. This is actually a sign that the industry is in its original period. How to judge whether an industry is beginning to mature? That is, the business model is basically fixed, and the development method is basically fixed, so that a large number of programmers can be promoted.


Third, productization and modularity continue to strengthen. At present, regardless of Ethereum, Bitcoin or many other new public chains, most of the architecture is very tightly coupled. In contrast to our Hadoop in the big data world, basically every module can be configured as a stand-alone plug-in for configurable and customizable plug-ins. Therefore, I believe that with the continuous maturity and stability of blockchain technology, there will be a mature product in the future, which can meet a variety of consensus algorithms and security mechanisms through pluggable configuration and plug-ins.


Finally, performance and scalability are improved. In fact, this is also the path that the database has gone through. The current blockchain world wants to cross the transformation of the database for decades in a short period of time through sidechains and shards.


I will introduce it later, and look at the blockchain from the perspective of the database. Its maximum performance and scalability limit where the bottleneck is and how it should be optimized.


2) Development status


Then let's go back and look at the current status of the blockchain industry.


I have always been concerned with some applications in the upper part of the blockchain and innovations in the financial field. From a technical point of view, the biggest innovation is the establishment of a peer-to-peer data storage mechanism.


In the database industry, everyone has been following the master-slave architecture, and the completely "more lively" system has been a legendary thing since it was proposed decades ago. No product has ever been truly alive.


And when we look at the current blockchain technology with an innovative multi-live database, we will find three issues that need improvement.


First of all, the blockchain architecture is very confusing now. You have not classified it into a transaction, stored procedure, authentication, master-slave synchronization module, etc., like most traditional databases. Most people still know about blockchain. Mysterious black box stage.


Second, the development language of the blockchain is completely fragmented. After the beginning of the "Warring States era", the database gradually used the SQL to achieve the industry's unification. The blockchain is still clearly in the "Warring States era", and there is no unified standard development and use standard.


Third, the variety of needs, some of the requirements or white paper business introduction is reliable, and some are completely whimsical. In fact, this is related to the new business model brought by the blockchain, and many people are still exploring new business models, resulting in the standard paradigm not being formed.


Blockchain vs database technology, same point


From the perspective of the database, blockchain technology is decentralized multi-library database technology, there is no essential difference between the two.


Here I list some of the more important technical points in the blockchain, and in what form these technical points exist in the database field. The one-to-one correspondence between these concepts and the technical concepts in the database is as follows:


Consensus mechanism


Consistency control - consensus mechanism


Distributed database is called consistency control, including traditional master-slave replication, a new generation of Raft, Paxos and other algorithms. In order to solve the additional Byzantine problem in the blockchain, the algorithm is improved to PBFT, PoW, PoS and other protocols.

Storage mechanism


Database log - ledger


The blockchain structure is basically equivalent to the transaction log of the database. The main additions include the Merkle Tree structure for quickly verifying the correctness of the data, but the essence is equivalent to the transaction log of the database. At the same time, the database will include enterprise-level capabilities such as transaction control in the log, which is not available in the blockchain data structure.


Smart contract


Smart contract---storage process


Smart contracts, like database stored procedures, are a piece of managed code. In essence, smart contracts are no different from database stored procedures. They execute a piece of code through external calls or virtual machines, and can share managed code with other users.


Fragmentation


The database sharding mechanism has existed since the era of MPP databases. By dividing large amounts of data into different shards, it is possible to limit the total amount of data per shard and increase the total throughput and storage space.


Application development interface


The current blockchain is still in the early days of a similar database, and the interface is not standardized. Depending on the blockchain project, its interfaces can be defined in terms of databases, object storage, API calls, and even PaaS platform standards.


Safety


The blockchain security mechanism is similar to the database security mechanism. Database security is generally divided into two modules, authentication and authorization, which represent user login and access rights. The blockchain currently only supports record-level write authorization, but is fully shared for read operations. Therefore, the database from the security policy is much better than the current blockchain.


Blockchain vs database technology, different points


Database and blockchain functional architecture diagram


1) Functional architecture


The yellow part is a function of both the blockchain and the database architecture. The white part is a feature unique to the current database.


SQL we also mentioned above, the SQL capabilities of the database is an important part of achieving its versatility, SQL is very important for the development of the blockchain model.


Index management, in the database is mainly to improve the performance efficiency of data management and data query, when the specific application scenarios appear, performance will become an important part of the next phase needs to be improved. Therefore, the index of stored data becomes an important component.


In terms of mechanism, the main differences between blockchain and database are as follows:


2) Consistency


The biggest difference between the design idea of blockchain and traditional database design is that it is more live, that is, the difference between the consistency model brought about by the system of decentralization.


The traditional relational database follows the ACID strong consistency model, and the written records can be read immediately. Some new distributed databases use the final consistency, which is the BASE model. The data written is not necessarily read, but it will eventually exist.


However, there are significant differences in the design of blockchains, or decentralized databases, which means that there is no "permanent confirmation" concept for any operation. Even if it is similar to Bitcoin, from the core principle, the content before the 6 blocks is only "basically will not be rolled back." To give an extreme example, if the WAN between China and the United States suddenly breaks for three days, and then resumes, Bitcoin will have a large-scale bifurcation. If there is an account at the same time in China and the United States for large-scale consumption, To restore a main chain, you must sacrifice a large number of people's transactions to achieve a fallback.


Then, since there is no way to guarantee strong consistency in the peer-to-peer architecture, the consistency in the blockchain system is essentially different from the traditional database, which leads to a series of subsequent design differences. In the final analysis, in any database model of the traditional master-slave architecture, people will try their best to prevent "brain cracks" in the cluster, that is, both nodes in the same cluster think that they are the master node. But this problem can happen all the time in the peer database system, and this phenomenon is called bifurcation in the blockchain, which is very different from our traditional database consistency model.


3) Lock mechanism


Among them, the lock mechanism can be said to be the biggest difference between the blockchain and the database in ensuring data consistency.


All students who have studied the database may not have heard of the lock. When we do a transaction, all records of the session change before the commit are locked and cannot be modified by other sessions.


In the decentralized database, since each accounting node operates local data, the change information is transmitted asynchronously, so there is no global lock at all to notify other people when the change is recorded. Therefore, under the premise of no lock, decentralized database, that is, how to ensure consistent data in the blockchain?


Bitcoin uses the UTXO structure, which is somewhat similar to the idea of "optimistic locking" in the database, that is, it does not lock when the operation is performed, and only determines whether the record has changed during the final submission process.


Bitcoin determines whether there is a trade conflict by whether the coin is being spent. Ethereum uses nonce as an incremental counter for each record to determine whether there is a duplicate transaction for an account. It is actually a mechanism for row-level locks implemented in disguise.


4) Security mechanism


Another blockchain industry is talking about security mechanisms.


First of all, I am not an encryption algorithm expert, so I will not discuss the specific encryption algorithm here, but from the security model design of the entire storage system, to discuss how the blockchain technology is under the system of full peer-to-peer architecture. Secure data.


In my opinion, the blockchain security system is divided into three levels, record level, block level and chain level.


Record-level security is mainly to determine whether an operation record is legal. In some implementations, it also includes whether it is visible and write-readable to different users.


The block level is when the node receives the block sent by another node, how to judge that the block itself has not been tampered with, then through the Merkel tree, mining results and other mechanisms can do.


Finally, how to ensure the integrity of the chain? For example, each data block needs to contain the check of the previous data block in the chain, and how to roll back when the fork occurs, which is to ensure the integrity of the entire chain structure.


Decentralized database architecture


What is the result of the convergence of blockchain technology and database technology?


Can we organize the existing blockchains into a database architecture, divided into different modules such as kernel, runtime, plug-ins, and SQL parsing optimization?


Since the core nature of the database is still an immutable transaction log, this part is equivalent to the chain structure of the blockchain. If we set up the SQL engine in the state store, even let the SQL engine directly access the data in the chain. Doesn't that mean we have a common programming and access interface?


For another example, for security components, can we do column-level row-level table-level and node-level security authentication, and at the same time, we can specify which tables need to be digitally signed, and which fields of some tables are shared, but other fields are Need to be multi-signed for encryption, etc.


In addition, for consistency, can we specify that some tables are global shared tables, and some tables are local tables, so that we can replace the current blockchain and

In addition, for consistency, can we specify that some tables are global shared tables, and some tables are local tables, which can replace the current deployment of blockchains and databases.


I believe that in the future there will be a “decentralized database” that combines the two.


Decentralized database basic functions


The basic features of the decentralized database:


Decentralization: The architecture is completely decentralized. There is no central control node. Each node has the function of reading and writing. The data of each node is consistent.


No global lock: Due to the peer-to-peer architecture on the WAN, decentralized databases cannot achieve global locks, so the system can only use some degree of weakening locks and consistency to meet high availability requirements;


Non-fixed nodes generate logs: non-fixed nodes generate logs, logs are logs of the entire database, and any node in the decentralized architecture has the right to log, thus forming a decentralized architecture without a master node, any node Have the opportunity to temporarily become a billing node out of the block;


Asynchronous transaction acknowledgment: Since there is no global lock, some transaction mechanisms must be adjusted compared to traditional databases. It may be a more feasible idea to roll back the commit of a transaction to be asynchronous.


Consistency policy adjustment: In the case of a multi-live blockchain state, the data consistency policy will be different from the traditional database consistency mechanism;


Row-level security and triggers: For data security, decentralized databases will guarantee data security at the row or even column level.


Conclusion: Blockchain and Database Technology Integration: Decentralized Database


For blockchain and traditional data technology, I think the future development of blockchain technology, the theme is "fusion"!


Now the business concept of blockchain is developing rapidly, but from the perspective of technology itself, I believe that the current blockchain technology is still similar to the database technology phase of the "80s" in the last century, in the growth of technology. As we mentioned above, blockchain technology has a long way to go in terms of versatility and standardization.


Based on the similarity of the technical route and architecture design, the integration of database technology and blockchain technology is actually the trend of the times. Through the introduction of blockchain technology and mechanisms, decentralized databases will be an important direction for future technological development.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.