This article is from: "Blockchain technical Guide" section, hereby explained.
Hash algorithm
Defined
Hash (hash or hashing) algorithms are very basic and very important technologies in the field of information technology. It can map any length of binary value (plaintext) to a shorter fixed-length binary value (hash value), and different plaintext is difficult to map to the same hash value.
For example, the MD5 hash value for a paragraph "Hello Blockchain world, this is Yeasy@github" is 89242549883a2ef85dc81b90fb606046.
$ echo "Hello Blockchain world, this is Yeasy@github" |md5
89242549883a2ef85dc81b90fb606046
This means that we just MD5 hash of a file to get the result as 89242549883a2ef85dc81b90fb606046, which means that the maximum probability of the file content is "Hello Blockchain World", which is Yeasy@github ". It can be seen that the core idea of hash is very similar to content-based addressing or naming.
NOTE: MD5 is a classic hash algorithm, and the SHA-1 algorithm has been proven to be less secure in commercial scenarios.
A good hash algorithm, will be able to achieve:
Forward fast: Given plaintext and hash algorithms, the hash value is calculated within finite time and finite resources.
Reverse difficulty: given (some) hash value, it is difficult (almost impossible) to reverse the introduction of plaintext in a finite time period.
Input sensitive: The original input information to modify a little information, the resulting hash value should look very different.
Conflict avoidance: It is difficult to find two different pieces of plaintext, making their hash values consistent (conflicting).
Conflict avoidance is sometimes referred to as "collision resistance". If a clear text is given, another clear text of the collision cannot be found, known as "anti-weak collision", and if any two plaintext cannot be found, the algorithm has "strong collision resistance".
Popular algorithms
Currently popular hash algorithms include MD5 (which has been proven to be insecure) and SHA-1, both of which are designed based on MD4.
MD4 (RFC 1320) was designed by MIT's Ronald L. Rivest in 1990, MD is the abbreviation for Message Digest. Its output is 128 bits. MD4 is not safe enough.
MD5 (RFC 1321) is an improved version of Rivest in 1991 for MD4. It still groups the input in 512 bits, and its output is 128 bits. MD5 is more complex than MD4, and is slower to calculate, but more secure. MD5 is not safe enough.
SHA1 (Secure Hash algorithm) is designed by the NIST NSA, whose output is a Hash value of 160 bits in length, so it is better for anti-poor lifting. The SHA-1 design is based on the same principles as MD4 and mimics the algorithm.
To improve security, the NIST NSA also designed SHA-224, SHA-256, SHA-384, and SHA-512 algorithms (collectively known as SHA-2), similar to the SHA-1 algorithm principle.
Performance
In general, hash algorithms are computational force-sensitive, which means that computing resources is the bottleneck, the higher the frequency of the CPU to hash the faster.
There are also some hash algorithms are not sensitive, such as scrypt, need a lot of memory resources, nodes can not simply add more CPUs to get the hash performance improvement.
Digital summary
As the name implies, a numeric digest is a hash operation on a digital content that obtains a unique digest value to refer to the original digital content.
The Digital Digest is a solution to the problem of ensuring that the content has not been tampered with (using the anti-collision characteristics of the hash function).
The Numeric digest is one of the most important uses of the hash algorithm.
When downloading software or files on the network, a numeric digest value is often provided at the same time, and the user downloads the original file which can be computed on its own and is compared with the provided digest value to ensure that the content has not been modified.
Encryption algorithm
Public Key Private Key system
Typical components of modern cryptographic algorithms include: encryption and decryption algorithm, public key, private key.
In the process of encryption, the plaintext is encrypted by encryption algorithm and public key, and ciphertext is obtained.
In the process of decryption, the ciphertext is decrypted by decryption algorithm and private key, and the plaintext is obtained.
According to whether the public key and private key are the same, the algorithm can be divided into symmetric and asymmetric encryption. The two models are suitable for different needs, just form complementary, many times can also be combined to form a combination mechanism.
Symmetric encryption
As the name implies, the public and private keys are the same.
The advantages are fast decryption, small space consumption and high secrecy.
The disadvantage is that participating parties need to hold the key, and if someone leaks it, security is compromised, and the other distribution key is a problem.
The representation algorithm includes DES, 3DES, AES, Idea, and so on.
It is suitable for the addition and decryption of large amounts of data and cannot be used for signature scenarios.
Asymmetric encryption
As the name implies, the public and private keys are different.
Public keys are generally public and accessible to all, and private keys are generally held by individuals and cannot be acquired by others.
The advantage is that the public private key is separate, easy to manage, and easy to complete key distribution.
The disadvantage is that the encryption and decryption speed is slow.
The representation algorithm includes: RSA, ElGamal, Elliptic Curve Series algorithm.
Generally applicable to the signature scene or key negotiation, not suitable for large amounts of data encryption and decryption.
Combination mechanism
The first is to negotiate a temporary symmetric encryption key (session key) with asymmetric encryption with high computational complexity, and then encrypt and decrypt the large amount of data passed by symmetric encryption.
Digital signatures and digital certificates
Digital signatures
Similar to a paper contract signed to confirm the content of the contract, digital signature is used to confirm the integrity and origin of a digital content.
A is issued to a file B. A digest the file, then encrypt it with your own private key, and send the file and the encrypted string to B. B after receiving the file and the encrypted string, use A's public key to decrypt the encrypted string, get the original digital digest, and compare the results after the summary of the file. If it is consistent, the file is indeed a sent, and the contents of the file have not been modified.
Multiple signatures
For n holders, a signature of at least m (n≥m≥1 n\ge{}m\ge{}1) is considered lawful and is called a multi-signature.
where n is the number of public keys provided, and M is the minimum number of signatures that need to match the public key.
Group Signature
Ring Signature
The ring signature was first presented by Rivest,shamir and Tauman three-bit cryptology in 2001. Ring signatures belong to a simplified group signature.
The signer first selects a temporary signer collection, which includes the signer itself. The signer then uses his private key and the other person's public key in the signature collection to generate the signature independently, without the help of others. Other members of the signer collection may not know that they are included.
Digital certificates
A digital certificate is used to prove who a public key is.
For digital signature applications, it is important to distribute the public key. Once the public key is replaced, the entire security system is destroyed.
How to make sure that a public key is really the original public key of a person.
This requires a digital certificate mechanism.
As the name implies, a digital certificate is like a certificate that proves information and legitimacy. Issued by the Certification authority (certification AUTHORITY,CA).
The content of the digital certificate may include the version, serial number, Signature algorithm type, issuer information, validity period, issuer, public key issued, CA digital signature, other information, and so on.
Among them, the most important include the issued public key, CA digital signature two information. Therefore, as long as this certificate is passed, it proves that a public key is legitimate because it has a digital signature of the CA.
Further, how to prove that the CA's signature is not legal.
Similarly, a CA's digital signature is legally illegal and is certified by a CA's certificate. The main operating system and the browser will pre-provision some CA's certificate (admit these are legitimate certificates), and then all the signatures based on their authentication will naturally be considered legal.
The PKI system presented in the later chapters provides a complete set of certificate management frameworks.
PKI system
PKI (public Key Infrastructure) system does not represent one kind of technology, but is a framework and specification of integrating multiple cryptographic methods to realize safe and reliable transmission of message and identity.
In general, the following components are included:
CA (certification authority): Responsible for issuing and invalidating certificates, receiving requests from RA;
RA (registration Authority): To verify the identity of users, verify the legitimacy of the data, responsible for registration, audit issued to the CA;
Certificate database: Storage certificate, general use LDAP Directory service, standard format adopts X.500 series.
CA is the most core component, which mainly completes the management of public key. From the previous section, we've covered two types of keys: for signing and for decrypting, which are called signing key pairs and cryptographic key pairs.
The user is based on the PKI system to request a certificate, generally can be generated by the CA certificate and private key, you can also generate their own public and private keys, and then the public key issued by the CA.
Merkel (also called Hash tree) is a binary tree consisting of a root node, a group of intermediate nodes, and a set of leaf nodes. The bottom leaf node contains the stored data or its hash value, each intermediate node is a hash of its two child node content, and the root node is also comprised of the hash value of its two child node contents.
Further, the Merkel tree can be extended to a multi-forked tree case.
The feature of the Merkel tree is that any change in the underlying data will be passed on to its father's node, to the root.
Typical application scenarios for the Merkel tree include:
Quickly compare large amounts of data: When two Merkelshigen are the same, it means that the data represented is necessarily the same.
Quick Location Modification: For example, if the data in the D1 is modified, the N1,N4 and Root will be affected. Therefore, along the root–> n4–> N1, can be quickly positioned to change the D1;
0 Proof of Knowledge: for example, how to prove a certain data (D0 ... D3) includes a given content D0, very simply, constructs an Merck tree, announcing that n0,n1,n4,root,d0 owners can easily detect D0 existence, but do not know anything else.
Homomorphic encryption
Defined
Homomorphic encryption (Homomorphic encryption) is a special encryption method, allowing the processing of ciphertext is still the result of encryption, that is, the ciphertext directly processing, and the clear text processing and re-encryption, the results are the same. From the point of view of algebra, it is homomorphism.
If you define an operator \triangle{}, the encryption algorithm E and the decryption algorithm D, satisfy:
$$ E (x\triangle{}y) = e (X) \triangle{} e (Y)
$$ means that the operation satisfies the homomorphism.
Homomorphism is included in algebra: addition homomorphism, multiplication homomorphism, subtraction homomorphism, and division homomorphism. At the same time satisfying addition homomorphism and multiplication homomorphism, it means algebraic homomorphism, that is, the whole homomorphism. At the same time satisfying the four homomorphism, it is called the arithmetic homomorphism.
History
The problem of homomorphic cryptography was first proposed by Ron Rivest, Leonard Adleman and Michael L. Dertouzos in 1978, but the first "all-in-one" algorithm was not proven by Cregg Kintry (Craig Gentry) until 2009.
Algorithms that satisfy only the addition homomorphism include the Paillier and Benaloh algorithms, and algorithms that only satisfy the multiplicative homomorphism include the RSA and ElGamal algorithms.
Homomorphic encryption is very important in the cloud era. At present, from a security standpoint, users are not afraid to put sensitive information directly on the third-party cloud for processing. If you have a more practical homomorphic encryption technology, then you can rest assured that the use of a variety of cloud services.
Unfortunately, the current known homomorphic encryption technology consumes a lot of computational time and is far from the practical level.
function encryption
One of the problems associated with Homomorphic encryption is function encryption.
Homomorphic encryption protects the data itself, and function encryption, as the name implies, is the processing function itself, that is, to let third parties do not see the process of the premise, the data processing.
The problem has been proven to be a scheme that does not exist for any multi-key to multiple general functions, and only one key for a particular function can be achieved at this time.
0 Knowledge proof (zero knowledge validation)
The validator, without providing any useful information to the authenticator, makes the authenticator believe that the assertion is correct.
For example, a like B proves that he has an item, but B cannot get the item and cannot prove to others that he owns the item with a's proof.