Run your software: cryptographic nature-use Hash Algorithms for data integrity and authentication

Source: Internet
Author: User
Tags ftp site
Make your software run: cryptographic essence-scatter Algorithm For data integrity and authentication

Gary McGraw and John VIW
Reliable Software Technologies
July 7, 2000

Content:
Unidirectional Functions
Hash Function
Internet distribution
Authentication Problems
Mac
Replay attacks
Telnet protocol
Other Attacks
Digital Signature
Signature Problems
What is next?
References
About the author

So far, Gary and John have discussed two common forms of encryption algorithms-public key cryptography systems, such as RSA and symmetric algorithms, in the series on cryptography, for example, Des-is the most common method to solve data confidentiality issues. They also discussed the importance of using well-known algorithms, rather than using your own algorithms, and introducedProgramRisks that are frequently encountered when implementing password procedures. In this section, they begin by introducing hash algorithms and are committed to studying common methods for data integrity and authentication.

Unidirectional Functions
The hash algorithm is a unidirectional function. That is to say, they receive a plaintext string and convert it into a small segment that cannot be used to reconstruct the ciphertext of the original plaintext. Obviously, some data must be lost in the conversion to make such a function work.

At first glance, unidirectional functions seem useless because you cannot retrieve the plaintext from the ciphertext of unidirectional computing. Why is it necessary to calculate an unrecoverable password? Of course, almost one-way functions are very useful, because in essence, all public key functions are unidirectional functions with "skylights. A good candidate function for public key cryptography is a function that is easy to compute in one direction, and is extremely difficult to compute in another direction unless you know some secrets. Therefore, we found that the public key algorithm is based on natural decomposition and other difficult mathematical tricks.

Hash Function
As the result shows, real unidirectional functions are also useful. These functions are usually calledHash FunctionThe result is usually calledPassword hash value,Password checksum,Password fingerprintOrMessage Abstract. These functions play an important role in many cryptographic protocols.

The idea is to receive a piece of plain text, and then convert it into a (usually smaller) ciphertext in an irreversible way. Theoretically, all possible plain text will be hashed into a unique ciphertext, But what actually happens is not that. In most cases, an infinite number of different strings can generate identical hash values. However, for a good cryptographic hash function, it is difficult to have two understandable strings with the same values in practice. Another feature of a good hash function is that the output does not reflect the input in any identifiable way.

Hash Functions usually generate a summary of a constant size. Many algorithms generate very small summaries. However, the security of algorithms depends largely on the size of the Summary. We recommend that you select algorithms that provide a summary of no less than 128 bits. SHA-1 provides 160-bit hash, which is a good hash function.

Hash functions can be used to ensure data integrity, which is similar to the traditional checksum. If you publish a document with a regular password hash, anyone can check the hash, assuming they know the hash algorithm. Most hash algorithms used in practice are publicly published and well-known. Remind you again that using a dedicated cryptographic algorithm, including hash functions, is usually a bad idea.

Internet distribution
Consider the distribution of software packages on the Internet. In the past, the software packages obtained through FTP were associated with the checksum. The idea is to download the software and run a program to calculate your checksum version. You can then compare the self-calculated checksum with the checksum obtained on the FTP site to determine the matching of the two and ensure the data integrity (various) over the wire on the transmission connection ). The problem is that this outdated method is not encrypted at all. First, there are many checksum techniques that can maliciously modify and download programs, and may cause the modified programs to generate identical checksum. Second, the "Trojan" version of the software package with its associated (poorly protected) checksum can be easily released on the FTP site. The cryptographic hash function can be used as an alternative to the old-fashioned checksum algorithm. They have the advantage of tampering with deliveryCodeIt became extremely difficult.

Warn you in advance-this distribution solution has another problem. What if I downloaded the incorrect checksum as a software consumer? For example, suppose we have distributed the "xybench" software package. One night, some hackers broke into the distribution machine and replaced xybench with a slightly modified version containing a malicious trojan horse. Attackers can also replace the publicly distributed Hash with a hash release with a Trojan copy. A malicious copy is generated when an innocent user downloads the target software package. The victim also downloaded the password checksum and tested it against the software package. It detects and malicious code looks safe and usable. Obviously, if we cannot ensure that the Hash itself is not modified, only the hash cannot be a complete solution. In short, we need a method to authenticate the hash.

Authentication Problems
There may be two situations when we consider authentication issues. We may want to verify the hash in each case. If so, we can use a Digital Signature Based on PKI, which will be discussed below. Or, we want to limit who can verify the hash. For example, assume that we send an anonymous message to the SCI. crypt newsgroup, in which all ofSource codeBut we hope that only our closest friends can verify that we have delivered messages. Message authentication code (MAC) can be used for this purpose.

Message authentication code
Mac uses a shared key, and the receiver uses a copy of it. This key can be used to authenticate suspicious data. The sender must have another copy of the key. Mac can work in several ways. The first method is to combine the key to the end of the data before calculating the summary. If no key exists, you cannot confirm that the data has not been changed. Another more complex method of computing is to calculate the hash as usual, and then encrypt the hash Using symmetric algorithms (such as des. To authenticate the hash, you must first decrypt it.


Applied cryptography
Many Mac structures do not depend on the hash method. Bruce Schneier'sApplied cryptographyThis topic is described in more detail (see references ).

Mac is also useful in many other environments. If you want to implement basic message authentication without using encryption (perhaps because of efficiency), Mac is a suitable tool to complete this task. Even if you have used encryption, Mac is an excellent way to ensure that encrypted bit streams are not maliciously modified during transmission.

If carefully designed, a good Mac can help solve other public protocol problems. Many protocols suffer from the so-calledReplay attacks(OrCapture-replay attack. Assume that we send a request to the bank to transfer $50 from our account to John Doe's bank account. If John Doe intercepts communication, he can send a copy of the same message to the bank later! Sometimes the bank thinks we have sent two requests.

Replay attacks
Replay attacks have been proven to be a common problem in many real-world systems. Fortunately, we can use the clever Mac usage to ease this situation. In the bank transfer example, we assume that we use an original Mac that hashes requests with the key. To cope with playback, we can ensure that the hash is always different. One obvious way to do this is to use timestamps.


Be careful!
If the code is not carefully written, another small error may occur. If the attacker can easily pick out the bit corresponding to the encryption time in your message and replace it with a set of different and random bits, the Code may not run. If your code only checks the time stamp of the previous 60 seconds, and does not check whether the time stamp comes from a future time, therefore, attackers will also be lucky (to steal funds.

If the server finds that the time stamp of a request expires (for example, 60 seconds), the request is rejected. This may be enough or not, because it still leads to a 60-second window, where the playback attack mayOccurred. You may consider disabling two requests at the same time unit and caching information about valid requests that have arrived in the past 60 seconds. If you can handle this special case: the same transaction is executed twice in the same time unit, this solution may be feasible. However, there are simpler solutions. When you calculate a Mac, not only data and key are hashed, but also a unique and ordered serial number is hashed. The remote host only needs to know the last serial number it processes and make sure that it does not process requests that are older than the next expected serial number. This is a commonly used method.

In many cases, authentication is not a problem. For example, you can use password hashes to authenticate the users who log on to the machine from the console. In many systems, when a user enters a password for the first time, the password itself is not actually stored. On the contrary, passwords are hashed. Most users think it would be better if the system administrator cannot retrieve their passwords at will. Assuming that the operating system is trustworthy (this is a ridiculous premise), we can assume that our database that encrypts hash passwords is correct. When a user attempts to log on and enters a password, the logon program hashes it and compares the new hash password with the stored hash. If the two are the same, assume that the user has entered the correct password, then the login continues.

Telnet protocol
Unfortunately, architects and developers sometimes assume that the security of the authentication mechanism is actually not a problem, but it is actually. For example, consider the Telnet protocol. Most Telnet servers receive user names and passwords as input. Then hash the password, or perform some similar conversions, and compare the result with that in the local database. The problem is that when the Telnet protocol is used, the password is transmitted in plain text on the network. Anyone who can use the package sniffer to listen on the network can find the password. Telnet AUTHENTICATION provides poor protection, which can be easily cleared by potential attackers. Many well-known protocols (including FTP, POP3, and most IMAP versions) have similar authentication mechanisms.

Of course, there are many other important security considerations around the implementation of the password authentication system. This question is a big topic.Article.

Other Attacks
Any good password hashing algorithm should be like this. Even if a known message is given and the message hash associated with it, it is difficult to find a replacement for duplicate hashing in plaintext. Deliberate searches for conflicts mean brute force attacks, which are usually difficult. It is especially difficult for attackers to use a second plaintext document to generate anything except a messy and meaningless string.

Another type of password hash attack is much easier to implement than the average brute force attack. Consider the following situation: Alice displays a document and authentication document password hashes to Bob, the content of which is Alice agreed to pay Bob $5 for each small jewelry. Bob doesn't want to store this document on his server, so he only stores the password hash. Alice wants to pay only $1 for each small ornament, so she wants to create a second document with the same hash value as the hash value of $5 and then bring it to court, sued bob for taking more of her money. When she appears in court, Bob will show the hash value. I believe Alice's document cannot hash the value because it is not the original document that she shows to him. If the attack succeeds, Alice will be able to prove that the files she forged are indeed listing Bob's stored values, and the court will win her case.

But how does she achieve this? Alice uses the so-calledBirthday attack. In this attack, she created two documents, one with $5 for each ornament and the other with $1 for each ornament. Then, in each document, she identifiesNWhere the surface can be changed (for example, where spaces can be replaced by tabs ). Okay.NThe value is usually half of the length of the final hash output bit plus 1 (so if we specify the length of the hash output bitM, ThenN=M/2+ 1 ). For the 64-bit hashing algorithm, she will select 33 places in each document. Then she repeatedly arranged each document to create and store hash values. In general, it is expected that she will find two documents listing the same values after about 2 m/2 messages are hashed. This is much more effective than a brute-force attack. If a brute-force attack is used, it is expected that the number of messages to be hashed is 2-1. If Alice needs 1 million years to execute a successful brute-force attack, she may be able to complete a successful birthday attack within a week. Therefore, Bob should require Alice to use an algorithm that generates a summary so that she cannot complete the birthday attack within any reasonable time.

If you want to achieve the same security as that for a birthday attack, the length of a given key isPSymmetric password, you should choose to provide the sizeP * 2Digest algorithm. Therefore, it is a good idea to require the hash algorithm to generate 256-bit or even 512-bit message digests for applications with high security requirements.

What is the applicable hash algorithm?
We especially like SHA-1. Bruce Schneier also recommends this algorithm. However, if the hash length must exceed 160 bits, SHA-1 is not enough. For large-digit hash, try to use symmetric encryption that is suitable for executing the hash method. The GOST hashing algorithm is a good example. It is derived from GOST encryption and has a length of 256 bits. Longer hash lengths may require some encoding to adapt to symmetric encryption. SchneierApplied cryptographyDescribes the construction of such algorithms. Neither SHA-1 nor GOST hash algorithms have any intellectual property rights restrictions.

Digital Signature
The idea behind digital signatures is to imitate traditional handwritten signatures. This idea is to "sign" a digital document in some way. The signature has the same legal force as the physical signature. Digital signatures must at least be as good as handwritten signatures for the following purposes:

    • The signature should be a proof of reliability. Its existence in a document should be convincing. Someone's signature appears in a document and he intentionally signs it.
    • The signature should not be forged. Therefore, the person who signed the signature in the document should not be able to declare that the signature is not yours.
    • After signing the document, it is impossible to detect changes to the document. Otherwise, the signature may be invalid.
    • The signature cannot be moved to another document.

Even for handwritten signatures, these goals are only conceptual and cannot truly reflect reality. For example, it is possible to forge a signature, although few are skilled enough to forge a signature. However, signatures are rarely abused, which well maintains its position in the court. In short, ink signatures are already a good solution.

Electronic signatures can at least be as good as physical signatures. This fact is often surprising because they treat this signature as a signature file (a string of ASCII) that is often placed at the end of an email message ). If digital signatures are like this, they are useless at all. It is easy to copy the signature from one file and add it directly to another file to form a counterfeit product. It may also be easy to modify a signed document, and no one can find it. Thank god, this is not the case with digital signatures.

Most digital signature systems use public key cryptography and cryptographic hashing algorithms. As we have explained, Public Key Cryptographic systems often use the receiver's public key to encrypt messages, and then the receiver uses the corresponding private key for decryption. Private keys can also be used to decrypt messages that can only be decrypted using the corresponding public key. If someone keeps his private key completely private (you haven't been hacked recently, have you ?) The ability to use the corresponding public key to decrypt a Message constitutes a proof that a suspicious person encrypts the original message.

Digital signatures are not only useful when signing documents-they can be used for almost any authentication requirements. For example, they are often used together with encryption to maintain data privacy and data authentication.

The digital signature used for the document is usually composed of encryption hashes of the document, and then encrypted with a private key. The result ciphertext is called a signature. Anyone can hash the document by themselves, decrypt the signature (using the public key or shared key), and compare the two hashes to confirm the signature. If the two hashes are the same, the signature is considered valid (assuming that the person who confirms the signature believes that the public key he uses does belong to you ).

The signature does not need to be stored along with the document. Similarly, the signature applies to any identical digital copy of the document. The signature can also be copied, but it cannot be applied to other documents, because the obtained hash does not match the decrypted hash.

Digital signature Problems
An issue with digital signatures is approval. People can always declare that their keys are stolen. However, digital signatures are still widely accepted as a legal alternative to physical signatures because they are at least as close as physical signatures as mentioned above. At present, there are at least 30 States with digital signature laws, and more States are likely to have legislation (if the United States Congress fails to pass a national law first ).

Most public key algorithms, including RSA and ElGamal, can be easily expanded to support digital signatures. In fact, a good software package that supports one of these algorithms should also support digital signatures. We recommend that you use your favorite public key cryptography algorithm for digital signature. Try to use built-in primitives instead of setting aside encryption algorithms and hash functions to build your own structures.


Applied cryptographyCool method in
Bruce Schneier discusses a lot of cool things you can do with cryptographic techniques. For example, he outlined how to create your digital authentication email system. He also discussed how to distribute tasksNPersonal methods to break down an encryption secret, only thisNPersonalMIndividuals put together their resources to reveal their secrets. This book is indeed worth reading carefully if you want to understand the possibilities that cryptographic techniques must provide.

What is next?
In future articles, we will discuss several topics. The most important theme is key management: How to securely generate, store, change, destroy or transmit encryption keys? We will also discuss substantive issues and study some real Cryptographic software packages, including code examples. We will study the implementation of cryptix package for Java applications, SSL for C, and Kerberos. We will also provide links to other Cryptographic software packages that are considered to be robust.

In the next column, we will explore the substantive details about how to add password management to software applications and (possibly more important) how not to do so.

Cryptography is a huge field, but it is only one aspect of software security. We cannot even lie that we fully understand it.

References

SeeDeveloperworksPrevious columns on Password Technology:

    • Make your software run: Hide everything
    • Run your software: proven Password
    • Read Bruce Schneier's "applied cryptography ".
    • Visit the reliable software technologies website.

About the author

Gary McGrawHe is vice president of business technology at reliable software technologies, based in Dulles, Virginia ). He is engaged in consulting services and research and helps determine the direction of technical research and development. McGraw started from research scientists at reliable software technologies and focused on software engineering and computer security. He has a dual-doctoral degree in cognitive and computer science from the University of India and a bachelor's degree in philosophy from the University of Virginia. He has written more than 40 peer-reviewed articles for technical publications and serves as a consultant for major electronic trading suppliers (including visa and Federal Reserve, he also served as the Chief Researcher under the sponsorship of the Air Force Research Laboratory, DARPA, the National Science Foundation, and NIST advanced technology projects.

McGraw is a well-known authority in mobile code security. In partnership with Princeton Professor Ed Felten, McGraw wrote "Java security: hostile applets, holes, & antidotes" (Wiley, 1996) and "Securing Java: getting down to business with mobile code" (Wiley, 1999 ). Dr. Jeffrey VOAs, one of the founders of McGraw and RST, wrote "software fault injection: inoculating programs against errors" (Wiley, 1998 ). McGraw regularly writes for some popular commercial publications, and its articles are often cited in national publications.

JOHN viega is a senior associate researcher at reliable software technologies, co-founder of the software security team, and senior consultant. He is the principal investigator of the Security Extension Project funded by Darpa for developing the standard programming language . John has prepared over 30 technical publications on software security and testing. He is responsible for looking for several public security vulnerabilities in major networks and e-commerce products, including recent intrusion into Netscape security. He is also an important member of the open source software community and has compiled mailman, namely, GNU mailing list manager, and the latest its4, this is a tool used to find security vulnerabilities in C and C ++ code. Visung has a master's degree in computer science from the University of Virginia.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.