Personal privacy is a matter of concern to everyone. However, information delivery is a work full of trade-offs. For example, everyone will be dissatisfied with the plan and idea of installing cameras in the shower to automatically reorder soap.
In the early days, everyone thought that sending emails, online ordering, and smartphone applications were full of magical powers, and they didn't care about their private information.
Privacy-enhancing technology allows people to control the amount of private information to be supported, but can limit their controls to preserve functionality. They combine encryption with clever algorithms to build a database that can answer certain questions correctly, but only for the right people.
Over the years, this field has achieved tremendous development, and now there are many methods and strategies that can well protect personal privacy. Because they only store enough information for the company to deliver the product, and at the same time avoid certain dangers that may be caused by hackers or insiders gaining access.
These methods have their limitations. They can only withstand general attacks, and if cyber attackers are better equipped or attack more targeted, they may face collapse. Under normal circumstances, the amount of protection is directly proportional to the computing power required for encryption calculations. Basic protection may not add significant additional load to the system, but even for
cloud computing providers, providing complete security may not be achieved.
But these restrictions should not prevent people from adding basic protection measures. There is no complete security method, but adding some simple solutions can protect people from some cyber attacks that may be brought about by adopting cloud computing services.
The following are 9 methods and strategies to balance personal privacy and function:
1. Use function
Cloud computing providers understand that customers are concerned about security, and they are gradually adding features that make it easier to lock data. For example, Amazon offers more than twenty products that help increase safety. AWS Firewall Manager helps ensure that the firewall only allows the correct data packets to enter. AWS Macie scans people's data, looking for sensitive data that is too open. Both Google Cloud and Microsoft Azure have their own security toolsets. Understanding all these products may require a team, but this is the best place to start protecting their cloud computing work.
2. Focus on encryption
When people only encrypt computer equipment, it is already very difficult to protect the security of passwords, encryption keys, and authentication parameters. For cloud computing applications, it is much more complicated, especially when they are managed by a team. Cloud computing vendors have designed various tools to help. People still have to be cautious about source code management, but these tools will help them encrypt so that they can be safely added to cloud computing applications. Tools such as Hashicorp’s Vault, Doppler’s Enclave, AWS’s key management system, and Okta’s API management tools can simplify the process. You still need to use these tools with caution, but it is better than writing the password in a notebook and locking it in the office.
3. Consider dedicated hardware
It is best not to share server hardware with others. It is also difficult to believe that cyber attackers will adopt a deception method of sharing the correct machine and then use different extreme methods such as Rowhammer, but some data may be worth the hard work of people. Cloud computing providers only provide dedicated hardware for this occasion. If the user's computing load is fairly stable, it may even be more cost-effective to use the server in an on-premises facility. Some people use hybrid tools from cloud computing providers, while others want to use their own on-premises servers. In any case, full control of the server is more expensive than shared servers, but it also avoids many network attacks.
4. Hash Algorithm
One of the simplest solutions is to use a one-way function to hide personal information. These mathematical functions are designed to be easy to calculate, but they are actually impossible to reverse. If you replace someone's name with f(name), people browsing the database will only see the randomly encrypted information in the one-way function.
This data may be incomprehensible for ordinary browsers, but it is still useful. For example, if you want to search for Bob's records, you can calculate f(Bob) and use this encrypted value in the query. This method is safe for occasional browsers, who may find an interesting row in the database and try to interpret the value of f(name). It will not prevent network attackers from targeted browsing because they know they are looking for Bob. More complex methods can add more protective layers.
The most common one-way function may be the Secure Hash Algorithm (SHA), which is a collection of functions approved by the National Institute of Standards and Technology. There are several different versions, and some weaknesses were found in the earlier version, so be sure to use the new version.
5. Pure encryption
Good encryption is built into many layers of the operating system and file system. Activating them is a great way to add some basic security to prevent cyber attackers and people who might gain physical access to people's devices. If people store data on a laptop, keeping the data encrypted can avoid some worry about loss.
However, the conventional encryption function is not one-way, there is a way to decrypt data. Choosing conventional encryption is usually inevitable because people are planning to use data, but this provides another way for cyber attackers. If the correct key can be applied to decrypt the data, a copy of the key can be found and deployed.
6. Fake data
Although some people complain that “fake news” is disrupting the world order, fake data may also provide protection. Some developers do not disclose the real data set to partners or internal personnel who need to use it for artificial intelligence training or programs, but are creating fake versions of data with many of the same statistical characteristics.
For example, the RTI company created a fake US Census database, whose data includes 110 million households and more than 300 million people in the United States. But there is no real personal information. These 300 million Americans are more or less in the same area of the United States, and their personal information is very close to the real information. Researchers who predict the path of infectious diseases can conduct research without access to real personal data.
An artificial intelligence company called Hazy is offering a Python-based tool that can run in a secure data center and generate synthetic versions of data that people can share more freely.
7. Differential Privacy
The term differential privacy describes a general method that only adds enough noise to the data to protect the private information in the data set while still retaining enough information for use. For example, randomly adding or subtracting a few years from the age of each person will hide their exact birth year, but the average age will not be affected.
This method is more useful for large-scale statistical work of research groups. Individual items may be corrupted by noise, but the overall result is still accurate.
Microsoft has begun to share White Noise, an open source tool built with Rust and Python to add fine-tuned noise to users' SQL queries.
8. Homomorphic encryption
Most encryption algorithms completely encrypt the data so that without the correct key, no one can understand the result. The homomorphic method uses a more complex framework, so many basic arithmetic operations can be performed on encrypted data without a key. People can add or multiply without knowing the basic information itself.
The simplest solution is feasible, but it has limitations.
IBM is now sharing an open source toolkit for embedding homomorphic encryption in iOS and MacOS applications, and promises to release Linux and Android versions soon. These tools are preliminary, but they provide the ability to explore calculations, as complex as training machine learning models, without having to access unencrypted data.
9. Keep nothing
Programmers may be like packers, they keep data in case it is useful for debugging in the future. One of the simplest solutions is to design the algorithm to be as stateless and logless as possible. After debugging, you need to stop filling the hard disk with a lot of information. Just return the result and stop.
Keeping as little information as possible is also dangerous because it is difficult to detect abuse or fix errors. But on the other hand, there is no need to worry that cyber attackers will use this kind of digital garbage, because they cannot attack any non-existent personal data.