I have already said in many speeches that the best way to improve your system is to avoid stupid things first. I am not saying that you or the things you develop are stupid, but some decisions are easily ignored by people, I have said in many speeches that the best way to improve your system is to avoid "stupid things ". I am not saying that you or the things you develop are "stupid", but some decisions are easily ignored by people, I don't know how much trouble it brings to system maintenance, especially system upgrades. As a consultant, I can see such things everywhere, and I have never seen people who make such decisions have had good results.
Images, files, and binary data
Since the database supports BLOB-type data, there must be no error in inserting the file into the BLOB field !? Wrong, not like this! In many database languages, it is not easy to process large fields.
There are many problems with storing files in the database:
- The read/write speed of the database will never catch up with the speed of file system processing
- Database Backup has become huge and time-consuming
- Access to files must go through your application layer and database layer.
The last two are real killers. Store thumbnail images in the database? Well, you can't use nginx or other types of lightweight servers to process them.
Make it easy for yourself. simply store the relative path of your files on the disk in the database, or use services such as S3 or CDN.
Short-lived data
Usage statistics, measurement data, GPS positioning data, session data, and any data that is useful or frequently changed for you in a short period of time. If you find that you are using a scheduled task to delete data that is valid for only one hour, one day, or several weeks from a table, it means you have not found the correct method for doing things. Using redis, statsd/graphite, and Riak is a more suitable tool for doing this. This suggestion also applies to the collection of short-lived data.
Of course, it is also feasible to plant potatoes in the back garden with excavators, but instead of taking out a shovel from the storage room, you reserve a excavator and wait for it to rush to your garden to dig holes, this is obviously slower. You need to select a proper tool to handle the tasks at hand.
Log files
Storing log data in a database seems to be good on the surface, and "I may need to perform complex queries on this data in the future" is quite impressive. This is notVery poorBut if you store log data and product data in a database, it is very bad.
Maybe your logging is quite conservative, and only one log is generated for each web request. For every event on the entire website, this will still produce a large number of database insertion operations, competing for the database resources required by your users. If your log level is set to verbose or debug, check that your database is on fire.
You should use some plain text files such as Splunk logugly to store your log data. It may be inconvenient to view them in this way, but there are not many such cases. sometimes you need to write some code to analyze the answer you want, but it is worth it in general.
But wait a moment. you are a different snowflake, and your problems will be so different, if you put one of the three things mentioned above into the database, there will be no problem.No, you are wrong. no, you are not special.Believe me.
Original article: Three things you shoshould never put in your database