The relational database management system still occupies the position of the industry leading. But even if you are a complete Oracle powder, deeply attracted to the pl/sql architecture of the old RAC, keep your mind and stay calm. Times are different, and now we need to think carefully before embarking on a task, and we must never choose a solution on the basis of personal likes and dislikes. This article lists 10 things that you must avoid when you use a relational database.
I am a nosql user and have a wide range of large data. This combination of skills is quite occasional, as you can see, today the most likely discussion among database technicians is the topic of "Runaway data growth".
The so-called "incorrigible", the relational database management system is still the industry-leading position. But even if you are a complete Oracle powder, deeply attracted to the pl/sql architecture of the old RAC, keep your mind and stay calm. Times are different, and now we need to think carefully before embarking on a task, and we must never choose a solution on the basis of personal likes and dislikes.
1. Search: Even the most dedicated Oracle expert will try to avoid using Oracle text components. Although Oracle has been spending it on its own database products, the actual performance is rather lackluster. Instead, we can see that many users are still using commands like and or to implement complex query work. The result is that users are complaining about the full load, the actual performance is weak--and the way Oracle has set up the data access itself is extremely annoying. Of course, Oracle is not the only one by one companies that lack the search function. In addition to them, most other RDBMS products do not achieve true search extensions.
With the help of Hibernate search, Apache SOLR or autonomy, we are able to achieve better performance in retrieval. Don't hesitate to make them a powerful assistant in the full text search effort.
2. Recommended: I have used a large number of ATG and other commercial products, this feature is definitely the most intolerable thing I have seen. The product keeps track of the user's large amount of daily information and tries to recommend other products that users may need. All the places I've worked for are usually closed down for the first time due to scalability considerations.
You may wish to imagine the functioning of social networks. If I want a user to be able to buy socks from friends and friends of his or her friends, this leap-forward relationship makes the RDBMS very passive. To achieve this demand, we need to adopt a self-connected table and a multiple query layer. It's like two lines of code in a graphics class database such as neo4j. While it is possible to achieve goals by flattening the social networking architecture and temporarily adjusting the data, it can also cause the relational database to lose its real-time nature.
3. Frequent transactions: You may think that the trading system is the forte of the RDBMS, because the data will contain some transaction attributes, right? I even suspect that the first operator to implement frequent transactions through NoSQL is a member of the NoSQL development team. In frequent trading activities, low latency is the most important and valuable factor. Yes, if you jump out of the box, you can get a lower latency effect in the RDBMS-but I'd like to remind you that relational databases are not designed for such tasks.
Oracle is trying to solve the problem by acquiring TimesTen, which has been trying to combine the memory database with the RDBMS-but even if the car adds wings and does not become a plane, we can only think of this as a small improvement in some way. On the contrary, we find that many frequent transaction operators spontaneously choose Riak and other key-value schemes and even more complex gemfire.
4. Product Catalog: This one may seem prosaic, but as I mentioned in my previous article, one of the worst SQL query nightmares of my personal work is the work of product data mapping. I was working for a mobile phone manufacturer. Mobile phone This thing as we all know, the same model "XYZ" may represent a variety of models, and these models in different markets are also given a different "nickname." Even the same model uses a variety of differentiated components. It is simply difficult to flatten such complex and fuzzy "classes", so that the graph-class database represented by NEO4J is the right choice when dealing with this kind of work.
I also encountered similar problems when I was working for a chemical company. The character mapping scheme we chose at that time was very stupid and very human. When the product information is transferred to the Graphics class database, mapping work becomes simple and easy. Even file databases like Couchbase 2.0 or MongoDB perform better than relational databases.
5. Users/Groups and ACLs: In some ways, LDAP is actually the most primitive NoSQL database. LDAP is designed specifically for users, groups, and ACLs to satisfy such requirements. Unfortunately, many people are using it as a derivative of new technology for misunderstanding, and companies are trying to use it to deal with some absurd and even terrible tasks. A number of companies have used it to build a bureaucracy-rich management mechanism, and many developers are forced to tamper with database tables to maintain their daily routines in order to avoid impact. This obviously violates the original intention of centralized user access control. In my opinion, the forms of "user" and "role" are unnecessary in any enterprise environment and should be discarded as soon as possible.
6. Log Analysis: If you do not know the harm of this aspect, you may open Hadoop or small Cluster Server version Rhq/jbosson log analysis function, set the log level, let the log capture other than the error information. The more complex the execution process, the more chaotic our working state will be. As you can see, the amount of data with some unstructured features like log information is an area that MapReduce's Hadoop and languages like pig excel at. However, we regret to see that all kinds of mainstream monitoring tools are still the main object of RDBMS-relational database does not need so much analysis and summary work, low latency is its biggest selling point and the primary demand.
7. Media repository: Although the effect of saving metadata can be (in fact, Couchbase 2.0 or MongoDB better in this respect), the blob in the RDBMS is still very ineffective after years of evolution. It's best to choose some type of distributed storage scheme or cluster file system for your own images and other binaries. Despite the disappointing performance, many CMS engines still push all storage tasks to the RDBMS, which is one of the things you should be aware of.
8. E-mail: I know that this has almost become a consensus. As the project completes and the emails are sorted into the RDBMS, I find that many people already understand that email is actually a metadata with moderately unstructured features, and relational databases are not good at storing such data. We have optimized the RDBMS as much as possible to maximize related components such as blobs. However, email management involves metadata, search, and content, and there is no significant correlation algebra between these things, and it is not related to transactions. The file system of the relational database itself is not problematic, except that the file class database behaves better in processing metadata.
9. Classified Ads: Advertising is a large-scale collection of information, a large number of users to query and publish such data, its content is short but very attractive. Craigslist, a well-known advertising site, uses the file-class database MongoDB, which specializes in searching, managing metadata, and is well suited to the inherent nature of advertising, with enough protection for information consistency. The best choice for an RDBMS is to take a detour when faced with a file database that is almost always tailored for advertising.
10. Time Sequencing/Forecasting: This is the most universal in the top ten rankings of this article, but its concrete manifestation is various, from commodity to data analysis to sunspot forecast. The performance of relational databases in scheduling problems has been controversial. Of course, the situation has improved significantly now, and over the past more than 10 years, the RDBMS has made remarkable progress in dealing with the efficiency and function of time-related processing that has emerged from the embarrassment of serious flaws. However, if you use the Time Class task as the main processing object, then like Cassandra so can with MapReduce cluster product family Good docking scheme is certainly more ideal. DataStax has made it clear that its Cassandra distribution will support time-sequencing data, and some other vendors have launched similar features in their products.
What are some of the areas in which you use RDBMS in addition to the ones mentioned earlier? Like everyone else, I can't live without an RDBMS every day, but is it really the best solution? Perhaps some old stubbornness will try to exonerate the RDBMS, but we have to admit that the use of habits is far from enough to support a relational database, and it is wise to use the right tools where appropriate.