Big Data Entry-level learning: SQL and NoSQL databases

Source: Internet
Author: User

The big data boom of the past few years has led to the activation of a large number of Hadoop learning enthusiasts. There are self-taught Hadoop, there are enrollment training courses to learn. Everyone who touches Hadoop knows that building each build in Hadoop requires a running environment, modifying profile testing, and so on. It's a pit for all of US entry-level novices. The domestic distribution of Hadoop so much, it seems to have not to fill such a hole? Do not know is unable to solve, or did not think?
Installation of the operating environment such a pit, those who do domestic big data on the bottom of the development, if not solve the problem, I think it is not a qualified big data underlying development agencies. Fortunately, in March, I applied to get a Dkhadoop three-node release, the big fast Open source release Hadoop. The domestic release of a variety of commonly used in the formation of such as: Hdfs,hbase,storm,flume,kafka,mahout,es, etc., and finally do not need to racked their brains to toss the bottom platform to build and configuration, simple to complete the installation. This is the Gospel for Hadoop beginners.
Pull a little bit more, back in the home to share the installation and use of Dkhadoop, today want to share with you is the database of big Data base content: SQL and NoSQL. To understand these two types of data, it is only necessary to clarify the concepts and how they differ.
Both concepts:
1, SQL database, refers to the relational database. Main representative: SQL? Server,oracle,mysql (open source), PostgreSQL (open source).
2, NoSQL refers to the non-relational database. Main representative: MONGODB,REDIS,COUCHDB.
The difference:
The difference between SQL data and NoSQL data is still relatively large, summed up basically can be from the following aspects of comparative analysis:
(1) Usage scenario: SQL is a number, which is best suited for clear definition, precise specification of independent projects. Typical use cases are online marketplaces and banking systems; NoSQL is a simulation, which is best suited for organization data without a fixed requirement. Typical use cases are social networks, customer management, and network analysis systems.
(2) Storage: SQL data exists in a table of a particular structure, and SQL typically stores data as a database table. For example, save a student's borrowing data:

NoSQL storage is more flexible and can be stored in JSON documents, hash tables, or other ways. For example, use a class JSON file to store the big borrowing data from the table above:

(3) In SQL, if you need to increase the external correlation data, the normalization method is to add a foreign key association external data table in the original table. For example, to increase the reviewer information in the Borrowing table, first establish an approver table:

Then in the original borrower's table to increase the reviewer foreign key, so if we need to update the reviewer's personal information only need to update the reviewer table and do not need to update the borrower's table.

In addition to this normalized external data table approach in NoSQL, we can also use the following denormalized methods to put external data directly into the original data set to improve query efficiency. Shortcomings are also more obvious, updating the reviewer data will be more troublesome.

(4) Data coupling?: SQL does not allow the deletion of external data that has already been used, such as "Bear three" in the reviewer table has been assigned to the borrower bear big, then in the reviewer table will not be allowed to delete the bear three this data to ensure data integrity, and NoSQL there is no such strong coupling concept, You can delete any data at any time.
(5) Query performance: In the same level of system design premise, because NoSQL omitted the consumption of join query, so theoretically performance is better than SQL.

Big Data Entry-level learning: SQL and NoSQL databases

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.