The use of the web crawler's MongoDB database

Source: Internet
Author: User
Tags auth install mongodb lowercase mongodb server naming convention reserved win32


A. ConciseMongoDB is a powerful, flexible, and easy-to-scale universal database


1. Ease of Use

MongoDB is a document-oriented database, not a relational database.
The main reason for not using relationships is to achieve better scalability. There are of course other benefits. Compared to relational databases, document-oriented databases no longer have a "row" concept instead of a more flexible "document" model.
By embedding documents and arrays in documents, the document-oriented approach can represent complex hierarchical relationships using only one record, which is consistent with the view of data in modern object-oriented language developers.
In addition, there is no longer a predefined schema: the key and value of the document are no longer fixed types and sizes. Since there is no fixed mode, it is easier to add or remove fields as needed.
The development process is usually accelerated because developers can make rapid iterations. Moreover, the experiment is easier to carry out. Developers can try a lot of data models,
Choose the best one.





2, easy to expand

The size of the application dataset is growing at an incredible rate. As the available bandwidth grows and the memory price drops, even for a small-scale application, the amount of data that needs to be stored can be astonishing or even exceed
A lot of database processing power. Very rare T-level data in the past is now commonplace.
Due to the ever-increasing amount of data that needs to be stored, developers face a problem: how to extend the database, which is divided into vertical expansion and horizontal expansion. Vertical expansion is the most labor-saving method, but the disadvantage is that the mainframe is generally very expensive, and
When the amount of data reaches the physical limit of the machine, you can't buy a stronger machine with more money. At this time, it is more appropriate to choose horizontal expansion, but another problem caused by horizontal expansion is that there are too many machines to be managed. .
MongoDB's design uses horizontal scaling. The document-oriented data model makes it easy to split data between multiple servers. MongoDB automatically handles data and load across clusters, automatically redistributes documents, and
The user's request is routed to the correct machine. This way, developers can focus on writing applications without having to think about how to scale. If a cluster needs more capacity, just add a new server to the cluster.
MongoDB will automatically transfer existing data to the new server

 


3. Rich function

As a general-purpose database, MongoDB provides a unique set of unique features that can be created, read, updated, and deleted.
#1, index
Supports common secondary indexes, allowing multiple fast queries, and providing unique indexes, composite indexes, geospatial indexes, full-text indexes

#2, aggregation
Support for aggregation pipes, users can create complex collections with simple fragments and automatically optimize through the database

#3, special collection type
Supports collections with limited time, suitable for data that will expire at some point, such as session sessions. Similarly, MongoDB also supports fixed-size collections for saving recent data, such as logs.



4. Excellent performance

One of MongoDB's main goals is to provide superior performance, which largely determines the design of MongoDB. MongoDB uses as much memory as possible for the cache cache, and the view automatically selects the correct index for each query.
In short, all aspects of the design are designed to maintain its high performance.
Although MongoDB is very powerful and tries to preserve many of the features of a relational database, it does not pursue all the features of a relational database. Whenever possible, the database server hands the processing logic to the client.
This streamlined design is one of the reasons why MongoDB can achieve such high performance.

 





Two. MongoDB Basics






1. Documentation is the core concept of MongoDB. The document is an ordered set of key-value pairs {' msg ': ' Hello ', ' foo ': 3}. Similar to an ordered dictionary in Python.

have to be aware of is:
#1, the key/value pairs in the document are ordered.
#2, the value in the document can not only be a string in double quotes, but also several other data types (even the entire embedded document).
#3, MongoDB distinguishes between type and case.
#4, MongoDB's documentation cannot have duplicate keys.
#5. The value in the document can be a variety of different data types, or it can be a complete embedded document. The key of the document is a string. With a few exceptions, the key can use any UTF-8 character.

Document key naming convention:
The #1 key cannot contain \0 (null character). This character is used to indicate the end of the key.
#2.. and $ have special meaning and can only be used in certain circumstances.
#3, the key starting with the underscore "_" is reserved (not strictly required).

2, a collection is a set of documents. If a document in MongoDB is likened to a row of relational data, then a collection is equivalent to a table

#1, the collection exists in the database, usually in order to facilitate management, different formats and types of data should be inserted into different collections, but in fact the collection has no fixed structure, which means that we can completely put data of different formats and types. Insert all into one collection.

#2. The way to organize subcollections is to use "." to separate subcollections of different namespaces.
For example, a blog-enabled application might contain two collections, blog.posts and blog.authors, to make the organizational structure clearer. The blog collection here (this collection doesn't even need to exist) and its two sub-collections. not related.
In MongoDB, using subcollections to organize data is very efficient and worth recommending

#3. When the first document is inserted, the collection will be created. Legal collection name:
The collection name cannot be an empty string "".
The collection name cannot contain a \0 character (a null character), which indicates the end of the collection name.
Collection names cannot begin with "system.", which is a prefix reserved for system collections.
User-created collection names cannot contain reserved characters. Some drivers do support inclusion in collection names because some system-generated collections contain this character. Never show $ in your name unless you want to access a collection created by this system.


3. Database: In MongoDB, multiple documents make up a collection, and multiple collections can form a database

The database is also identified by name. The database name can be any UTF-8 string that satisfies the following conditions:
#1, can't be an empty string ("").
#2, must not contain ‘ ‘ (space), ., $, /, \, and \0 (null characters).
#3, should be all lowercase.
#4, up to 64 bytes.

Some database names are reserved, and you can directly access these special-purpose databases.
#1, admin: From the perspective of identity authentication, this is the "root" database. If a user is added to the admin database, this user will automatically
Get access to all databases. Furthermore, some specific server-side commands can only be run from the admin database, such as listing all databases or shutting down the server.
#2, local: This database can never be copied, and all local collections on one server can be stored in this database.
#3, config: When MongoDB is used for slice setting, the fragment information will be stored in the config database.


4, emphasize: Add the database name to the collection name, get the fully qualified name of the collection, that is, the namespace

The database is also identified by name. The database name can be any UTF-8 string that satisfies the following conditions:
#1, can't be an empty string ("").
#2, must not contain ‘ ‘ (space), ., $, /, \, and \0 (null characters).
#3, should be all lowercase.
#4, up to 64 bytes.

Some database names are reserved, and you can directly access these special-purpose databases.
#1, admin: From the perspective of identity authentication, this is the "root" database. If a user is added to the admin database, this user will automatically
Get access to all databases. Furthermore, some specific server-side commands can only be run from the admin database, such as listing all databases or shutting down the server.
#2, local: This database can never be copied, and all local collections on one server can be stored in this database.
#3, config: When MongoDB is used for slice setting, the fragment information will be stored in the config database.





Three. The installation and use of MongoDB

[01] Windows installation first go to the official website to download:
Https://www.mongodb.com/dr/fastdl.mongodb.org/win32/mongodb-win32-x86_64-2008plus-ssl-3.4.4-signed.msi/download

[02] After downloading and installing, after the installation is complete, configure the bin directory to the system environment variable.

[03] Enter the new folder in the installation directory:
First create a new data directory:
D:\ProfessionalSoftwares\mongodb\data
Then create two new folders in the data directory: db and logs
Finally create a new file in the logs directory: mongo.log

[04] Now we will make mongoDB a service and run it in the background, so we can use the MongoDB server every time we boot:
First run the cmd command line as an administrator, then execute the following command on the command line: (Because I configured the environment variable, I can use it directly.
Mongod command)
Mongod --bind_ip 0.0.0.0 --logpath D:\ProfessionalSoftwares\mongodb\data\logs\mongo.log
--logappend --dbpath D:\ProfessionalSoftwares\mongodb\data\db --port 27017
--serviceName "MongoDB" --serviceDisplayName "MongoDB" --install --auth

The above --install parameter means to install mongodb as a service. --auth means that when I link the mongodb server to my client, I need to enter the specified username and password.

After the installation is complete, then start the MongoDB service in the administrator cmd command line that you just opened:
Net start MongoDB Here MongoDB is the service name in the above command: serviceName

In addition, if you need to close the MongoDB service, you only need to use net stop MongoDB.

[05] After the final operation, we enter the following URL in the browser: http://localhost:27017/
Displaying this sentence means that we installed successfully:
It looks like you are trying to access MongoDB over HTTP on the native driver port.


2. Account Management



Databases generally have corresponding accounts and passwords, MongoDB also has:

#Account management, refer to the official website address: https://docs.mongodb.com/master/tutorial/enable-authentication/

#1, create an account, here we are equivalent to creating an administrator account root, and then use Robomongo to log in to the database, you need to enter the verification,
Otherwise, I will get an error: Failed to execute "listdatabases" command. I have been here for a long time.
By default, MongoDB has two databases: admin and test. When we create an administrator account, it is usually under the admin database.

Note that the following steps are all required to be executed under the cmd command line with administrator privileges:

Then the above said, after logging in to the MongoDB database in the initial environment, we directly use the admin database, and then create an account.
Use admin
db.createUser(
   {
     User: "root",
     Pwd: "cisco123",
     Roles: [ { role: "root", db:

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.