1.mongoDB Introduction 1.1 NoSQL Database
Database: Software for efficient, regular data persistence storage
NoSQL database: Not only SQL, refers to non-relational database
Benefits: High scalability, distributed computing, low cost, flexible architecture, semi-structured data, simplified correlation
Disadvantages: no standardization, limited query, not intuitive
Common NoSQL databases
Column storage: Hbase, Cassandra, hypertable
Document storage: MongoDB, CouchDB
K-v storage: tokyocabinet, BerkeleyDB, Memcachedb, Redis
Object storage: neo4j, Versant
XML database: BerkeleyDB, BaseX
Note: Black bold for common database
1.2.MongoDB Overview
MongoDB is a database based on distributed file storage. Written by the C + + language. Designed to provide scalable, high-performance data storage solutions for WEB applications.
MongoDB is a product between a relational database and a non-relational database, and is the most versatile and most like relational database in a non-relational database.
Advantages:
- High-performance data written in C + +
- Mode freedom
- Set-oriented
- Full index support
- Replication and high Availability
1.3.Mongodb Terminology Interpretation
- Database--database: Database
- Table–collection: Database Tables – Collections
- Row–document: Data Logging – Documentation
- Column–field: Data field – Domain
- Index–index: Index – Index
- Table-join–none: Table Connection ~
- Primary key–primary key: Primary key
1.4.MongoDB basic syntax--data type
A collection is a table in a relational library.
Documents correspond to rows in relational databases
Document: Is a JSON object composed of Key=value key-value pairs
{"Name": "admin", "gender": "Male"}
Collections: Storing multiple documents with no fixed structure
{"name": "admin", "gender": "Male"} {"name": "Manager", "Age": "{"}} "name": "Manager", "Phone" : " 16868686868 "}
- ObjectID: Document ID
- String: Strings
- Boolean: Boolean value
- Integer: Integers
- Double: Floating-point number
- Arrays: Array or List
- Object: Embedded Document
- Null: null value
- Timestamp: Time stamp
- Date: datetime
2.Mongodb Download Installation
- Official website
Note: Even the stable version, such as 1.6, odd for the development version, such as 1.7
2.1.Windows under MongoDB Installation
Download the MSI file directly or click the all verison binaries download.
I use the zip installation here, after the installation is complete,
Create DB folder and log folder in Mongo folder
- Start database in CMD window input command
Mongod--dbpath d:\ Software Installation \mongodb\data\db (here is the path to DB)
- Opens a new window input command MONGO for database operations
2.2.Linux under MongoDB Installation
MongoDB installation is very simple, no need to download the source files, can be installed directly with the Apt-get command.
1. Open terminal and enter the following command:
sudo apt-get install MongoDB
2. After the installation is complete, enter the following command in the terminal to view the MongoDB version:
Mongo-version
Display version information, which is installed successfully
3. Start the MONGO database
In the terminal input command
sudo MONGO
3. Installing Pymongo
Pymongo is the Python interface development package for MongoDB and is the recommended way to use Python and MongoDB.
Using Python to manipulate MongoDB needs to be installed via Pymongo, input command
pip install Pymongo Default install pip install Pymongo==2.8 Install specified version pip Install–upgrade Pymongo Upgrade Pymongo
Installation Successful
Command runs successfully without error
4.Mongodb basic use 4.1. Basic operation
MongoDB stores data as a document
The data is made up of Key=value key-value pairs.
Operation of the data: adding and deleting changes
NoSQL Ternary: Databases – Collections – Documents [--domains]
4.2. Basic syntax
Database operations
DB: View the currently pointing database
Show DBS: View all current Databases
Use < database name;: Point to a database
The use database does not create a database, and if the operation data is automatically created
Db.dropdatabase (): Delete the currently pointing database
Collection operations
Show Collections: View all collections of the current database
Db.createcollection (<c_name> [, Options]): Create a collection
db.< collection Name >.drop (): Deletes the specified collection
5}): Capped default false means no upper bound, true indicates that setting the upper limit requires setting the size parameter ~ Indicates that the previous data is overwritten when the upper limit is reached
- Add data
Syntax:db.< collection name >.insert (document)
Collections can be original, can be nonexistent
Document: Data that is in JSON format
Simple query:db.< collection name >.find () query the data for the specified collection
Db.student.insert ({name: "Jerry", Gender: "Male"}) Db.student.insert ({_id: "1", Name: "Tom", Gender: "Female", age:18})
- Update data
Syntax:db.< collection name >.update (<QUERY>, <update>,[multi:<boolean>])
Specify Property Update: $opration
Multi: Default false update meets condition first, set true full collection update
# update a document that meets the criteria db.student.update ({name: "Tom"}, {name: "Jerry"}) # update eligible fields in a document that meets the criteria db.student.update ({name: "Tom"}, {$set: {name: "Jerry"}}) # Update the qualifying multi-line document with the corresponding domain db.student.update ({}, {$set: {name: "Donghua"}}, {multi:true})
Save data
Syntax:db.< collection name >.save (document)
Features: [_id] If the data does not exist, add if the data exists modified
Delete data
Syntax:db.< collection name >.remove (<query>, {justone:<boolean>})
parameter query: Delete a document's criteria
Parameter Justone: Set to TRUE or 1, delete one, default false to delete multiple bars
Querying data
Basic Query
Find ([{document Condition}]): Full set query
FindOne ([{document Condition}]): Query first
Pretty (): Formatting the results of the query
Comparison operators
default judgment, no operator $lt: little~ less than< or equals~ less thanor equal to <= $gt: granter~ greater than > or equals~ is greater than or equal to >=# The query name is Jerry's student db.student.find ({name: "Jerry"})# Check the age of marriageable db.student.find ({age:{$gte: 20}})
- logical operators
Logic with: and operations, default operation, no operator
Logical OR: or operation, $or
# Check the age of marriageable age and gender for female students Db.student.find ({age:{$gte: $}, Gender: "Female"})# Db.student.find ({$or: [{age:{$gt: +}, {gender: "female"}]) for students older than 18 or gender
# query for learners aged 18 or 20 Db.student.find ({age: {$ in: [18,20]}})# query Age not 18 Db.student.find ({age: {$nin: [20]}})
- Limit the number of query bars
<find>.limit (count)
<find>.sort ({field: 1/-1, ...}) Db.student.find (). Sort ({name:1}) 1. Indicates ascending order-1 means descending order, multiple fields can be specified
<find>. Count () db. < collection name >.count ({condition})
两种操作方式1.查询结果,通过count()统计数据2. 通过count()直接添加条件统计数据
db.< collection name >. DISTINCT ("deduplication domain name", {condition}) query data list, all age distribution Db.student.distinct ("ages", {})
# n data Query m data db.hero.find (). Pretty (). Limit (m). Skip (N)
5.Mongodb interacting with Python
Before we learned the crawler, we now store the crawled data in MongoDB.
#Crawl Hero League hero info details and store#-*-coding:utf-8-*-ImportPymongoImportRequests fromBs4ImportBeautifulSoup#connections built on MongoclientClient = Pymongo. Mongoclient ('localhost', 27017)#Get the databaseHero = client['Hero']#get a collection of dataSheet_tab = hero['Sheet_tab']url='http://lol.duowan.com/hero/'req=requests.get (URL) soup= BeautifulSoup (Req.text,'Html.parser') Links= Soup.find (id="champion_list"). Find_all ('a') forLinkinchLinks:link= link['href'] requ=requests.get (link) SOP= BeautifulSoup (Requ.text,'Html.parser') Data= { 'title': Sop.find ('H2', class_="Hero-title"). Get_text (),'name': Sop.find ('H1', class_="Hero-name"). Get_text (),'Tags': Sop.find ('Div', class_="Hero-box ext-attr"). Find_all ('span') [1].get_text (),' Story': Sop.find ('Div', class_="Hero-popup"). Find_all ('P') [0].get_text (),} sheet_tab.insert_one (data)
Turn on MONGO, run code
With the Robo 3T visualizer we can see that 137 data has been fetched and stored in MongoDB
MongoDB Installation and Basic use