MongoDB data model and index learning summary, mongodb Model

Source: Internet
Author: User
Tags createindex

MongoDB data model and index learning summary, mongodb Model
MongoDB data model and index learning Summary 1. MongoDB Data Model:

  • MongoDB data storage structure:
    MongoDB uses BSON (binary json, binary encoding) data format for document (large files use the GridFS Protocol) to store and exchange data. Bson absorbs the characteristics of JSON schema-less, and has a loose storage structure. It does not need to define the metadata structure of data storage as RDB (relational data) does, in addition, multiple data types are supported and optimized to make read and write more efficient.
    (1) Data Types supported by BSON:

    Double、String、Object、Array、Binary Data、Undefined、Object id、Boolean、Date、Null、Regular Expression、JavaScript、Symbol、JavaScript(with scope)、32-bit integer、Timestamp、64-bitInteger、Min key、Max key

    (2) BSON is manifested in the following forms:

    { "_id" : ObjectId("542c2b97bac0595474108b48"), "ts" : Timestamp(1412180887, 1),"name":"steven"}

    (3) BSON is the communication protocol and data storage format in MongoDB: In MongoDB, the client and server communicate in the BSON document format. For example, to query a piece of data, you need to write it like this:

    db.steven.find({"name":"steven"})

    To update a piece of data, write as follows:

    db.steven.update({"name":"steven"},{$set:{"name":"jianying"}})

    To delete a piece of data, write as follows:

    db.steven.remove({"name":"steven"})

    In short, the RPC communication format for CRUD in MongoDB supports the BSON data format. In addition, the storage format is similar to that of BSON:

    { "_id" : ObjectId("542c2b97bac0595474108b48"), "ts" : Timestamp(1412180887, 1),"name":"steven"}

    (4) BSON data format encoding:
    The String type of BSON adopts UTF-8 encoding. the K value in the KV structure and the V value of the String type are all encoded in the UTF-8 format. If other formats are used, transcoding is required. And any UTF-8 characters except the following can be used for K values:

    A. the key cannot contain \ o (null character) B. $ and. it has a special meaning. Only c. keys starting with the underscore (_) are retained (not strictly required)

    The encoding of other value types is based on the built-in protocol of the specific data type. MongoDB supports document reference and nesting in the way data models are organized. The details are as follows.

  • Data Model Design Pattern-reference and nesting:
    Data storage in reference mode is a mode in which MongoDB organizes the data storage structure, that is, a document stores the necessary information required to retrieve another document, for example:

    {   _id: "joe",   name: "Joe Bookreader"}{   patron_id: "joe",   street: "123 Fake Street",   city: "Faketon",   state: "MA",   zip: "12345"}

    The document above is the information of user joe, and the following document records his address information. To retrieve the address information based on joe's name, you need to first retrieve the first document, then retrieve the second document. The nested mode is designed as follows:

    {   _id: "joe",   name: "Joe Bookreader",   addresses: [                {                  street: "123 Fake Street",                  city: "Faketon",                  state: "MA",                  zip: "12345"                }              ] }

    The two design modes have their own advantages and disadvantages. The reference mode is regarded as a standardized mode, which reduces data storage redundancy and makes the structure design fresh and simple. In line with our general design principles, the communication overhead for obtaining complete data is relatively large, and the atomicity of operations on multiple documents is not guaranteed at the MongoDB level. The nonstandard nested design mode has the opposite characteristics, which reduces the communication cost and ensures Atomicity in a single document. The disadvantage is that data redundancy exists. The choice of data organization method is actually a trade-off ).

  • Note:
    (1) The size of the MongoDB document must be smaller than 16 Mb. If the size exceeds 16 MB, use GirdFs.
    (2) If the size of the added document exceeds the space originally allocated to it, MongoDB will move the document to another location on the disk. Migration documents are more time-consuming than in-situ updates, resulting in disk fragmentation issues.
    (3) In MongoDB, the atomic level of operations is ensured to the document level.
    (4) Bson string adopts UTF-8 encoding.

2. MongoDB index structure:
  • MongoDB supports the following index types:
    MongoDB uses the B-tree structure to organize indexes (effectively supporting equivalent queries and range queries), and supports indexing of any fields in the document, indexes can be built for Single-value, array, text, and nested fields. MongoDB supports full indexing for the BSON storage format. In the face of multiple and powerful Mongo indexes, the index design has a great impact on performance improvement. Currently, the latest release V3.0 supports the following types of indexes:

    Index type description Default _ id Default ID index: Mongo creates the id field of the unique index by Default. Each document has a _ id field. Single Field Single-value index: Index A Single Field of a document or a Field of a nested document. Compound Index composite Index: the Compound Index combines multiple fields to build an Index. The fields are indexed in the tree structure of the upper and lower layers. Multikey Index multi-value Index: creates an Index for each value of the Array Based on the Index structure of the array type. Geospatial Index geographic location Index: an Index is built for the geographical coordinate structure to efficiently locate the coordinate range. This is an additional benefit. Text indexes Text index: Text search similar to search engines involves word segmentation operations. Unfortunately, Chinese characters are not supported, and the query syntax is relatively simple. Hashed Indexes Hash index: To support Hash-based Sharding (a deployment method), Hashed Indexes only support equivalent searches, but do not support range searches.

    The above describes the index types, and different types of indexes can carry the following attributes, indirectly as follows:

  • Index attributes:

(1) unique index: consistent with the concept of unique index of RDB (Relational Database Service), designed to avoid repeated values.
The construction method is as follows:

    db.members.createIndex( { "user_id": 1 }, { unique: true } )

(2) sparse index: the sparse index is embodied in the fact that it only creates an index Entry for documents that contain index fields. Ignore documents that do not contain index fields.
The construction method is as follows:

    db.addresses.createIndex( { "xmpp_id": 1 }, { sparse: true } )

(3) TTL index: TTL refers to the life cycle, that is, the stored document store has an expiration time attribute and is automatically deleted after the lifecycle, such as log data, temporary data automatically generated by the system, and session data.
The construction method is as follows:

    db.log_events.createIndex( { "createdAt": 1 }, { expireAfterSeconds: 3600 } )
  • Index structure and features:

(1) B-tree structure and sequential storage: MongoDB indexes are organized by B-tree structure and support efficient equivalence query and range query. The internal index items (entries) are ordered by default, and the returned results are ordered by nature.
(2) index sorting: You can specify that index items are created in ascending or descending order. The selection of ascending or descending order is equivalent to that of a single-value index, however, a composite index is not equally effective. A composite index is organized into a tree structure at the upper and lower levels. incorrect selection in ascending or descending order will have a great impact on the performance.
(3) intersection of indexes: After version 2.6, the index query optimization policy supports the intersection of indexes and can be used together to retrieve data most efficiently. For example, you can build two separate indexes. When the query conditions are associated with these two indexes, the index optimization plan will automatically combine these two indexes for retrieval.
For example, the following two indexes are constructed:

{ qty: 1 }{ item: 1 }

The following query statement will hit the above two indexes:

db.orders.find( { item: "abc123", qty: { $gt: 15 } } )

In addition, the intersection of indexes and include:

Prefix intersection of indexes: it is mainly for composite indexes. The query plan will optimize the prefix of composite indexes for query.
  • Index analysis method:

(1) Evaluate the RAM capacity and try to ensure that the index is in the memory:
Command for querying the index size (in bytes ):

db.collection.totalIndexSize() db.collection.stats()

(2) analyze and view the Index Plan:

You can use explain and hint in MongoDB to view the index policy:

db.collection.find().explain()

We can see that the index policy takes effect and the use of the index intersection.

db.collection.find().hint({"name":1})

The hint command can specify to force an index.

(3) index management information: each database contains a system. indexes set, which records the metadata information of index building in the database.

db.system.indexes.find()
  • Note:
    (1) Each index requires at least 8 KB of space.
    (2) MongoDB automatically creates a unique index for the _ id field.
    (3) A special index type supports the implementation of TTL sets, and TTL depends on a background thread in Mongod, this thread reads the date value in the index and deletes expired documents from the set.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.