Effects and costs
The index in the relational database is described above. The index in the Notes database is hidden in the view concept (this article discusses only the view index of notes and does not include full-text indexes.) )。 A developer-created view is simply a design document stored in a database that the database engine creates and updates indexes on. The indexes in a relational database are data structures (mainly B-trees) that are sorted from records, and the index of Notes views also includes unsorted columns, computed values, classifications, totals, and so on (the data structure is still B-tree, and if luck is good enough, you will encounter notes B-tree structure is invalid error). The view that the user sees in the client is the indexed data, plus the settings for the appearance. This difference is due to the differences in the underlying design of the two types of databases.
The indexes in the relational database are built for querying. SELECT NAME, the JOB from the person WHERE country= ' China ' is a simple query statement that greatly improves speed because the country column of the person table is indexed. There is no difference in whether there is an index for a select-from-person, non-selective query statement. Notes, as a document database, are slightly different. In a document database, a document that is stored as a unit of data, the biggest difference compared to a record in a relational database is that the document has no constraints on the table definition that is being used, such as how many columns a record has, the name and data type of the column, the length geometry, and so on. Each document can have a field of any number, name, and data type. If every record in a table has a common schema, it can be said that each document has its own schema. In some document databases, such as the big hot MongoDB, the document is also classified in the collection that represent the same entity, which has the meaning and use of similar table. In the Notes database and in another popular couchdb, there is no such container, all documents, regardless of the content and purpose, are directly under the database level (COUCHDB and the Notes database is not only this, considering its original author Damien It's not surprising that Katz is from the Notes development core team. As a result, even select * from person queries all the data in the table, there is no direct correspondence in notes. This also means that any meaningful query in the Notes database (that is, by querying the ID of the document) must be helped by the concept of view.
Thus, the first function of the Notes view index is to select some documents from the database, extract the values of the fields as or computed column values, and provide a table-like data collection or interface. The index is read when we access the Notesviewentry in LotusScript or Java using the Notesview object. At the same time, the response level, classification, summation, average value of the document also make the view assume the report function that relies on SQL and other methods in the relational database.
The second function of the Notes view index is to make it easier to find documents. The essence of common Notesview.getalldocumentsbykey and Getdocumentbykey is to find entries in the index and return the corresponding documents. The user sorts, collapses, expands, and searches at the beginning of the view displayed by the client, and is also done using the index.
The cost of the Notes View index is the same as discussed in the relational database, which occupies the storage space and consumes computing resources.
Time of update
When was the index created? When was it updated? A discussion of the role of indexes in the previous section shows that presenting a collection of documents and enabling users to find documents quickly requires that the data in the index is up-to-date, which is consistent with the data of the document on which it is based. In the previous article, we mentioned that there are three options for updating the index theoretically, one with the document modification (new, modified, and deleted), the second is the delay until the index is read to check whether the update is needed, and the third is the scheduled update. Option one ensures that the index is always consistent with the document, but increases the burden on the database when the document is modified. Option two also guarantees that the index of each read is up-to-date. The advantage compared to option one is that the relevant document may have been modified several times before the index is read. Different users modify different documents, a single user modifies a document multiple times, and so on, all sorts of things happen. The first scenario requires an update every time the index is modified, and scenario two makes all the modifications (if any) to the index two reads, with only one batch update, which reduces overhead. The disadvantage is that if the index is found to be updated when reading, the wait time is increased. The third scenario is actually a supplement to the second scenario, which is also a batch update to reduce the number of times and the number of documents that need to be updated when reading the index.
Combined with the above analysis, it can be seen that the speed of user access to the index is the fastest, the cost is the most system resources required. Scenario two and three delay ideas, saving system resources, but users sometimes need to endure delay. relational databases basically use the first scenario. Document databases are selected differently. MongoDB uses scenario one, COUCHDB uses scenario two, and notes mixes scenarios two and three.
the process of updating
timed Updates
Let's take a look at the notes view's timed update. This is done by the server running update and Updall two tasks. The run time of both can be set in Notes.ini, the update runs continuously by default, and Updall runs once every two o'clock in the morning. Update maintains two queues (queue), one is an immediate queue (immediate) and the other is a delay queue (deferred). The queue holds a request for a database index to be updated. There are several types of requests: Replicator (Replicator) After replication, a request is sent to the queue if a copy's document changes. The Mail router (router) sends a new message to a mailbox, and a request is sent. Most frequently, a user creates, modifies, or deletes a document in a database, and when he exits the database, a request is sent to the queue. Replication and user-modified requests are sent to the delay queue, which, for example, triggers a timely update of the database full-text index to send a request to the immediate queue. Update every five seconds (can modify this default by Notes.ini's Update_idle_time and Update_idle_time_ms settings, the same below) to check two queues, immediately processing requests in the immediate queue, In the delay queue, the request is compared to the same database, and all requests within 15 minutes (Update_suppression_time) after the first arrival request are ignored, in order to reduce the resource consumption. Modifications to a database in less than 15 minutes will only trigger update updates to its index once.
The request records only the path of the database, and update determines the index of those views to be updated when processing. At this point, notes once again show the nature of saving resources as much as possible. Views that were not opened in the last seven days (update_access_frequency) are ignored first. It then searches for documents that have changed recently and that are contained in the view, based on the index's last-updated time and the view's selection criteria, and does not update if the number is less than 20 (update_note_minimum).
Updall task because it is in the early hours, notes appear generous. It does not matter the queue, but examines all views of all databases, requires updated updates, and rebuilds are required.
Update when open
Timed updates do not guarantee that the data is up-to-date when the index is accessed and can only minimize the burden of updating at that time. When a user accesses a view through a client interface or program, the Nifopencollection () function of the Notes API is called to open the index (NIF means notes indexing Facility, which is the component of the Notes Index section) if the index data is not up-to-date , Nifopencollection () will call Nifupdatecollection () to update the index. Each time we open a view in the client, this process occurs (except for the special cases described below). If the view index is updated from the last update update, there are not many changes in the document, and this process can be completed quickly. And if there is a lot of document changes during this period, even if the index of the view does not exist, it will take a long time to wait. Since R7, notes are indexed and updated in a separate background thread, so the user interface is not stuck and the user can do other things.
To increase the speed at which the view is opened, or to minimize the number of index updates, the notes view also has the option to control whether the update is open.
The differences are: 1. Automatic, after first use. 2. Automatic. 3. Manual. 4. Automatic, updated up to every n hours. Manual means the index does not update automatically when the view is opened, and it needs to be updated by F9. The three automatic difference is that when set to 1 or 4 o'clock, if a view has not been opened without an index, the update and Updall runtime does not create an index, and when set to 2, it is set to 4 o'clock, if the index has been updated in the last n hours of the setting, as if it were set to 3.
The index occupies the space in the database is not negligible, in the database if the view is many and complex (many columns, classification and sorting more), the index occupies more space than the document space. Therefore, the view also has the drop index settings, can be set to the database is closed or not accessed more than a certain number of days after discarded. The Updall task is responsible for deleting the index if the conditions set are met.
So, when we design the view, we want to ...
Each view contains three types of indexes:
1. The default index sorted by Notesid.
2. An index sorted by a column.
3. The index of the parent document and the child document relationship.
A new index is added for each additional column sort (including classification). Multilevel classifications require more computation than the same number of orders.
We've seen from the previous discussion how notes are looking for ways to reduce the number of index updates, because this is a very computationally expensive action. When designing a view, ignoring notes's painstaking work, building a large stack of poorly-used views, combined with the sort and less-than-necessary classification that users rarely use, will not only overwhelm the update task on the Domino server, but also swallow the slow-open view.
As I mentioned before, one of the benefits of separating the Xpages view layer from the data layer is that the view can be used just as a collection of data, and different presentation needs, such as the appearance, column, and further filtering of the document can be implemented using the Xpages view control, providing the possibility to reduce the number of views, thereby reducing the number of indexes , improving performance and maintainability.
Useful reference articles in the notes help:
Notes help-refreshing View Indexes
Administrator help-indexer Tasks:update and Updall
84. The Notes database from the view index (bottom)