How SQL Server Architecture Works

Last Update:2018-12-08 Source: Internet

Author: User

Tags bulk insert configuration settings

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Abstract: This article discusses in depth the working principle of the SQL Server architecture. This section describes the database engine enhancements and tips, and provides links to relevant information. An in-depth understanding of the SQL Server engine helps database administrators (Database System Engineers) take full advantage of SQL Server in designing, building, or improving database systems. Although this article is intended for database professionals, it can also be used for teaching or marketing purposes.

I. Introduction

This article describes Microsoft? SQL Server? 2000 is added to the storage engine function, and provides related usage skills, and discusses the working principle of the storage engine. Understanding how the storage engine works allows you to maximize the performance of SQL Server.

Today, people are focusing on the high scalability of applications. The design and implementation cycles of databases are constantly shortened, and the database is constantly evolving and developing due to the constant changes in development requirements and the increasing use of products. To meet scalability, availability, and ease of use requirements, a data storage engine with flexibility and flexibility is required.

Different Versions of SQL Server 2000 support systems of different sizes, ranging from small mobile systems for Pocket PCs to Windows running in clusters? 2000 high-availability transaction processing or decision support systems with a capacity of several TB on Datacenter Server. All these systems meet the flexibility, security, and reliability requirements of mission-critical business systems.

Because of the intelligence and automation of storage engine operations, you can deploy SQL Server 2000 applications in projects of various purposes and sizes. A highly sophisticated architecture improves performance, availability, and scalability.

Availability

Because new algorithms are used to interact with physical files, the reliability and concurrency capabilities are enhanced. These algorithms reduce the daily maintenance work and prevent you from running Database Console Commands (DBCC ). However, DBCC can still be used, and the running of the newly added dbcc check command does not interfere with online processing.

Scalability

The storage subsystem (composed of physical database files and their layout on disks) supports both small databases and super large databases. SQL Server supports up to 64 GB physical memory (RAM) and 32 processors.

Easy to use

The enhanced management function helps database administrators (DBAs) achieve automatic and centralized management of servers. This also makes maintenance of remote servers and applications easy, and DBA does not need to access each site. Server configurations managed by complex algorithms can dynamically respond to changes in server usage plans, allowing DBAs to focus on database management, optimization, and other tasks.

Ii. Storage engine Enhancement

The relational database Server of SQL Server 2000 consists of two main parts: the relational engine and the storage engine. The two engines work independently and interact with each other through local Data Access Components (such as ole db. Relational engine provides interfaces to access the storage engine, which is composed of services that interact with the basic database storage components and functions.

The main tasks of the storage engine include:
Provides functions to improve the ease of use of management and storage components
Manage data buffering and all I/O operations on physical files
Control concurrency, manage transactions, lock and logging
Manage files and physical pages used to store data
Recover System Faults

The storage engine in SQL Server 2000 provides new features that are simple in concept and flexible in actual operations, while reducing detailed capacity planning and performance optimization. SQL Server 2000 can respond to its environment and adapt to database usage changes accurately and quickly. This technical breakthrough has shifted the focus of database management to data simplification as a service. SQL Server 2000 DBA can focus on designing a system that responds to data streams and data usage, without wasting time on optimizing individual parameters.

2003-6-7 19:21:00
View comments ???

19:23:01 the changes in SQL Server 2000 were built on the enhancement of the architecture. This enhancement was introduced in SQL Server 7.0 to provide the foundation for later improvements and innovations. The primary purpose of the storage engine group is to reduce the time and effort spent on regularly optimizing servers. Because the vast majority of optimization parameter settings are based on databases, the engine uses adaptive algorithms to dynamically adjust these settings so that they are suitable for the database environment. Tuning parameters can now be automatically adjusted in this way, but in earlier versions, they need to be constantly adjusted and tested. You can still manually adjust the optimization function, but SQL Server 2000 can do more. Only a few SQL Server customers need to adjust the tuning parameters. This adjustment requires careful testing and supervision by experienced database administrators.

The following table summarizes the main enhancements of the SQL Server 2000 storage engine. This article will be detailed later.

Function Description and Benefits
Application lock manager if you need to control concurrent access to application-defined resources (such as forms), the new stored procedure allows you to use the SQL Server application lock manager to lock these resources.
The Database Console Command (DBCC) dbcc check command can be run during online processing without interrupting updates. The new feature allows you to verify the consistency of physical pages to detect hardware errors. In SQL Server 2000 Enterprise Edition, DBCC can run in parallel on multiple processors.
All DATABASE options of DATABASE options can be modified using alter database. This feature simplifies management.
Differential backup in SQL Server 2000, because the improved feature can track database changes at a wider level, differential backup is faster.
By using the dynamic adaptive algorithm, the server can automatically adjust the previously static configuration settings. Management Control can still be used to manage system-wide resources, but you do not need to use them in the future. Manually setting parameters can be dynamically adjusted within their constraints.
In-row text is in a table that contains small and frequently-used text columns. Smaller text values can be stored on the same page as standard data rows instead of text value pages. If the table contains such frequently accessed text data, this function can reduce disk I/O operations.
In the Enterprise Edition, the indexing process automatically uses all the processors configured for parallel processing, reducing the time required for indexing. For example, in an eight-processor server, the time is shortened to 1/6. You can also use available resources in memory and tempdb during index creation.
The pre-read index reading function is enhanced, improving the index scanning performance.
The improvement of the restructured index on dbcc showcontig provides detailed information about index fragmentation. The newly added DBCC command INDEXDEFRAG can reorganize the index page online without interrupting the database service or causing Database Consistency or fault recovery issues.
You can specify each key column in the index in descending order as either ascending or descending.
KILL command this command now reports the progress of completion. If the command is waiting for another process (such as rollback), you can view the progress of the command execution. Improved commands can be used to stop Microsoft Distributed Transaction Coordinator (ms dtc) transactions that are not associated with a specific session.
The technology in Windows 2000 improves the performance of enterprise systems that use a large amount of memory. With the AWE extension of Windows 2000, SQL Server 2000 supports up to 64 GB physical memory (RAM ).
The improved locking manager can detect deadlocks of other resources (such as threads and memory. The improvement of concurrency also reduces the occurrence of deadlocks, which further enhances the scalability of SQL Server 2000.
You can create bookmarks in the logs by marking the logic log with the Transact-SQL command, so that the database can be restored to the point in time shown in the bookmarks. This feature also allows you to synchronously restore multiple databases for the same application.
The improvement of dbcc showcontig by online index reorganization provides detailed information about index fragmentation. The newly added DBCC command INDEXDEFRAG can reorganize the index page online without interrupting the database service or causing Database Consistency or fault recovery issues.
Optimized pre-read I/O operations for each file involved in the scan, SQL Server 2000 sends out multiple consecutive pre-read operations at the same time. To improve performance, the query optimizer uses continuous pre-read I/O operations when scanning tables and indexes.
Password-protected backups use passwords to protect backup media and separate backups. This prevents unauthorized users from restoring the backup and accessing the database.
The fault recovery mode allows you to select the log record level of the database by using the fault recovery mode. In this way, transaction log management is more flexible. The fault recovery mode can be changed online to adapt to the changing database usage during the day.
In the Enterprise Edition, you can perform multiple scans on a table by using other scans in progress to reduce the actual I/O operations on the disk.
The log shrinking command can be run immediately in more cases. Even if you cannot contract logs immediately, SQL Server will provide feedback on the Creation progress, indicating the operations that must be completed before you continue or complete the contraction operation.
Snapshot Backup is further enhanced to support Snapshot Backup provided by third-party providers. Snapshot Backup uses storage technology to back up or restore the entire database within several seconds. These backups can now be combined with regular transaction logs and differential backups to provide complete protection for OLTP databases. This feature is very useful for medium-sized or large databases, because availability is very important in this environment.
Space-saving empty tables and indexes SQL Server 2000 does not allocate disk pages for empty tables and indexes. SQL Server 7.0 assigns up to three disk pages to empty tables and indexes.
Sorting of the First n items this new function can optimize the search of the First n items (for example, select top 5 * FROM tablename ).
Xlock SQL Server 2000 provides this new Transact-SQL lock prompt. It can be used to explicitly call mutually exclusive, transaction-level pages or table locks.

SQL Server 2000 has added many features that make data interaction more effective and more flexible in management. The following sections detail these enhancements and provide related tips.
3. Interaction with data

In SQL Server 2000, the storage engine is enhanced to provide better scalability and performance when interacting with data. Understanding these enhancements helps you use SQL Server more effectively.

Data exchange starts from query, whether it is through the user interface or automatically executed tasks. Data requests are first transmitted to the relational engine, and then the relational engine interacts with the storage engine to obtain data and pass it to the user. From the perspective of users and DBAs, the functions of storage and relational engine cannot be differentiated.

More effective Data Reading

Data is transmitted between the server and the user through a series of transactions. After an application or user starts a task, the database passes it to the query processor for processing and returns the final result. The query processor receives, interprets, and executes SQL statements to complete the task.

For example, when a user session issues a SELECT statement, the following steps are performed:

After the relational engine compiles and optimizes the statements, it incorporates them into the execution plan (a series of steps required to obtain data ). Then, the relational engine runs the execution plan. The steps include accessing tables and indexes through the storage engine.

The relational engine interprets the execution plan and calls the storage engine to collect the required information.

The relational engine combines all the data returned by the storage engine into the final result set, and then returns the result set to the user.

Two improvements were made to improve the performance of the process. In SQL Server 2000, the relational engine submits the approval query predicates to the storage engine, so that these predicates can be processed as soon as possible, this improves the efficiency of data exchange between storage and relational engines. This improvement significantly improves the efficiency of approved queries.

Enhanced Top n Functions

Another improvement is that the storage engine selects the first n records from the result set for processing. In SQL Server 2000, the new Top n engine analyzes the optimal operation path for statements similar to the following:
SELECT top 5 * from orders order by date_ordered desc
In this example, if the entire table must be searched, the engine analyzes the data and tracks only the first n values in the cache. This method will greatly improve the performance of the SELECT statement, because only the first n values need to be sorted, not the entire table.

Shared Scan

In SQL Server 2000 Enterprise Edition, two or more queries can share ongoing table scans. This feature improves the performance of large SQL Server 2000 databases. For example, when a query uses an unordered scan to query a large table, the pages in the cache are cleared to free up space for inbound data. If another query has already started, the second scan of the same table will re-search these pages on disk I/O. In an environment where frequent table scans are performed, when two queries search for the same data page, this will cause disk bumps.

Figure 1: Shared scan results

Optimized processes can reduce disk I/O operations caused by such data access modes. The first unordered scan of a table reads data from the disk. Subsequent unordered scans of the same table do not need to read the hard disk, but only need to use the information already in memory. See figure 1. When multiple scan operations are performed on the same table at the same time, this synchronization process can increase the performance by up to eight times. This improvement is more effective in queries supported by large decisions, because the size of the entire table is much larger than the cache size ).

When there is no more effective execution plan for the query, the storage engine will use the shared scan function to assist in the query. This feature improves the performance of frequently reading large tables. This function is called when the query processor determines that the best execution plan contains a table scan. However, although you can use query or index optimization to forcibly perform a shared scan, forcing a table scan does not improve performance. At this time, using a good index to complete the same job will not be bad, and may be better.

Concurrency

To maintain transaction consistency while multiple users perform data interaction, the storage engine locks resources to manage the dependencies of rows, pages, keys, key ranges, indexes, tables, and databases. By locking a resource when it is changed, the engine prevents multiple users from changing the same data at the same time. The locks in SQL Server can be dynamically applied at different granularity levels to select the minimum number of locks required by the firm.

In SQL Server 2000, the improvement in concurrency further reduces deadlocks and avoids unnecessary resource locks. For example, the enhanced lock manager can understand other competing resources (such as threads and memory ). This new feature helps Database Administrators determine a wider range of design or hardware restrictions.

-6-7 19:30:13 lock the newly added Transact-SQL interface in the Manager supports using custom lock logic in programming code. The lock required by the business logic can be created by calling the sp_getapplock command in the Transact-SQL batch processing, which allows you to specify the resources defined by the application to be locked (such as locking the form, instead of data rows), the lock mode to be used, the timeout value, and the lock range are transactions or sessions. When the new application lock manager is used to create locks, they accept SQL Server's regular lock management, just as they are created by the storage engine. Therefore, you do not have to worry about when the call transaction ends, the lock created by the application is still open.

In SQL Server 2000, the process used to obtain the lock considers whether all data on the page has been submitted. For example, if you want to run a SELECT statement on a table whose data has not changed recently (such as a table in the pubs database), the process will not produce a lock, because no active transactions have been updated to the table recently. The storage engine compares the log serial number on the data page with the current active transaction to implement the above functions. If the vast majority of data in the database is earlier than the earliest active transaction, this function will significantly reduce the lock operation for such a database, thus greatly improving the performance.

While using locks to protect data in transactions, another process latching controls access to physical pages. A latch is a lightweight, short-term synchronization object that protects transactions that do not need to be locked during the lifetime of a transaction. When the storage engine scans a page, it first locks the page, reads the row, returns the row to the relational engine, and then unlocks the page, to allow other processes to access the same data. The storage engine uses a process called lazy latching to optimize access to the data page, that is, only when another active process requests a page will the lock on the page be released. If no active process requests the same data page, a single lock will always be valid during the entire page operation.

To improve the system's concurrent performance, you should focus on the design of the database system and its related code objects. SQL Server 2000 supports TB-level data storage, and its scalability can grow linearly without restrictions. The task of the database administrator is to manage the database lifecycle, that is, the design and optimization cycle of all database components (from code to data storage on the disk, to ensure that the design always meets the requirements of the service level agreement.

Iv. Tables and Indexes

The physical data structure has also been improved to improve the flexibility of design and maintenance.

As tables or indexes grow, SQL Server allocates new data pages to eight groups. These data pages are called extensions. Although columns of the text, ntext, or image type can be stored on different pages, a row of data cannot exceed one page, so it can only have 8 KB of data. The keys of tables with clustered indexes are stored on the disk in sequence. Heap tables without clustered indexes are unordered. Records are stored in the order of insertion.

SQL Server 2000 supports the index view, which is often called the object view in other database products. When a clustered index is created on a view, the view converts the derived object to the basic object stored in the database, and its structure is the same as that of the table with the clustered index. The index view can be used to store pre-calculated values or complex join results, provided that the maintenance overhead cannot exceed the performance gains. In SQL Server 2000 Enterprise Edition, as long as the index view can optimize the query plan, the query processor will automatically use it. The index view can improve the query speed for data that is rarely changed but often used as a component of a complex join or computing query.

Intra-row text

In-row text can be used to store small text data on the home page. For example, if a table has a text column, but the text value is usually smaller than the rest of the content in the row can be placed on the same common page, you can set the threshold value in the text column. The threshold is used to determine the maximum text length that can be stored on the home page rather than on a separate text page. If most of the data can be stored on the home page, but only a small part of the data is relatively large, you need to create a text page. This can improve the performance.

To determine under what circumstances the new feature is used, you need to weigh the storage density (or the number of rows stored on each data page) and improve I/O performance. For example, a text column in a table is used to store comments. This column contains 20% of the text values, while the other 80% of the text values are less than 100 bytes. In this case, it seems that the intra-row text solution can be used; however, the intra-row text should be considered only when data in such columns is frequently accessed. If you frequently access this table but only view this comment column during special search, it is not best to use intra-row text. Because the number of rows stored on each page is small, the storage density is reduced. Because the table contains more pages, the table scan response time also increases. Therefore, the best way to implement intra-row text is that there are text columns that require frequent access, and many values of this column are smaller than 8 K, which can be stored in rows.

New data type

SQL Server 2000 introduces three new data types. Bigint is an 8-byte integer. SQL _variant can store data values of different data types. The third data type table can be used to optimize performance. Table variables make tempdb more efficient and faster than temporary tables. Like other variables, they are used to declare their batch processing. The table variable function is similar to a temporary table, but its performance is higher than that of a temporary table or cursor, and it can use server resources more effectively. Generally, when creating code that interacts with the database, you must consider the best way to use available resources on the server.

Index

By using indexes, You can optimize data access. Because whether an index is created depends on the usage, the primary cause of database slowness is incorrect indexes. Standard Index maintenance should include periodic verification of the current index scheme and appropriate addition or deletion of indexes to adapt to the current system usage.

Several new features in SQL Server 2000 make index maintenance more effective and index management easier. These enhancements reduce disk I/O operations and increase the performance of index scanning. This feature is particularly useful when range scanning can use secondary indexes.

19:31:33 create an index

When an index is created, the storage engine samples the rows and calculates the best way to create an index using server resources. You can use the options to control the indexing method, so you can choose to control the system resource allocation method. You can use these options and combine your knowledge about specific database systems to balance the resources in the process that are crucial to the overall system performance, this minimizes the impact of index creation on transaction processing.

Resource Command Option description
Memory sp_configure (advanced)
Index create memory
Specifies the total amount of memory that can be used to create an index.
TempDB create index
Sort_in_tempdb
Allocate disk space from tempdb for sorting during index creation. If tempdb is on a separate disk, this command will generate a higher I/O bandwidth, and if the database space is not very continuous, this command can also make the index page layout physically more consecutive.
CPU sp_configure (advanced)
Maximum Parallelism
Limit the number of CPUs that can be used in parallel operations (server scope.

Another scalability of large systems is the creation of parallel indexes. SQL Server 2000 Enterprise Edition has this function. This process is automatically called when a single create index statement is issued. Storage engine computing data requirements, and then create a separate thread, each thread to create an index.

Figure 2: Parallel INDEX OPTIMIZATION

You can also use shared table scanning to create indexes to further optimize the process.

Sort index fragments

SQL Server 2000 supports online index reorganization, which is a great improvement compared with previous versions. Online Index reorganization has little impact on transaction throughput, and can be stopped and restarted at any time without affecting the running effect. Index reorganization is performed in small increments and can be completely restored.

As information is inserted, deleted, and updated in a table, clustered and non-clustered index pages become fragmented, reducing the efficiency of data range query. Therefore, regular sorting of index fragmentation is very beneficial. You can use dbcc showcontig (this command has been improved in SQL Server 2000) to analyze and report fragments.

If you confirm that the index has changed to fragmentation, you can use the dbcc indexdefrag command to reorganize it. This command records pages in the order of logical keys, compresses the available space, and moves the rows in the created extension to meet the fill factor settings. Increase the page content density to reduce the number of pages read during data scanning, thus improving reading performance. If the index is frequently maintained and its distribution is not completely broken, the performance of running dbcc indexdefrag is much less than that of rebuilding the index.

Dbcc indexdefrag is one of the many long-term management operations that use short transactions internally. These short transactions maximize the number of concurrent operations on the server, allow the operation to stop without affecting the work, and all these transactions are recorded for recovery in the event of a fault.
5. log records and fault recovery

A transaction log is a record stream that records changes made to the database from database creation to the current time point. Each record operation creates a log record. Log records are generated by the transaction and written to the disk when the transaction is committed. On the contrary, the data page modified by the transaction is not immediately written to the disk, but is first stored in the SQL Server buffer cache and then written to the disk later. Delaying data writing to a disk can maximize the performance of Multi-Channel Access to data pages and avoid interruption scanning. The Force write of logs to the disk at the time of submission is to ensure that the completed work will not be lost when the server is shut down.

Fault recovery ensures transaction consistency before turning the database online. If the database is consistent in the transaction, all submitted work takes effect, and any uncommitted work becomes invalid. Logs always define the correct view of the database. In short, fault recovery is the process of keeping data consistent with transaction logs at a given point in time.

When SQL Server is started, when the database is connected or the last step of restoring the database from backup, the fault recovery will be automatically executed. The fault recovery performed at SQL Server startup is called restart fault recovery or start fault recovery. Backup is usually used for fault recovery due to disk faults. This type of fault recovery is called media fault recovery.

Restarts the fault recovery automatically. Generally, it can be recovered to the nearest point in time. When using backup for fault recovery, DBA can choose to restore to an earlier point in time. Such fault recovery must meet some restrictions.

Each time an SQL Server instance is started, the startup fault recovery will automatically run, and it will roll back all the transactions that were not completed when the instance was last closed. When using backup for fault recovery, DBA can choose to restore to an earlier point in time. Such fault recovery must meet some restrictions. In any case, the fault recovery operation is based on this target time point.

19:32:58 fault recovery is divided into two phases:

Recover all changes until the target time point in the transaction log is reached.

Undo All operations performed by a transaction that is still active at the recovery stop point.

SQL Server uses checkpoint acceleration to restart fault recovery. The checkpoint forces all the modified data pages in the cache of the current buffer to be saved to the disk. This creates a point for the recovery phase of the fault recovery. Because the checkpoint overhead is very large, SQL Server automatically manages the checkpoint to minimize the time it takes to restart and improve performance as much as possible.

In SQL Server 2000, successful writes must be permanently stored on disks. If you use the write cache disk storage, contact your storage device vendor to check whether the cache is fault tolerant. Fault Tolerance means that the cache is not affected by power failure or operator operation. If the cache is not fault tolerant, it should not be used.

Logical log tag

In SQL Server 7.0, it can be restored to any specified point in time. If a hardware fault occurs, the recovery process is quite simple. However, another threat to the database may be the input of invalid data or the destruction of valid data by user operations. In this case, you need to determine the start time of the problem. In SQL Server 7.0, the only way to solve this problem is to restore the log to a database copy until the problem is reproduced and then restore the product image, until the time point before the time when the problem is detected.

In SQL Server 2000, you can mark transactions in logs. After that, if you need to recover, you can refer to the tag used during execution, instead of using the specified time. Therefore, use the begin transaction statement and the with mark [description] clause. Tag is stored in msdb. Fault recovery can include transactions that contain tags, or stop transactions that contain tags. For example, if a process runs in batch mode and changes many records, you can use this function to ensure that, when a process is running in an error environment, data can be rolled back to the time point when the command is executed.

The tag name must not be unique. To specify the required transaction, specify the datetime value. The operation syntax is:
Restore log with [STOPBEFOREMARK | STOPAFTERMARK] = @ TaggedTransaction AFTER @ datetime
You can also use tags (Distributed tags) for distributed transactions to restore multiple related databases to consistent transactions. The relevant database can be located on the same or different instances of SQL Server. You can set distributed tags for a group of databases on a regular basis (for example, once every five minutes ). If the transaction log of a database is corrupted, the database must be restored to an earlier time point. The distributed TAG provides this point of time. With the distributed tag, you do not have to worry about determining the backup time when backing up multiple related databases.

Shrink transaction logs

In SQL Server 7.0, the log shrinking operation cannot be performed immediately. This operation is postponed to the next backup or delete transaction logs. This method makes many SQL Server 7.0 customers very worried. SQL Server 2000 can contract logs immediately and indicate whether further contraction can be performed after log backup. In this case, you can run the shrink command again after the log backup is complete.

The log size depends on the current fault recovery mode and application design. If you find that logs need to be scaled regularly, find out the cause of the problem. We should further investigate the reason for the log filling, instead of simply using the shrink command to maintain the log.

Fault recovery mode

Using the fault recovery mode added in SQL Server 2000, you can easily implement data protection plans. These modes all choose between performance, log space requirements, and media (Disk) fault protection. There are three modes: simple fault recovery, full fault recovery, and large capacity record.

When selecting a fault recovery mode, the database usage and availability requirements should be taken into account, and the selected mode should help to determine the appropriate backup and recovery process. These fault recovery modes are only applicable to media fault recovery, that is, fault recovery using backup. Restart fault recovery for all submitted tasks.

Switching between fault recovery modes is very easy. For example, in a large database, you can use both full mode and large-capacity record mode, or both modes. Full mode can be used during the day, while large-capacity record mode can be used at night or during loading of data containing large-volume inserts and re-indexing. You can also switch to the large-capacity record mode during data loading, switch back to the full mode, run the transaction log backup, and restore to the mode switching point without running a full database backup. This function can be used to process large volumes of data more effectively. All you need to do is to back up previous transaction logs.

To change the recovery mode, use the following syntax:
Alter database set recovery RecoveryModel

Simple fault recovery mode

The simple fault recovery mode usually requires a small amount of log space. However, if the data or log file is damaged, the potential loss is the biggest. In this mode, only events required for basic fault recovery are recorded. In simple fault recovery mode, only full and differential database backup can be performed. In the event of a fault, you must complete all the work submitted since the last backup. This mode is the simplest for administrators, but does not apply to applications with critical tasks, because such programs generally do not allow loss of submitted work.

This mode is similar to the truncate log on checkpoint option in SQL Server 7.0 and earlier versions.

19:34:47 full fault recovery mode

In the full recovery mode, everything is recorded. The full recovery mode provides comprehensive protection to prevent the loss of damaged data files. If the transaction log is corrupted, the work submitted after the last log Backup will be lost and must be completed manually again.

Even if you use the full recovery mode, it is best to use a fault-tolerant disk to store transaction logs to prevent data loss. The full recovery mode allows recovery to the specified time point.

Large Capacity record fault recovery mode

The fault recovery mode of large capacity records provides the highest performance for large capacity operations. In this mode, these operations occupy less log space than the full recovery mode. For example, the assignment of a new page will be recorded, and the data inserted into the page will not be recorded. In SQL Server 2000, large-capacity operations consist of large-capacity loads (BCP AND BULK INSERT, including when they run in the DTS package), SELECT INTO, CREATE INDEX, WRITETEXT, and UPDATETEXT.

Compared with the full fault recovery mode, the large-capacity recording fault recovery mode reduces log records for large-capacity operations. Remember that if the log is damaged or a large operation is performed after the last log backup when a fault recovery is required, the changes made to the database after the last log Backup will be lost.

This mode does not support recovery to the specified time point, but it allows recovery to the end of the transaction log backup that contains large volume changes. Transaction Log backup using the large-capacity record fault recovery mode contains extensions modified by large-capacity operations. This feature improves log transmission support, because there is no need to worry that the backup will become invalid after large-capacity operations. SQL Server maintains ing to track modified data extensions. In this way, SQL Server can be optimized to identify the changed process.

Improved Backup Functions

In addition to introducing the fault recovery mode to simplify conventional data protection, SQL Server 2000 also improves the management features of snapshot technology, differential backup, and security.

The transaction log backup chain will never be disconnected. In SQL Server 7.0, some operations (such as adding files to a database) interrupt the log chain and require full database backup in the future.

Backup operations do not conflict with applications or other management operations. For example, backup can be performed simultaneously with large-capacity operations (such as index creation and batch loading.

Logs and file backups can be performed simultaneously.

No matter what activities the system is performing, SQL Server 2000 provides excellent support for unattended backup operations.

SQL Server supports snapshot backup and recovery technologies jointly with independent hardware and software vendors. Snapshot Backup minimizes or even does not occupy system resources during Backup. This technology is especially useful for medium-sized or large databases because availability is very important in such environments. The main advantages of this technology are:

Backup can be created within a very short period of time (usually in seconds), which basically does not affect the server.

You can use disk backup to quickly restore the database.

Another host can create backups without affecting the production system.

You can immediately create copies of production databases for reporting or testing purposes.

Snapshot backup and recovery technologies are completed in collaboration with third-party hardware and/or software vendors that use specific features provided by SQL Server 2000 to implement this technology. Backup technology usually uses the method of splitting the disk image set to create an instant copy of the data to be backed up. When the data is restored, the original data can be immediately put into use. The basic Disk Synchronization is performed in the background, so it can be almost instantly restored.

The running time of differential database backup is proportional to the total amount of data changes after the last full backup. The fewer data changes, the faster the backup. SQL Server 2000 uses ing to track data extensions that have changed since the last database or file backup to ensure that these extensions can be located more effectively. In addition, SQL Server 2000 supports differential file backup.

The backup will still collect the changes made to the database after the last full backup, and the operation is the same as the fault recovery. However, such backups are very fast, because they only record a small amount of changed information, especially when the database is very large and the changed data is very small.

To ensure security, you can use a password to protect backup media and backup sets. This prevents unauthorized users from adding data to the backup or restoring the database.

6. Enhanced management functions

In SQL Server 2000, several management functions of the storage engine have been enhanced.

Database Verification

DBCC provides various management capabilities, including CHECK commands for Verifying database consistency.

Experience with SQL Server 7.0 and SQL Server 2000 shows that Database Inconsistency is caused by hardware problems, but database engines or applications may not be able to detect such problems during normal operations. This situation is more likely to occur on infrequently accessed data. To solve this problem, SQL Server 2000 introduces a check mode Physical_Only, which can detect most problems caused by hardware. The detection process is very fast, the speed is equivalent to the disk scan speed, and the consumption of resources is very small.

As the SQL Server storage engine improves its infrastructure (starting with SQL Server 7.0), Database verification is no longer required during regular maintenance. However, Microsoft still uses database verification tools as an important part of key data management tasks. Microsoft suggestion:

Run Physical_Only checks occasionally based on confidence in basic hardware (especially the disk subsystem.

-6-7 19:36:16 complete database inspection at critical moments, such as during hardware or software upgrades, or when any problem is suspected.

Microsoft does not recommend performing a full database check during regular maintenance.

SQL Server 2000 has also made the following major improvements in database verification:

By default, the check can be completed online. The online check has little impact on the transaction workload. The impact depends on the system load, hardware configuration, and tempdb speed. Microsoft's experimental results show that for medium OLTP workloads (50% CPU usage), this impact is 15% to 20%. The provided TABLOCK option forcibly checks for the shared table lock, which can make the check run faster, but may impede updates.

The check operation is performed in parallel on a symmetric multi-processor (SMP) computer, and is limited by the maximum parallelism set in this SQL Server instance.

The SQL Server 2000 check Command continues to support the repair feature introduced in SQL Server 7.0. In some cases, offline repair can replace Backup recovery.

Database Status Control

SQL Server 2000 has improved the alter database statement. The improved statement allows more control over the DATABASE status through Transact-SQL. Currently, all DATABASE options can be flexibly modified using the alter database Command; in future versions, sp_dboption and databaseproperty () will not be updated (). The statements sp_helpdb and DatabasePropertyEx () provide more information about the database status.

The following table lists the database status options.

Option type available settings
User permission SINGLE_USERRESTRICTED_USERMULTI_USER
Availability ONLINEOFFLINE
Update capability READ_ONLYREAD_WRITE

SQL Server also sets the following statuses based on the conditions in the database: recovery, recovery, and suspect ). You can set database options in the following ways: SET clause of the alter database statement, sp_dboption system stored procedure, or SQL Server Enterprise Manager (in some cases ).

After the database status changes, the sessions that change the database status remain connected, and sessions that are inconsistent with the new status can be terminated, and their transactions will be rolled back. The session termination options are as follows:

Terminate now

Terminate after the specified time

Allow normal completion of active processes

Check activity. If active user sessions are found, the status changes are ignored.

The following are two examples of syntax:

Alter database accting set read_only with rollback immediate
Alter database accting set single_user with rollback after 60 seconds

System process ID and unit of work

Another improvement in management is the KILL command, which is used when the process is stopped. The improved KILL command has Status feedback. Therefore, if you want to know the status of the KILL command, run the following command:

KILL SPID WITH STATUSONLY

When you try to stop a system process ID (SPID) that has been stopped by another KILL command, the system returns the same status information.

In SQL Server 2000, MS DTC transactions can exist without a connection or SPID. Therefore, connections can be used by other processes before transactions or work units are completed. You can commit or roll back a transaction when the ms dtc Transaction Manager sends a message declaration task that has completed. This is called a unit of work (UOW), which is the transaction identifier that ms dtc uses for transactions. UOW does not have SPID.

19:37:22 Dynamic Optimization

In SQL Server 2000, Performance Optimization Based on use is dynamically managed without manual adjustment. Static parameters have been removed, but some resources are still managed (for example, setting the maximum memory available for SQL Server ). This method is more accurate and faster than manual calculation based on average and estimated values. In this way, you can focus on the design of database management. Traditional database systems require a lot of manual management and optimization work. For example, to optimize the system based on actual usage, DBA must monitor the system and continuously record a large amount of statistical data to select static settings that provide optimal system performance. Then, DBA should re-evaluate the system to determine the effect of the new settings, and then start the optimization process from the beginning.

SQL Server 2000 introduces a dynamic algorithm in the storage engine, which can actively monitor Server usage and adjust settings internally. Dynamic Feedback and analysis in SQL Server 2000 can keep the settings within 10% of the absolute optimization value (see figure 3), so that the system has better performance and better adaptability.
VII. Data Storage Components

SQL Server 2000 works with Windows 2000 to balance the workload of all available CPUs. If you are running a specific SQL Server instance and other applications do not occupy the same resources, keep the processor settings as the default value to make full use of all the processors. SQL Server can use parallel processing capabilities on multiple processors to perform queries, index creation, DBCC, and other operations. SQL Server 2000 Standard Edition supports up to four processors and 2 GB physical memory (RAM ). The enterprise edition has been upgraded to a new level, supporting up to 32 processors and 64 GB physical memory (RAM ).

The Main Memory Source of the SQL Server instance is called its memory pool. In SQL Server instances, almost all data structures using memory are allocated from the memory pool. Examples of objects allocated from the memory pool include the buffer cache (where the most recently read data is stored) and the process cache (where the most recent execution plan is stored ).

The allocation in the memory pool is highly dynamic. To optimize performance, SQL Server constantly adjusts the size of memory pools allocated to different regions. For example, when the number of stored execution plans is small, the memory pool is adjusted by allocating more available memory to the data cache to optimize resource usage.

SQL Server 2000 tries its best to use memory to reduce disk I/O. Therefore, SQL Server uses the buffer cache to load recently referenced data in the physical memory (RAM), so that the data can be reused. The potential way to reduce disk I/O and provide database system speed is to increase the available physical memory (RAM) of SQL Server ).

Generally, you do not need to adjust the memory settings. However, they can be controlled in some cases. For example, when running multiple instances of SQL Server on the same Server, especially when using a Failover cluster, you need to pay special attention to the memory. If you run other applications on the Server where SQL Server is running, you also need to monitor the memory usage.

19:38:17 SQL Server 2000 uses the new features of Windows 2000 to address physical memory (RAM) exceeding 3 GB ). See Figure 4. SQL Server 2000 Enterprise Edition can use the amount of memory allowed by Windows 2000 Advanced Server or Windows 2000 Datacenter Server.

Files, file groups, and disks

SQL Server stores data and logs in disk files. In basic installation, by default, the created data and log files are stored in the default location specified in the server configuration. However, the following basic principles can be applied to achieve optimal performance and management capabilities:
Distribute data to multiple disks, channels, and controllers as much as possible.

Generally, the more disks (regardless of a single capacity), the faster the disk access speed (Controller and channel) and the faster the storage engine reads and writes data. The larger the system usage, the more important it is to separate data files from log files (store them on different physical drives. In addition, because the use of tempdb has changed, tempdb should be stored in a large disk set; for example, stored together with data files or on a group of disks.

Using file groups makes enterprise databases easier to manage.

Each database starts with a default file group. Because SQL Server 2000 can work efficiently without adding a file group, many systems do not need to add user-defined file groups. However, as the system grows, the use of additional file groups can provide higher management capabilities, which of course requires the implementation and maintenance of competent DBAs.

In SQL Server 2000, if a specific file group in the database is set to read-only, the data in the file group cannot be changed, but directory information such as permissions can still be managed.

Note: in SQL Server 2000, the asynchronous I/O operations in the database engine implement dynamic management and are not affected by the number of files or file groups in use, this is the same as SQL Server 7.0.

When implementing or optimizing the database design, database administrators (Database System Engineers) need to consider the configuration of database storage components, especially the layout of physical and logical disks and the arrangement of database files on disks.

8. Summary

With the increased flexibility and performance control, database administrators can learn how to use the database technology and rich experience in using the database, focus on the management of database code, design and storage components, and use this as the best way to manage the database system. The SQL Server 2000 database engine provides general scalability and flexibility for various database implementations.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More