Exchange data management

Source: Internet
Author: User
Tags knowledge base

As more and more emails grow, you can no longer avoid the importance of managing Exchange data effectively. The size limit of the Exchange Mailbox cannot meet the requirements of users who want to retain more mails on the server. Should users be encouraged to use the PST file to solve the problem? This article compares, analyzes, and answers the questions from the root cause.

It is not easy to implement effective Exchange Server data management. It is especially difficult to find a balance between user needs and Exchange performance and stability. Today, email service is increasingly becoming a key application of various companies and organizations, and administrators are gradually in a dilemma. To effectively manage Exchange data, you need a solution that combines multiple technical methods, including user codes and appropriate technologies (for example: storage hardware, monitoring and reporting tools, and data management applications ). So where should we start now?

First, I need to clarify what I call effective data management. In my opinion, effective data management is the secure and optimal Exchange data storage while providing users with the required data access services. The best entry point I suggest is to test your company's financial, technical, and regulatory constraints. These factors will greatly influence how you decide the storage group (SGs), database, user mailbox (including offline folder-OSTs, and personal folder-PSTs) exchange data backup, recovery, and archiving.

Various constraints

Any administrator who manages Exchange data will face conflicting requirements on how to balance them. When you are looking for a data management solution in the market, you will have three special considerations. You will first consider the cost of the scheme, the technical limitations of the scheme, and the laws and regulations that the company needs to comply. Financial constraints. As email data continues to grow (including the increase in the number and size) and more companies decide to store emails in the Exchange database or other traceable offline storage, that is, the demand for enterprise data storage has increased-correspondingly, the investment required also needs to increase. Financial considerations include the cost of purchasing additional disks and storage infrastructure (for example, additional storage arrays, backup devices, storage region networks-SANs) and the cost of managing the storage. These fees will vary depending on your business scale and needs. Small enterprises may only need to buy some disks to meet the increasing storage needs of hundreds of users. For some large enterprises, it may not be that simple to have thousands of users.
If the storage demand exceeds the budget you can pay, you may need to implement a stricter backup policy to limit which data can be backed up and reduce the importance, excluded non-active data or archive solutions. This method is typically much more cost-effective than purchasing storage devices to meet storage requirements. Technical constraints. Even if your enterprise has the ability and is willing to pay for purchasing more storage space, uncontrolled data growth may damage your ability to maintain effective backup and quickly restore data. Although the tape drive technology is evolving, increasing data will inevitably lead to longer backup and recovery times. Therefore, when we try our best to meet one requirement (for example, fast data access), we may not be able to meet other requirements (for example, fast data recovery ). Therefore, you may wish to evaluate some solutions that compromise these requirements. These compromise schemes are generally combined by online Exchange data management and offline archiving solutions. A similar solution is attractive because it allows you to specify a maximum value for the growth of Exchange storage and archive key data according to policies, but still keeps the data easy to search and access.
Rules and constraints. Many enterprises have implemented some rules and regulations for forced email communication archiving. Most of the companies that have implemented archive are designed to meet the requirements of the enterprise's internal management system (rather than external requirements ). A well-developed system that meets various requirements should be able to help enterprises trace all inbound, outbound, or relay emails. After you implement such a system, you can ensure that any information that comes in and out of your system can be retrieved, whether in a PSTs file or in a handheld device. After you understand the constraints that will affect your Enterprise, you can start to pair the database files located on the Exchange Server and the cached files (OSTs) of Outlook) and PSTs files specify different policies. You also need to decide which backup, recovery, and archive solutions are most suitable for your environment.
Manage Server-based data
Exchange saves the email data in the database of the Exchange Server. Generally, storing data on an Exchange server is easier to access and manage than storing data in a PSTs file. The best place to share information is to use the Exchange public folder database. One Exchange Server 2003 or Exchange 2000 Server can support up to four SGs servers, and each SG can support up to five databases. Therefore, a single server can support up to 20 databases. According to best practices of the Exchange database, when the database size does not exceed 40 GB, the backup and recovery time is still within the acceptable range.
The Storage Limit determines the maximum number of users supported by each Exchange system. The Exchange storage subsystem must be able to cope with the I/O data volumes produced by the users it supports. Microsoft Knowledge Base Article "optimize the storage of Exchange Server2003? FamilyID = c6084d20-9730-4ffc-805d-b957327604c6 & DisplayLang = zh-cn, Chinese Version) we recommend that you use an average of 0.75 of I/OS per user per second when planning an Exchange Server. This applies to most systems-including those high-end SAN platforms-where it is proposed that each server support a maximum of 4000 users.
You need to follow the size of these databases and the number of supported users, as well as other performance parameters (for example, the size of transaction logs) server hardware level, allocated storage area, and mailbox size. 1 shows a typical data table tool for computing and storage requirements. For example, it is appropriate for a server with a mailbox size of 4000 MB to support users. Besides limiting the mailbox size (you can restrict all mailboxes in a database or for some users), you can manage Exchange-based data, you can also use the Group Policy and Exchange Mailbox manager to delete user mailboxes that have expired, or a very large number of emails. This method helps prevent users' mailboxes from quickly exceeding the limit. If you are worried that users will often delete emails by mistake, the "Restore deleted emails" function of Exchange is very useful. After this feature is enabled, you can restore the deleted emails directly. This function can solve the problem that users need to recover after deleting emails. Otherwise, they can only use the Administrator to recover from the backup tape. However, we still need to pay attention to the database growth caused by this. We have sufficient evidence that the retention time of the deleted items is set to 7 days, and the database will grow by 10%-30%.
Manage user-based data
Users usually use the OST or PST format to save emails on their local desktops or laptops. This is the most difficult to manage Exchange data, because these files are usually scattered, inaccessible (from the perspective of system management ). The OST file is slightly better, because it is always a copy of Exchange data. If Outlook 2003 is used to cache the Exchange mode, the OST file is a completely consistent copy of the online Exchange mailbox, but for non-Cache mode (or earlier versions of Outlook ), the local OST stores a subset of mailbox data on the server. PST is completely different. Because each email address has a size limit, this forces users to save some important emails to PST files, so these files are usually large (several hundred megabytes or even larger ), it is usually stored on the local hard disk-that is to say, these important information is not backed up. Some users put PST files in the private sharing area of the server, at least this score is better to store them in the user's computer. Because the daily backup of the server will contain these PST files stored on the server, if there is no mechanism to detect the size and growth speed of the PST file, it may still become a problem. Therefore, the transfer of emails to PST files is not worth the candle. In addition, PST Files pose great security risks. Users can choose to encrypt the PST file, but the tool used to decrypt the PST file is at your fingertips. If you use PST to save sensitive information, once the notebook or data is lost, the information will be irrecoverable if it is stolen. Even if PST is stored on the server for sharing, it is necessary to prevent unauthorized access. Finally, if the legal counsel asks the company to implement email archiving and backtracking mechanisms, these unmanageable PST will bring you endless troubles.
Better backup and recovery
The most important factor in choosing a backup and recovery solution is the amount of data you need to process and the processing speed. For server-based data, many enterprises choose based on the recovery speed within one hour (this standard depends on your company's service level Memorandum ). For example, to restore 40 GB of data within one hour, the corresponding tape drive must be able to provide a speed of no less than 10 Mbps per second. Currently, many backup solutions provide the ability to transfer data to the intermediate media before the data is actually written to the tape, so the backup and recovery speed is much faster than the conventional backup to the tape. SAN-based systems generally have a higher recovery speed. Generally, GB to GB is common hourly. This high speed certainly helps you design databases. The more data that can be backed up and restored within a fixed period of time, the more flexibility you are given in system design. You can increase the size limit of each mailbox, you can also increase the number of users supported by each server.
The Volume Shadow Copy Services (VSS) function of Windows Server 2003, together with Exchange 2003, provides multiple consecutive snapshots for the Exchange database in seconds. However, note that this snapshot is only an instantaneous image of the original database on the disk. If the source physical volume is damaged, these snapshots are completely invalid (although many storage vendors are working to solve this problem ). Therefore, even if you take a snapshot of the data, you still need to back up the tape. However, the snapshot data can be restored in a very short period of time. Therefore, storage applications supporting VSS can greatly increase the speed of backup and recovery, and your data storage architecture will be greatly improved. However, you need to perform a careful test before deploying it to the production environment.
Exchange 2003 (especially SP1) provides a new feature called Recovery Storage Group (RSG. The concept is very simple: when a database in SG fails and needs to be recovered from the backup, an empty database can be used by the affected users for the moment. Although users cannot access emails in the original database during the restoration process, the restoration storage group provides the basic functions for sending and receiving new emails. After the damaged database is restored, the recovered storage group (storing some recent emails) can be merged with the old database. The new Mailbox Data recovery Wizard (Recover Mailbox Data Wizard) in SP1 simplifies the process of merging the two databases.
Backup of user-based data, such as PST files, remains a challenge. It is almost impossible to back up PST files on users' local hard disks because it is difficult for us to control users' behavior on their workstation. PST files stored in network sharing can be backed up in a centralized manner, but they do not have much advantage over storing data in the Exchange database.
Strictly speaking, the archiving solution differs from the Regulatory-Compliance solution in the following aspects: 1. Archiving is usually initiated by users, in addition, users can decide how to transfer information from their Exchange mailbox to archive storage.
2. the proprietary archiving system usually uses rigid metrics such as the Policy-based content expiration time to transfer the content to the archive storage.
3. Generally, the archiving solution cannot completely record all emails that have been created by the system or processed by the system.
Although Outlook provides a very basic archiving function, you can transfer a message stored in your mailbox to a PST file or delete it when the threshold value is exceeded. However, this function only supports limited transfer and does not support archiving to a dedicated and protected archive device. Therefore, for the archiving feature we have discussed, outlook cannot provide us with anything.
There are already many mature solutions available in the market. For example, KVS Enterprise Vault from VERITAS can provide user initiation and policy combination to archive messages to a second level (or higher) the data location function. These solutions are very effective because an e-mail Stub (Stub) is retained in your Exchange mailbox during archiving of emails and attachments ). If you need to access the archived mail, click the mail stub. In this way, the efficiency of Exchange storage can be enhanced, because some large attachments can be transferred to archive storage in this way to free up more available space for Exchange.
This type of archiving solution generally integrates the Exchange message log (Journaling) function, which can intercept and track emails sent through the Exchange Server. However, if the number of emails is growing and the number of emails is required to comply with the regulations, even archiving solutions that integrate the Message Log feature of Exchange cannot meet the requirements (those that provide archive storage solutions that cannot be rewritten, deleted, and read-only ), there will be more advanced technologies to replace.
This type of technology includes EMC Centera and HP's Reference Information Storage System (RISS ). These solutions allow you to store emails as static content. An unmodifiable format is usually stored on a disk similar to RAID to ensure data consistency and content integrity, including digital signatures and timestamps. In addition to Content Retrieval and retrieval, these solutions apply Hierarchical Storage Management (HSM ). HSM is very important for large enterprises. For example, on average, every user sends 20 emails each day, and the average size of each email is 25 kb. In an enterprise with 10,000 users, about 200,000 emails are generated each day-that is, 4.7 GB per day, and 1.7 TB per year. If you still need to archive inbound emails, the storage requirements will be greatly increased. Of course, this is only the average level. Based on my experience, in an enterprise with 9,400 users, a user will receive an email of GB to GB every month.
Archiving is generally a prelude to migration of Exchange systems. This technology can significantly reduce the amount of data to be migrated and speed up the migration process.
Note:
EMC Centera is a network storage system designed to store and quickly and conveniently access fixed content. The outstanding advantages of Centera are the non-rewritable and non-erasable attributes in the WORM (one write multiple reads) attribute, disk performance and TCO. Http: // china.emc.com/products/systems/centera_ce.jsp
Reference Information Storage System (RISS) Reference Information Storage System: http://www.hp.com.cn/storage/mo_jukebox/RISS/default.asp
Hierarchical Storage Management (HSM) is the unified Management of all your Storage resources (such as tape drives, tape libraries, NAS, mid-and low-end disk arrays, and high-end Storage systems, improves the utilization of each storage device and saves costs. Based on the value of data storage, make rational use of storage resources stored online, nearline, and offline. OST offline folder file: used to save the local copy of the Exchange Server mailbox on the local computer. When the connection is available, the items in the OST file are synchronized with the Server.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.