Understanding of Storage

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Disk

An I/O access is roughly divided into three steps: first, the head to the specified track (seek), and second, waiting for the data to be read to rotate along with the disk to the head (Delay ), third, read data. Comparison
The first two times can be ignored, so the response time of an IO is equal to the seek time + delay time. Because the seek time is a mechanical action, so it is difficult to greatly improve,
You can increase the Latency by increasing the disk speed. Therefore, a disk with a higher speed can carry more iops. The disk iops is determined by the disk speed. For example, a disk with a 15000rpm speed can
To withstand 150 iops.

Throughput is determined by the disk speed and interface speed. The speed determines the internal transmission rate, and the interface determines the external transmission rate. Obviously, the former must be lower than the latter. Common interfaces are:
ATA, SCCI, SATA, SAS, FC, etc. FC interfaces are common in high-end storage, while SAS and SATA are common in server or low-end storage.

Storage

For a storage system, iops mainly depends on the cache algorithm and the number of disks. Sometimes we tend to be fooled by vendor data. The first is the cache hit rate.
The provider has used some means to make the cache hit rate very high and the iops can be almost as desired. Another factor is the number of disks. the manufacturer's data is the test result of 1000 disks of the same model.
Our actual system only has 100 disks.

When purchasing storage, you should avoid buying high-end storage, instead of configuring a small number of disks. vendors like you very much to buy a high-end box, telling you that the scalability is good and you don't need to buy less disks now, later
Expansion and so on. We recommend that you do not overwrite data. If you are pursuing high performance, you can select disks with smaller capacity, while the number of disks is larger.

The number of disks can be calculated. In our experience, the cache hit rate of OLTP applications is usually about 20%, and the rest of the I/O is still on the disk. Based on the disk speed and type, we can
In order to know the iops that a disk can carry, the number of disks can be estimated. To get a better response time, we recommend that the iops of each disk should not exceed 100.

The factors that affect throughput are a little more complex, depending on the number of disks and the storage architecture. When the number of disks reaches a certain level, the throughput is mainly limited by the storage architecture. For example, for a high-end storage, the maximum throughput is
It is 1.4 GB, which is determined by its internal architecture. In addition, pay attention to the storage and host interfaces, such as the memory card, which has 4 GB and 2 GB (bit here, rather than byte). Generally, the host
And storage are equipped with multiple memory cards.

Raid

Raid is commonly used in raid 10 and RAID 5. database applications with high performance requirements generally use RAID 10 and RAID 5.
Redo is placed on RAID5, because RAID5 has very poor performance for small Io such as Redo, which can easily cause Log File sync waiting. One raid
The number of disks in A group should not exceed 10, because the larger the number of disks in a raid group, the higher the probability of a bad disk (probability problem ). Some high-end storage for raid

The number of disks in the group is fixed, which is mainly related to the storage architecture. In the process of using storage, you will find that the more high-end things are, the more rigid they are, while the middle and low-end storage is very flexible, not
High-end storage is poor, but architecture determines everything.

Stripe

The role of stripe is to distribute Io as much as possible. It can be adjusted in some storage, but many storage cannot be adjusted, generally between-K. There is an error
It is said that I made a stripe in the storage, an IO in the database, and all the Disks will respond to this Io. This statement is incorrect. For Oracle, the size of a random Io is
8 K. Generally, the size of the strip is much larger than that of 8 K. Therefore, a random oracle Io will always be on only one disk. A disk can only respond to one Io at a time, that is, the disk does not
The concept of sending Io, but from the perspective of the entire system, Io is still scattered in the macro view due to different disk responses, so we can see that when a database is running, all disks are busy. In fact, each disk is
Disks are for different I/O services. For sequential Io, the default setting of Oracle is 128 K, and the maximum value is determined by the OS, which is generally 1 M. If the size of sequential Io is greater than stripe
I/O may have several disks with a response at the same time, but many of the storage's stripe is greater than 128 kb. At this time, I/O still has only one disk for response. Because reading is a sequential process, it is necessary
To improve throughput.

Someone may ask, how good is stripe? If I make stripe very small, isn't that good? One Io can read many disks at the same time, greatly improving the throughput. Let's assume that
The stripe value is 1 K. Oracle has to distribute one Io on eight different disks, but the problem arises. A disk does not have the concurrent I/O capability, if each Io occupies many blocks
In this way, the concurrent I/O capability of the entire system is reduced. If an 8 k I/O is read on one disk, it is read in parallel with eight disks, there will be no big difference (maybe reading on a disk is faster ),
It cannot be very small as stripe. How big is stripe? In my opinion, it is better to be bigger than or smaller than 256 kb. Data Warehouse applications can be larger. ASM
The default value of stripe is 1 M. I used to change 1 m to 128 K. I found that 1 M has better performance and 1 m is recommended for Oracle. This indicates that
The stripe size should be slightly larger, rather than the finer or more dispersed we want.

Storage Division

How can we use the allocated Lun after it is output to the host? This is more flexible and variable. First of all, we need to look at our use. are we pursuing iops or throughput? We use File
System, raw devices, ASM? How many disks does the storage output Lun span? The general storage does not have the virtualization function, the output Lun is only in one raid
At this time, we often need to use the LVM on the OS to divide it again. Let's look at the following.

Each raid
Goup has four disks, and two Luns are created. After being output to the host, two vpcs are created with blue and red Luns respectively, and then LV (stripe) is created ), each LV is
It is completely across all disks. In practice, we need to consider more issues. Sometimes we need to consider not only the disk, but also the distribution of the load to different controllers. The problem of front-end card backend cards and multi-path is quite complicated.
Some storage has the virtualization function and can even output A Lun. For example, 3par can output a virtual volume, which is already across all disks, we can use it directly (but the actual
).

With ASM in Oracle, the problem becomes more complicated. My suggestion is that if yes, only RAID 1 is used for storage, and the stripe is used for ASM. If some storage is required
Stripe, no problem. Storage division is a very technical task. It is possible to make a good plan only when you have a deep understanding of storage, host, and database.

This article mainly discusses some misunderstandings in the storage usage process. With the gradual popularization of SSD, I think it will bring great changes to the overall storage market. Next time we will discuss what SSD has brought to us.
Opportunities.

Note: storage is the bottom layer of the system. Because it is very important, the market is basically monopolized by several major manufacturers. Every manufacturer has some buzzwords or commercial hype, so we need to polish our eyes,
Beware of being fooled.

-EOF-

Correction: In this article, we have an incorrect assumption that Oracle scattered read is a fully serial process, which is actually used in different multiblock
There is a degree of parallelism between reads. Oracle sends several multiblock read requests to the operating system at the same time.
IO requests, and then merge and sort the returned results. The whole scattered read should be a local parallel and macro serial process.

Post address: http://www.hellodba.net/tag/%E5%AD%98%E5%82%A8

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Understanding of Storage

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Understanding of Storage

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support