A Free Trial That Lets You Build Big!
Start building with 50+ products and up to 12 months usage for Elastic Compute Service
Chapter 2 Storage System Environment
As one of the key elements of the data center, storage is considered an unusual resource and needs to be paid more attention and special for its implementation and management.ProgramStreaming to storage is called the storage system environment by combining various components. the main three components in this environment are: Host, connection and storage. these entities have their own physical and logical components to facilitate data access. this chapter describes the details of the storage system environment and focuses on storage. this chapter provides details about the various hardware components of a disk, disk geometry, and key rules for controlling disk performance. the bus technology used for the connection between the host and the storage, and the interface protocol will also be explained.
This chapter provides an opportunity to understand the various logical components of a host. These components include file system, volume manager, and operating system, they also introduced their roles in the storage system environment.
2.1 storage system environment components
The storage system environment has three main components: Host, connection, and storage. We will introduce them in this section.
You can use applications to store and retrieve data. the computer on which these applications run is called the host. the host can be a simple laptop or a complex server cluster ). the host has a physical component (hardware device) that allows it to connect to another host. The device uses logical components (software and Protocol ). data access and the performance of the entire storage system environment depend on the host's physical components and logical components. logical components are introduced in section 2.5 of this chapter.Physical Components
A host has three main physical components:
The communication path used by physical components is called bus. The bus connects CPU and other components, such as storage and IO devices.
The CPU consists of four main components:
Both the memory and storage media are used to store data temporarily and permanently. memory Modules are implemented using semiconductor chips, but storage devices use their magnetic media or optical media for storage. the memory module enables faster data access than other storage media. generally, the host has two types of memory:
Storage devices are cheaper than the semiconductor memory. Some storage devices are as follows:
The IO Device makes it possible to send and receive data from the host. The communication here may be one of the following:
A connection refers to a connection between the host and other hosts, or a connection between the host and any peripheral device (such as a printer or storage device. the connected components in the storage system environment can be divided into physical and logical components. physical components are the hardware that connects to the host. the connection logic component is the protocol used for communication between the host and the storage. communication protocols are introduced in chapter 5.
The connected host and storage have three physical components: Bus, port, and cable.
A bus is a set of paths connecting one part of a computer to another, for example, a connection from a CPU to memory. port is a special entry between the host and external devices. cable is a copper or optical media connecting the host to an internal or external device.
Physical components communicate with each other through the bus and transmit data bit (control, data, address). These bit are transmitted on the bus through one of the following methods:
The size of the bus, also known as the width, determines the amount of data that can be transmitted through the bus at a time. the width of the bus can be compared by the number of lines on the highway. for example, the 32-bit bus can transmit 32 bit data, while the 64-bit bus can transmit 64 bit data at the same time. each bus has a clock measured in MHz. this data transmission rate at the end-to-end of the bus affects the speed at which the application runs.
Bus can be divided into the following types based on the channels in the computer system:
Connected logical component
The popular interface protocol used to connect local bus to peripheral devices is Peripheral Component Interconnect (PCI ). this interface protocol is used to connect the disk system to the integrated device electronics/advanced policyattachment (IDE/ATA) and small computer system interface (SCSI ).
PCI is a specification that standardizes the communication between PCI expansion cards (such as NICs and cats) and CPUs. PCI provides interconnection between the CPU and the attached device. the plug-and-play function of PCI makes it easy for the host to identify and configure new cards and devices. the bus width of PCI can be 32bit or 64bit. the 32bit PCI bus provides a throughput of 133 MB per second. PCI Express is an enhanced version of the PCI bus that provides higher throughput and clock speed.
IDE/ATA is the most popular interface protocol for modern disks. This protocol provides excellent performance at relatively low costs. Details of IDE/ATA will be introduced in chapter 5th.
SCSI has gradually become the preferred protocol for high-end computers. this kind of interface is much less used than IDE/ATA on PCs, because it is more costly. SCSI is initially used on parallel interfaces for devices to communicate with the host. SCSI is now enhanced, including more types of related technologies and standards. chapter 5 introduces SCSI in detail.
Storage devices are the most important component in the storage system environment. storage vehicles are using either magnetic or solid media. disks, tapes, disks, and magnetic media. CD-ROM is a solid state device using optical media and removable flash cards are an example of solid state media.
Tape is a popular medium for backup, because it is relatively low cost. in the past, the data center had a large number of tape drives, processing thousands of tapes. however, tape has the following limitations:
■ The data on the tape is stored linearly along the length of the tape. searching and retrieving data are executed sequentially. It takes several seconds to access the data. therefore, random data access is very slow and time-consuming. this limits the chance of tape as the storage end for applications that require real-time, fast, and data access.
■ In a shared computing environment, data stored on tapes cannot be accessed by multiple applications at the same time. The limit is that only one application can be accessed at a time.
■ On the tape drive, the reader head must touch the surface of the tape, so the tape will degrade or be damaged during repeated use.
■ The demand for data storage and retrieval is also subject to high costs associated with tape management.
Despite these limitations, tape is still widely used due to its low cost. the continuous advances in tape technology have led to the need for large-capacity media and high-speed drives. modern tape libraries have more memory (cache) or disk drives to increase data throughput. with the added intelligence, today's tape is part of an end-to-end data solution, especially for low-cost, long-term data storage solutions with less frequent access.
CDs are stored in small and popular in single-user computing environments. he is often used by individuals to store photos or back up data on small and medium PCs. he is also spread as a single application, such as a game, or as a means of transmitting data from one system to another. the limitations of the optical disc are the capacity and read speed, which limits the application of the commercial data storage solution of optical media. one read of multiple (worm) is the advantage of the disc. CD-ROM is an example of a worm device. to some extent, the CD ensures that the data will not be modified, so they can be used as low-cost, long-term, relatively small or fixed, the storage method that will not be modified after creation. the disc array, called jukebox, is still used as a fixed content storage solution. other disc media include CD-RW, and some DVD variants.
Disk drives are the most popular storage media in modern computers. They are used to store and access high performance requirements, online applications, and so on. disks support fast random data access. this means that data can be read and written by multiple users or applications at the same time relatively quickly. in addition, the disk has a large capacity, and the disk storage array is implemented through multiple disks, providing increased capacity and enhanced performance.
Translated from <Information Storage Management>
Start building with 50+ products and up to 12 months usage for Elastic Compute Service