Introduction to Linux Cluster File System

Source: Internet
Author: User

Cluster Application

Cluster applications have various levels of maturity and functionality. They include:
  • High PerformanceA cluster, also known as a parallel cluster or computing cluster, is usually used in systems that support massive computing processing. In these clusters, parallel file systems allocate processing resources among nodes, so that each node can access the same file simultaneously through concurrent read/write. Beowulf Linux cluster developed by NASA in the early 1990s S is the most common example.
  • High Availability(HA) clusters are specially designed for fault tolerance or redundancy. Because these clusters usually use one or more servers for processing, when one or more servers are down, these servers can assume the processing responsibilities of other servers.
  • Load BalancingOrServer Load balancerThe cluster distributes load evenly among multiple servers (usually web servers) as much as possible.
  • StorageA cluster is used between a SAN instance and a server with different operating systems to provide shared access to data blocks on a common storage medium.
  • DatabaseThe cluster uses Oracle RAC as a platform and introduces many cluster file system features into the application itself.
These cluster applications have overlapping features. one or more of these features can be found in a single cluster application, especially in HA and load balancing clusters. For example, Oracle RAC can be installed on the HA cluster file system to introduce the advantages of the Database Cluster into the HA cluster application, such:
  • Shared resources-including data, storage, hard disks, and metadata-make multiple nodes look like a single file system. They allow all members of the Cluster to read and write the file system at the same time.
  • A storage device is collected to a single disk volume, which improves performance because no data replication is required.
  • Scalable capacity, bandwidth, and connectivity
  • A single system image provides the same data view for all nodes.
Now let's look at some Linux file systems that support Oracle RAC and have cluster awareness, and how they can improve Oracle RAC functions.

Cluster File system that can run Oracle

Oracle RAC technology has provided features such as server Load balancer, redundancy, failover, scalability, high-speed cache, and locking, therefore, when the Oracle data file is located on a block device with a traditional Linux file system such as ext2/ext3 installed, the function will be duplicated. In this case, the performance is reduced because the high-speed cache of Oracle and file systems consumes memory resources.

At the time of writing this article, in addition to the third-party cluster file system, there are four optional file systems that run Oracle RAC. The Oracle recommendation sequence is as follows:

  1. Oracle Automatic Storage Management
  2. Oracle Cluster File System
  3. Network File System
  4. Original device.
Oracle Automatic Storage Management (ASM)One feature of Oracle is that, no matter in which environment it runs, once you obtain an Oracle API, all the appearances, experiences, and operations are the same. Oracle ASM is an Oracle Database 10 GIt extends this consistent environment to storage management, and uses SQL statements, Oracle Enterprise Manager Grid Control, or database configuration assistant programs to create and manage storage content and metadata. Use ASM for Oracle Database 10 GData File storage is considered the best method.

The basic data structure in ASM is a disk group, which consists of one or more disks. In this context, a "disk" can be a disk partition, a complete disk, a cascade disk, a partition of a storage device, or a complete storage device.

We must realize that ASM is not a general cluster file system. On the contrary, ASM is a cluster-aware file system designed to process Oracle database files, control files, and log files. ASM should not be shared with the logical volume manager (LVM) because the latter will make ASM unable to recognize the disk.

ASM performs the following functions:
  • Use the asm id in the disk header to identify the disk.
  • Data is dynamically allocated between all memories in the disk group, providing optional redundancy protection and cluster awareness.
  • Allows you to perform major storage operations when the Oracle database is fully operational-add, delete, or even move a disk group to a new storage array without downtime, though rare)
  • When a disk is added or deleted, automatic load balancing and re-balancing are performed.
  • Provides extra redundancy protection by using fault groups
  • Optimize the use of storage resources.
When installed on the original device or the block device that uses the ASM library driver recommended by Oracle, the ASM itself runs as an instance, which is started before the database instance. It enables DBAs to create, expand, and reduce disks and map these changes to the disk groups on other nodes that share access to these groups. The database instance can share the cluster pool of memory among multiple nodes in the cluster.

ASM is installed by the General Oracle installer. If you add ASM to an existing database, make sure that the database is set to belong to the ASM instance so that the ASM instance can be started before the slave database at startup. For example:

$ Srvctl modify instance-d O10G-I O10G1-s + ASM1

Make the o10G1 instance belong to the + ASM1 instance.

The differences between an ASM instance and an Oracle database instance are as follows:
  1. Although you can use several V $ views to obtain information about the ASM instance, there is no data dictionary: V $ ASM_DISKGROUP, V $ ASM_CLIENT, V $ ASM_DISK, V $ ASM_FILE, V $ ASM_TEMPLATE, V $ ASM_ALIAS, and V $ ASM_OPERATION.
  2. You can only use SYSDBA or SYSOPER to connect to the ASM instance.
  3. There are five initialization parameters for the ASM instance. INSTANCE_TYPE is necessary and should be set as follows: INSTANCE_TYPE = ASM.
In an ASM instance, DBA can use SQL syntax or Enterprise Manager:
  1. Define a disk group for the storage pool using one or more disks
  2. Add and delete disks in a disk group
  3. DefineFault GroupTo increase data redundancy protection. This is usually a series of disks in a disk group that need to run continuously. They share a shared resource, such as a controller.
You can use Enterprise Manager or the V $ ASM view to monitor the status of the ASM disk group. You can also reference them in a database instance when creating a database structure to allocate memory.

When you create a tablespace, redo log, archive log file, and control file, you can reference the ASM disk group from the database instance by specifying the disk group in the initialization parameter or in the DDL.

For more information about ASM, see Lannes Morris-Murphy's OTN article "Automatic Storage" and Arup Nanda's "Oracle Database 10 ". G: The ASM section of the 20 most important features provided for DBAs and Oracle Database Administrator guide 10g 1st (10.1)Chapter 1.

Oracle Cluster File System (OCFS)OCFS is designed to support data and disk sharing of Oracle RAC applications. It provides consistent File System Images Between server nodes in the RAC cluster and acts as a replacement for the original device. In addition to simplifying cluster database management, it also overcomes the limitations of the original device while maintaining the performance advantages of the original device.

OCFS 1st supports Oracle data files, spfiles, control files, arbitration disk files, archiving logs, configuration files, and Oracle cluster Registry (OCR) files. Oracle Database 10G). It is designed not to use files from other file systems, or even to use Oracle software that must be installed on each node of the cluster-unless you use a third-party solution. In addition, OCFS does not provide LVM functions, such as I/O allocation and segmentation, and does not provide redundancy.

Oracle supports Oracle databases in OCFS 2.1 on Red Hat Advanced Server 1st, Red Hat Enterprise Linux 3, and Novell SUSE (United Linux) for 32-bit and 64-bit releases, the database needs to be installed from a downloadable binary file. Oracle does not support recompilation.

There are three different rpm packages:

  • The OCFS kernel module has different distribution versions for Red Hat and United Linux. Verify your kernel version carefully:
    $ uname -aLinux linux 2.4.18-4GB #1 Wed Mar 27 13:57:05 UTC 2002 i686 unknown
  • OCFS support package
  • OCFS tool package.
After downloading these rpm packages, perform the following installation steps:
  1. Run the command in the directory where the rpm package is downloaded.Rpm-Uhv ocfs *. rpmCommand to install these packages.
  2. Make sure that automatic mounting is enabled when boot is enabled.
  3. Ocfstool is automatically used to configure OCFS on each node in the cluster. You can also use manual configuration. For more information, seeOCFS User Guide. The final result of this step is to create the/etc/ocfs. conf file for configuring OCFS.
  4. Run ocfs load_ocfs to ensure that OCFS is loaded at startup.
  5. UseOcfstoolCommand and GUI environment orMkfs. ocfsFormat the OCFS partition.
  6. Manually mount the OCFS partition, or add an item in/etc/fstab to implement automatic mounting.

Because OCFS version 1st is not written to comply with POSIX standards, file commands such as cp, dd, tar, and textutils require coreutils to provide an O_DIRECT switch. This switch enables these commands to be used as expected for Oracle data files, even if Oracle is operating on these identical files only when you run a third-party software for hot backup ). RMAN can be used to completely avoid this problem. If you still need these features to complete various maintenance tasks, you can download the OCFS tool that implements these commands from oss.oracle.com/projects/coreutils/files.

By contrast, OCFS version 2nd is still a beta version by March 2005) complies with POSIX standards and supports Oracle database software, which can be installed on one node and shared among other nodes in the cluster. In addition to the shared ORACLE_HOME, other new features of OCFS 2nd include improved metadata data cache, space allocation, and locking. In addition, the log and node recovery functions are improved.

Network File System (NFS)Although ASM and OCFS are the preferred file systems for Oracle RAC, Oracle also supports NFS on authenticated network file servers. NFS is a distributed file system, which is not discussed in this article. For more information, visit the NFS homepage.

Original DeviceIn the past, the original device was the only option to run Oracle RAC. The original device is a disk drive without a file system installed. It can be divided into multiple original partitions. The original device allows direct access to the hardware Partition by bypassing the file system buffer cache.

To enable Oracle RAC to use the original device, you must use the LinuxRawCommand to bind a block device to the original device:

# raw /dev/raw/raw1 /dev/sda/dev/raw/raw1:bound to major 8, minor 0# raw /dev/raw/raw2 /dev/sda1/dev/raw/raw2:bound to major 8, minor 1# raw /dev/raw/raw3 /dev/sda2/dev/raw/raw3:bound to major 8, minor 2
After binding, you can useRawCommand to query all the original devices.
# raw -qa/dev/raw/raw1:bound to major 8, minor 0/dev/raw/raw2:bound to major 8, minor 1/dev/raw/raw3:bound to major 8, minor 2
The major and minor values determine the device location and driver of the kernel. The major value determines the total device type, while the minor value determines the number of devices belonging to the device type. In the above example, major 8 is the device type of the SCSI disk/dev/sda.

Note that the above command can be run without being accessible to the device. When I run the above command for demonstration, my system is not connected to any SCSI disk. The effects of these commands will disappear after my next restart, unless I put these commands in a directory similar to/etc/init. d/boot. local or/etc/init. in d/dbora boot scripts, these scripts are run whenever my system is booted.

After you map a block device to the original device, you still need to ensure that the original device belongs to the oracle user and oinstall group.

# ls -l /dev/raw/raw1crw-rw----    1 root     disk     162,   1 Mar 23  2002 /dev/raw/raw1# chown oracle:oinstall /dev/raw/raw1# ls -l /dev/raw/raw1crw-rw----    1 oracle   oinstall 162,   1 Mar 23  2002 /dev/raw/raw1
You can then use Symbolic Links between the Oracle data file and the original device to make things easier to manage.

In Linux kernel 2.4, the original device restrictions include the restrictions on one original device per partition and the limit on 255 original devices per system. Novell SUSE Enterprise Linux has 63 original device files, but can be usedMknodCommand must have the root permission) create up to 255 original devices.

# ls /dev/raw/raw64ls:/dev/raw/raw64:No such file or directory# cd /dev/rawlinux:/dev/raw # mknod raw64 c 162 64# ls /dev/raw/raw64/dev/raw/raw64
The aboveMknodThe Command requires the device name, device type, and major and minor values. In this example, the device name is "raw64", and the device type is "c", indicating that it is a character device ). The major and minor values of the new device are 162 and 64, respectively. In addition, Novell SUSE users can install these devices by running orarun rpm.

Other disadvantages of using the original device include:

  • The number of original partitions of a disk is limited to 14.
  • Oracle File Management (OMF) is not supported ).
  • The size of the original device partition cannot be adjusted. Therefore, if the space is insufficient, you must create another partition to add database files.
  • The original device is displayed as unused space, which may cause other applications to overwrite it.
  • The only way to write data to the original device is to use low-level commands.Dd, This command transfers raw data between devices or files. However, you need to be extremely careful to ensure proper coordination of memory and disk I/O operations.
  • A raw partition can only have one data file, one control file, or one redo log. If you do not use ASM, you need to provide an independent raw device for each data file associated with the tablespace. However, a tablespace can have multiple data files in different original device partitions.
Conclusion

Oracle RAC provides many functions of a file system cluster or non-cluster, minimizing the work of the file system itself. As mentioned above, all we need is a file system that supplements the existing and internal database cluster functions of Oracle RAC. Although OCFS, NFS, and original devices may also be feasible, in most cases, ASM will do this to the maximum extent and thus be considered the best practice of Oracle. You can also use ASM for data files, OCFS for voting disks, OCR and Oracle home directories, and ASM for NFS storage.

In the future, we can look forward to another way: OCFS 2nd can use the shared Oracle Home Directory to improve the shared memory on ASM.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.