Oracle Cluster File System version 2 (OCFS2) is a general-purpose, high-performance, high-availability, shared-disk file system intended for use in clusters. It is also possible to mount an OCFS2 volume on a standalone, non-clustered system.
Although it might seem that there is no benefit in mounting
ocfs2
locally as compared to alternative file systems such as ext4
or btrfs
, you can use the reflink command with OCFS2 to create copy-on-write clones of individual files in a similar way to using the cp --reflink command with the btrfs file system. Typically, such clones allow you to save disk space when storing multiple copies of very similar files, such as VM images or Linux Containers. In addition, mounting a local OCFS2 file system allows you to subsequently migrate it to a cluster file system without requiring any conversion.
Almost all applications can use OCFS2 as it provides local file-system semantics. Applications that are cluster-aware can use cache-coherent parallel I/O from multiple cluster nodes to balance activity across the cluster, or they can use of the available file-system functionality to fail over and run on another node in the event that a node fails. The following examples typify some use cases for OCFS2:
- Oracle VM to host shared access to virtual machine images.
- Oracle VM and VirtualBox to allow Linux guest machines to share a file system.
- Oracle Real Application Cluster (RAC) in database clusters.
- Oracle E-Business Suite in middleware clusters.
OCFS2 has a large number of features that make it suitable for deployment in an enterprise-level computing environment:
- Support for ordered and write-back data journaling that provides file system consistency in the event of power failure or system crash.
- Block sizes ranging from 512 bytes to 4 KB, and file-system cluster sizes ranging from 4 KB to 1 MB (both in increments in power of 2). The maximum supported volume size is 16 TB, which corresponds to the maximum possible for a cluster size of 4 KB. A volume size as large as 4 PB is theoretically possible for a cluster size of 1 MB, although this limit has not been tested.
- Extent-based allocations for efficient storage of very large files.
- Optimized allocation support for sparse files, inline-data, unwritten extents, hole punching, reflinks, and allocation reservation for high performance and efficient storage.
- Indexing of directories to allow efficient access to a directory even if it contains millions of objects.
- Metadata checksums for the detection of corrupted inodes and directories.
- Extended attributes to allow an unlimited number of
name:value
pairs to be attached to file system objects such as regular files, directories, and symbolic links. - Advanced security support for POSIX ACLs and SELinux in addition to the traditional file-access permission model.
- Support for user and group quotas.
- Support for heterogeneous clusters of nodes with a mixture of 32-bit and 64-bit, little-endian (x86, x86_64, ia64) and big-endian (ppc64) architectures.
- An easy-to-configure, in-kernel cluster-stack (O2CB) with a distributed lock manager (DLM), which manages concurrent access from the cluster nodes.
- Support for buffered, direct, asynchronous, splice and memory-mapped I/O.
- A tool set that uses similar parameters to the
ext3
file system.
Use yum to install or upgrade the following packages to the same version on each node:
kernel-uek
ocfs2-tools
Creating the Configuration File for the Cluster Stack
You can create the configuration file by using the o2cb command or a text editor.
To configure the cluster stack by using the o2cb command:
- Use the following command to create a cluster definition.
#
o2cb add-cluster
cluster_name
For example, to define a cluster namedmycluster
with four nodes:#
o2cb add-cluster mycluster
The command creates the configuration file/etc/ocfs2/cluster.conf
if it does not already exist. - For each node, use the following command to define the node.
#
o2cb add-node
cluster_name
node_name
--ipip_address
The name of the node must be same as the value of system'sHOSTNAME
that is configured in/etc/sysconfig/network
. The IP address is the one that the node will use for private communication in the cluster.For example, to define a node namednode0
with the IP address 10.1.0.100 in the clustermycluster
:#
o2cb add-node mycluster node0 --ip 10.1.0.100
- If you want the cluster to use global heartbeat devices, use the following commands.
#
o2cb add-heartbeat
. . . #cluster_name
device1
o2cb heartbeat-mode
cluster_name
globalNoteYou must configure global heartbeat to use whole disk devices. You cannot configure a global heartbeat device on a disk partition.For example, to use/dev/sdd
,/dev/sdg
, and/dev/sdj
as global heartbeat devices:#
o2cb add-heartbeat mycluster /dev/sdd
#o2cb add-heartbeat mycluster /dev/sdg
#o2cb add-heartbeat mycluster /dev/sdj
#o2cb heartbeat-mode mycluster global
- Copy the cluster configuration file
/etc/ocfs2/cluster.conf
to each node in the cluster.NoteAny changes that you make to the cluster configuration file do not take effect until you restart the cluster stack.
The following sample configuration file
/etc/ocfs2/cluster.conf
defines a 4-node cluster named mycluster
with a local heartbeat.node: name = node0 cluster = mycluster number = 0 ip_address = 10.1.0.100 ip_port = 7777 node: name = node1 cluster = mycluster number = 1 ip_address = 10.1.0.101 ip_port = 7777 node: name = node2 cluster = mycluster number = 2 ip_address = 10.1.0.102 ip_port = 7777 node: name = node3 cluster = mycluster number = 3 ip_address = 10.1.0.103 ip_port = 7777 cluster: name = mycluster heartbeat_mode = local node_count = 4
If you configure your cluster to use a global heartbeat, the file also include entries for the global heartbeat devices.
node: name = node0 cluster = mycluster number = 0 ip_address = 10.1.0.100 ip_port = 7777 node: name = node1 cluster = mycluster number = 1 ip_address = 10.1.0.101 ip_port = 7777 node: name = node2 cluster = mycluster number = 2 ip_address = 10.1.0.102 ip_port = 7777 node: name = node3 cluster = mycluster number = 3 ip_address = 10.1.0.103 ip_port = 7777 cluster: name = mycluster heartbeat_mode = global node_count = 4 heartbeat: cluster = mycluster region = 7DA5015346C245E6A41AA85E2E7EA3CF heartbeat: cluster = mycluster region = 4F9FBB0D9B6341729F21A8891B9A05BD heartbeat: cluster = mycluster region = B423C7EEE9FC426790FC411972C91CC3
The cluster heartbeat mode is now shown as
global
, and the heartbeat regions are represented by the UUIDs of their block devices.
If you edit the configuration file manually, ensure that you use the following layout:
- The
cluster:
,heartbeat:
, andnode:
headings must start in the first column. - Each parameter entry must be indented by one tab space.
- A blank line must separate each section that defines the cluster, a heartbeat device, or a node.
To configure the cluster stack:
- Run the following command on each node of the cluster:
#
/etc/init.d/o2cb configure
The following table describes the values for which you are prompted.To verify the settings for the cluster stack, enter the systemctl status o2cb command:#
systemctl status o2cb
Driver for "configfs": Loaded Filesystem "configfs": Mounted Stack glue driver: Loaded Stack plugin "o2cb": Loaded Driver for "ocfs2_dlmfs": Loaded Filesystem "ocfs2_dlmfs": Mounted Checking O2CB cluster "mycluster": Online Heartbeat dead threshold: 61 Network idle timeout: 30000 Network keepalive delay: 2000 Network reconnect delay: 2000 Heartbeat mode: Local Checking O2CB heartbeat: ActiveIn this example, the cluster is online and is using local heartbeat mode. If no volumes have been configured, the O2CB heartbeat is shown asNot active
rather thanActive
.The next example shows the command output for an online cluster that is using three global heartbeat devices:#
systemctl status o2cb
Driver for "configfs": Loaded Filesystem "configfs": Mounted Stack glue driver: Loaded Stack plugin "o2cb": Loaded Driver for "ocfs2_dlmfs": Loaded Filesystem "ocfs2_dlmfs": Mounted Checking O2CB cluster "mycluster": Online Heartbeat dead threshold: 61 Network idle timeout: 30000 Network keepalive delay: 2000 Network reconnect delay: 2000 Heartbeat mode: Global Checking O2CB heartbeat: Active 7DA5015346C245E6A41AA85E2E7EA3CF /dev/sdd 4F9FBB0D9B6341729F21A8891B9A05BD /dev/sdg B423C7EEE9FC426790FC411972C91CC3 /dev/sdj - Configure the
o2cb
andocfs2
services so that they start at boot time after networking is enabled:#
systemctl enable o2cb
#systemctl enable ocfs2
These settings allow the node to mount OCFS2 volumes automatically when the system starts.
For the correct operation of the cluster, you must configure the kernel settings shown in the following table:
On each node, enter the following commands to set the recommended values for
panic
and panic_on_oops
:#sysctl kernel.panic = 30
#sysctl kernel.panic_on_oops = 1
To make the change persist across reboots, add the following entries to the
/etc/sysctl.conf
file:# Define panic and panic_on_oops for cluster operation kernel.panic = 30 kernel.panic_on_oops = 1
The following table shows the commands that you can use to perform various operations on the cluster stack.
You can use the mkfs.ocfs2 command to create an OCFS2 volume on a device. If you want to label the volume and mount it by specifying the label, the device must correspond to a partition. You cannot mount an unpartitioned disk device by specifying a label. The following table shows the most useful options that you can use when creating an OCFS2 volume.
For example, create an OCFS2 volume on
/dev/sdc1
labeled as myvol
using all the default settings for generic usage (4 KB block and cluster size, eight node slots, a 256 MB journal, and support for default file-system features).# mkfs.ocfs2 -L "myvol" /dev/sdc1
Create an OCFS2 volume on
/dev/sdd2
labeled as dbvol
for use with database files. In this case, the cluster size is set to 128 KB and the journal size to 32 MB.# mkfs.ocfs2 -L "dbvol" -T datafiles /dev/sdd2
Create an OCFS2 volume on
/dev/sde1
with a 16 KB cluster size, a 128 MB journal, 16 node slots, and support enabled for all features except refcount trees.#mkfs.ocfs2 -C 16K -J size=128M -N 16 --fs-feature-level=max-features
\--fs-features=norefcount /dev/sde1
Note
Do not create an OCFS2 volume on an LVM logical volume. LVM is not cluster-aware.
You cannot change the block and cluster size of an OCFS2 volume after it has been created. You can use thetunefs.ocfs2 command to modify other settings for the file system with certain restrictions. For more information, see the
tunefs.ocfs2(8)
manual page.
As shown in the following example, specify the
_netdev
option in /etc/fstab
if you want the system to mount an OCFS2 volume at boot time after networking is started, and to unmount the file system before networking is stopped.myocfs2vol /dbvol1 ocfs2 _netdev,defaults 0 0
Note
The file system will not mount unless you have enabled the
o2cb
and ocfs2
services to start after networking is started. See Section 20.2.5, “Configuring the Cluster Stack”.
You can use the tunefs.ocfs2 command to query or change volume parameters. For example, to find out the label, UUID and the number of node slots for a volume:
# tunefs.ocfs2 -Q "Label = %V\nUUID = %U\nNumSlots =%N\n" /dev/sdb
Label = myvol
UUID = CBB8D5E0C169497C8B52A0FD555C7A3E
NumSlots = 4
Generate a new UUID for a volume:
#tunefs.ocfs2 -U /dev/sda
#tunefs.ocfs2 -Q "Label = %V\nUUID = %U\nNumSlots =%N\n" /dev/sdb
Label = myvol UUID = 48E56A2BBAB34A9EB1BE832B3C36AB5C NumSlots = 4
Comentarios
Publicar un comentario