ZFS | JamesCoyle.net Limited

23-Sep-2013
James Coyle
2

My experience with GlusterFS performance.

Category : How-to

Get Social!

gluster-orange-ant I have been using GlusterFS to replicate storage between two physical servers for two reasons; load balancing and data redundancy. I use this on top of a ZFS storage array as described in this post and the two technologies combined provide a fast and very redundant storage mechanism. At the ZFS layer, or other filesystem technology that you may use, there are several functions that we can leverage to provide fast performance. For ZFS specifically, we can add SSD disks for caching, and tweak memory settings to provide the most throughput possible on any given system. With GlusterFS we also have several ways to improve performance but before we look into those, we need to be sure that is it the GlusterFS layer which is causing the problem. For example, if your disks or network is slow, what chance does GlusterFS have of giving you good performance? You also need to understand how the individual components work under the load of your expected environment. The disks may work perfectly well when you use dd to create a huge file, but what about when lots of users create lots of files all at the same time? You can break down performance into three key areas:

Networking – the network between each GlusterFS instance.
Filesystem IO performance – the file system local to each GlusterFS instance.
GlusterFS – the actual GlusterFS process.

Networking Performance

Before testing the disk and file system, it’s a good idea to make sure that the network connection between the GlusterFS nodes is performing as you would expect. Test the network bandwidth between all GlusterFS boxes using Iperf. See the Iperf blog post for more information on benchmarking network performance. Remember to test the performance over a period of several hours to minimise the affect of host and network load. If you make any network changes, remember to test between each change to make sure it has had the desired effect.

Filesystem IO Performance

Once you have tested the network between all GlusterFS boxes, you should test the local disk speed on each machine. There are several ways to do this, but I find it’s best to keep it simple and use one of two options; DD or bonnie++. You must be sure to turn off any GlusterFS replication as it is just the disks and filesystem which we are trying to test here. Bonnie++ is a freely available IO benchmarking tool. DD is a linux command line tool which can replicate data streams and copy files. See this blog post for information on benchmarking the files system.

Technology, Tuning and GlusterFS

Once we have made it certain in our minds that disk I/O and network bandwidth are not the issue, or more importantly understood what constraints they give you in your environment, you can tune everything else to maximise performance. In our case, we are trying to maximise GlusterFS replication performance over two nodes.

We can aim to achieve replication speeds nearing the speed of the the slowest performing speed; file system IO and network speeds.

See my blog post on GlusterFS performance tuning.

25-Aug-2013
James Coyle
17

ZFS and GlusterFS network storage

Tags : GlusterFS ZFS

Category : How-to

Get Social!

gluster-orange-ant Since ZFS was ported to the Linux kernel I have used it constantly on my storage server. With the ability to use SSD drives for caching and larger mechanical disks for the storage arrays you get great performance, even in I/O intensive environments. ZFS offers superb data integrity as well as compression, raid-like redundancy and de-duplication. As a file system it is brilliant, created in the modern era to meet our current demands of huge redundant data volumes. As you can see, I am an advocate of ZFS and would recommend it’s use for any environment where data integrity is a priority.

Please note, although ZFS on Solaris supports encryption, the current version of ZFS on Linux does not. If you are using ZFS on Linux, you will need to use a 3rd party encryption method such as LUKS or EcryptFS.

The problem with ZFS is that it is not distributed. Distributed file systems can span multiple disks and multiple physical servers to produce one (or many) storage volume. This gives your file storage added redundancy and load balancing and is where GlusterFS comes in.

GlusterFS is a distributed file system which can be installed on multiple servers and clients to provide redundant storage. GlusterFS comes in two parts:

Server – the server is used to perform all the replication between disks and machine nodes to provide a consistent set of data across all replicas. The server also handles client connections with it’s built in NFS service.
Client – this is the software required by all machines which will access the GlusterFS storage volume. Clients can mount storage from one or more servers and employ caching to help with performance.

The below diagram shows the high level layout of the storage set up. Each node contains three disks which form a RAIDZ-1 virtual ZFS volume which is similar to RAID 5. This provides redundant storage and allows recovery from a single disk failure with minor impact to service and zero downtime. The volume is then split into three sub volumes which can have various properties applied; for example, compression and encryption. GlusterFS is then set up on top of these 3 volumes to provide replication to the second hardware node. GlusterFS handles this synchronisation seamlessly in the background making sure both of the physical machines contain the same data at the same time.

zfs and glusterfs highlevel structure

For this storage architecture to work, two individual hardware nodes should have the same amount of local storage available presented as a ZFS pool. On top of this storage layer, GlusterFS will synchronise, or replicate, the two logical ZFS volumes to present one highly available storage volume.

See this post for setting up ZFS on Ubuntu. For the very latest ZFS binaries, you will need to use Solaris as the ZFS on Linux project is slightly behind the main release. Set up ZFS on both physical nodes with the same amount of storage, presented as a single ZFS storage pool. Configure the required ZFS datasets on each node, such as binaries, homes and backup in this example. At this point, you should have two physical servers presenting exactly the same ZFS datasets.

We now need to synchronise the storage across both physical machines. In Gluster terminology, this is called replication. To see how to set up GlusterFS replication on two nodes, see this article.

These two technologies combined provide a very stable, highly available and integral storage solution. ZFS handles disk level corruption and hardware failure whilst GlusterFS makes sure storage is available in the event a node goes down and load balancing for performance.

Quick poll

Do you use GlusterFS in your workplace?

Do you use ZFS on Linux?

24-Aug-2013
James Coyle
0

ZFS dataset encryption

Tags : Encryption ZFS

Category : How-to

Get Social!

ZFS datasets support a host of features to help you manage your storage mounts as effectively as possible. Dataset encryption was added to ZFS in version 30 and can be enabled on a ZFS dataset during dataset creation. As ZFS on Linux in behind the official Solaris release, encryption is not available. ZFS on Linux is currently only at version 28.

You cannot encrypt an existing dataset. You would have to create a new, encrypted dataset, and migrate your data.

To create a dataset volume with encryption, use the following command. Replace [MOUNT POINT] with the location of where to mount the encrypted volume, [ZPOOL] with the name of the existing pool to use and [DATASET NAME] with the name to call the new encrypted dataset.

zfs create -o mountpoint=[MOUNT POINT] [ZPOOL]/[DATASET NAME]

For example:

zfs create -o encryption=on mountpoint=/mnt/homes datastore/homes

Now, you will be asked for a passphrase to use. Enter a passphrase, and then confirm it by typing it a second time. Your encrypted dataset will now be created.

Enter passphrase for 'datastore/homes': xxxxxxxxxxxxxxxxx
Enter again: xxxxxxxxxxxxxxxxx

Finally, check the dataset was created and encrypted:

zfs get encryption datastore/homes
NAME              PROPERTY    VALUE        SOURCE
datastore/homes   encryption  on           local

23-Aug-2013
James Coyle
0

ZFS dataset compression

Tags : Encryption ZFS

Category : How-to

Get Social!

ZFS datasets support a host of features to help you manage your storage mounts as effectively as possible. Compression is a feature common in many file systems and it’s also included in ZFS!

See this post for information on installing ZFS and setting up a volume with a data set.

First of all, we need to identify the dataset name to apply compression with zfs list.

zfs list

With the result looking similar to below. In my example, there are three datasets; backups, binaries and homes.

NAME                 USED  AVAIL  REFER  MOUNTPOINT
datastore            312K  62.6G  38.6K  /datastore
datastore/backups   38.6K  62.6G  38.6K  /mnt/backups
datastore/binaries  38.6K  62.6G  38.6K  /mnt/binaries
datastore/homes     38.6K  62.6G  38.6K  /mnt/homes

Let’s apply compression to the datastore/homes dataset. Check to see if compression is already applied to the dataset with zfs get compression.

zfs get compression datastore/homes

Our current datastore/homes dataset is not currently compressed. To enable compression, use the zfs set command.

zfs set compression=on datastore/homes

Check the compression status again to make sure the change has taken effect.

zfs get compression datastore/homes
NAME             PROPERTY     VALUE     SOURCE
datastore/homes  compression  on        local

By default, ZFS uses gzip-6. There are various options we can use here, to override the default.

The compression library gzip ranges from gzip-1, with the lowest compression, up to gzip-9 with the highest compression.
ZLE is the fastest compression method but it’s only compresses one scenario – a run of zeros.
The older library LZJB exists, however this usually offers inferior compression compared with gzip.

For a few more examples, see below.

zfs set compression=gzip-9 datastore/homes
zfs set compression=lzjb datastore/homes

Finally, to remove compression from a dataset, run:

zfs set compression=off datastore/homes

23-Aug-2013
James Coyle
11

Create a ZFS volume on Ubuntu

Tags : apt-get Linux ubuntu ZFS

Category : How-to

Get Social!

zfs-linux ZFS is a disk and logical volume manager combining raid like functionality with guaranteeing data integrity. Every block of data read by ZFS is checksumed and recovered if an error is found. ZFS also periodically checks the entire file system for any silent corruption which may have occurred since the data was written.

ZFS was initially developed by Sun for use in Solaris and as such was not available on Linux distributions. Thanks to some clever guys over at ZFS on Linux, this has now changed. We can now install the ZFS on most Linux distributions such as Debain/ Ubuntu and Red Hat/ CentOS.

ZFS provides a data volume which can have multiple mount points, spanning multiple disks. Disks can be combined into virtual groups to allow for various redundancy options:

Mirror – data will be mirrored across disks, equivalent to RAID 1. This is quite simply a copy of one disk to another every time data is changed. You require a minimum of two disks for a mirrored set. This provides the best redundancy but requires the most space. For example, if you use 2x 500GB disks, only 500GB will be available as the other 500GB will be a copy of the first disk.
Stripe – data will be stored across all available disks, equivalent to RAID 0. In a two disk striped array, half of a file would be on disk one and half of the file on disk two. This provides the fastest read and write speeds but it offers no redundancy. In the event of a failed disk, all data on the stripe will be lost.
RAID-Z – data will be written to all but one of the disks, with the remaining disk used for parity. This is equivalent to RAID 5. A minimum of three disks are required with one disk always being used for parity. In the even of a single disk failure, all data can be recovered and in fact, will still be accessible providing no further disks fail. In the even of a second disk failure, all data on the RAIDZ will be lost.
RAID-Z 2 and RAID-Z 3 – these are the same as RAIDZ but with two and three disks used for parity respectively. RAID-Z 3 is recommended for highly critical data consistency environments. RAIDZ-2 requires a minimum of 4 disks, and RAID-Z 3 requires 5 disks as a minimum.

zfs highlevel structure diagram

In addition to these virtual groups, multiple groups can be combined. For example, you can mirror a striped virtual volume to create a RAID 10. This gives the added performance of striped volumes with the redundancy of mirrored volumes.

For our below example, we are going to create a single RAIDZ 1 with three disks. This gives us two full disks of storage, and a further disk for parity.

Installing ZFS on Ubuntu

Before we can start using ZFS, we need to install it. Simply add the repository to apt-get with the following command:

apt-add-repository --yes ppa:zfs-native/stable

In a minimum package install, you may not have the apt-add-repository installed.

The program 'apt-add-repository' is currently not installed.  You can install it by typing:
apt-get install python-software-properties

If this is the case, install it before running the apt-add-repository command.

apt-get install python-software-properties

Update the apt cache with the update argument

apt-get update

Install the ZFS binaries, tools and kernel modules. This may take a while due to the amount of packages apt will have to download, building the tools and the ZFS modules for the kernel.

apt-get install ubuntu-zfs

At this point, it is best to test the kernel was correctly compiled and loaded.

dmesg | grep ZFS

The output should look like below. If it does not try running modprobe zfs.

[  824.725076] ZFS: Loaded module v0.6.1-rc14, ZFS pool version 5000, ZFS filesystem version 5

Create RAID-Z 1 3 disk array

Once ZFS is installed, we can create a virtual volume of our three disks. The three disks should all be the same size, if they are not the smallest disk’s size will be used on all three disks.

Identify the disks you would like to use with fdisk. Some disk controllers may have their own naming conventions and administration tools but we’ll use fdisk in this example. Whilst we are on this point, raid controllers should not be set up with raid functionality when using ZFS. Some of the mechanisms in ZFS can be fooled with an underlying layer also doing data parity and therefore data corruption can occur in this environment.

fdisk -l | grep /dev/

The output will look like:

Disk /dev/vdb doesn't contain a valid partition table
Disk /dev/vdc doesn't contain a valid partition table
Disk /dev/vdd doesn't contain a valid partition table

And there we have it! The three disks to add to our ZFS array. Note, I have removed the root volume in this example to avoid confusion.

Run the zpool create command passing in the disks to use for the array as arguments. By specifying the argument -f it removes the need to create partitions on the disks prior to creating the array. This command creates a zpool called datastore however you can change this to suit your needs.

zpool create -f datastore raidz /dev/vdb /dev/vdc /dev/vdd

Confirm the zpool has been created with:

zpool status datastore

The output should be similar to:

  pool: datastore
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        datastore   ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            vdb1    ONLINE       0     0     0
            vdc1    ONLINE       0     0     0
            vdd1    ONLINE       0     0     0

errors: No known data errors

Create ZFS dataset

At this point, we now have a zpool spanning three disks. One of these is used for parity, giving us the chance to recover in the event of a single disk failure. The next step is to make the volume usable and add features such as compression, encryption or de-duplication.

Multiple datasets or mount points can be created on a single volume. Generally, you do not specify these size of these. Put simply, the storage of the zpool with be available to any dataset as it requires it. You can set up quotas to manage dataset sizes but that won’t be covered in this tutorial.

What we are interested in is creating three volumes; binaries, homes and backups. These will be mounted at /mnt/binaries, /mnt/homes and /mnt/backups respectively. Using zfs create command, create the three required volumes.

We specify the mount point, zpool and dataset name in the command.

zfs create -o mountpoint=[MOUNT POINT] [ZPOOL NAME]/[DATASET NAME]

Example:

zfs create -o mountpoint=/mnt/binaries datastore/binaries
zfs create -o mountpoint=/mnt/homes datastore/homes
zfs create -o mountpoint=/mnt/backups datastore/backups

Test the datasets have been created with zfs list.

zfs list
NAME                 USED  AVAIL  REFER  MOUNTPOINT
datastore            312K  62.6G  38.6K  /datastore
datastore/backups   38.6K  62.6G  38.6K  /mnt/backups
datastore/binaries  38.6K  62.6G  38.6K  /mnt/binaries
datastore/homes     38.6K  62.6G  38.6K  /mnt/homes

And an ls in /mnt should give us the mount points.

ls /mnt/
backups/   binaries/   homes/

You can now use your mounted datasets as required. You can export them as NFS, CIFS or simply use them as local storage!

See my other posts for compression and encryption. Please note, encryption is not currently available on ZFS for Linux.

JamesCoyle.net Limited

My experience with GlusterFS performance.

2

My experience with GlusterFS performance.

Networking Performance

Filesystem IO Performance

Technology, Tuning and GlusterFS

17

ZFS and GlusterFS network storage

Quick poll

0

ZFS dataset encryption

0

ZFS dataset compression

11

Create a ZFS volume on Ubuntu

Installing ZFS on Ubuntu

Create RAID-Z 1 3 disk array

Create ZFS dataset

Visit our advertisers

Search

Quick Poll

Recent Posts

Visit our advertisers