Setup Glusterfs with a replicated volume over 2 nodes
Category : How-to
This post will show you how to install GlusterFS in Ubuntu/ Debian however the steps will be similar with Red Hat based linux operating systems with minor changes to the commands.
Gluster File System is a distributed files system allowing you to create a single volume of storage which spans multiple disks, multiple machines and even multiple data centres.
Before we get started, install the required packages using apt-get. With Red Hat/ Cent based operating systems you will need to use yum, or download the package directly from http://download.gluster.org/pub/gluster/glusterfs/3.4/LATEST/
apt-get install glusterfs-server
Perform this on both of your servers. If you have more than two servers, perform this command on all of the servers required for the volume.
You will now need each of these servers to know about the others. Run gluster peer probe and the ip address of all the other servers in your GlusterFS cluster.
gluster peer probe gfs2.jamescoyle.net
Each of the commands should return with Probe successful which means the node is now known to this machine. You will only need to do this on one node of your cluster.
Run gluster peer status to check each node in your cluster is aware of the other nodes:
gluster peer status
The result should look like:
Number of Peers: 1 Hostname: gfs2.jamescoyle.net Uuid: a0977ca2-6e47-4c1a-822b-99df896080ee State: Peer in Cluster (Connected)
Now we need to create the volume where the data will reside. the volume will be called datastore. First of all, we need to identify where on the host this storage is. For this example, it is /mnt/gfs_block on both nodes, but this could be any mount point of storage that you have. If the folder does not exist, it will be silently created so be sure to get the correct path on all nodes.
gluster volume create datastore replica 2 transport tcp gfs1.jamescoyle.net:/mnt/gfs_block gfs2.jamescoyle.net:/mnt/gfs_block
If this has been sucessful, you should see:
Creation of volume testvol has been successful. Please start the volume to access data.
As the message indicates, we now need to start the volume:
gluster volume start datastore
And wait for the message that is has started.
Starting volume testvol has been successful
Running either of the below commands should indicate that GlusterFS is up and running. The ps command should show the command running with both servers in the argument. netstat should show a connection between both nodes.
ps aux | grep gluster netstat -tap | grep glusterfsd
As a final test, to make sure the volume is available, run gluster volume info. An example output is below:
gluster volume info Volume Name: datastore Type: Replicate Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: gfs1.jamescoyle.net:/mnt/datastore Brick2: gfs2.jamescoyle.net:/mnt/datastore
That’s it! You now have a GlusterFS volume which will maintain replication across two nodes. To see how to use your volume, see our guide to mounting a volume.
13 Comments
Srikanth
9-Jan-2014 at 10:41 amHi,
I followed exact steps which are provided, everything is okay but i cannot see the data replicated on other node if i create a file in on 1st node.. Kindly help me in this regards..
Vijay
16-Jan-2014 at 12:52 amHi Srikanth,
How you are accessing the gluster share ? through NFS or gluster client ?
Vijay
Sebastian
11-Feb-2014 at 12:30 pmHello James
The only way to get the files synchronized is if the files are copied through the client?
There´s no way to create a 2-nodes and forget the third machine (client)?
Best regards
james.coyle
11-Feb-2014 at 12:35 pmHi Sebastian,
Files will be synchronized between servers regardless of the client being connected. For example, if I change a file on server1, the file will be replicated to server2.
Sebastian
11-Feb-2014 at 2:12 pmGreat. Thanks your time.
Best regards
Sebastian
11-Feb-2014 at 3:31 pmHi James
I´d tried to replicate files directly over the one of the bricks and doesn´t work. The access of the files must be over the client… that`s my experience.
Could be possible to install the client in one of the bricks?
My idea it`s create a 2 servers only arch.
Best regards
james.coyle
11-Feb-2014 at 5:19 pmSorry Sebastian, I understand your question now – the client being an actual GlusterFS client.
All file synchronization is done via the GlusterFS client. I haven’t tried it myself, but I’d be pretty sure you can install the GlusterFS client on the server and create a mount point. You can then use this mount point as the storage access point.
Andrew
28-Apr-2015 at 4:03 pmHello James!
As I can see, I faced with same situation as Sebastian.
Were created two bricks (replica 2).
If I change something on 1st brick/2nd brick, changes does not sync to opposite brick while there are no active clients.
If volume has at least one connected client, changes from bricks or made by client save on both bricks normal.
In this case I have two questions:
1. Is this a normal behavior of Gluster? Why such decision was made?
2. How can I avoid obligatory client connections? (sync changes between bricks without active local\remote clients)
Thanks in advance.
Ernie
26-Feb-2014 at 5:43 amI have made a two node replicated volume as per you example above and it’s running fine. How can I now add a 3rd node to it for greater redundancy?
Shyam
6-Feb-2015 at 2:51 pmCan we create and start multiple volumes??
How it can be managed the shared volume for multiple users?
Aaron Toponce
12-Mar-2015 at 3:00 amI’m not sure if this is a bug and has been fixed in a recent release, or if this is just how 2-peer replicated volumes work, but I have a bit of a situation I can’t resolve.
I have 2 KVM hypervisors peered with KVM images mounted to the GlusterFS client mount. The peers are in a 2-node replicated setup. Suppose the hypervisors have a a kernel update. My steps will go something like this:
1. Migrate all the VMs off one KVM hypervisor to the other.
2. Reboot the free hypervisor.
At that reboot, GlusterFS disk IO will go through the roof on the live KVM hypervisor, creating an effective DoS of the running VMs. This will last well past the reboot of the other KVM (30 minutes or more). Once the IO has settled, and I migrate the VMs to the freshly rebooted KVM, the problem starts all over again.
The only way to do this in any sort of sane manner, is to shut down all running VMs on both hypervisors, reboot both hypervisors, and start the VMs back up. This is a problem, as I would like to use the shared storage GlusterFS provides for high availability, and no downtime for the VMs in this scenario. Unfortunately, I have not been successful in this regard.
Thoughts? I’m running Debian 7.8 stable on both hypervisors, with GlusterFS 3.2.7.
Toms Varghese
17-Apr-2015 at 3:25 pmHi James, Your articles have been a great help for me to experiment with GlusterFS. I would like to go through the source code of GlusterFS and understand its working. But I am unable to find any proper documentation of the overall architecture and code structure. Do you know whether there are any code walk-through documents available for glusterfs source code? All documentations I found was only for the administration of GlusterFS.
Sorry if this is not the right place to ask this..
Thanks
madhavi
6-Nov-2015 at 12:53 amhi, i am facing two issues –
1. just two days back i installed glusterfs-server worked fine but the 2nd brick didnt show the file automatically in sync with 1st brick.
2. Today when i restarted my vm for the same, the “connection failed” error is shown, “check gluster daemon is operational” it is not restarting.
Is there any restart command on ubantu 14?
Please help …