Skyscraper I love you

Building a Solaris Cluster Express cluster in a VirtualBox on OpenSolaris

I’ve been wanting to have a play around with both the Solaris Cluster Express work that’s coming out of OpenSolaris and also VirtualBox, a virtualization platform that Sun recently acquired and have moved under their xVM banner. So wanting to kill two birds with one stone I thought I’d try setting up a VirtualBox Solaris Express cluster. Here’s a run through on how to get the same thing going if you’d like to try.

First an overview of what I set out to achieve.

  • Clustered MySQL server in a zone
  • Clustered Apache zone providing phpmyadmin front end for MySQL
  • On a two node Solaris Express cluster
  • With nodes running in VirtualBox virtual machines on a single physical machine
  • Shared storage provided over iSCSI from the host machine

VirtualBox does not support sharing the same (VDI) disk image between multiple hosts, unless the image is read-only. As such VirtualBox cannot natively provide storage appropriate for a clustered environment, so we’re going to prevent storage over iSCSI to the virtual nodes.

At the time of writing Solaris Cluster Express 6/08 has just been released, this has been updated to run on Solaris Express Community Edition (SXCE) Build 86, so we’ll be using that for our guest OS in the virtual machines.

Initially I was hoping to use OS X as the host platform for VirtualBox however the networking support is far from complete in the OS X release. Specifically is does not support ‘internal networking’ and that’s needed for the cluster interconnects. So instead I chose OpenSolaris 2008.05 for the host, this has the advantage that the host can be used to provide the iSCSI services along with a Quorum Server and has the required networking flexibility.

If you don’t have an OpenSolaris host, then any recent Solaris 10 or Solaris Express server on your network should work fine. Ensure that ZFS supports the ‘shareiscsi’ options and that you can share 2 8GB volumes from it. Also you will need to install a Quorum Server to it. If you don’t have a suitable host on your network you could create a 3rd VirtualBox machine to provide the required services.

To build this environment the following downloads were used

  • Solaris Express community edition, build 86, from here
  • VirtualBox 1.6.2 from here
  • Solaris Cluster Express 6/08 from here
  • OpenSolaris 2008.05 from here
  • phpMyAdmin 2.11.6 from here

You’ll need a machine with at least 2 GB of RAM, and around 50GB of hard disk space to follow this guide. If you have 2GB of RAM you’ll spend a fair amount of time swapping and may experience occasional cluster node panic, so more is recommended if you can get it.

Throughout this guide the prompt for the commands indicated the server it should be typed on.

  • opensolhost# – The OpenSolaris host
  • cxnode1# – The first Cluster Express node
  • cxnode2# – The second Cluster Express node
  • both-nodes# – Repeat on both the first and second cluster nodes
  • apache-zone# – Within the Apache zone
  • mysql-zone# – Within the MySQL zone

The following IP addresses were used, modify these to match your local network environment

  • – The IPMP failover IP for cxnode1
  • – The IPMP failover IP for cxnode2
  • – The IP for the Apache zone
  • – The IP for the MySQL zone
  • – The first IPMP test address on cxnode1
  • – The first IPMP test address on cxnode2
  • – The second IPMP test address on cxnode1
  • – The second IPMP test address on cxnode2

Sun provide extensive documentation for Sun Cluster 3.2, which for these purposes should match Cluster Express closely enough. You can find a good starting point at this documentation centre page.

Installing and configuring VirtualBox

To begin download VirtualBox from the links above, the steps below reference the 64 bit version but they should be the same for 32 bit users. Download the package and unzip/tar it, inside you’ll find two packages, they both need installing, ie:

opensol-host# pkgadd -d VirtualBoxKern-1.6.2-SunOS-r31466.pkg
opensol-host# pkgadd -d VirtualBox-1.6.2-SunOS-amd64-r31466.pkg

Before the virtual machines can be created some network configuration on the host is required. To enable the front interfaces of the cluster nodes to connect to the local network some virtual interfaces on the host are required. These will have the desired affect of bridging the local VLAN with the virtual machines.

The instructions in the VirtualBox documentation say to use /opt/VirtualBox/ to create these interfaces, however I had problems getting that to work. Not only is does it need some tweaks to get it to work with OpenSolaris 2008.05, I found I couldn’t get the interfaces going even after they were created. Fortunately I came across this blog post, which pointed me in the direction of the steps below.

2 public interfaces are required for each cluster node, these can then be configured with IPMP, just as you would do in a real cluster, whilst this doesn’t really provide more resiliency it’s a worthwhile exercise. To create these public interfaces a total of four virtual interfaces are required on the host. Each one needs a defined MAC address that you can choose yourself, or you can use the suggested ones here. To create the interfaces do this, replacing e1000g0 with the address of your physical interface on your host.

opensol-host# /usr/lib/vna e1000g0 c0:ff:ee:0:1:0
opensol-host# /usr/lib/vna e1000g0 c0:ff:ee:0:1:1
opensol-host# /usr/lib/vna e1000g0 c0:ff:ee:0:2:0
opensol-host# /usr/lib/vna e1000g0 c0:ff:ee:0:2:1

I’ve chosen these MAC addresses because c0:ff:ee is easy to remember, and the last two octets represent the cluster node number and then the interface number in that node.

Now plumb these interfaces, but don’t give them IP addresses.

opensol-host# ifconfig vnic0 plumb
opensol-host# ifconfig vnic1 plumb
opensol-host# ifconfig vnic2 plumb
opensol-host# ifconfig vnic3 plumb

These four ‘vnic’ interfaces will form the public interfaces of our cluster nodes. However these will not persist over a reboot so you may want to create a start up script to create these. There’s a sample script in the linked blog post above.

It’s now time to create the virtual machines, as your normal user run /opt/VirtualBox/VirtualBox and a window should pop up like this

Sun xVM VirtualBox

To start creating the virtual machine that will serve as the first node of the cluster, click ‘New’ to fire up the new server wizard. Enter ‘cxnode1′ for the name of the first node (or choose something more imaginative), pick ‘Solaris’ as the OS. Page through the rest of the wizard choosing 1GB for the memory allocation (this can be reduced later, at least 700MB is recommended) and create the default 16GB disk image.

The network interfaces need to be configured to support the cluster. Click into the ‘Network’ settings area and change the ‘Adapter 0′ type to ‘Intel PRO/1000 MT Desktop’ and change the ‘Attached to’ to ‘Host Interface’. Then enter the first MAC address you configured for the VNICs, if you followed the example above this will be c0ffee000100, then enter an interface name of ‘vnic0′. This Intel PRO/1000 MT Desktop adapter will appear as e1000g0 within the virtual machine. Generally I’ve grown to like these adapters, not least because they use a GLDv3 driver.

Now for ‘Adapter 1′ enable the adapter and change the type as before, and set ‘Attached to’ to ‘Host Interface’, the MAC to c0ffee000101 (or as appropriate) and the interface name to ‘vnic1′. Then enable ‘Adapter 2′ and set ‘Attached to’ to Internal Network and set ‘Network Name’ to ‘Interconnect 1′. Repeat for ‘Adapter 2′ but set the ‘Network Name’ to ‘Interconnect 2′.

Finally point the CD/DVD-ROM to the ISO image you downloaded for Solaris Express Build 86. You need to add the ISO with the ‘Virtual Disk Manager’ to make it available to the machine. You can use the ‘/net’ NFS automounter to point to a NFS share where this image resides if required.

Finally change the boot order, in General / Advanced, so that ‘Hard Disk’ comes before ‘CD/DVD’. This means that it will initially boot the install media, but once installed will boot from the installed drive.

Repeat the above steps to create a second cluster node. Ensure that ‘Adapter 2′ and ‘Adapter 3′ are connected to the same networks as for the first cluster node. Adapters 1 and 2 should be connected to the 3rd and 4th VNICs created previously.

Installing Solairs

Solaris now needs to be installed to both the cluster nodes. Repeat the following steps for each node. To boot a virtual machine click ‘Start’ and the machine should boot and display the Grub menu. DON’T pick the default of ‘Solaris Express Developer Edition’ but rather choose ‘Solaris Express’. If you choose the Developer Edition option you’ll get the SXDE installer which does not offer the flexibility required around partition layout.

Pick one of the ‘Solaris Interactive’ install options as per your personal preference. If you’ve ever installed one of the main line Solaris releases that you’ll be at home here. Suggested settings for system identification phase:

  • Networked
  • Configure e1000g0 (leave the others for the time being)
    • No DHCP
    • Hostname: cxnode1 (or choose something yourself)
    • IP: (or something else as appropriate)
    • Netmask (or as appropriate)
    • No IPV6
    • Default Route – Specify
    • (the IP of your default router)
    • Don’t enable Kerberos
    • ‘None’ for naming service
    • Default derived domain for NFSv4 domain
  • Specify time zone as required
  • Pick a root password

Then in the installation phase pick the following options

  • Reboot automatically: no
  • Eject additional CDs: yes (not that we’ll be using any)
  • Media: CD/DVD
  • Install Type: Custom
  • Add additional locales as required, we’re using C as the default.
  • Web Start Scan location: None
  • Software group: Entire Group Plus OEM / Default Packages
  • Leave the fdisk partition as the default, one single Solaris partition covering the whole disk.
  • To support the cluster a specific filesystem layout is required. Click ‘Modify’ then set as this: (If space is needed for a live upgrade in the future then an additional virtual disk can be attached)
    • Slice 0: ‘/’ – 13735 MB
    • Slice 1: ‘swap’ – 2048 MB
    • Slice 6 ‘/globaldevices’ – 512 MB
    • Slice 7 ‘(leave blank)’ – 32MB
  • Now just confirm and start the install

The installer should now run through in due course, once complete reboot the machine and check it boots up fine for the first time.

By default Solaris will boot up into a GNOME desktop. If you want to disable the graphic login prompt from launching then do ‘svcadm disable cde-login’

Before the cluster framework is installed the VirtualBox ‘Guest Additions’ need to be installed, these serve a similar role to VMWare Tools in that they provide better integration with the host environment. Specifically the ‘Time synchronization’ facilities are required to assist with keeping the cluster nodes in sync.

If you still have the SXDE DVD image mounted then ‘eject’ this in the guest and ‘Unmount’ it from the ‘Devices’ menu. Then choose ‘Install Guest Additions…’ from the VirtualBox menu. The Guest Additions iso should mount, then su to root and pkgadd the VBoxSolarisAdditions.pkg from the CD. If you’re running X you should logout and back in to activate the X11 features.

Repeat the above steps for both cluster nodes.

It’s worth considering taking a snapshot at this point so if you run into problems later you can just snap it back to this post install state.

Preparing for cluster install

Once the nodes are installed there are a few steps required to configure them before the cluster framework can be installed. Firstly the public network interfaces on the nodes need to be configured. To do this use the below /etc/hostname files modifying where appropriate to your local network.

  • cxnode1:/etc/hostname.e1000g0
    • netmask broadcast deprecated -failover group public up
      addif up
  • cxnode1:/etc/hostname.e1000g1
    • netmask broadcast deprecated -failover group public up
  • cxnode2:/etc/hostname.e1000g0
    • netmask broadcast deprecated -failover group public up
      addif up
  • cxnode2:/etc/hostname.e1000g1
    • netmask broadcast deprecated -failover group public up

Also check that /etc/defaultrouter is correct.

The RPC communication must be activated for the cluster framework to function. To do this do

opensol-host# svccfg
svc:> select network/rpc/bind
svc:/network/rpc/bind> setprop config/local_only=false
svc:/network/rpc/bind> quit
opensol-host# svcadm refresh network/rpc/bind:default
opensol-host# svcprop network/rpc/bind:default | grep local_only

The last command should return ‘false’.

Modify your path to include


Also check your ‘umask’ is set to 0022 and change it if not.

Finally we need to ensure the cluster nodes exist in /etc/inet/hosts on both hosts. For example cxnode1 cxnode2

After making the above changes bounce the nodes to check it all persists across a reboot.

Once the above has been repeated on both cluster nodes it is time to install the cluster framework.

Installing Cluster Express

Download and extract Solaris Cluster Express 6/08, inside the package cd into 'Solaris_x86' then run './installer'. If you are connected over a ssh or on the console then run './installer -nodisplay' instead. The steps listed here are for the GUI installer, but the text one is much the same.

Wait for the GUI to launch, click through and accept the license. Then from the list of available services choose ‘Solaris (TM) Cluster Express 6/08′ and ‘Solaris (TM) Cluster Express Agents 6/08′ (enter ‘4,6′ if you’re in the text installer). Leave the other options disabled, and clear ‘Install multilingual packages’ unless you want them. Click ‘Next’ to start the installer. You’ll be informed that some packages are being upgraded from their existing versions (namely SUNWant, SUNWjaf and SUNWjmail). The cluster will now perform some pre installation checks and they should all pass ‘OK’, then proceed with the install. Choose ‘Configure Later’ when prompted, as that will be done once all the installation steps are finished. Repeat the install process on the second node.

Now the cluster is installed it can be configured and established. This is started by running /usr/cluster/bin/scinstall. I prefer to do this on the second node of the cluster (cxnode2), as the installer configures the partner server first and it will be assigned node id 1, the server running the install will be assigned node id 2. It doesn’t really matter I just prefer it to follow the ordering of the hostnames.

Once the installer is running choose option 1 (Create a new cluster or add a cluster node), then option 1 (Create a new cluster), answer ‘yes’ to continue, choose ‘2′ for custom configuration. Pick a cluster name eg ‘DEV1′. Then when prompted for the other nodes in the cluster enter ‘cxnode2′ then ^D to complete. Confirm the list of nodes is correct, communication with the other node will now be tested and should complete fine. Answer ‘no’ for DES authentication.

Now you’ll be asked about the network configuration for the interconnect. The default network range is but you can change this if required. Answer ‘yes’ to use at least two private networks, then ‘yes’ to the question about switches. Although there are no switches to configure we’re considering each private network configured in VirtualBox to be an unmanaged switch. Accept the default names for ’switch1′ and ’switch2′. You’ll now be asked to configure the transport adapters (effectively the network interfaces). Pick ‘1′ (e1000g2) for the first cluster transport adapter, answer ‘yes’ to indicate it is a dedicated cluster transport adapter. Answer ’switch1′ when asked about which switch it is connected to and accept the defaut name. Then pick ‘2′ (e1000g3) for the second adapter and again pick the default options. Answer ‘yes’ for auto discovery.

You’ll now be asked about Quorum configuration, this will be addressed later so choose ‘yes’ to disable automatic selection. The next quetion is about the global devices file system. The default is /globaldevices, accept it as this matches the layout created when initially installing Solaris. Accept this for cxnode2 as well.

You’ll be asked for final confirmation that it is ok to proceed so just answer ‘yes’ when you’re ready. You’ll now be asked if you want the creation to be interrupted if sccheck fails, go for ‘yes’ rather than the default of ‘no’. The cluster establishment will now start. You should see something similar to this when it discovers the cluster transports:

The following connections were discovered:
cxnode2:e1000g2  switch1  cxnode1:e1000g2
cxnode2:e1000g3  switch2  cxnode1:e1000g3

cdnode1 will reboot and establish the cluster with itself as the only node. Don’t worry about any errors about /etc/cluster/ccr/did_instances, the DID database hasn’t been created yet. The installer will then configure cxnode2 and reboot that. When it boots back up it will join the new cluster.

We now have an established cluster! However it’s in ‘installmode’ until a quorum device is configured.

Configuring a Quorum Server

To finish the base cluster configuration a quorum device must be assigned. Initially I was planning to do this by presenting an iSCSI LUN from the OpenSolaris host into the guests, then using that for the quorum device. However I found that it, although it could be added fine, it would show as ‘offline’ and not function properly.

As such a quorum server running on the OpenSolaris host will be used. Full documentation for this can be found here, but this overview should be enough to get you going. Fortunately it can be installed on it’s own, without requiring a full cluster install. To install it make the Cluster Express package available on the host and run the installer again, you’ll need to use the ‘-nodisplay’ option as some packages won’t be available for the graphical installer.

When running the installer choose ‘No’ when asked if you want to install the full set of components. Then choose option 2 for ‘Quorum Server’ and install that. Choose ‘Configure Later’ when asked.

The default install creates a configuration for one quorum server, running on port 9000. You can see the configuration in /etc/scqsd/scqsd.conf. Start the quorum server by running

opensol-host# /usr/cluster/bin/clquorumserver start 9000

The quorum server can now be configured into the cluster to get it fully operational.

One one of the cluster nodes run clsetup. Answer ‘yes’ to confirm you have finished the initial cluster setup and ‘yes’ to add a quorum device. Now pick option 3 (Quorum Server). Answer ‘yes’ to continue then give a name for the device, such as ‘opelsolhost’. Enter the IP of your host when prompted and 9000 as the port number. Allow it to proceed then choose ‘yes’ to reset installmode. The cluster is now properly established.

You can check the quorum registration on the OpenSolaris host by doing /usr/cluster/bin/clquorumserver show, you should see something like this:

opensol-host#/usr/cluster/bin/clquorumserver show
---  Cluster DEV1 (id 0x484E9DB9) Registrations ---
Node ID:                      1
Registration key:           0x484e9db900000001
Node ID:                      2
Registration key:           0x484e9db900000002

Provisioning storage

To enable the creation of the Apache and MySQL zones it’s necessary to present some storage to the cluster that can be shared between them.

As OpenSolaris is running on the host, take advantage of the in built iSCSI support in ZFS. First create some ZFS Volumes, one for each clustered service. eg.

opensol-host# zfs create -V 8g rpool/apache
opensol-host# zfs create -V 8g rpool/mysql

Now enable iSCSI export on them.

opensol-host# zfs set shareiscsi=on rpool/apache
opensol-host# zfs set shareiscsi=on rpool/mysql

And confirm this with iscsitadm list target -v

opensol-host# iscsitadm list target -v
Target: rpool/apache
iSCSI Name:
Alias: rpool/apache
Size: 8.0G
Backing store: /dev/zvol/rdsk/rpool/apache
Status: online
Target: rpool/mysql
iSCSI Name:
Alias: rpool/mysql
Size: 8.0G
Backing store: /dev/zvol/rdsk/rpool/mysql
Status: online

Now configure the nodes to see the presented storage. Do this on both of the nodes, replacing with the IP of your host.

both-nodes# iscsiadm modify discovery –sendtargets enable
both-nodes# iscsiadm add discovery-address
both-nodes# svcadm enable network/iscsi_initiator

And then to confirm:

both-nodes# iscsiadm list target -S
Alias: rpool/mysql
OS Device Name: /dev/rdsk/c2t01000017F202642400002A00484FCCC1d0s2
Alias: rpool/apache
OS Device Name: /dev/rdsk/c2t01000017F202642400002A00484FCCBFd0s2

Make a note of the OS Device Name to Alias matching as you need to put the right LUN into the correct resource group.

You can also confirm the storage is available by running format, eg:

both-nodes# format
Searching for disks...done
0. c0d0
1. c2t01000017F202642400002A00484FCCBFd0
2. c2t01000017F202642400002A00484FCCC1d0

To make this storage available to the cluster you must populate the DID device database. This is performed via cldevice and only needs to be run on one of the nodes:

cxnode1# cldevice populate
Configuring DID devices
cldevice: (C507896) Inquiry on device “/dev/rdsk/c0d0s2″ failed.
did instance 5 created.
did subpath cxnode1:/dev/rdsk/c2t01000017F202642400002A00484FCCC1d0 created for instance 5.
did instance 6 created.
did subpath cxnode1:/dev/rdsk/c2t01000017F202642400002A00484FCCBFd0 created for instance 6.
Configuring the /dev/global directory (global devices)
obtaining access to all attached disks

Then list the device database to check the storage is available

cxnode1# cldevice list -v
DID Device          Full Device Path
----------          ----------------
d1                  cxnode1:/dev/rdsk/c1t0d0
d2                  cxnode1:/dev/rdsk/c0d0
d3                  cxnode2:/dev/rdsk/c1t0d0
d4                  cxnode2:/dev/rdsk/c0d0
d5                  cxnode1:/dev/rdsk/c2t01000017F202642400002A00484FCCC1d0
d5                  cxnode2:/dev/rdsk/c2t01000017F202642400002A00484FCCC1d0
d6                  cxnode1:/dev/rdsk/c2t01000017F202642400002A00484FCCBFd0
d6                  cxnode2:/dev/rdsk/c2t01000017F202642400002A00484FCCBFd0

Now the basic configuration of the cluster is fairly complete. It’s a good opportunity to shut it down and take a snapshot. When shutting down the entire cluster you must use cluster shutdown rather than just shutting down the individual nodes. If you don’t then you must bring up the nodes in the reverse order of shutting them down. For an immediate shutdown do cluster shutdown -y -g 0.

Configuring the Storage

Originally I was planning to create some SVM metasets and add the disks to those, however this doesn’t appear to be possible with iSCSI LUNs yet, not even if you use the VirtualBox built in iSCSI initiator support, which results in the storage appearing as local disks. So instead I settled on using ZFS-HA to manage the disks. The process for this, as with most zfs stuff, is fairly straightforward.

First create a ZFS storage pool for each clustered service. We’ll use the DID device number here, so make sure you follow list back via cldevice list -v and iscsiadm list target -S to ensure you put the correct target into the correct ZFS pool.

In order for the storage to be added to ZFS it needs to have a ‘fdisk’ partition added to it. One one of the nodes run fdisk against the two devices, accepting the default partition layout.:

cxnode1# fdisk /dev/rdsk/c2t01000017F202642400002A00484B5467d0p0
No fdisk table exists. The default partition for the disk is:
a 100% “SOLARIS System” partition
Type “y” to accept the default partition,  otherwise type “n” to edit the
partition table.

The standard SMI label needs to be replaced with an EFI one. When I’ve been working with ZFS previously this has always happened automatically but it didn’t work for me this time, possibly because we are going to add the DID device to the zpool rather than a traditional disk device. To change it manually run format with the -e option.

cxnode1# format -e
Searching for disks…done

0. c0d0
1. c2t01000017F202642400002A00484FCCBFd0
2. c2t01000017F202642400002A00484FCCC1d0
Specify disk (enter its number): 2
selecting c2t01000017F202642400002A00484FCCC1d0
[disk formatted]
Error occurred with device in use checking: No such device
format> label
Error occurred with device in use checking: No such device
[0] SMI Label
[1] EFI Label
Specify Label type[0]: 1
Warning: This disk has an SMI label. Changing to EFI label will erase all
current partitions.
Continue? y
format> p
partition> p
Current partition table (default):
Total disk sectors available: 16760798 + 16384 (reserved sectors)

Part      Tag    Flag     First Sector        Size        Last Sector
0        usr    wm                34       7.99GB         16760798
1 unassigned    wm                 0          0              0
2 unassigned    wm                 0          0              0
3 unassigned    wm                 0          0              0
4 unassigned    wm                 0          0              0
5 unassigned    wm                 0          0              0
6 unassigned    wm                 0          0              0
7 unassigned    wm                 0          0              0
8   reserved    wm          16760799       8.00MB         16777182

If you’d like to suppress the “Error occurred with device in use checking: No such device” warning then set ‘NOINUSE_CHECK’ as an environment variable. It seems to be this bug that’s causing the warning.

Now the disks can be added to a ZFS pool. Make sure you add the correct DID device here:

cxnode1# zpool create apache-pool /dev/did/dsk/d6s0
cxnode1# zpool create mysql-pool /dev/did/dsk/d5s0

Then confirm these are available as expected:

cxnode1# zpool list
apache-pool  7.94G   111K  7.94G     0%  ONLINE  -
mysql-pool   7.94G   111K  7.94G     0%  ONLINE  -

Now create some filesystems in the pools. For each service three file systems are required

  • zone – This will become the zone root
  • data – This will become /data within the zone and be used to store application data
  • params – This will be used to store a cluster parameter file
cxnode1# zfs create apache-pool/zone
cxnode1# zfs create apache-pool/data
cxnode1# zfs create apache-pool/params
cxnode1# zfs create mysql-pool/zone
cxnode1# zfs create mysql-pool/data
cxnode1# zfs create mysql-pool/params

To make the storage usable with the cluster it needs to be configured into a resource group. Create two resource groups, one for each service

cxnode1# clresourcegroup create apache-rg
cxnode1# clresourcegroup create mysql-rg

SUNW.HAStoragePlus is the resource type that can manage ZFS storage, along with SVM and VxVM. It needs to be registered with the cluster with clresourcetype register, this only needs to be performed on one node.

cxnode1# clresourcetype register SUNW.HAStoragePlus

Create clustered resources to manage the ZFS pools:

cxnode1# clresource create -g apache-rg -t SUNW.HAStoragePlus -p Zpools=apache-pool apache-stor
cxnode2# clresource create -g mysql-rg -t SUNW.HAStoragePlus -p Zpools=mysql-pool mysql-stor

The resource groups will currently be in ‘Unmanaged’ state, you can confirm this with clresourcegroup status. To bring them under cluster management bring them online on the first node:

cxnode1# clresourcegroup online -M -n cxnode1 apache-rg

Then check the file systems are available are available on the first node but not on the second.

cxnode1# zfs list
apache-pool        158K  7.81G    21K  //apache-pool
apache-pool/data    18K  7.81G    18K  //apache-pool/data
apache-pool/params 18K  7.81G    18K  //apache-pool/params
apache-pool/zone    18K  7.81G    18K  //apahce-pool/zone
cxnode2# zfs list
no datasets available

Now move the pool to the other node and check that it becomes available on that node

cxnode1# clresourcegroup switch -n cxnode2 apache-rg
cxnode1# zfs list
no datasets available
cxnode2#  zfs list
apache-pool        158K  7.81G    21K  //apache-pool
apache-pool/data    18K  7.81G    18K  //apache-pool/data
apache-pool/params 18K  7.81G    18K  //apache-pool/params
apache-pool/zone    18K  7.81G    18K  //apahce-pool/zone

When you’re happy switch it back to the first node:

cxnode1# clresourcegroup switch -n cxnode1 apache-rg

Then repeat the above online and failover tests for mysql-rg and mysql-pool

As the ZFS pools have now been added as cluster managed resources you must now use the cluster to mange them. Don’t perform export/import operations manually.

Preparing Apache and MySQL zones for clustering

To provide a clustered Apache and MySQL service two zones are going to be created. Whilst these services could equally well be clustered without the use of zones I decided to go down this path so that the benefits of zones could be enjoyed in tandem with the clustering.

A full example run through for configuring MySQL in a clustered zone is provided by Sun in the official documentation if you require further information.

It’s worth pointing out that if you are following this plan then it will create zones on ZFS devices, currently this is not supported for ‘Live Upgrade’ at present, so you are restricting your upgrade paths in the future. If you do decide you need to ‘Live Upgrade’ the cluster at some point in the future then you could remove the zones, do the upgrade, and then re-create the zones. If you don’t want to do this then consider using raw disk slices with the cluster rather than ZFS.

For each zone the pool/zone file system will be used for the zone root, then the pool/data file system will be delegated to the zone’s control which will be used for the application to store it’s data, i.e. the MySQL databases or the Apache document root.

To provide IP addressing for the zones SUNW.LogicalHostname resources will be used, rather than directly configuring the zone with an IP addres.

First entries need adding to /etc/hosts on both nodes, eg: apache-zone mysql-zone

Then create LogicalHostname resources for each address:

cxnode1# clreslogicalhostname create -g apache-rg -h apache-zone apache-addr
cxnode1# clreslogicalhostname create -g mysql-rg -h mysql-zone mysql-addr

Now it’s time to create the zones. This is a really simple zone configuration, you could add resource controls or other features as desired. autoboot must be set to false as the cluster will be managing the starting and stopping of the zone.

cxnode1# zonecfg -z apache
apache: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:apache> create
zonecfg:apache> set zonepath=/apache-pool/zone
zonecfg:apache> set autoboot=false
zonecfg:apache> add dataset
zonecfg:apache:dataset> set name=apache-pool/data
zonecfg:apache:dataset> end
zonecfg:apache> verify
zonecfg:apache> commit
zonecfg:apache> exit
cxnode1# zonecfg -z mysql
mysql: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:mysql> create
zonecfg:mysql> set zonepath=/mysql-pool/zone
zonecfg:mysql> set autoboot=false
zonecfg:mysql> add dataset
zonecfg:mysql:dataset>set name=mysql-pool/data
zonecfg:mysql:dataset> end
zonecfg:mysql> verify
zonecfg:mysql> commit
zonecfg:mysql> exit

To enable the zones to be installed we must change the permissions on the zone roots:

cxnode1# chmod 700 /apache-pool/zone
cxnode1# chmod 700 /mysql-pool/zone

Now install the zones:

# zoneadm -z apache install
Preparing to install zone <apache>.
Creating list of files to copy from the global zone.
Copying <9668> files to the zone.
Initializing zone product registry. Determining zone package initialization order.
Preparing to initialize <1346> packages on the zone.
Initialized <1346> packages on zone.
Zone <apache> is initialized.
Installation of these packages generated warnings:
The file </apache-pool/zone/root/var/sadm/system/logs/install_log> contains a log of the zone installation.
# zoneadm -z mysql install

The documentation has this to say about configuring zones:

Caution: If the zone is to run in a failover configuration, each node being able to host that zone must have the exact same zone configuration for that zone. After installing the zone on the first node, the zone’s zone path already exists on the zones’s disk storage. Therefore it must get removed on the next node prior to successfully create and install the zone.[..] Only the zone’s zone path created on the last node will be kept as the final zone path for the failover zone. For that reason any configuration and customization within the failover zone should get performed after the failover zone is known to all nodes that should be able to host it.

To achieve this the newly created zone must be destroyed and recreated on the second node. To my mind this is a really ugly way of achieving this, the cluster should be able to manage this itself and make suitable configuration changes on a node when the zone is configured into the cluster.

In later releases of Sun Cluster 3.1 the recommended way to manage this configuration was that the /etc/zones files be hacked to replicate the configuration of the zones from one node to another. However this method is not supported any more so the official instructions will be followed.

To do this migrate the storage for the zones to the other node

cxnode1# clresourcegroup switch -n cxnode2 apache-rg
cxnode1# clresourcegroup switch -n cxnode2 mysql-rg

Then delete the previously installed zone roots on the second node

cxnode2# rm -rf /apache-pool/zone/*
cxnode2# rm -rf /mysql-pool/zone/*

Now repeat the zonecfg and zoneadm steps above to recreate both of the zones.

When the zone installs have completed again move the storage back to the first node.

cxnode1# clresourcegroup switch -n cxnode1 apache-rg
cxnode1# clresourcegroup switch -n cxnode1 mysql-rg

The zones can now be booted and configured. Repeat these steps for Apache and the the MySQL zone. Boot the zone:

cxnode1# zoneadm -z apache boot

I got this error when booting the zones Unable to set route for interface lo0 to *??9?x , I m not sure what this means but it doesn’t seem to impact anything.

Login to the zone’s console to configure it:

cxnode1# zlogin -C apache

You’ll be asked a few questions to configure the zone. Choose language, terminal type and time zone information as appropriate. Enter the same hostname as you used above, eg ‘apache-zone’ or ‘mysql-zone’. I received some alerts about avahi-bridge-dsd failing to start when booting, as far as I can tell it’s some sort of Bonjour networking thing, we don’t need it here so it’s ok to disable. You can also disable some other services that are not required to free up some resources

apache-zone# svcadm disable cde-login
apache-zone# svcadm disable sendmail
apache-zone# svcadm disable webconsole
apache-zone# svcadm disable avahi-bridge-dsd
apache-zone# svcadm disable ppd-cache-update

Now mount the zfs file systems that have delegated to the zone to an appropriate place. To do this on the Apache zone:

apache-zone# zfs set mountpoint=/data apache-pool/data

And on the MySQL zone:

mysql-zone# zfs set mountpoint=/data mysql-pool/data

Wait for the zone to finish booting and check you don’t have any failed services with svcs -xv. Then shut the zone down and repeat for the other zone.

Clustering the zones

Before proceeding further ensure the storage is available on the first node, fail it over if necessary. Also make sure the zones are shut down.

To enable clustering for the zones they must be registered with the cluster. To do this a script called sczbt_register is provided. To use this a configuration file must be completed and then registered. A sample configuration file is provided at /opt/SUNWsczone/sczbt/util/sczbt_config, this is also the file that will be read by default by sczbt_register. It is recommended to copy this file to some other place for future reference, then run sczbt_register against that. Comments are included in the file to explain the options, or see the official docs for more info.

Copy /opt/SUNWsczone/sczbt/util/sczbt_config to /etc/sczbt_config.apache and /etc/sczbt_config.mysql and edit as follows




The zones can now be registered. First you need to register the SUNW.gds resource type. On one node do:

cxnode1# clresourcetype register SUNW.gds

Then register the two zones

cxnode1# /opt/SUNWsczone/sczbt/util/sczbt_register -f /etc/sczbt_config.apache
sourcing /etc/sczbt_config.apache
Registration of resource apache-zone succeeded.
Validation of resource apache-zone succeeded.
cxnode1# /opt/SUNWsczone/sczbt/util/sczbt_register -f /etc/sczbt_config.mysql
sourcing /etc/sczbt_config.mysql
Registration of resource mysql-zone succeeded.
Validation of resource mysql-zone succeeded.

Then enable the Apache zone and log in to it. You should see the LogicalHostname resource has been assigned to the zone

cxnode1# clresource enable apache-zone
cxnode1# zoneadm list -cv
ID NAME             STATUS     PATH                           BRAND    IP
0 global           running    /                              native   shared
1 apache           running    /apache-pool/zone              native   shared
- mysql            installed  /mysql-pool/zone               native   shared
cxnode1# zlogin apache
[Connected to zone 'apache' pts/2]
Last login: Tue Jun 17 20:23:08 on pts/2
Sun Microsystems Inc.   SunOS 5.11      snv_86  January 2008
apache-zone# ifconfig -a
lo0:1: flags=2001000849 mtu 8232 index 1
inet netmask ff000000
e1000g1:1: flags=201040843 mtu 1500 index 3
inet netmask ffffff00 broadcast

Then test a failover and failback of the zone.

cxnode1# clresourcegroup switch -n cxnode2 apache-rg
cxnode1# clresourcegroup switch -n cxnode1 apache-rg

Repeat the same checks for the MySQL zone.

Installing and configuring MySQL

MySQL 5.0.45 is installed by default with the Solaris Express install we have performed. This consists of the SUNWmysql5r, SUNWmysql5u and SUNWmysql5test packages.

The installation can be used pretty much unmodified, the only modification needed is to repoint the MySQL data directory to the delegated zfs file system. This is done by modifying the SMF properties for the service, you cannot make this change by modifying my.cfg. Make the change with svccfg:

mysql-zone# svccfg
svc:> select svc:/application/database/mysql:version_50
svc:/application/database/mysql:version_50> setprop mysql/data = /data/mysql
svc:/application/database/mysql:version_50> refresh
svc:/application/database/mysql:version_50> end

Then create the directory for the databases and one for some logs

mysql-zone# mkdir /data/mysql
mysql-zone# mkdir /data/logs
mysql-zone# chown mysql:mysql /data/mysql
mysql-zone# chown mysql:mysql /data/logs

Now start up the database and check that it starts ok

mysql-zone# svcadm enable mysql:version_50
mysql-zone# svcs mysql:version_50
STATE          STIME    FMRI
online         10:38:54 svc:/application/database/mysql:version_50

Now set a password for the root users for the database, it’s set to ‘root’ in this case.

mysql-zone# /usr/mysql/5.0/bin/mysqladmin -u root password root
mysql-zone# /usr/mysql/5.0/bin/mysqladmin -u root -h localhost -p password root
Enter password:root

The /etc/hosts file in the zone needs to be modified so that ‘mysql-zone’ is the name for the clustered IP address for the zone rather than the localhost, also the address for the apache-zone needs to be added.       localhost       loghost   mysql-zone   apache-zone

Allow access to the root user to connect from the Apache zone.

mysql-zone# /usr/mysql/5.0/bin/mysql -p
Enter password: root
mysql> GRANT ALL ON *.* TO 'root'@'apache-zone' identified by 'root';

MySQL is now configured and ready to be clustered. We’ll be using a process loosely based on the one documented here. Alternatively it would be possible to use SMF to manage the service, you can see an example of that method in the Apache configuration later.

First a user for the fault monitor must be created along with a test database for it to use. A script is provided with the agent to do this for you. It will grant the fault monitor user ‘PROCESS, SELECT, RELOAD, SHUTDOWN, SUPER’ on all databases, then ALL privileges on the test database.

To create the required users you need to provide a config file. Copy the supplied template into /etc and edit it there

mysql-zone# cp /opt/SUNWscmys/util/mysql_config /etc
mysql-zone# vi /etc/mysql_config

Use these values, note that MYSQL_DATADIR is the location of the my.cnf, not the directory to the databases. The meaning of DATADIR changed in 5.0.3 to mean the location of the data and not the config directory, but for this configuration it should point to the config directory.


Then run the registration script

mysql-zone# /opt/SUNWscmys/util/mysql_register -f /etc/mysql_config
sourcing /etc/mysql_config and create a working copy under /opt/SUNWscmys/util/
MySQL version 5 detected on 5.11/SC3.2
Check if the MySQL server is running and accepting connections
Add faulmonitor user (fmuser) with password (fmpass) with Process-,Select-, Reload- and Shutdown-privileges to user table for mysql database for host mysql-zone
Add SUPER privilege for fmuser@mysql-zone
Create test-database sc3_test_database
Grant all privileges to sc3_test_database for faultmonitor-user fmuser for host mysql-zone
Flush all privileges
Mysql configuration for HA is done

Now shut down the database so it can be bought online by the cluster.

mysql-zone# svcadm disable mysql:version_50

Drop back to the global zone and copy the MySQL agent configuration template to /etc

cxndoe1# cp /opt/SUNWscmys/util/ha_mysql_config /etc/ha_mysql_config

Use these settings, this time the ‘DATADIR’ should be set to point to the actual data location and not the location of the config. Descriptions of the configuration is given in the file:


Now register this with the cluster:

cxnode1# /opt/SUNWscmys/util/ha_mysql_register -f /etc/ha_mysql_config
sourcing /etc/ha_mysql_config and create a working copy under /opt/SUNWscmys/util/
clean up the manifest / smf resource
sourcing /opt/SUNWscmys/util/ha_mysql_config
disabling the smf service svc:/application/sczone-agents:
removing the smf service svc:/application/sczone-agents:
removing the smf manifest /var/svc/manifest/application/sczone-agents/.xml
sourcing /tmp/
/var/svc/manifest/application/sczone-agents/mysql-server.xml successfully created
/var/svc/manifest/application/sczone-agents/mysql-server.xml successfully validated
/var/svc/manifest/application/sczone-agents/mysql-server.xml successfully imported
Manifest svc:/application/sczone-agents:mysql-server was created in zone mysql
Registering the zone smf resource
sourcing /opt/SUNWsczone/sczsmf/util/sczsmf_config
Registration of resource mysql-server succeeded.
Validation of resource mysql-server succeeded.
remove the working copy /opt/SUNWscmys/util/

Before bringing this online a tweak is needed to the supplied agent scripts. As mentioned briefly above the use of ‘DATADIR’ is a bit broken. If you try to bring MySQL online now it will fail as it won’t be able to find its configuration file. The agent scripts have this hard coded to ${MYSQL_DATADIR}/my.cnf which is no use for our purposes.

In the zone edit /opt/SUNWscmys/bin/functions and make this replacement, ensure you edit the copy in the MySQL zone and not the one in the global zone.




The mysql-server can now be enabled.

cxnode1# clresource enable mysql-server

Configuring Apache

Apache is going to be used to provide a web front end to the MySQL install, via the ubiquitous phpMyAdmin. The supplied Apache install (at /usr/apache2/2.2) is going to be used. As Apahce will be running in a zone it can be used unmodified, keeping the configuration in /etc and not worrying about any potential conflicts with other Apache installs.

At present the supplied Apache resource type does not directly support running the resource in a zone, or at least I couldn’t figure it out. So instead some of the provided zone monitoring tools are going to be used to ensure Apache is up and running. This uses a combination of SMF and a shell script.

To begin Apache must be configured. Bring the zone online on one of the nodes and log in to it. The configuration file for Apache is at /etc/apache2/2.2/httpd.conf. Only a small tweak is required to move the document root onto the zfs file system we have prepared for it. You could, if desired, also move other parts of the configuration, such as the log location. For this example just change DocumentRoot to be /data/htdocs and update the Directory stanza a page or so below it. Then do a mkdir on /data/htdocs. That completes our very simple Apache configuration.

So start it up ‘svcadm enable apache22′. Download phpMyAdmin from here. Solaris now ships with p7zip to manage 7z files, so you could download that version to save a bit of bandwidth if you like. You can extract them with pzip -d [filename]. Once extracted move the extracted directory to /data/htdocs/phpmyadmin.

Add mysql-zone to /etc/hosts eg    mysql-zone

Modify setting these two values:

$cfg['Servers'][$i]['host'] = 'mysql-zone';
$cfg['blowfish_secret'] = 'enter a random value here';

To enable monitoring of the Apache instance we need a simple probe script. Make a directory /opt/probes in the zone and create a file called probe-apache.ksh with this content:

if echo "GET; exit" | mconnect -p 80 > /dev/null 2>&1
exit 0
exit 100

Then chmod 755 /opt/probes/probe-apache.ksh. All this does is a simple connect on port 80, it could be replaced with something more complex if needed. Finally disable Apache so that the cluster can start it:

apache-zone# svcadm disable apache22

Drop back to the global zone and copy /opt/SUNWsczone/sczsmf/util/sczsmf_config to /etc/sczsmf_config.apache and set the following settings


 Now this can be registered with the cluster:

cxnode1# /opt/SUNWsczone/sczsmf/util/sczsmf_register -f /etc/sczsmf_config.apache
sourcing /etc/sczsmf_config.apache
Registration of resource apache-server succeeded.
Validation of resource apache-server succeeded.

Now enable Apache and check that it’s functioning correctly:

clresource enable apache-server

You can now browse to /phpmyadmin and check that everything it working!


When I started this work I wasn’t sure whether it was going to be possible or not, but despite a couple of bumps along the way I’m happy with the end result. Whilst it might not be a perfect match for a real cluster it certainly provides enough opportunity for testing and for use in training.

Categorised as: Solaris


  1. […] There is an interesting walkthrough for testing Solaris Cluster with iSCSI and Virtualbox at Building a Solaris Cluster Express cluster in a VirtualBox on OpenSolaris. Whoever wants to play with a mature cluster framework should give this a try. Posted by Joerg […]

  2. […] Building a Solaris Cluster Express cluster in a VirtualBox on OpenSolaris (tags: zfs sun solaris cluster computer software) […]

  3. daniel.p says:

    great work !! thanks much.

    i have spent a lot of time playing with cluster in ESX vmware infrastructure server (on x86) to make it fully working .. break point was for me to configure shared device (because everything else had worked for me).. now, i am encouraged by this article. so, anyway my conditions changed, because our company has bought t5240 sparc server (and got as a gift from two additional fairly historic netra servers) .. but with no diskpool, so shared storage in cluster still pains me .. thanks much for this article .. tomorrow i’ll begin with *second service* with virtual box instead then ESX vmware ..

    Very Best Regards by daniel

  4. Gideon says:

    I’ve set up something similar, but I’ve found that only one of my two virtual nodes can access the iSCSI device. The other node sees it, but if I try to run, say, format on it, I get an error about a reservation conflict.

    Any ideas?

  5. Franky G says:

    This is Fantastic, Im sure people will agree that the shred storage has been a stickler for most,

    I will attempt this with the host OS on Linux Centos 5.1

    Once im done, ill post my results

    Franky G

  6. […] Weihnachtszeit nutzen, um auf scaleo einen Solaris Cluster aus Virtualbox-Maschinen zu bauen (siehe hier und hier). Aber zwei unvorhergesehene Ereignisse trafen […]

  7. […] takes a look at the interesting world of the Linux branded zone. I’ve posted about VirtualBox before and I hope to take a look at xVM Server (Xen) in a future post. Read on for my first steps with […]

  8. fii says:

    Thanks for this info . Tried it and it works perfect . I now have a sun cluster lab and i’m about to throw in oracle rac and stuff in there .

    Can I buy you a beer the next time I’m in Leeds ?

  9. Aero says:

    Hello, thanks for posting the procedure. Can you tell me what is this error:

    cldevice: (C507896) Inquiry on device “/dev/rdsk/c0d0s2″ failed.

    You got this output after cldevice populate.?

    does cldev status show FAIL for that disk ?

  10. Upendra says:

    svcadm enable iscsi/initiator

  11. Koko says:

    I found this page quite interesting. I want to try it. But I cannot find the Solaris Cluster Express 12/08, it says that the product is no longer available. Any idea where can i find another location?
    Thank You

  12. Chris says:

    Thanks so much for taking the time to do this! This is EXACTLY what I needed and couldn’t find anywhere!

  13. Baban says:

    Hi i am new to clustering. And i do find the ip addresses assigned to multipathing group, public interfaces & cluster interconnects to be quite confusing. Can any one clear which and how many ip addresses are assigned to ipmp, cluster interconnect?

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>