Sun Cluster Disaster Recovery two-node cluster with Root on ZFS. Node cloning.


In this howto, we are going to recover a two node cluster using ZFS cloning techniques.

The scenario is:

  • Two-node cluster with Sun Cluster 3.2u2
  • Solaris 10 5/09
  • One node is death (completely broken).
  • One node is alive, still running in the cluster.

1. Preparing the new node (the one replacing the death node).

We must boot this node in single user using a Solaris 10 DVD (minimum release 10/08) or using a Jumpstart Server:

ok boot net -s
[..]
Requesting System Maintenance Mode
SINGLE USER MODE
# format
Searching for disks...WARNING: /pci@1d,700000/SUNW,qlc@2,1/fp@0,0/ssd@w50020f2300004d48,1 (ssd2):
        offline or reservation conflict 
 
WARNING: /pci@1d,700000/SUNW,qlc@2,1/fp@0,0/ssd@w50020f2300004d48,0 (ssd0):
        offline or reservation conflict 
 
done 
 
AVAILABLE DISK SELECTIONS:
       0. c0t0d0
          /pci@1d,700000/scsi@4/sd@0,0
       1. c0t1d0
          /pci@1d,700000/scsi@4/sd@1,0
       2. c1t50020F2300004D48d0
          /pci@1d,700000/SUNW,qlc@2,1/fp@0,0/ssd@w50020f2300004d48,0
       3. c1t50020F2300004D48d1
          /pci@1d,700000/SUNW,qlc@2,1/fp@0,0/ssd@w50020f2300004d48,1
Specify disk (enter its number):

We’ll use c0t0d0 for the new ZFS root file system.

We must ensure that there is no previous ZFS file system on the new disk. The zdb command is very useful for this:

# zdb -l /dev/rdsk/c0t0d0s0
--------------------------------------------
LABEL 0
--------------------------------------------
    version=10
    name='rpool'
    state=0
    txg=602
    pool_guid=1997550466735601058
    hostid=2215638202
    hostname=''
    top_guid=8842605545621435740
    guid=8842605545621435740
    vdev_tree
        type='disk'
        id=0
        guid=8842605545621435740
        path='/dev/dsk/c0t0d0s0'
        devid='id1,sd@SFUJITSU_MAJ3364M_SUN36G_02M39160____/a'
        phys_path='/pci@1d,700000/scsi@4/sd@0,0:a'
        whole_disk=0
        metaslab_array=14
        metaslab_shift=28
        ashift=9
        asize=36412325888
        is_log=0
--------------------------------------------
LABEL 1
[...]

If the output is similar to the previous, remove the ZFS metadata from the first and last megabyte of the disk (if you don’t get this output, you can skip to the next step):

# dd if=/dev/zero seek= count=2048 of=/dev/rdsk/s2
dd if=/dev/zero count=2048 of=/dev/rdsk/s2

In our example:

# prtvtoc /dev/rdsk/c0t0d0s2
[..]
# dd if=/dev/zero seek=71125132 count=2048 of=/dev/rdsk/c0t0d0s2
2048+0 records in
2048+0 records out
# dd if=/dev/zero count=2048 of=/dev/rdsk/c0t0d0s2
2048+0 records in
2048+0 records out 
 
# zdb -l /dev/rdsk/c0t0d0s0 
 
WARNING: /pci@1d,700000/scsi@4/sd@0,0 (sd0):
        Corrupt label; wrong magic number 
 
cannot open '/dev/rdsk/c0t0d0s0': I/O error

Now the disk is clean. We must create the partitioning in the disk. Use all the available space for the slice 0 (remember that we need to work with the SMI labeling):

# format -e
[..]
 
partition> p
Current partition table (unnamed):
Total disk cylinders available: 24620 + 2 (reserved cylinders) 
 
Part      Tag    Flag     Cylinders         Size            Blocks
  0       root    wm       0 - 24619       33.92GB    (24620/0/0) 71127180
  1 unassigned    wu       0                0         (0/0/0)            0
  2     backup    wu       0 - 24619       33.92GB    (24620/0/0) 71127180
  3 unassigned    wm       0                0         (0/0/0)            0
  4 unassigned    wm       0                0         (0/0/0)            0
  5 unassigned    wm       0                0         (0/0/0)            0
  6 unassigned    wm       0                0         (0/0/0)            0
  7 unassigned    wm       0                0         (0/0/0)            0 
 
partition> label
[0] SMI Label
[1] EFI Label
Specify Label type[0]: 0
Ready to label disk, continue? y

2. Generating the clone from the alive node.

We must execute the following commands from the node that is still part of the cluster:

1. Create a recursive snapshot:

# zfs snapshot -r rpool@clusterclon

2. Send the snapshot to a file:

# zfs send -R rpool@clusterclon > /zonas/clusterclon.zfs

You can destroy the recursive snapshot if you want:

# zfs destroy -r rpool@clusterclon

We must send the file to the new node somehow. In the example we’ll use an NFS server:

# share /zonas
# dfshares
RESOURCE                                  SERVER ACCESS    TRANSPORT
    node11:/zonas                         node11  -         -

The following steps must be executed from the NEW NODE (aka. the clone):

1. Create the root pool rpool:

# zpool create -m /rpool rpool c0t0d0s0
cannot mount '/rpool': failed to create mountpoint
# zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
rpool                  33.8G     71K   33.7G     0%  ONLINE     -

2. Mount the new pool in an alternate root:

# mkdir /tmp/rpool
# zpool export rpool
# zpool import -R /tmp/rpool rpool

3. Mount the NFS server share and import the clone image:

# mount 10.164.17.170:/zonas /mnt
# zfs receive -F -d rpool < /mnt/clusterclon.zfs

You can destroy the imported snapshots if you want:

# zfs destroy -r rpool@clusterclon

4. Create the dump and swap devices:

# zfs create -V 1g rpool/dump
# zfs create -V 1500m rpool/swap

5. Modify the parameters for the new image (hostname, ip addresses, nodename, vfstab or whatever):

# zfs umount -a
# zpool export rpool
# zpool import rpool
# mkdir /tmp/raiz
# zfs set mountpoint=/tmp/raiz rpool/ROOT/zfsBE
# zfs mount rpool/ROOT/zfsBE
# vi /tmp/raiz/etc/hostname.bge0
# vi /tmp/raiz/etc/hosts
# vi /tmp/raiz/etc/nodename

It’s also important to change the cluster node id (swap it from 1 to 2 or from 2 to 1):

# vi /tmp/raiz/etc/cluster/nodeid

Remove the /etc/path_to_inst file, it will be regenerated on the reboot:

# rm /tmp/raiz/etc/path_to_inst*

You’ll probably have to regenerate and modify more things, like ssh server keys or the dump directory (dumpadm).

6. Set the final parameters:

# zfs umount rpool/ROOT/zfsBE
# zfs set mountpoint=/ rpool/ROOT/zfsBE
# zpool set bootfs=rpool/ROOT/zfsBE rpool
# zpool set failmode=continue rpool
# installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c0t0d0s0

Reboot the new node with reconfiguration and outside the cluster (this will recreate the did devices):

# reboot –- -rx

After the reboot, boot the new node inside the cluster:

# init 6

Finally check the cluster status:

# cldev status -v
# cldev clear
# cldev refresh

Sergio.

VN:F [1.9.12_1141]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.12_1141]
Rating: 0 (from 0 votes)
Sphere: Related Content

FacebookTwitterGoogle BookmarksLinkedInShare

No related posts.

Related posts brought to you by Yet Another Related Posts Plugin.

, ,

  1. No comments yet.
(will not be published)