Solaris: UFS to ZFS, LiveUpgrade and Patching
This article gives a detailed overview, how we migrate our servers from UFS to ZFS boot 2-way mirros, how they are upgraded to Solaris™ 10u6 aka 10/08 with /var on a separate ZFS and finally how to accomplish "day-to-day" patching. The main stages are devided into:
Make sure, that on each stage all zones are running or at least bootable, and the environment variables show below are properly set. Also give your brain a chance and think before you blindly copy-and-paste any commands mentioned in this article! There is no guarantee, that the commands shown in this article match exactly your system and thus may damage it/cause data loss, if you do not adjust them to your needs!
The shown procedures have been successfully tested on several Sun Fire V240, 280R, 420R, V440, V490, T1000, X4500, X4600 and Sun Ultra 40s with zero or more running sparse zones.
setenv CD /net/install/pool1/install/sparc/Solaris_10_u6-ga1 setenv JUMPDIR /net/install/pool1/install/jumpstart mount /local/misc set path = ( /usr/bin /usr/sbin /local/misc/sbin )
update to S10u6 aka 10/08 via recommended/feature patching
on pre U4 systems SUNWlucfg is probably missing:
pkgadd -d $CD/Solaris_10/Product SUNWlucfg
make sure, that all required patches are installed. E.g. for sparc:
137137-09 - U6 kernel Patch and its lu/zfs boot dependencies:
119252-26, 119254-59, 119313-23, 121430-36/121431-37, 121428-12/121429-12, 124628-08, 124630-19
see also: Solaris™ Live Upgrade Software: Minimum Patch Requirements
see also: checkpatches.sh
Also apply the following patches to avoid a lot of trouble (see LiveUpgrade Troubleshooting for more information):
# Solaris gpatch -p0 -d / -b -z .orig < /local/misc/etc/lu-5.10.patch # Nevada gpatch -p0 -d / -b -z .orig < /local/misc/etc/lu-5.11.patch
In case of unclarity one should consult the follwing docs:
determine the HDD for the new root pool aka rpool
echo | format
In this example we use: c0t1d0
format the disk so that the whole disk can be used by ZFS
# on x86 first fdisk -B /dev/rdsk/c0t1d0p0 # on sparc and x86 delete all slices and assign all blocks to s0 format -d c0t1d0
If you want to use mirroring, make sure, that s0 of HDD0 and HDD1 have finally the same size (use number of blocks when specifying its size)
Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 29772 34.41GB (29773/0/0) 72169752 1 unassigned wm 0 0 (0/0/0) 0 2 backup wu 0 - 29772 34.41GB (29773/0/0) 72169752 3 unassigned wm 0 0 (0/0/0) 0 4 unassigned wm 0 0 (0/0/0) 0 5 unassigned wm 0 0 (0/0/0) 0 6 unassigned wm 0 0 (0/0/0) 0 7 unassigned wm 0 0 (0/0/0) 0
move all UFS zones to ZFS mirror 'pool1' on HDD2 and HDD3
This allows LU to use zone snapshots instead of copying the stuff - thus magnitudes faster. In our example, UFS zones are in /export/scratch/zones/ ; pool1 mountpoint is on /pool1 .
One may use the following ksh snipplet (requires GNU sed!):
ksh zfs create pool1/zones # adjust this and the following variable UFSZONES="zone1 zone2 ..." UFSZPATH="/export/scratch/zones" for ZNAME in $UFSZONES ; do zlogin $ZNAME 'init 5' done echo 'verify commit ' >/tmp/zone.cmd for ZNAME in $UFSZONES ; do # and wait, 'til $ZNAME is down while true; do zoneadm list | /usr/xpg4/bin/grep -q "^$ZNAME"'$' [ $? -ne 0 ] && break done zfs create pool1/zones/$ZNAME mv $UFSZPATH/$ZNAME/* /pool1/zones/$ZNAME/ chmod 700 /pool1/zones/$ZNAME gsed -i \ -e "/zonepath=/ s,$UFSZPATH/$ZNAME,/pool1/zones/$ZNAME," \ /etc/zones/$ZNAME.xml zonecfg -z $ZNAME -f /tmp/zone.cmd zoneadm -z $ZNAME boot done exit
create the rpool
zpool create -f -o failmode=continue rpool c0t1d0s0 # some bugs? require us to do the following manually zfs set mountpoint=/rpool rpool zfs create -o mountpoint=legacy rpool/ROOT zfs create -o canmount=noauto rpool/ROOT/zfs1008BE zfs create rpool/ROOT/zfs1008BE/var zpool set bootfs=rpool/ROOT/zfs1008BE rpool zfs set mountpoint=/ rpool/ROOT/zfs1008BE
create the ZFS based Boot Environment (BE)
lucreate -c ufs1008BE -n zfs1008BE -p rpool
~25min on V240
At least here one probably ask itself, why we do not use pool1 for boot, then form a mirror of HDD0 and HDD1 and put another BE on the mirror. The answer is pretty simple: because some machines like the thumper aka X4500 can boot from 2 special disks, only (c5t0d0 and c5t4d0).
move BE's /var to a separate ZFS within the BE
zfs set mountpoint=/mnt rpool/ROOT/zfs1008BE zfs mount rpool/ROOT/zfs1008BE zfs create rpool/ROOT/zfs1008BE/mnt cd /mnt/var find . -depth -print | cpio -puvmdP@ /mnt/mnt/ rm -rf /mnt/mnt/lost+found cd /mnt; rm -rf /mnt/var zfs rename rpool/ROOT/zfs1008BE/mnt rpool/ROOT/zfs1008BE/var zfs umount rpool/ROOT/zfs1008BE zfs set mountpoint=/ rpool/ROOT/zfs1008BE zfs set canmount=noauto rpool/ROOT/zfs1008BE/var
~7 min on V240
activate the new ZFS based BE
luactivate zfs1008BE
copy the output of the command to a safe place, e.g. USB stick
restart the machine
init 6
after reboot, check that everything is ok
E.g.:
df -h dmesg # PATH should be /pool1/zones/$zname-zfs1008BE for none-global zones zoneadm list -iv lustatus
destroy old UFS BE
ludelete ufs1008BE
One will get warnings about not beeing able to delete ZFSs for the old bootenv like /.alt.tmp.b-LN.mnt/pool1/zones/$zname - that's ok. One can promote its clones (e.g. /pool1/zones/$zname-zfs1008BE) later and than remove the old ones including their snapshots on desire.
make sure, everything is still ok
init 6
move all remaining filesystems from HDD0 to the root pool
Depending on the mount hierarchy, the following recipe needs to be adapted!
a) check df -h | grep c0t0d0 b) stop all zones and processes, which use those UFS slices (remember to unshare those slices, if exported via NFS) zlogin $ZNAME 'init 5' c) create appropriate ZFSs foreach USLICE in $UFS_SLICES_FROM_a zfs create rpool/mnt # just to be sure mount -o ro -o remount $USLICE.mntpoint cd $USLICE.mntpoint find . -depth -print | cpio -puvmdP@ /rpool/mnt/ rm -rf /rpool/mnt/lost+found umount $USLICE.mntpoint # comment out the appropriate entry in /etc/vfstab gsed -i -e "/^$USLICE/ s,^,#," /etc/vfstab zfs rename -p rpool/mnt rpool/$USLICE.mntpoint # in case of NFS export, comment out entries in /etc/dfs/dfstab # and apply to the ZFS itself zfs set sharenfs='rw=sol:bsd:lnx,root=admhosts' rpool/mnt # if the parent mountpoint is not appropriate zfs set mountpoint=$USLICE.mntpoint rpool/$USLICE.mntpoint done
adjust /etc/lu/ICF.$NUM
Deduce $NUM from /etc/lutab (e.g. with grep :`lucurr`: /etc/lutab | cut -f1 -d:), replace c0t0d0* entries with its zfs counter part and add all parents not yet part of that file. The order of these entries are important! lumount tries to mount the filesystems in the same order they appear in the file and thus may hide required directories (mountpoints)! E.g. our diff would be:
zfs1008BE:-:/dev/zvol/dsk/rpool/swap:swap:4196352 zfs1008BE:/:rpool/ROOT/zfs1008BE:zfs:0 zfs1008BE:/var:rpool/ROOT/zfs1008BE/var:zfs:0 zfs1008BE:/pool1:pool1:zfs:0 zfs1008BE:/rpool:rpool:zfs:0 zfs1008BE:/pool1/zones:pool1/zones:zfs:0 -zfs1008BE:/var/log/web:/dev/dsk/c0t0d0s6:ufs:16780016 +zfs1008BE:/rpool/var:rpool/var:zfs:0 +zfs1008BE:/rpool/var/log:rpool/var/log:zfs:0 +zfs1008BE:/var/log/web:rpool/var/log/web:zfs:0 -zfs1008BE:/export/scratch:/dev/dsk/c0t0d0s7:ufs:16703368 +zfs1008BE:/export:rpool/export:zfs:0 +zfs1008BE:/export/scratch:rpool/export/scratch:zfs:0
make sure, that no slice of HDD0 is used anymore and everything works as expected
init 6 # after reboot df -h | grep c0t0d0 zpool status swap -l
Destroy any zpool/zfs/volume, which is still assigned to c0t0d0 - if one is still in use, zfs/zpool will warn you.
repartition HDD0 as one slice as described above
attach HDD1 to HDD0 - form a ZFS 2-way mirror
zpool attach rpool c0t1d0s0 c0t0d0s0
Finally, to avoid an unbootable environment, check the ZFS Troubleshooting Guide to fix any known ZFS Boot Issues immediately.
See also:
remove the old Live Upgrade packages
pkgrm SUNWluu SUNWlur SUNWlucfg
add the Live Upgrade packages from the release/update to install
pkgadd -d $CD/Solaris_10/Product SUNWluu SUNWlur SUNWlucfg
check, whether all required patches are installed
E.g. wrt. Solaris™ Live Upgrade Software: Minimum Patch Requirements:
checkpatches.sh -p 119081-25 124628-05 ...
# Solaris gpatch -p0 -d / -b -z .orig < /local/misc/etc/lu-5.10.patch # Nevada gpatch -p0 -d / -b -z .orig < /local/misc/etc/lu-5.11.patch
create the new root pool on HDD0
This is usually not neccessary, if you already have a ZFS mirrored boot environment (in this case just use rpool instead of rpool0 in the following examples/scripts and omit this step). However, if e.g. s0 of HDD1 is smaller than s0 of HDD0, the latter one can not be attached to the former one. So we need to "swap" the situation it.
zpool create -o failmode=continue rpool0 c0t0d0s0 zpool status
check and fix basic ownership/permissions
pkgchk -v SUNWscpr
speedup lu commands
Some servers have a lot of filesystems, which are completely meaningless wrt. LiveUpgrade (e.g. the users home directories). That's why they should be ignored by the lu* commands and thus prevent a lot of unnecessar work and save a lot of time. E.g. excluding ~2200 ZFS on a X4600M2 (4xDualCore Opteron 8222 3GHz) saves about 40 min per lumount, luactivate, etc. command. So to exclude all user home directories in our case, we set an appropriate regular expression (see regexp(5)) into /etc/lu/fs2ignore.regex - for more information see LiveUpgrade Troubleshooting.
echo '/export/home' > /etc/lu/fs2ignore.regex
create a new boot environment for upgrade
rmdir /.alt.lucopy.*
lucreate -n s10u6 -p rpool0
~30 min on V240
mount the new bootenv on /mnt - fix any errors
Do not continue before it executes without any errors. If you got problems, have a look at LiveUpgrade Troubleshooting.
lumount s10u6 /mnt
determine patches, which would be removed by luupgrade
can be used to re-apply them after upgrade, if necessary
$CD/Solaris_10/Misc/analyze_patches -N $CD -R /mnt \ >/mnt/var/tmp/s10u6-rm.txt luumount s10u6
~4 min on V240
create the profile to use for the upgrade
We use:
$JUMPDIR/mkProfile.sh -u $CD/Solaris_10 # remove U6 zone poison SUNWdrr on sparc echo 'cluster SUNWC4u1 delete' >> /tmp/profile.orig echo 'cluster SUNWCcvc delete' >> /tmp/profile.orig cp /tmp/profile.orig /var/tmp/profile # just if one wanna know the intial setup of the system cp /var/tmp/profile /mnt/var/sadm/system/logs/profile.s10u6
verify the profile (simulate upgrade)
rm -f /var/tmp/upgrade.err /var/tmp/upgrade.out luupgrade -u -n s10u6 \ -l /var/tmp/upgrade.err -o /var/tmp/upgrade.out \ -s $CD -j /var/tmp/profile -D
upgrade if /var/tmp/upgrade.err is empty
rm -f /var/tmp/upgrade.err /var/tmp/upgrade.out luupgrade -u -n s10u6 \ -l /var/tmp/upgrade.err -o /var/tmp/upgrade.out \ -s $CD -j /var/tmp/profile
~95 min on V240
NOTE: The last step - copying the failsafe miniroot - of luupgrade may fail. See luupgrade: Installing failsafe fails how to do this manually.
make sure that zone pathes are not mounted
# umount all zone ZFSs created by lu*, e.g.: zfs umount /pool1/zones/*-zfs1008BE-s10u6
If zonepathes are mounted, lumount/luactivate and friends will usually fail!
check infos and errors and fix them if neccessary, re-apply your changes
lumount s10u6 /mnt gpatch -p0 -d /mnt -b -z .orig < /local/misc/etc/lu-`uname -r`.patch cd /mnt/var/sadm/system/data/ less upgrade_failed_pkgadds upgrade_cleanup locales_installed vfstab.unselected cleanup4humans.sh /mnt
BTW: We use the script cleanup4humans.sh to get this job done faster and in a more reliable manner. It prepares basic command lines, which one probably needs to use to decide, whether to copy back replaced files or to replace files with the version suggested by the upgrade package.
apply new patches
We use PCA to find out, which patches are available/recommended for the new BE and its zones and finally apply them using pca -R ...
However, to get reproducable results on all systems, we modified the script to have the patch download directory preset, to use always /usr/bin/perl no matter, how the PATH variable is set, and finally to invoke a postinstall script (if available on the system), which automatically fixes questionable changes made by patches. For your convinience, you can download this Patch and edit/adapt it to your needs.
Of course you may try to accomplish the same with the smpatch command, however IMHO this kind of bogus, poorly designed software is probably one reason, why some people think, Java is a bad thing. That's why we don't install the SUNWCpmgr, SUNWCswup, SUNWswupclr, SUNWswupcl junkware on any system.
OK, let's start with creating patch lists for each zone using the convinience script lupatch.sh (download and adjust it to your needs if you don't have it already). BTW: This command works also for the current BE, when one adds the '-R /' option.
# download all potential patches and show available patch lists when finished lupatch.sh -d
Now one should study the READMEs of all potential new patches for the zones by invoking the command shown below. BTW: This command works also for the current BE, when one adds the '-R /' option.
lupatch.sh -r
After that one should customize the available patch lists (per default /var/tmp/patchList.*) by removing the patch list at all or by removing the lines of patches, which should not be installed.
Finally one should apply the patches to the zones either using 'luupgrade -t -r /mnt patch ...' or the following command, which uses 'pca -i -R /mnt/$zonepath patch ...' internally to patch each zone using the patch lists from mentioned before (and makes sure, that the global zone gets patched first).
lupatch.sh -i
Make sure to always patch the 'global' zone FIRST!!!
unmount the BE
cd / luumount s10u6
activate the new BE
luactivate s10u6
boot into the new BE
init 6 # after reboot, check whether you still have a swap device enabled swap -l # if not, add the one of from that pool, where the current BE lives and # add an appropriate entry to /etc/vfstab: swap -a /dev/zvol/dsk/rpool0/swap echo '/dev/zvol/dsk/rpool0/swap - - swap - no -' >>/etc/vfstab
adjust your backup settings
javaws -viewer
In this section it is shown, how one may transfer remaining ZFSs from HDD1 aka rpool with the old ZFS BE (zfs1008BE) to HDD0 aka rpool0 where the current BE (s10u6) lives. This is usually not neccessary, when one was able to attach s0 of HDD0 to s0 of HDD1 before doing the upgrade to the "new" OS. However, for some reasons this might have not been possible at the time of the upgrade and that's why this case including pretty detailed troubleshooting hints are covered here.
To get the picture, we have now the following situation on our server:
Boot Environment Is Active Active Can Copy Name Complete Now On Reboot Delete Status -------------------------- -------- ------ --------- ------ ---------- zfs1008BE yes no no yes - s10u6 yes yes yes no - Filesystem Size Used Available Capacity Mounted on rpool0/ROOT/s10u6 33G 4.7G 23G 18% / rpool0/ROOT/s10u6/var 33G 2.2G 23G 9% /var rpool/export 34G 20K 21G 1% /export rpool/export/scratch 34G 1021M 21G 5% /export/scratch pool1 67G 22K 66G 1% /pool1 pool1/home 67G 39M 66G 1% /pool1/home pool1/web 67G 27K 66G 1% /pool1/web pool1/web/iws2 67G 284K 66G 1% /pool1/web/iws2 pool1/web/iws2/sites 67G 850K 66G 1% /pool1/web/iws2/sites pool1/web/theo2 67G 25K 66G 1% /pool1/web/theo2 pool1/web/theo2/sites 67G 24K 66G 1% /pool1/web/theo2/sites pool1/zones 67G 29K 66G 1% /pool1/zones rpool 34G 21K 21G 1% /rpool rpool/var 34G 19K 21G 1% /rpool/var rpool/var/log 34G 18K 21G 1% /rpool/var/log rpool0 33G 21K 23G 1% /rpool0 rpool0/ROOT 33G 18K 23G 1% /rpool0/ROOT rpool/var/log/web 34G 1012K 21G 1% /var/log/web pool1/zones/sdev-zfs1008BE-s10u6 67G 732M 66G 2% /pool1/zones/sdev-zfs1008BE-s10u6 NAME USED AVAIL REFER MOUNTPOINT pool1 1.10G 65.8G 22K /pool1 pool1/home 39.3M 65.8G 39.1M /pool1/home pool1/home@home 196K - 39.1M - pool1/web 1.18M 65.8G 27.5K /pool1/web pool1/web/iws2 1.11M 65.8G 284K /pool1/web/iws2 pool1/web/iws2/sites 850K 65.8G 850K /pool1/web/iws2/sites pool1/web/theo2 50K 65.8G 25.5K /pool1/web/theo2 pool1/web/theo2/sites 24.5K 65.8G 24.5K /pool1/web/theo2/sites pool1/zones 1.06G 65.8G 29K /pool1/zones pool1/zones/sdev 942M 65.8G 942M /pool1/zones/sdev pool1/zones/sdev@zfs1008BE 264K - 942M - pool1/zones/sdev-zfs1008BE 35.5M 65.8G 943M /pool1/zones/sdev-zfs1008BE pool1/zones/sdev-zfs1008BE@s10u6 7.00M - 942M - pool1/zones/sdev-zfs1008BE-s10u6 104M 65.8G 732M /pool1/zones/sdev-zfs1008BE-s10u6 rpool 12.5G 21.2G 21K /rpool rpool/ROOT 7.52G 21.2G 18K legacy rpool/ROOT/zfs1008BE 7.52G 21.2G 4.05G /.alt.zfs1008BE rpool/ROOT/zfs1008BE/var 3.47G 21.2G 3.47G /.alt.zfs1008BE/var rpool/dump 2.01G 21.2G 2.01G - rpool/export 1021M 21.2G 20K /export rpool/export/scratch 1021M 21.2G 1021M /export/scratch rpool/swap 2.00G 23.2G 15.7M - rpool/var 1.02M 21.2G 19K /rpool/var rpool/var/log 1.01M 21.2G 18K /rpool/var/log rpool/var/log/web 1012K 21.2G 1012K /var/log/web rpool0 10.9G 22.6G 21.5K /rpool0 rpool0/ROOT 6.89G 22.6G 18K /rpool0/ROOT rpool0/ROOT/s10u6 6.89G 22.6G 4.72G / rpool0/ROOT/s10u6/var 2.17G 22.6G 2.17G /var rpool0/dump 2.01G 22.6G 2.01G - rpool0/swap 2.00G 24.6G 16K - Dump content: kernel pages Dump device: /dev/zvol/dsk/rpool0/dump (dedicated) Savecore directory: /var/crash/joker Savecore enabled: yes swapfile dev swaplo blocks free /dev/zvol/dsk/rpool0/swap 256,4 16 4196336 4196336
mount the new bootenv on /mnt - fix any errors
This step is recommended, since if lumount fails, ludelete would fail as well. So fixing it early makes troubleshooting easier.
lumount zfs1008BE /mnt
And ooops, we get the following error:
ERROR: cannot mount '/mnt/var': directory is not empty ERROR: cannot mount mount point </mnt/var> device <rpool/ROOT/zfs1008BE/var> ERROR: failed to mount file system <rpool/ROOT/zfs1008BE/var> on </mnt/var> ERROR: unmounting partially mounted boot environment file systems ERROR: No such file or directory: error unmounting <rpool/ROOT/zfs1008BE> ERROR: umount: warning: rpool/ROOT/zfs1008BE not in mnttab umount: rpool/ROOT/zfs1008BE no such file or directory ERROR: cannot unmount <rpool/ROOT/zfs1008BE> ERROR: cannot mount boot environment by name <zfs1008BE>
To check, what's going wrong, one may use the procedure described in Debugging lucreate, lumount, luumount, luactivate, ludelete and have a look at lumount.trc.zfs1008BE.log :
CPU PID TIME COMMAND 1 48923 12341952594063 lumount zfs1008BE /mnt 0 48923 12341971394229 /etc/lib/lu/plugins/lupi_zones plugin 1 48923 12341972471729 /etc/lib/lu/plugins/lupi_svmio plugin 0 48923 12341991109479 /etc/lib/lu/plugins/lupi_bebasic plugin 1 48923 12342011131729 metadb 0 48923 12342087965979 zfs set mountpoint=/mnt rpool/ROOT/zfs1008BE 1 48923 12343091202978 zfs get -Ho value mounted rpool/ROOT/zfs1008BE 1 48923 12343115644562 zfs mount rpool/ROOT/zfs1008BE 1 48923 12343186977645 zfs get -Ho value mounted rpool/ROOT/zfs1008BE/var 0 48923 12343212715811 zfs mount rpool/ROOT/zfs1008BE/var 0 48923 12344023001311 lockfs -f /mnt/ 0 48923 12344027045227 umount -f /mnt/ 0 48923 12344031500477 umount -f /mnt 1 48923 12344561966727 umount rpool/ROOT/zfs1008BE 1 48923 12344568052643 umount -f rpool/ROOT/zfs1008BE
Here we can see: lumount does "nothing special" or "magic things". Mounting the the old BE's / succeeds, however mounting its /var FS fails. Because the mountpoint of the BEs / has been already set to /mnt by lumount, we do:
zfs mount rpool/ROOT/zfs1008BE ls -alR /mnt/var
/mnt/var: total 14 drwxr-xr-x 4 root root 4 Nov 27 05:09 . drwxr-xr-x 33 root root 35 Dec 2 03:41 .. drwx------ 3 root root 3 Nov 27 05:09 log drwx------ 2 root root 4 Nov 27 05:30 run /mnt/var/log: total 9 drwx------ 3 root root 3 Nov 27 05:09 . drwxr-xr-x 4 root root 4 Nov 27 05:09 .. drwx------ 2 root root 2 Nov 27 05:09 web /mnt/var/log/web: total 6 drwx------ 2 root root 2 Nov 27 05:09 . drwx------ 3 root root 3 Nov 27 05:09 .. /mnt/var/run: total 10 drwx------ 2 root root 4 Nov 27 05:30 . drwxr-xr-x 4 root root 4 Nov 27 05:09 .. -rw-r--r-- 1 root root 4 Nov 29 06:30 bootadm.lock -rw-r--r-- 1 root root 3 Nov 29 06:30 ipmon.pid
Because /var/run is usually mounted on swap and all other directories are empty, we can conclude, that we probably see relicts from the upgrade process but don't really care about it, because not needed anymore. So we solve our problem and check again using the commands shown below.
rm -rf /mnt/var/* zfs umount /mnt lumount zfs1008BE /mnt luumount zfs1008BE
Well, now all this works and we are able to continue with the next step.
delete the old BE
Never delete a BE, where an error occured during lucreate when snapshoting relevant datasets. E.g.
Creating snapshot for <pool1/zones/sdev-zfs1008BE-s10u6> on \ <pool1/zones/sdev-zfs1008BE-s10u6@s10u6_20081203> ERROR: cannot create snapshot \ 'pool1/zones/sdev-zfs1008BE-s10u6@s10u6_20081203': dataset is busy ERROR: Unable to snapshot <pool1/zones/sdev-zfs1008BE-s10u6> on \ <pool1/zones/sdev-zfs1008BE-s10u6@s10u6_20081203> cannot open 'pool1/zones/sdev-zfs1008BE-s10u6_20081203': dataset does not exist cannot open 'pool1/zones/sdev-zfs1008BE-s10u6_20081203': dataset does not exist cannot open 'pool1/zones/sdev-zfs1008BE-s10u6_20081203': dataset does not exist Population of boot environment <s10u6_20081203> successful.
First create another snapshot of the original ZFS (here pool1/zones/sdev-zfs1008BE-s10u6@stillused) to prevent ludelete from destroying it (here pool1/zones/sdev-zfs1008BE-s10u6). It's probably a bug and should not happen, however, it happend - the whole zone from the current BE was gone as well (even so not reproducable)!
ludelete zfs1008BE lustatus
umount: warning: /.alt.tmp.b-c0.mnt/pool1/zones/sdev-zfs1008BE not in mnttab umount: /.alt.tmp.b-c0.mnt/pool1/zones/sdev-zfs1008BE not mounted Deleting ZFS dataset <pool1/zones/sdev-zfs1008BE>. ERROR: cannot destroy 'pool1/zones/sdev-zfs1008BE': filesystem has dependent clones use '-R' to destroy the following datasets: pool1/zones/sdev-zfs1008BE-s10u6 ERROR: Unable to delete ZFS dataset <pool1/zones/sdev-zfs1008BE>. Determining the devices to be marked free. Updating boot environment configuration database. Updating boot environment description database on all BEs. Updating all boot environment configuration databases. Boot environment <zfs1008BE> deleted. Boot Environment Is Active Active Can Copy Name Complete Now On Reboot Delete Status -------------------------- -------- ------ --------- ------ ---------- s10u6 yes yes yes no -
As we can see, the old zfs1008BE with its ZFSs
rpool/ROOT/zfs1008BE,
rpool/ROOT/zfs1008BE/var,
pool1/zones/sdev-zfs1008BE and
pool1/zones/sdev@zfs1008BE could be deleted
successfully, but not without any "errors". Of course we do not want to have
pool1/zones/sdev-zfs1008BE-s10u6 destroyed, since it
is still used as the base of the zone sdev. So
DON'T do, what ludelete actually suggests!
What we do instead is to promote this ZFS clone of the pool1/zones/sdev@zfs1008BE ZFS snapshot (see zfs(1M)). Note, that this is not really required (because on pool1), but since the original BE doesn't exist anymore and the old zone (at least if it is a sparse zone) would use ZFS from the new BE, a rollback to the old zone would not make sense, because then it would be more or less inconsistent and cause sooner or later trouble. So free its resources aka disk space!
promote zone ZFSs, which depend on the old BE
# check potential candidates zoneadm list -cp | grep -v :global: | cut -f2,4 -d: # promote apropriate ZFSs zfs promote pool1/zones/*-zfs1008BE-s10u6 # make sure, the VALUE of the origin is '-' zfs get origin pool1/zones/*-zfs1008BE-s10u6
destroy snapshots and old parents of promoted ZFSs
zfs destroy pool1/zones/sdev-zfs1008BE-s10u6@zfs1008BE zfs destroy pool1/zones/sdev
transfer remaining ZFSs
So what is left?
zfs list | egrep '^rpool( |/)' ksh zoneadm list -cp | grep -v :global: | cut -f2 -d: | while read ZN ; do echo $ZN zonecfg -z $ZN info fs | /usr/xpg4/bin/egrep -E '[[:space:]](dir|special):' zonecfg -z $ZN info zonepath | grep ': rpool/' done exit # not shown below swap -l dumpadm ls -al /rpool /export /rpool/var /rpool/var/log
rpool 5.01G 28.7G 21K /rpool rpool/ROOT 18K 28.7G 18K legacy rpool/dump 2.01G 28.7G 2.01G - rpool/export 1021M 28.7G 20K /export rpool/export/scratch 1021M 28.7G 1021M /export/scratch rpool/swap 2.00G 30.7G 15.7M - rpool/var 1.02M 28.7G 19K /rpool/var rpool/var/log 1.01M 28.7G 18K /rpool/var/log rpool/var/log/web 1012K 28.7G 1012K /var/log/web sdev dir: /var/log/httpd special: /var/log/web/iws2 dir: /data/sites special: /pool1/web/iws2/sites
So we can see, that the only ZFS we have to consider for transfer are rpool/var/log/web and rpool/export/scratch, because all others are either not used anymore or have no data in it (except the directory used as mountpoint for another ZFS). For obvious reasons we do not transfer /export/scratch but just destroy it and create a new one in rpool0. So /var/log/web is the only one left over.
However, since it is used by the zone sdev, we have to shutdown the zone first, transfer the fs and finally restart the zone.
# get rid of the old /export/scratch and create a new one on rpool0 zfs destroy -rR rpool/export zfs create -o mountpoint=/export rpool0/export zfs create -o sharenfs='rw=sol:bsd:lnx,root=admhosts' \ -o quota=8G pool0/export/scratch chmod 1777 /export/scratch # transfer /var/log/web zlogin sdev 'init 5' sh -c 'while [ -n "`zoneadm list | grep sdev`" ]; do sleep 1; done' zfs umount rpool/var/log/web zfs snapshot rpool/var/log/web@xfer zfs send -R rpool/var/log/web@xfer | zfs receive -d rpool0 # restart the zone zoneadm -z sdev boot # fix /etc/power.conf gsed -i 's,/rpool/,/rpool0/,g' /etc/power.conf pmconfig # optional: destroy the source of transfered ZFS zfs destroy rpool0/var/log/web@xfer zfs destroy -r rpool/var zfs destroy -r rpool/export
destroy the old pool
Just to make sure, that not unintentionally valid data gets destroyed, one should have a look at the pool!
zfs list | egrep '^rpool( |/)' zpool destroy rpool
rpool 4.01G 29.7G 20K /rpool rpool/ROOT 18K 29.7G 18K legacy rpool/dump 2.01G 29.7G 2.01G - rpool/swap 2.00G 31.7G 15.7M -
make lufslist and friends happy
Since there is no lusync_icf one needs to do this manually. The helper below should make and display the proper changes, however one should of course verify, that the changes being made are correct.
ksh BE=`lucurr` ICF=`grep :${BE}: /etc/lutab | awk -F: '{ print "/etc/lu/ICF." $1 }'` cp -p $ICF ${ICF}.bak gsed -i -r -e '/:rpool:/ d' -e 's,:(/?rpool)([:/]),:\10\2,g' $ICF diff -u ${ICF}.bak $ICF print "\nVERIFY that $ICF is correct - if not\ncp -p ${ICF}.bak $ICF\n" lufslist $BE exit
format HDD1 and zpool attach to HDD0
Since we want to the boot disk mirrored by ZFS, we have to make sure, that s0 of HDD1 has at least the same size in blocks as s0 of HDD0. Otherwise zfs will deny attaching it.
# compare 'Sector Count' of 'Partition' 0 prtvtoc /dev/rdsk/c0t0d0s2 | /usr/xpg4/bin/grep -E 'Count|^[[:space:]]*0' prtvtoc /dev/rdsk/c0t1d0s2 | /usr/xpg4/bin/grep -E 'Count|^[[:space:]]*0' # if size is not OK, adjust it via 'format' - see above # and finally attach the slice to form a 2-way mirror zpool attach rpool0 c0t0d0s0 c0t1d0s0 zpool status sleep 900 print "\n##### status #####\n" zpool status | grep scrub:
* Partition Tag Flags Sector Count Sector Mount Directory 0 2 00 0 71763164 71763163 * Partition Tag Flags Sector Count Sector Mount Directory 0 2 00 0 72169752 72169751 pool: pool1 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM pool1 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 c0t3d0 ONLINE 0 0 0 errors: No known data errors pool: rpool0 state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 0h0m, 3.62% done, 0h23m to go config: NAME STATE READ WRITE CKSUM rpool0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t0d0s0 ONLINE 0 0 0 c0t1d0s0 ONLINE 0 0 0 errors: No known data errors ##### status ##### scrub: none requested scrub: resilver completed after 0h14m with 0 errors on Wed Dec 3 06:05:06 2008
Resolve ZFS Boot Issues
see ZFS Boot Issues and Solaris 10 10/08 Release and Installation Collection >> Solaris 10 10/08 Release Notes >> 2. Solaris Runtime Issues >> File Systems.
# ------------ on sparc --------- dd if=/dev/rdsk/c0t1d0s0 of=/tmp/bb bs=1b iseek=1 count=15 dd if=/dev/rdsk/c0t1d0s0 of=/tmp/bb bs=1b iseek=1024 oseek=15 count=16 cmp /tmp/bb /usr/platform/`uname -i`/lib/fs/zfs/bootblk # if they differ, fix it: installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk \ /dev/rdsk/c0t1d0s0 ls -al /rpool0/platform/`uname -m`/bootlst # if this failed mkdir -p /rpool0/platform/`uname -m`/ cp -p /platform/`uname -m`/bootlst /rpool0/platform/`uname -m`/bootlst # ------------ on x86 (not yet tested) ----------- installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c0t1d0s0
cleanup artifacts and have a beer
rm -rf /.alt.* rmdir /rpool
When you have studied and understood all previous sections, you should have a pretty good understanding, how to do the day-to-day patching of a ZFS based Solaris system. However, for our convinience the summary.
+ tcsh mount /local/misc set path = ( /usr/bin /usr/sbin /local/misc/sbin ) setenv RPOOL `df / | grep -v ^Filesystem | cut -f1 -d/` setenv NBE s10u8_`date '+%Y%m%d'` lupatch.sh -d -R / # stop here - if no new patches are available lupatch.sh -r -R / # edit patch lists for each zones if not all uninstalled patches should be # installed: # delete all unwanted patch lines - for permanent ignorance put something # like 'ignore=123456-07' into /etc/pca.conf vi /var/tmp/patchList.* lucreate -n $NBE -p $RPOOL umount /mnt lumount $NBE /mnt # do not continue before lumount above works - see lumount trouble lupatch.sh -i # fix possible LU updates: assumes 121430-50 (121431-51) or greater is installed gpatch -p0 -d /mnt -b -z .orig < /local/misc/etc/lu-`uname -r`.patch gsed -i -e '/^\/var\/mail/ s,^,#,' /mnt/etc/lu/synclist # S10u8 includes zfs entries into /etc/vfstab - prevents boot: check & correct cat /mnt/etc/vfstab gsed -i -e '$d' /mnt/etc/vfstab gsed -i -e '$d' /mnt/etc/vfstab cd / luumount $NBE luactivate $NBE # when the next maintainance window starts init 6
Copyright (C) 2008 Jens Elkner (jel+lu@cs.uni-magdeburg.de)