Archive

Posts Tagged ‘emc ce’

Replacing clustered storage for a SQL cluster (EMC CE / MS Clustering)

January 25, 2011 1 comment

So we had a client running a SQL database . . and they ran out of space.
Normally, you’d just expand disks and we’re good . . but in this instance, the SQL server was clustered – and the cluster was Geo-spanned using EMC Cluster Enabler.

The way disk access is handled is as per following diagram
#################
#               #
#   EMC CE      # (Uses Lun / Device ID to identify disks) failover)
#               #(Allows failover of GeoSpan clusters i.e. remote site
#################
       ||
       \/ (Creates a Cluster Resource to handle GeoSpan replication)
##################
#                #
# MS Cluster     # (Uses Signature ID of disk to identify disk)
#   Service      #(Provides ability to failover between 2 Servers seemlessly)
##################
        ||
        \/
#################
#               #
# DiskMgr       # (Uses Disk ID to identify disk)
#               #
#################

So in order for us to expand the disk, we need to at a DiskMgr level, simply replace the disk with a new one containing the same data . . which also happens to carry the same Disk ID, Signature ID and Device / Lun ID. Of course . . the Lun ID will change when your storage guys present new storage and the signature and disk ID can’t change while the resource is in use.

In Simple Terms, what we need to do (once shutting down your SQL services etc)is:
1) Deconfigure Cluster resource (removes the cluster
2) Stop Clustering
3) Present Disk to OS
4) Disk Offset for SQL (format / drive letter etc)
5) Copy data
4) Swap Signatures for Clustering
6) Swap Drive Letters
7) Enable Clustering
8) Configure EMC CE Group

But for those of you who need a little more info (I blog this so I too have a copy) – here is what we do:

Provision new disk(s) and replicate these at a Storage level (clariion / symmetrix etc)

Shut down SQL services (and any other apps that write to this disk)

* EMC CE
Launch EMC Cluster Enabler (screen grab config for reference)
Select Manage cluster
Under storage, make sure disk that has been presented is available to both ends (rescan if needed)
Select Groups -> <Name of Group containing disk to be replaced. -> right click -> Deconfigure Group ** this removes the resource, stopping the EMC replication for all objects in that group – Your disk is still available in the cluster and at this point your Server could in theory continue to operate . . though without the ability to failover to the other cluster Node

Check that both servers see new disks at compmgmt level
Shut down B-Node
In Device Manager, disable ‘cluster Disk driver’ (You’ll have to first right click and select ‘show hidden devices’) * This Driver is the tool that allows the Clustering service to ‘own’ a disk and manage failover etc of whole disks.
Set cluster service to Manual (Services.msc)
Bounce A-Node
At cmd use Diskpar (this is an EMC tool for managing EMC disks)
Run a DumpCfg and capture the output (again an EMC tool)
diskpar (this will list disks – provide Device IDs etc)
* recheck disk IDs and write down disk ID and signature numbers!
diskpar -S<New Disk’s Number> *follow prompts – select offset of 1920 (SQL server) – We are setting the offset of the disk to 1920 as this is for a SQL server (best practice)
Compmgmgt – Assign disk a new drive letter and format **Allocation size must be 64k!! (SQL)
Robocopy old drive to new drive
View registry to capture disk signatures (hklm\system\currentcontrolset\services\clusdisk\parameters\signatures) – new disk will be in ‘available’, old one in ‘signatures’
Dumpcfg (will display signature mappings)
{Now swap signatures} as follows:
Dumpcfg -s<New random signature to allocate old disks> <Old disk number from DeviceMgr> >> Gives new sig to old disk
Dumpcfg -s<Original disk’s old signature ID> <New Disks Device ID>

Swap drive letters in Compmgmt

Enable Cluster disk Driver in Device Manager

At this point, we have reset all identifiers according to above diagram and we need to bring everything back up (so we reverse the process)

Bounce A-Node
Start clustering (If OK, Set Service to Auto)
Verify that your drive is accesible etc.
Start B-Node

EMC CE
NodeNameLevel, right-click -> ‘configure CE cluster’ -> follow wizard

Advertisements