ECC – Discover Brocade Switches using SMI Agent

ECC leverages the SNIA SMI standard for discovery of fabric switches.  In order to add brocade switches to the ECC repository for monitoring and management several specific steps must be taken.

Switch Discovery Flow: ECC Repository (SMI Agent Server Credentials) -> SMI Agent (Brocade Credentials) -> Brocade Switch

  1. Download the SMI Agent from the Brocade web site. Note there is a specific SMI agent for ECC 6.1.
  2. Install the SMI agent on a server which has IP connectivity to both the ECC environment and the Brocade switches.  Accept all defaults and configure the SMI Agent to start as a service.
  3. Discover the switches previously added to the SMI Agent by using the Discover -> Connectivity option within ECC. Within the connectivity options use port 8000 to connect to the SMI Agent.  For the credentials fields use an account either local or AD which has permissions to access the server where the SMI Agent is installed.  The account should be persistent and have a password which is set to not expire.
Notes:
The SMI Agent is where you setup connectivity to each Brocade fabric you would like to discover.
SMI Agent Details
SMI Agent Folder: C:\SMIAgent
SMI Agent Manual Configuration: C:\SMIAgent\Server\bin\Configurationtool.bat
SMI Agent Configuration File: C:\SMIAgent\Server\bin\provider.xml
After successfully importing the switches remember to configure collection policies to ensure ECC is updated with new information.

CLARiiON – RAID 5 Two Disk Fault Recovery

To those of us in the storage admin business losing two disks in a RAID 5 disk group falls into a special category.  That category would be associated what most like to call a resume generating event or RGE.  I ran into this specific issue today and survived due to a couple key pieces of information provided by the vendor and by coming through logs to ensure I executed the recovery process in the correct order.

But there is no recovery you say? Ah right, so you’ve lost two disks in a RAID group which has only one parity drive.  The LUN’s which fall within the RAID group are all off line, and the disks in question show up as “Removed”.  At this point you’re SOL.  Someone is leaning over your shoulder asking as simple question… WHEN WILL MY APPLICATION SERVER BE BACK UP?

On to the recovery steps… when a disk fails most times it is actively failed by the array itself and not by catestrophic hardware failure.  High CRC error rates found on the drive lead the array to kick the drive out.  With a two disk failure the array takes a different approach to disks which are “Removed” due to high CRC errors.  The recovery process is quite simple.  Re-insert the second disk which failed.  The array will attempt to copy all of the usable data off it onto a hot spare.  The reason why the second disk to fail is used is due to it having the last updated data.  Where as the first disk to mark failed would not have the updates included in the second failed disk.

You can check the status of the disk rebuild through Naviseccli.  Run naviseccli -h SP_IP_address -user username -scope 0 getdisk 3_4_3 -state -rb this will show you the rebuild status of each LUN found on the disk which is being rebuild.

Troubleshooting

Issue: In some instances LUNs found within pools associated with the failed drives can show up as in a “Faulted” state.
Resolution: A common way to resolve this issue is to reboot each of the CLARiiON storage processors (SP).  Reboot the first SP and wait 15 to 20 minutes before rebooting the second SP.

EMC SymCLI Proxy (Client to Server) Configuration

The EMC DMX/Symmetrix SymCLI software can be installed in two key ways: SymCLI Server standalone & SymCLI Server + SymCLI Client.  The SymCLI client leverages the SymCLI Server to run commands against discovered Symmetrix arrays.  Use of the client allows for distributed SymCLI command execution which can come in handy for scripting backups of key systems or enabling admins from different groups.  Bellow I’ve outlined the key steps for setting up SYMCLI with a proxy host. I have left out steps associated with setting up SSL and also licensing since they are well documented.

Environment
Linux VM – SymCLI Client
Windows Physical – SymCLI Server & ECC Server (FC + Gatekeepers Mapped & Masked)

1. Map & Mask required Symmetrix Gatekeepers to physical host (FC connectivity required)
2. Install SymAPI on server and discover Symmetrix array (run symcfg discover)
3. Install SymAPI on Client machine
3.1 Update /var/symapi/config/netcnfg file with the following line

SYMAPI_SECURE – TCPIP xxx.xxx.xxx.xxx 2707 ANY

3.2 Run stordaemon start storsrvd
3.3 Update user profile to reflect path to symapi binaries and SYMAPI_SECURE environment variable.

vi .bash_profile

PATH=$PATH:$HOME/bin:/opt/emc/SYMCLI/bin

export SYMCLI_CONNECT=SYMAPI_SECURE
export PATH

4. Run symcfg list -services to verify netcnfg file connection details
5. Run symcfg list to show Symmetrix arrays visable via proxy

Key directories
SymAPI binaries /opt/emc/bin
SymAPI configuration /var/symapi

Key files
SymAPI Client to server configuration /var/symapi/config/netcnfg
User path & environment variable configuration /.bash_profile

Troubleshooting
Error: “no devices found”
Fix: Double check that the user in question has all of the search paths and environment variables correctly set within their .bash_profile

Error: “The caller is not authorized to perform the requested operation”
Fix: Check to see if symmetrix authentication (symauth) services is enabled on the array in question

Update Symmetrix Authentication List
file.txt ->assign user <host:username> to role Admin;
symauth preview -file <file.txt>
symauth commit -file <file.txt>