Many of you have likely had the same issue as I ran into, you have a remote server in a remote datacenter with only DRAC6 access and no way to access the RAID card without a reboot. Well, those days are gone! I have discovered the path to get into the RAID card and do pretty much any function you could do from the BIOS gui and more.
First, you need to get the software installed, there’s this awesome article I found for that.
http://de.community.dell.com/techcenter/support-services/w/wiki/909.how-to-install-megacli-on-esxi-5-x
This includes the commands to view the log which is important… but doesn’t help you do much.
View system logs:
Switch with cd /opt/lsi/MegaCLI/ to the MegaCLI Folder and use this command to get your controllerlog:
./MegaCli -FwTermLog -Dsply –aALL
To Merge it in A .txt File use this:
./MegaCli -FwTermLog -Dsply –aALL > lsi.txt
The Logfile is now located in /opt/lsi/MegaCLI/
Here’s the list of other commands I have compiled from all over the net:
View information about the RAID adapter
For checking the firmware version, battery back-up unit presence, installed cache memory and the capabilities of the adapter:
# MegaCli -AdpAllInfo -aAll
View information about the battery backup-up unit state
# MegaCli -AdpBbuCmd -aAll
View information about virtual disks
Useful for checking RAID level, stripe size, cache policy and RAID state:
# MegaCli -LDInfo -Lall -aALL
View information about physical drives
# MegaCli -PDList -aALL
Patrol read
Patrol read is a feature which tries to discover disk error before it is too late and data is lost. By default it is done automatically (with a delay of 168 hours between different patrol reads) and will take up to 30% of IO resources.
To see information about the patrol read state and the delay between patrol read runs:
# MegaCli -AdpPR -Info -aALL
To find out the current patrol read rate, execute
# MegaCli -AdpGetProp PatrolReadRate -aALL
To reduce patrol read resource usage to 2% in order to minimize the performance impact:
# MegaCli -AdpSetProp PatrolReadRate 2 -aALL
To disable automatic patrol read:
# MegaCli -AdpPR -Dsbl -aALL
To start a manual patrol read scan:
# MegaCli -AdpPR -Start -aALL
To stop a patrol read scan:
# MegaCli -AdpPR -Stop -aALL
You could use the above commands to run patrol read in off-peak times.
Migrate from one RAID level to another
In this example, I migrate the virtual disk 0 from RAID level 6 to RAID 5, so that the disk space of one additional disk becomes available. The second command is used to make Linux detect the new size of the RAID disk.
# /usr/local/sbin/MegaCli -LDRecon -Start -r5 -L0 -a0
# echo 1 > /sys/block/sda/device/rescan
Extending an existing RAID array with a new disk
./MegaCli -LDRecon -Start -r5 -Add -PhysDrv[32:3] -L0 -a0
Create a new RAID 5 virtual disk from a set of new hard drives
First we need to now the enclosure and slot number of the hard drives we want to use for the new RAID disk. You can find them out by the first command. Then I add a virtual disk using RAID level 5, followed by the list of drives I want to use, specified by enclosure:slot syntax.
# MegaCli -PDList -aALL | egrep ‘Adapter|Enclosure|Slot|Inquiry’
# MegaCli -CfgLdAdd -r5′[252:5,252:6,252:7]’ -a0
Extending an existing RAID array with a new disk
First check the enclosure device ID and the slot number of the newly added disk with the command above. Then we reconstruct the logical drive, adding the new drive. For a RAID 5 array this command is used:
# MegaCli -LDRecon -Start -r5 -Add -PhysDrv[32:3] -L0 -a0
View reconstruction progress
When reconstructing a RAID array, you can check its progress with this command.
# MegaCli -LDRecon ShowProg L0 -a0
(replace L0 by L1 for the second virtual disk, and so on)
Configure write-cache to be disabled when battery is broken
# MegaCli -LDSetProp NoCachedBadBBU -LALL -aALL
Change physical disk cache policy
If your system is not connected to a UPS, you should disable the physical disk cache in order to prevent data loss.
# MegaCli -LDGetProp -DskCache -LAll -aALL
To enable it (only do this if you have a UPS and redundant power supplies):
# MegaCli -LDGetProp -DskCache -LAll -aALL
General Parameters
The parameter -aN (where N is a number starting with zero or the string ALL) specifies the adapter ID. If you have only one controller it’s safe to use ALL instead of a specific ID, but you’re encouraged to use the ID for everything that makes changes to your RAID configuration.
- Physical drive parameter -PhysDrv [E:S]
For commands that operate on one or more pysical drives, the -PhysDrv [E:S] parameter is used, where E is the enclosure device ID in which the drive resides and S the slot number (starting with zero). You can get the enclosure device ID using MegaCli -EncInfo -aALL. The E:S syntax is also used for specifying the physical drives when creating a new RAID virtual drive (see 5).
- Virtual drive parameter -Lx
The parameter -Lx is used for specifying the virtual drive (where x is a number starting with zero or the string all).
Running the executable can be accomplished by:
shell> /opt/MegaRAID/MegaCli/MegaCli <cmd>
or
shell> cd /opt/MegaRAID/MegaCli
shell> ./MegaCli <cmd>
Gather information
MegaCli -AdpAllInfo -aALL
MegaCli -CfgDsply -aALL
MegaCli -adpeventlog -getevents -f lsi-events.log -a0 -nolog
MegaCli -EncInfo -aALL
- Virtual drive information
MegaCli -LDInfo -Lall -aALL
- Physical drive information
MegaCli -PDList -aALL
MegaCli -PDInfo -PhysDrv [E:S] -aALL
- Battery backup information (Cisco MSPs do not have the battery backup unit installed, but in case yours has one)
MegaCli -AdpBbuCmd -aALL
- Check Battery backup warning on boot. If this is enabled on an MSP, it will require manual intervention every time the system boots
MegaCli -AdpGetProp BatWarnDsbl -a0
Controller management
MegaCli -AdpSetProp AlarmSilence -aALL
MegaCli -AdpSetProp AlarmDsbl -aALL
MegaCli -AdpSetProp AlarmEnbl -aALL
- Disable battery backup warning on system boot
MegaCli -AdpSetProp BatWarnDsbl -a0
- Change the adapter rebuild rate to 60%:
MegaCli -AdpSetProp {RebuildRate -60} -aALL
Virtual drive management
- Create RAID 0, 1, 5 drive
MegaCli -CfgLdAdd -r(0|1|5) [E:S, E:S, ...] -aN
MegaCli -CfgSpanAdd -r10 -Array0[E:S,E:S] -Array1[E:S,E:S] -aN
MegaCli -CfgLdDel -Lx -aN
Physical drive management
MegaCli -PDOffline -PhysDrv [E:S] -aN
MegaCli -PDOnline -PhysDrv [E:S] -aN
MegaCli -PDMarkMissing -PhysDrv [E:S] -aN
MegaCli -PdPrpRmv -PhysDrv [E:S] -aN
MegaCli -PdReplaceMissing -PhysDrv [E:S] -ArrayN -rowN -aN
The number N of the array parameter is the Span Reference you get using MegaCli -CfgDsply -aALL and the number N of the row parameter is the Physical Disk in that span or array starting with zero (it’s not the physical disk’s slot!).
- Rebuild drive – Drive status should be “Firmware state: Rebuild”
MegaCli -PDRbld -Start -PhysDrv [E:S] -aN
MegaCli -PDRbld -Stop -PhysDrv [E:S] -aN
MegaCli -PDRbld -ShowProg -PhysDrv [E:S] -aN
MegaCli -PDRbld -ProgDsply -physdrv [E:S] -aN
MegaCli -PDClear -Start -PhysDrv [E:S] -aN
MegaCli -PDClear -Stop -PhysDrv [E:S] -aN
MegaCli -PDClear -ShowProg -PhysDrv [E:S] -aN
MegaCli -PDMakeGood -PhysDrv[E:S] -aN
Changes drive in state Unconfigured-Bad to Unconfigured-Good.
Hot spare management
MegaCli -PDHSP -Set -PhysDrv [E:S] -aN
MegaCli -PDHSP -Rmv -PhysDrv [E:S] -aN
MegaCli -PDHSP -Set -Dedicated -ArrayN,M,... -PhysDrv [E:S] -aN
Walkthrough: Rebuild a Drive that is marked ‘Foreign’ when Inserted:
MegaCli -PDMakeGood -PhysDrv [E:S] -aALL
- Clear the foreign setting
MegaCli -CfgForeign -Clear -aALL
MegaCli -PDHSP -Set -PhysDrv [E:S] -aN
Walkthrough: Change/replace a drive
1. Set the drive offline, if it is not already offline due to an error
MegaCli -PDOffline -PhysDrv [E:S] -aN
2. Mark the drive as missing
MegaCli -PDMarkMissing -PhysDrv [E:S] -aN
3. Prepare drive for removal
MegaCli -PDPrpRmv -PhysDrv [E:S] -aN
4. Change/replace the drive
5. If you’re using hot spares then the replaced drive should become your new hot spare drive
MegaCli -PDHSP -Set -PhysDrv [E:S] -aN
6. In case you’re not working with hot spares, you must re-add the new drive to your RAID virtual drive and start the rebuilding
MegaCli -PdReplaceMissing -PhysDrv [E:S] -ArrayN -rowN -aN
MegaCli -PDRbld -Start -PhysDrv [E:S] -aN
Gathering Standard logs
On every instance of a hard drive problem with an MSP server, we need to run the following commands to have any information about the problem:
shell> rm –f MegaSAS.log
shell> /opt/MegaRAID/MegaCli/MegaCli -adpallinfo -a0
shell> /opt/MegaRAID/MegaCli/MegaCli -encinfo -a0
shell> /opt/MegaRAID/MegaCli/MegaCli -ldinfo -lall -a0
shell> /opt/MegaRAID/MegaCli/MegaCli -pdlist -a0
shell> /opt/MegaRAID/MegaCli/MegaCli -adpeventlog -getevents -f lsi-events.log -a0 -nolog
shell> /opt/MegaRAID/MegaCli/MegaCli -fwtermlog -dsply -a0 -nolog > lsi-fwterm.log
Collect the MegaSAS.log, lsi-events.log, and the lsi-fwterm.log files from the directory where the commands are run (they can be run from any directory on the MSP server) and attach the logs to the service request. You may use a program such as WinSCP (freeware) to pull the files off of the server.
I hope this info is useful to you all in one place, it took me a day to figure it all out.
Mike