LSI’s MegaRAID series of SAS RAID controllers will halt the system boot process when loading the cards option ROM (what LSI call’s the controller’s BIOS), if there are any “errors” detected. Manual intervention is needed to get the system boot process to pass the controller’s option ROM loading. Occasionally, this is a “good thing”, as more than once I’ve had a card’s self test has discover ECC memory problems. Much more often, this “feature” will hang the boot process due to the presence or “foreign configuration”. Ie, a disk that has MegaRAID configuration information on it that’s not part of active virtual device. For example, a hot spare moved from one system to another (without yet being reconfigured in the new system) can trigger this “feature”, halting the entire boot.
Example ECC error
:
Example Foreign configuration error
:
The MegaRAID storage manager GUI shows a value for “Boot Error Handling” but doesn’t allow you to configure it. As with most things, I’m sure MegaCLI
has the functionality but has such a hideous interface I was unable to find the correct setting. I was just about to break down and email LSI support (yet again) when I decided to get the brand new storcli
utility a try. storcli
is intended to be a user friend replacement for MegaCLI
with a 3Ware tw_cli
like interface. It only took a few minutes displaying command help with storcli to figure out how to change this parameter.
# /opt/MegaRAID/storcli/storcli64 /c0 show bios Controller = 0 Status = Success Description = None Controller Properties : ===================== ----------------------------------------------- Ctrl_Prop Value ----------------------------------------------- Basic Input/Output System (BIOS) ON Auto Boot Select(ABS) OFF BIOS Boot Mode Stop On Error ----------------------------------------------- # /opt/MegaRAID/storcli/storcli64 /c0 set BIOSMode help Storage Command Line Tool Ver 1.03.11 Jan 30, 2013 (c)Copyright 2012, LSI Corporation, All Rights Reserved. SYNTAX: storcli /cx set BIOSMode=DESCRIPTION: Sets the BIOS Boot mode. OPTIONS: SOE - Stop On Error BE - Bypass Error HCOE- Headless Continue on Error HSM - Headless Safe Mode CONVENTION: /cx - specifies the controller where X is the controller index # /opt/MegaRAID/storcli/storcli64 /c0 set BIOSMode=BE Controller = 0 Status = Success Description = None Controller Properties : ===================== ---------------- Ctrl_Prop Value ---------------- BIOS Mode BE ---------------- # /opt/MegaRAID/storcli/storcli64 /c0 show bios Controller = 0 Status = Success Description = None Controller Properties : ===================== ---------------------------------------------- Ctrl_Prop Value ---------------------------------------------- Basic Input/Output System (BIOS) ON Auto Boot Select(ABS) OFF BIOS Boot Mode Bypass Error ----------------------------------------------
And now that I know the flags are SOE|BE I was able to figure out the equivalent MegaCLI command as well.
# MegaCLI64 -h | grep -i bios MegaCLI64 -AdpBIOS -Enbl |-Dsbl | -SOE | -BE | EnblAutoSelectBootLd | DsblAutoSelectBootLd | -Dsply -aN|-a0,1,2|-aALL # MegaCLI64 -AdpBIOS -BE -aALL BIOS is set to Bypass Error on Adapter 0. Exit Code: 0x00