AEN Error Message Details
001h Controller reset occurred
The 3ware RAID controller has detected a soft reset from the device driver. If the 3ware RAID controller fails to respond to the device driver within a reasonable amount of time, the device driver may issue a soft reset to the 3ware RAID controller and retry the command packet.
002h Degraded unit
An error has been encountered and the array is operating in degraded (non-redundant) mode. The user should replace the bad drive as soon as possible and initiate a rebuild.
Removing a drive from a redundant unit will cause this AEN immediately.
003h Controller error occurred
The 3ware RAID controller has encountered an internal error. Please contact AMCC Customer Support as a replacement board may be required.
004h Rebuild failed
The 3ware RAID controller was unable to complete a rebuild operation. This error can be caused by drive errors on either the source or the destination of the rebuild. However, due to ATA drives' ability to reallocate sectors on write errors, the rebuild failure is most likely caused by the source drive of the rebuild detecting some sort of read error. The default operation of the 3ware RAID controller is to abort a rebuild if an error is encountered. If it is desired to continue on error, you can set the Continue on Source Error During Rebuild policy for the unit on the Controller Settings page in 3DM.
005h Rebuild completed
The 3ware RAID controller has successfully completed a rebuild. The completion of the rebuild changes the state of the array from rebuilding to OK. The data is now redundant.
006h Incomplete unit detected
At power-on initialization time, or during a rescan, the 3ware RAID controller performs a "rollcall" of all drives attached to the card. After detection of the drives, the 3ware RAID controller then uses an internal algorithm to logically connect drives that belong to the same array. If after rollcall a member of an array is not found, the INCOMPLETE UNIT AEN is sent. Examples of incomplete units are as follows:
Replacing the missing or dead drive and initiating a rebuild will change the state of the array from an incomplete unit to OK. No rebuild is required if you replace the missing drive before loading the driver.
007h Initialize completed
The 3ware RAID controller has completed the initialization sequence of RAID levels 1, 10, 5, or 50. For RAID 5 and RAID 50, the data on the array is read and the resultant parity is written to the parity area on the array. For RAID 1 and 10, one half of the mirror is copied to the other half (mirrors are synchronized).
008h Unclean shutdown detected
The 3ware RAID controller has the ability to detect if the system has been shutdown via the standard shutdown mechanism of the operating system (clean shutdown). If the system loses power or is reset without going through the normal shutdown procedure, it is possible that the data on a redundant array may be out of synchronization. The unclean shutdown detection will detect this case and force the array to enter the rebuilding state. This has the effect of synchronizing the array back to a fully redundant state.
To prevent unclean shutdowns, the user should always go through the normal shutdown procedure for the operating system in use and use an uninterruptible power supply (UPS) to prevent unclean shutdowns due to sudden power loss.
009h Drive timeout detected
The 3ware RAID controller has a sophisticated recovery mechanism to handle various types of failures of a disk drive. One such possible failure of a disk drive is a failure of a command that is pending from the 3ware RAID controller to complete within a reasonable amount of time. If the 3ware RAID controller detects this condition, it notifies the user, prior to entering the recovery phase, by displaying this AEN.
Possible causes of APORT time-outs include a bad or intermittent disk drive, power cable or interface cable.
00Ah Drive error detected
As part of the recovery mechanism of the 3ware RAID controller, various drive failures can be detected and, if possible, corrected. One such drive failure is when the drive indicates back to the 3ware RAID controller that it was unable to complete a command. If the drive returns an error to the 3ware RAID controller, the user is notified by this AEN.
00Bh Rebuild started
The 3ware RAID controller notifies the user whenever it starts a rebuild. The rebuild start may be user-initiated (by selecting the rebuild button in the 3DM Disk Management Utility), may be auto-initiated by a hot spare failover, or may be started after drive removal or insertion (due to the Auto-Rebuild policy). In any of these cases, the user is notified of the event by this AEN.
00Ch Initialize started
The 3ware RAID controller notifies the user by this AEN whenever it starts an initialization. Initialization either occurs at array creation time for larger RAID 5 or 50 arrays or later during the initial verification of redundant arrays.
00Dh Unit deleted
The unit was deleted.
00Eh Initialize failed
The 3ware RAID controller was unable to complete the initialization. This error can be caused by unrecoverable drive errors. When this error occurs, the unit will go back to degraded mode if possible.
00Fh SMART threshold exceeded
The 3ware RAID controller supports SMART Monitoring, whereby the individual drives automatically monitor certain parametric information such as error rates and retry counts. By monitoring this data, SMART may be able to predict a drive failure before it happens, allowing a user to schedule service of the array before it becomes degraded. The SMART status of each drive attached to the 3ware RAID controller is monitored daily. If a failure of any drive is determined to be likely, the user is notified by this AEN.
3ware recommends that you replace any drive that has the SMART attribute exceeded.
019h Drive removed
Drive removed.
This AEN is posted whenever a drive is removed from the controller while the controller is powered on.
01Ah Drive inserted
Drive inserted.
This AEN is posted whenever a drive is connected to the controller while the controller is powered on.
01Eh Unit inoperable
Unit inoperable. Drive removal caused unit to become inoperable. This AEN is sent after offline unit timer expires; if the unit becomes operational before the timer expires (20 seconds) there will be no AEN since there were no IO errors.
01Fh Unit Operational
Unit operational. Drive insertion caused the unit to become operational again. This AEN is sent only after the offline timer expires (20 seconds).
020h Prepare for shutdown (power-off)
[[need definition]]
021h Downgrade UDMA mode
The 3ware RAID controller communicates to the ATA disk drives through the Ultra DMA (UDMA) protocol. This protocol ensures data integrity across the ATA cable by appending a Cyclical Redundancy Check (CRC) for all ATA data that is transferred. If the data becomes corrupted between the drive and the 3ware RAID controller (e.g., an intermittent cable connection) the 3ware RAID controller detects this as a UDMA CRC or cable error. The 3ware RAID controller then retries the failed command three times at the current UDMA transfer rate. If the error persists, it lowers the UDMA transfer rate (e.g., from UDMA 100 to UDMA 66) and retries another three times. This AEN is sent to the user when the 3ware RAID controller lowers the UDMA transfer rate.
Possible causes of UDMA CRC errors are bad interface cables or cable routing problems through electrically noisy environments (e.g., cables are too close to the power supply).
022h Upgrade UDMA mode
During the self-test, if a drive is found to not be in the optimal UDMA mode, the controller will upgrade its UDMA mode to be optimal.
023h Sector repair completed
The 3ware RAID controller supports a feature called dynamic sector repair to allow the unit to recover from certain drive errors that would normally result in a degraded array situation. For redundant arrays such as RAID 1, 10, 50, and 5, the 3ware RAID controller essentially has two copies of the users data available. If a read command to a sector on a disk drive results in an error, it reverts to the redundant copy in order to satisfy the host's request. At this point, the 3ware RAID controller has a good copy of the requested data in its cache memory. It will then use this data to force the failing drive to reallocate the bad sector, which essentially repairs the sector. When a sector repair occurs, the user is notified by this AEN.
The fact that a sector repair AEN has been sent to the user is an indication of the presence of grown defects on a particular drive. While typical modern disk drives are designed to allow several hundred grown defects, special attention should be paid to any drive in an array that begins to indicate sector repair messages. This may be an indication of a drive that is beginning to fail. The user may wish to replace the drive, especially if the number of sector repair errors exceeds 3 per month.
024h Sbuf memory test failed
The 3ware RAID controller, as part of its data integrity features, performs diagnostics on its internal RAM devices. Once a day, a non-destructive test is performed on the cache memory. Failure of the test indicates a failure of a hardware component on the 3ware RAID controller. This AEN is sent to notify the user of the problem. If the controller is still under warranty, contact 3ware Technical Support for a replacement controller.
025h Cache flush failed; some data lost
To improve performance, this 3ware RAID controller features caching layer firmware. For write commands this means that it acknowledges it has completed a write operation before the data is committed to disk. If the 3ware RAID controller can not commit the data to the media after it has acknowledged to the host, this AEN is posted to the user.
Typically, the LOST CACHED WRITE notification would be an indication of a catastrophic failure of the drives in the array, such as loss of power to multiple drives in an array.
026h Drive ECC error reported
This AEN may be sent when a drive returns the ECC error response to an 3ware RAID controller command. The AEN may or may not be associated with a host command. Internal operations such as Background Media Scan post this AEN whenever drive ECC errors are detected.
Drive ECC errors are an indication of a problem with grown defects on a particular drive. For redundant arrays, this typically means that dynamic sector repair would be invoked (see AEN 023h). For non-redundant arrays (JBOD, RAID 0 and degraded arrays), drive ECC errors result in the 3ware RAID controller returning failed status to the associated host command.
027h DCB checksum error detected
The 3ware RAID controller stores certain configuration parameters on a reserved area of each disk drive called the Drive Configuration Block (DCB). As part of power-on initialization, the 3ware RAID controller performs a checksum of the DCB area to ensure consistency. If an error occurs, please contact 3ware technical support. The drive's DCB has been corrupted.
028h DCB version unsupported
During the evolution of the 3ware product line, the format of the DCB has been changed to accommodate new features. The DCB format expected by the 3ware RAID controller and the DCB that is written on the drive must be compatible. If an array that was created on a very old 3ware product is connected to a newer 3ware RAID controller, this AEN is posted and the 3ware RAID controller rejects the drive. Please contact 3ware technical support if this event occurs.
029h Verify started
The 3ware RAID controller allows the user to verify data integrity on an array.The verification functions for different RAID levels are as follows:
When the verification starts, this AEN is posted to the user.
02Ah Verify failed
This AEN indicates that the data integrity verification function (see AEN 029h) has terminated with an error. For each RAID level being verified, this may mean:
- Single, JBOD, and Spare. A single drive returned an error, possibly because of a media defect.
- RAID 0. A single drive returned an error, possibly because of a media defect.
- RAID 1. One side of the mirror does not equal the other side.
- RAID 10. One side of the mirror does not equal the other side.
- RAID 5 and 50. The parity data does not equal the user data.
For any RAID type, the most likely cause of the error is a grown defect in the drive. For out-of-synchronization mirrors or parity, the error could be caused by improper shutdown of the array. This possibility applies to RAID 1, 10, 5, and 50. A rebuild will re-synchronize the array.
02Bh Verify completed
This AEN indicates the data integrity verification function (see AEN 029h) was completed successfully.
02Ch Source drive ECC error overwritten
If a read error is encountered during a rebuild and the user chooses to ignore the error, the sector in error is reallocated. The user is notified of the event by this AEN.
02Dh Source drive error occurred
If an error is encountered during a rebuild operation, this AEN is generated if the error was on a source drive of the rebuild. Knowing if the error occurred on the source or the destination of the rebuild is useful for troubleshooting.
02Eh Replacement drive capacity too small
The 3ware RAID controller notifies the user by this AEN when the replacement drive capacity is smaller than required. The replacement drive must be equal to or greater capacity than the drive it's replacing.
02Fh Verify not started; unit never initialized
This AEN will be sent by the controller when a verify operation is attempted but the unit has never been initialized before. The unit will transition to initializing mode.
030h Drive not supported
3ware 8000 and 9000 series Serial ATA Controller only support UltraDMA-100/133 drives when using the parallel to serial ATA converter. This AEN indicates that an unsupported drive was detected during rollcall or a hot add. This AEN also could indicate that the Serial to Parallel converter was jumpered in the wrong place. The converter must be correctly jumpered to correspond to UDMA 100 or 133 drives.
032h Spare capacity too small for some units
This AEN is sent by the controller when it finds a valid hot spare but the capacity is not sufficient to use it for a drive replacement.
033h Migration started
This AEN is sent when migration of a unit is started.
034h Migration failed
This AEN is sent when migration of a unit fails. Look at the Alarms page for other entries that will give you an idea of why the migration failed (such as a drive error on a specific port).
035h Migration completed
This AEN is sent when migration of a unit is complete. The new capacity is now ready to be used. If the capacity of the array did not change, then you don't need to do anything else. If the capacity of the migrated array is larger, please refer to the part of this document on migration for information on how to change the file system to use the new capacity.
036h Verify fixed data/parity mismatch
This AEN is sent by the controller when a verify error is found (parity inconsistency for RAID5/50 or data mismatch for RAID1/10 configuration) and recovered. If the error is not recovered the AEN_VERIFY_FAILED is returned instead.
037h SO-DIMM not compatible
This AEN will be sent if an incompatible SODIMM memory has been connected to the controller. In this case, the controller is inoperable.
038h SO-DIMM not detected
This AEN will be sent if there is no SODIMM memory connected to the controller. In this case, the controller is inoperable.
039h Buffer ECC error corrected
This AEN will be sent when the controller has detected and corrected a memory ECC error.
03Ah Drive power on reset detected
If the controller detects that a drive has been power-cycled, it will send this AEN. The controller may degrade the unit (if possible).
03Bh Rebuild paused
This AEN will be sent when the rebuild operation is paused.
Rebuilds are normally paused for ten minutes after a system first boots up and during non-scheduled times when scheduling is enabled.
03Ch Initialize paused
This AEN will be sent when the initialization is paused.
Initializations are normally paused for ten minutes after a system first boots up and during non-scheduled times when scheduling is enabled. Initializations follow the rebuild schedule.
03Dh Verify paused
This AEN will be sent when the verify operation is paused.
Verifies are normally paused for ten minutes after a system first boots up and during non-scheduled times when scheduling is enabled.
03Eh Migration paused
This AEN is sent when migration is paused. Migration follows the rebuild schedule. For more information, see Scheduling Background Tasks.
03Fh Flash file system error detected
The 3ware RAID controller stores some configuration parameters as files in its flash memory. This AEN will be sent when a corrupted flash file system is found on the controller during boot-up. A further attempt will be made to repair the flash file system. These files usually get corrupted when a flash operation is interrupted by events such as power failures.
040h Flash file system repaired
This AEN will be sent if a corrupted flash file system is successfully repaired. Some of the flash files with insufficient data may be lost in the operation. The configuration parameters which are lost will then return to their default values.
041h Unit number assignments lost
The 3ware RAID controller tries to keep the unit numbers persistent across soft resets. This AEN will be sent if unit number assignments were lost from some unknown reasons.(This event rarely happens. Please contact AMCC 3ware technical support if this event occurs.)
042h Primary DCB read error occurred
This AEN will be sent when the controller finds error in reading the primary copy of the Disk Configuration Block (DCB). The back-up copy of the DCB will be read if this error occurs. If a valid DCB is found, the primary DCB is re-written to rectify the errors found.
043h Backup DCB read error detected
This AEN will be sent when the controller sees a latent error in the backup Disk Configuration Block (DCB). A scrubbing activity will be started to repair any sector errors on getting this error. An effort is made to read the backup DCB even when the primary DCB is successfully read. When an error occurs here, this is a latent error that needs to be addressed before any future errors so this DCB is re-written with the primary copy to rectify the errors found.
044h Battery voltage is normal
The Battery Backup Unit measures and evaluates the battery pack voltage on a continuous basis. If the voltage falls outside the acceptable range then comes back within the acceptable range, this AEN will be posted to the host.
045h Battery voltage is low
The Battery Backup Unit measures and evaluates the battery pack voltage on a continuous basis. If the voltage is below the warning threshold, this AEN will be posted to the user. When this event happens, the Battery Backup Unit is still able to backup the 3ware RAID controller, but the user should replace the battery.
046h Battery voltage is high
The Battery Backup Unit measures and evaluates the battery pack voltage on a continuous basis. If the voltage is above a warning threshold, this AEN will be posted to the user. When this event happens, the Battery backup Unit is still able to backup the 3ware RAID controller but the user should replace the battery.
047h Battery voltage is too low
The Battery Backup Unit measures and evaluates the battery pack voltage on a continuous basis. If the voltage is too low to operate, this AEN will be posted to the user. This indicates that the battery pack must be replaced. The Battery Backup Unit becomes not ready and is unable to backup the 3ware RAID controller.
048h Battery voltage is too high
The Battery Backup Unit measures and evaluates the battery pack voltage on a continuous basis. If the voltage is too high to operate, this AEN will be posted to the user. The Battery Backup Unit becomes not ready and is unable to backup the 3ware RAID controller. This indicates that the Battery Backup Unit must be replaced.
049h Battery temperature is normal
The Battery Backup Unit measures and evaluates the battery pack temperature on a continuous basis. If the temperature falls outside the acceptable range then comes back within the acceptable range, this AEN will be posted to the host.
04Ah Battery temperature is low
The Battery Backup Unit measures and evaluates the battery pack temperature on a continuous basis. If the temperature is below a warning threshold, this AEN will be posted to the user. When this event happens, the Battery Backup Unit is still able to backup the 3ware RAID controller but the user should replace the battery pack.
04Bh Battery temperature is high
The Battery Backup Unit measures and evaluates the battery pack temperature on a continuous basis. If the temperature is above a warning threshold, this AEN will be posted to the user. The user should check that there is enough airflow around the Battery Backup Unit. When this event happens, the Battery Backup Unit is still able to backup the 3ware RAID controller but the user should replace the battery pack if the temperature warning persists.
04Ch Battery temperature is too low
The Battery Backup Unit measures and evaluates the battery pack temperature on a continuous basis. If the temperature is too low to operate, this AEN will be posted to the user. When this event happens, the Battery Backup Unit becomes not ready and is unable to backup the 3ware RAID controller. The user must replace the battery pack.
04Dh Battery temperature is too high
The Battery Backup Unit measures and evaluates the battery pack temperature on a continuous basis. If the temperature is too high to operate, this AEN will be posted to the user. The user should check that there is enough airflow around the Battery Backup Unit. When this event happens, the Battery Backup Unit becomes not ready and is unable to backup the 3ware RAID controller. The user must replace the battery pack if the temperature error persists. The use of a PCI card in the slot adjacent to the BBU is not recommended and may result in the battery temperature being exceeded.
04Eh Battery capacity test started
This AEN is posted when the Battery Backup Unit starts a battery test. The test estimates the battery capacity in hours, which is how long the Battery Backup Unit can back up the 3ware RAID controller. This test performs a full battery charge/discharge/re-charge cycle and may take up to 20 hours to complete. During this test the Battery Backup Unit cannot backup the 3ware RAID controller; all units have their write cache disabled until the test completes.
04Fh Cache synchronization skipped
The 3ware RAID controller performs cache synchronization when system power is restored following a power failure. This AEN is posted when the cache synchronization was skipped and write data is still being backed up in the controller cache. This can occur if a unit that was present before the power failure was physically removed or became inoperable before system power was restored.
050h Battery capacity test completed
This AEN is posted when the Battery Backup Unit completes a battery capacity test. All units will have their write cache settings restored to their original values since the Battery Backup Unit is now able to backup the 3ware RAID controller.
051h Battery health check started
The Battery Backup Unit periodically evaluates the health of the battery and its ability to backup the 3ware RAID controller in case of a power failure. This AEN is posted to the host when this health check is started.
052h Battery health check completed
The Battery Backup Unit evaluates periodically the health of the battery and its ability to backup the 3ware RAID controller in case of a power failure. This AEN is posted to the host when this health check has completed.
053h Battery capacity test is overdue
The recommended time interval for running the battery capacity test is once every 4 weeks. If a battery capacity test has not been completed in the last 4 weeks this AEN will be sent to the host, and once every week thereafter.
055h Battery charging started
This AEN is posted when the Battery Backup Unit starts a battery charge cycle.
056h Battery charging completed
This AEN is posted when the Battery Backup Unit completes a battery charge cycle.
057h Battery charging fault
This AEN is posted when the charger of the Battery Backup Unit has detected a battery fault during a charge cycle. The Battery Backup Unit becomes not ready and is unable to backup the 3ware RAID controller.
058h Battery capacity is below warning level
The measured capacity of the battery is below the warning level. When this occurs the Battery Backup Unit is still able to backup the 3ware RAID controller but it signals that the battery pack should be replaced soon.
059h Battery capacity is below error level
The measured capacity of the battery is below the error level. When this occurs the Battery Backup Unit becomes not ready and is unable to backup the 3ware RAID controller. The user must replace the battery pack.
05Ah Battery is present
This AEN is posted to the host when the Battery Backup Unit detects that a battery pack has been connected.
05Bh Battery is not present
This AEN is posted to the host when the Battery Backup Unit detects that the battery pack has been removed.
05Ch Battery is weak
The Battery Backup Unit periodically evaluates the health of the battery and its ability to backup the 3ware RAID controller in case of a power failure. This AEN is posted when the result of the health test is below the warning threshold. This indicates that the battery pack should be replaced soon because the battery is becoming weak.
05Dh Battery health check failed
The Battery Backup Unit periodically evaluates the health of the battery and its ability to backup the 3ware RAID controller in case of a power failure. This AEN is posted when the result of the health test is below the fault threshold. This indicates that the battery pack must be replaced. The Battery Backup Unit becomes not ready and is unable to backup the 3ware RAID controller.
05Eh Cache synchronization completed
If drive insertion causes unit to become operational this will be sent if retained write cache data was flushed.
The 3ware RAID controller performs cache synchronization when system power is restored following a power failure. This AEN is posted for each unit when the cache synchronization completed successfully.
05Fh Cache synchronization failed; some data lost
The 3ware RAID controller performs cache synchronization when system power is restored following a power failure. This AEN is posted when cache synchronization was not successful for some reason.
AMCC www.3ware.com Direct:(408) 542-8800 Toll Free: (800) 840-6055 email: 3waresupport@amcc.com |
Copyright (c) 2004-2006, AMCC |