Volume errors are typically transactions left unfinished during a system crash of some kind. This type of error is fixed automatically during volume mount by the NSS journaling feature. Journaling in NSS handles the same level of problems as Vrepair does on NetWare Traditional volumes.
If errors persist after you mount the volume, or if you cannot mount the volume, first rule out hardware causes for the problems. For information, see Section 17.2.2, Ruling Out Hardware Causes.
If a volume cannot be mounted or problems persist after journaling errors are resolved, check the hardware for faulty media or controller problems.
Make sure you have a good backup of the data.
Use the latest diagnostic software and utilities from the manufacturer of your hard drives and controllers to troubleshoot the hard drives without destroying the data.
For example, verify the media integrity and that devices are operating correctly.
If necessary, repair the media or controllers.
If errors persist after you have ruled out hardware causes, and you do not have a viable backup to restore to the last known good state, you should check the pool for metadata inconsistencies. For information, see Section 17.2.3, Verifying the Pool to Identify Metadata Inconsistencies.
The verify process is a read-only assessment of the pool. The Pool Verify option searches the pool for inconsistent data blocks or other errors in the file system’s metadata and reports data in the verification log. For information on where to find the verification log and how to interpret any reported errors, see Section 17.2.4, Reviewing Log Files for Metadata Consistency Errors.
Following one of these procedures to verify the pool:
For a 32-bit machine, make sure you have enough space available in the Linux kernel cache memory to run a pool verify.
When running ravsui(8) for a pool verify or a pool rebuild on Linux, the utility needs contiguous space in kernel memory separate from the space allocated to the core NSS process. The larger the pool, the larger the space that is needed. To make space available, you might need to reduce the space used by other processes. You can optionally reduce the minimum number of buffers reserved for the core NSS process to as little as 10,000 4-KB buffers.
Open a terminal console as the root user.
At the console prompt, enter
nsscon
In nsscon, enter
nss MinBufferCacheSize=10000
Place the pool in maintenance mode.
At a terminal prompt, enter
nsscon
In nsscon, enter
nss /PoolMaintenance=poolname
Start the pool verify by entering the following at the terminal console prompt:
ravsui verify poolname
Use RAVVIEW to read the logs.
For information about using RAVVIEW, see Section B.17, RAVVIEW (Linux).
Do one of the following:
If the log reports no errors with the pool’s metadata, it is safe to activate the pool and mount the volumes.
If the log reports no errors with the pool’s metadata, but you still cannot create files or directories, run a Pool Rebuild with the ReZID option. For information, see Section 17.3, ReZIDing Volumes in an NSS Pool.
If the log reports errors with the pool’s metadata, the affected volumes remain in Maintenance mode. Decide whether to rebuild the pool based on the type of error and potential outcomes. For information about rebuilding the pool, see Section 17.2.5, Rebuilding NSS Pools to Repair Metadata Consistency.
For a 32-bit machine, if you modified the MinBufferCacheSize setting in Step 1, you can change it back to its original setting now, unless you are continuing with a pool rebuild.
Open a terminal console as the root user.
At the console prompt, enter
nsscon
In nsscon, enter
nss MinBufferCacheSize=value
Replace value with the desired minimum number of 4-KB buffers. The default value is 30000.
Place the pool in maintenance mode by entering the following at the server console prompt:
nss /PoolMaintenance=poolname
Verify the pool by entering the following at the server console prompt:
nss /poolverify=poolname
Review any errors on-screen or in the volume_name.rlf file, located at the root of the DOS drive.
Do one of the following:
If the log reports no errors with the pool’s metadata, the pools and volumes are automatically activated. It is safe to mount the volumes.
If the log reports no errors with the pool’s metadata, but you still cannot create files or directories, run a Pool Rebuild with the ReZID option. For information, see Section 17.3, ReZIDing Volumes in an NSS Pool.
If the log reports errors with the pool’s metadata, the volumes affected remain in Maintenance mode. Decide whether to rebuild the pool based on the type of error and potential outcomes. For information about rebuilding the pool, see Section 17.2.5, Rebuilding NSS Pools to Repair Metadata Consistency.
Make sure to check the error log whenever an NSS volume does not come up in active mode after a verify or rebuild.
Messages are written to the following logs:
Table 17-1 Location of Log Files for the NSS Pool Verify and Pool Rebuild Utilities
Platform |
Log |
Purpose |
---|---|---|
Linux |
/var/opt/novell/log/nss/rav/filename.vbf This is the default location, but you can specify the location and the filename. |
Log of the pool verify process using ravsui verify. If a volume has errors, the errors are displayed on the screen and written to this log file of errors and transactions. On Linux, use the RAVVIEW utility to read logs. For information, see Section B.17, RAVVIEW (Linux). |
/var/opt/novell/log/nss/rav/filename.rtf |
Log of the pool rebuild process using ravsui rebuild. This log contains information about data that has been lost during a rebuild by the pruning of leaves in the data structure. |
|
NetWare |
filename.vlf, located at the root of the server’s DOS drive. |
Log of the pool verify process using poolverify. If a volume in the pool has errors, the errors are displayed on the screen and written to this log file of errors and transactions. |
filename.rlf, located at the root of the server’s DOS drive. |
Log of the pool rebuild process using poolrebuild. This log contains information about data that has been lost during a rebuild by the pruning of leaves in the data structure. |
Whenever you verify or rebuild a pool, the new information is appended at the end of the log file. If you want to keep old log files intact, rename the log file or move it to another location before you start the verify or rebuild process.
Warnings indicate that there are problems with the metadata, but that there is no threat of data corruption. Performing a data restore from a backup tape or rebuilding the pool’s metadata are optional. However, rebuilding a pools’s metadata typically results in some data loss.
Errors indicate that there are physical integrity problems with the pool’s metadata, and data corruption will definitely occur, or it will continue to occur, if you continue to use the pool as it is.
If you decide to rebuild the pool, use the Pool Rebuild utility. For information, see Section 17.2.5, Rebuilding NSS Pools to Repair Metadata Consistency.
If the verify log does not report errors, but you continue to be unable to create files or directories on volumes in the pool, it might be because the files’ ID numbers have exceeded the maximum size of file numbering field. You might need to rebuild the pool with the ReZID option. For information about how to decide if a ReZID is needed, see Section 17.3, ReZIDing Volumes in an NSS Pool.
The purpose of a pool rebuild is to repair the metadata consistency of the file system. Rebuild uses the existing leaves of an object tree to rebuild all the other trees in the system to restore visibility of files and directories. It checks all blocks in the system. Afterwards, the NSS volume remains in maintenance mode if there are still errors in the data structure; otherwise, it reverts to the active state.
WARNING:Data will be lost during the rebuild.
A pool rebuild depends on many variables, so it is difficult to estimate how long it might take. The number of storage objects in a pool, such as volumes, directories, and files, is the primary consideration in determining the rebuild time, not the size of the pool. This is because a pool rebuild is reconstructing the metadata for the pool, not its data. For example, it would take longer to rebuild the metadata for a 200 GB pool with many files than for a 1 TB pool with only a few files. Other key variables are the number of processors, the speed of the processors, and the size of the memory available in the server.
You do not need to bring down the server to rebuild a pool. NSS allows you to temporarily place an individual storage pool in maintenance mode while you verify or rebuild it. While the pool is deactivated, users do not have access to any of the volumes in that pool.
If you do not place the pool in maintenance mode before issuing the rebuild or verify commands, you receive NSS Error 21726:
NSS error: PoolVerify results
Status: 21726
Name: zERR_RAV_STATE_MAINTENANCE_REQUIRED
Source: nXML.cpp[1289]
Following one of these procedures to rebuild the pool:
Depending on the nature of the reported errors, you might want to open a call with Novell Support before you begin the rebuild process.
For a 32-bit machine, make sure you have enough space available in the Linux kernel cache memory to run a pool rebuild.
When running ravsui(8) for a pool verify or a pool rebuild on Linux, the utility needs contiguous space in kernel memory separate from the space allocated to the core NSS process. The larger the pool, the larger the space that is needed. To make space available, you might need to reduce the space used by other processes. You can optionally reduce the minimum number of buffers reserved for the core NSS process to as little as 10,000 4-KB buffers.
Open a terminal console as the root user.
At the console prompt, enter
nsscon
In nsscon, enter
nss MinBufferCacheSize=10000
Place the pool in maintenance mode.
At a terminal prompt, enter
nsscon
In nsscon, enter
nss /PoolMaintenance=poolname
Start the pool rebuild. At the terminal console prompt, enter
ravsui rebuild poolname
For information, see Section B.16, RAVSUI (Linux) for options to set the pruning parameters for the rebuild.
Rebuilding can take several minutes to several hours, depending on the number of storage objects in the pool.
Review the log on-screen or in the filename.rtf file to learn what data has been lost during the rebuild.
For information, see Section 17.2.4, Reviewing Log Files for Metadata Consistency Errors.
Do one of the following:
No Errors: If errors do not exist at the end of the rebuild, the pool’s volumes are available for mounting.
Errors: If errors still exist, the pool remains in the maintenance state. Repeat the pool verify to determine the nature of the errors, then call Novell Support for assistance.
For a 32-bit machine, if you modified the MinBufferCacheSize setting in Step 2, you can change it back to its original setting.
Open a terminal console as the root user.
At the console prompt, enter
nsscon
In nsscon, enter
nss MinBufferCacheSize=value
Replace value with the desired minimum number of 4-KB buffers. The default value is 30000.
Depending on the nature of the reported errors, you might want to open a call with Novell Support before you begin the rebuild process.
Place the pool in maintenance mode. At a terminal prompt, enter
nss /PoolMaintenance=poolname
To run Rebuild, enter the following command at the server console:
nss /poolrebuild=poolname
Replace poolname with the name of the pool you want to rebuild.
Rebuilding can take several minutes to several hours, depending on the number of objects in the pool.
Read the filename.rlf file at the root of the DOS drive on your server for information about data that has been lost.
For information, see Section 17.2.4, Reviewing Log Files for Metadata Consistency Errors.
Do one of the following:
No Errors: If errors do not exist at the end of the rebuild, the pool is activated automatically. It is safe to mount the volumes.
Errors: If errors still exist, the pool remains in the maintenance state. Repeat the pool verify to determine the nature of the errors, then call Novell Support for assistance.