RDAC multipathing in AIX - LUNs only available through one dac path
RDAC stands for Redundant Disk Array Controller and is an I/O path failover driver installed on the host computers that access the storage subsystem.
In this scenario we have an IBM p-series server running AIX directly connected to a DS4700 disk array via two paths.
The host AIX server is using the RDAC multipathing driver for resilience.
However this configuration is not working as it should. Half the LUNs mapped from the DS4700 should be presented through each path, however all LUNs are accessible through one path only.
As can be seen from the output of "fget_config -Av" below, all paths to the LUNs on the DS4700 are via dac0.
It also shows that dacNONE is active rather than dac1.
fget_config -Av
---dar0--- User array name = 'DS4700' dac0 ACTIVE dacNONE ACTIVE Disk DAC LUN Logical Drive utm 31 hdisk16 dac0 0 oradata1 hdisk17 dac0 1 oradata2 hdisk18 dac0 2 oradata3 hdisk19 dac0 3 archive hdisk20 dac0 4 orabackup hdisk21 dac0 5 app1 hdisk22 dac0 6 oralogs hdisk23 dac0 7 oratemp
The dar listed above is the disk array router and represents the entire array, including the current and the deferred paths to all LUNs (hdisks).
The dac listed above are the disk array controller devices and represent a controller within the storage subsystem.
Searching for additional information using "lsdev -H -Cc disk" as shown below, we can see that the locations for the DS4700 disk are 09-08-01 and 05-08-01.
lsdev -H -Cc disk
name status location description hdisk0 Available 03-08-01-3,0 16 Bit LVD SCSI Disk Drive hdisk1 Available 03-08-01-4,0 16 Bit LVD SCSI Disk Drive hdisk2 Available 03-08-01-5,0 16 Bit LVD SCSI Disk Drive hdisk3 Available 03-08-01-8,0 16 Bit LVD SCSI Disk Drive hdisk4 Available 00-08-01-3,0 16 Bit LVD SCSI Disk Drive hdisk5 Available 00-08-01-4,0 16 Bit LVD SCSI Disk Drive hdisk6 Available 00-08-01-5,0 16 Bit LVD SCSI Disk Drive hdisk7 Available 00-08-01-8,0 16 Bit LVD SCSI Disk Drive hdisk8 Defined 09-08-01 1814 DS4700 Disk Array Device hdisk9 Defined 09-08-01 1814 DS4700 Disk Array Device hdisk10 Defined 09-08-01 1814 DS4700 Disk Array Device hdisk11 Defined 09-08-01 1814 DS4700 Disk Array Device hdisk12 Defined 09-08-01 1814 DS4700 Disk Array Device hdisk13 Defined 09-08-01 1814 DS4700 Disk Array Device hdisk14 Defined 09-08-01 1814 DS4700 Disk Array Device hdisk15 Defined 09-08-01 1814 DS4700 Disk Array Device hdisk16 Available 05-08-01 1814 DS4700 Disk Array Device hdisk17 Available 05-08-01 1814 DS4700 Disk Array Device hdisk18 Available 05-08-01 1814 DS4700 Disk Array Device hdisk19 Available 05-08-01 1814 DS4700 Disk Array Device hdisk20 Available 05-08-01 1814 DS4700 Disk Array Device hdisk21 Available 05-08-01 1814 DS4700 Disk Array Device hdisk22 Available 05-08-01 1814 DS4700 Disk Array Device hdisk23 Available 05-08-01 1814 DS4700 Disk Array Device
Now we need to list all devices with the same locations as the DS4700 disks found above.
The output from "lsdev -C | grep 05-08" is shown below. We should only expect to see dac0.
lsdev -C | grep 05-08
dac0 Available 05-08-01 1814 DS4700 Disk Array Controller dac2 Defined 05-08-01 DS3/4K PCM User Interface fcnet0 Defined 05-08-02 Fibre Channel Network Protocol Device fcs0 Available 05-08 FC Adapter fscsi0 Available 05-08-01 FC SCSI I/O Controller Protocol Device hdisk16 Available 05-08-01 1814 DS4700 Disk Array Device hdisk17 Available 05-08-01 1814 DS4700 Disk Array Device hdisk18 Available 05-08-01 1814 DS4700 Disk Array Device hdisk19 Available 05-08-01 1814 DS4700 Disk Array Device hdisk20 Available 05-08-01 1814 DS4700 Disk Array Device hdisk21 Available 05-08-01 1814 DS4700 Disk Array Device hdisk22 Available 05-08-01 1814 DS4700 Disk Array Device hdisk23 Available 05-08-01 1814 DS4700 Disk Array Device
As can be seen from the output above, dac2 is defined, however dac2 should not exist.
The output from "lsdev -C | grep 09-08" is shown below. We should only expect to see dac1.
lsdev -C | grep 09-08
dac1 Defined 09-08-01 1814 DS4700 Disk Array Controller dac3 Defined 09-08-01 DS3/4K PCM User Interface dac4 Available 09-08-01 1814 DS4700 Disk Array Controller fcnet2 Defined 09-08-02 Fibre Channel Network Protocol Device fcs2 Available 09-08 FC Adapter fscsi2 Available 09-08-01 FC SCSI I/O Controller Protocol Device hdisk8 Defined 09-08-01 1814 DS4700 Disk Array Device hdisk9 Defined 09-08-01 1814 DS4700 Disk Array Device hdisk10 Defined 09-08-01 1814 DS4700 Disk Array Device hdisk11 Defined 09-08-01 1814 DS4700 Disk Array Device hdisk12 Defined 09-08-01 1814 DS4700 Disk Array Device hdisk13 Defined 09-08-01 1814 DS4700 Disk Array Device hdisk14 Defined 09-08-01 1814 DS4700 Disk Array Device hdisk15 Defined 09-08-01 1814 DS4700 Disk Array Device
As can be seen from the output above, dac3 is defined and dac4 is available, however neither should exist.
To fix we need to remove dac2, 3 and 4.
Use the rmdev command as follows.
rmdev -Rdl dac2
rmdev -Rdl dac3
rmdev -Rdl dac4
Then run config manager.
cfgmgr
Now run fget_config -Av again.
fget_config -Av
---dar0--- User array name = 'DS4700' dac0 ACTIVE dac1 ACTIVE Disk DAC LUN Logical Drive utm 31 hdisk16 dac0 0 oradata1 hdisk17 dac1 1 oradata2 hdisk18 dac0 2 oradata3 hdisk19 dac1 3 archive hdisk20 dac0 4 orabackup hdisk21 dac1 5 app1 hdisk22 dac0 6 oralogs hdisk23 dac1 7 oratemp
Now we can see that multipathing is working as it should with the LUNs split across the two paths, dac0 and dac1. dac1 also states that it is active.
We can therefore now move those LUNs on the DS4700 that are not on their preferred paths back to their preffered paths.
We still have an issue though because when we run "lsdev -Cc disk" we can see that there are still 8 defined disks and 8 available disks.
When specific information about a device is recorded, but it is unavailable to the system, the device is in a defined state. We can therefore state that the host has connectivity to the disks in the available state and that the defined disks are not required.
We again need to use rmdev as follows to remove these defined disks.
rmdev -Rdl hdisk8
Do this for all the defined disks, in this case hdisk8 to hdisk15
Then run config manager as follows.
cfgmgr
The defined disks should now have been removed and "lsdev -Cc disk" should give the following output.
name status location description hdisk0 Available 03-08-01-3,0 16 Bit LVD SCSI Disk Drive hdisk1 Available 03-08-01-4,0 16 Bit LVD SCSI Disk Drive hdisk2 Available 03-08-01-5,0 16 Bit LVD SCSI Disk Drive hdisk3 Available 03-08-01-8,0 16 Bit LVD SCSI Disk Drive hdisk4 Available 00-08-01-3,0 16 Bit LVD SCSI Disk Drive hdisk5 Available 00-08-01-4,0 16 Bit LVD SCSI Disk Drive hdisk6 Available 00-08-01-5,0 16 Bit LVD SCSI Disk Drive hdisk7 Available 00-08-01-8,0 16 Bit LVD SCSI Disk Drive hdisk16 Available 05-08-01 1814 DS4700 Disk Array Device hdisk17 Available 09-08-01 1814 DS4700 Disk Array Device hdisk18 Available 05-08-01 1814 DS4700 Disk Array Device hdisk19 Available 09-08-01 1814 DS4700 Disk Array Device hdisk20 Available 05-08-01 1814 DS4700 Disk Array Device hdisk21 Available 09-08-01 1814 DS4700 Disk Array Device hdisk22 Available 05-08-01 1814 DS4700 Disk Array Device hdisk23 Available 09-08-01 1814 DS4700 Disk Array Device