After a Worker Node in the Cluster Breaks Down and Recovers, Pod Failover Is Complete but the Source Host Where the Pod Resides Has Residual Drive Letters
Symptom
A Pod is running on worker node A, and an external block device is mounted to the Pod through CSI. After worker node A is powered off abnormally, the Kubernetes platform detects that the node is faulty and switches the Pod to worker node B. After worker node A recovers, the drive letters on worker node A change from normal to faulty.
Environment Configuration
Kubernetes version: 1.18 or later
Storage type: block storage
Root Cause Analysis
After worker node A recovers, Kubernetes initiates an unmapping operation on the storage, but does not initiate a drive letter removal operation on the host. After Kubernetes completes the unmapping, residual drive letters exist on worker node A.
Solution or Workaround
Currently, you can only manually clear the residual drive letters on the host. Alternatively, restart the host again and use the disk scanning mechanism during the host restart to clear the residual drive letters. The specific method is as follows:
Check the residual drive letters on the host.
Run the following command to check whether a DM multipathing device with abnormal multipathing status exists.
multipath -ll
The following is an example of the command output. The path status is failed faulty running, the corresponding DM multipathing device is dm-12, and the associated SCSI disks are sdi and sdj. If multiple paths are configured, multiple SCSI disks exist. Record these SCSI disks.
mpathb (3618cf24100f8f457014a764c000001f6) dm-12 HUAWEI ,XSG1 size=100G features='0' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=-1 status=active |- 39:0:0:1 sdi 8:48 failed faulty running `- 38:0:0:1 sdj 8:64 failed faulty running
- If yes, go to step 1.2.
- If no, no further action is required.
Run the following command to check whether the residual DM multipathing device is readable.
dd if=/dev/dm-12 of=/dev/null count=1 bs=1M iflag=direct
The following is an example of the command output. If the returned result is Input/output error and the read data is 0 bytes (0 B) copied, the device is unreadable. dm-xx indicates the device ID obtained in step 1.1.
dd: error reading '/dev/dm-12': Input/output error 0+0 records in 0+0 records out 0 bytes (0 B) copied, 0.0236862 s, 0.0 kB/s
Log in to the node again in another window.
Run the following command to view the suspended process.
ps -ef | grep dm-12 | grep -w dd
The following is an example of the command output.
root 21725 9748 0 10:33 pts/10 00:00:00 dd if=/dev/dm-12 of=/dev/null count=1 bs=10M iflag=direct
Kill the pid.
kill -9 pid
Record the residual dm-xx device and associated disk IDs (for details, see step 1.1) and perform the clearing operation.
Clear the residual drive letters on the host.
Run the following command to delete residual multipathing aggregation device information according to the DM multipathing device obtained in step 1.
multipath -f /dev/dm-12
If an error is reported, contact technical support engineers.
Run the following command to clear the residual SCSI disks according to the drive letters of the residual disks obtained in step 1.
echo 1 > /sys/block/xxxx/device/delete
When multiple paths are configured, clear the residual disks based on the drive letters. The residual paths are sdi and sdj.
echo 1 > /sys/block/sdi/device/delete echo 1 > /sys/block/sdj/device/delete
If an error is reported, contact technical support engineers.
Check whether the DM multipathing device and SCSI disk information has been cleared.
Run the following commands in sequence to query the multipathing and disk information. If the residual dm-12 device and SCSI disks sdi and sdj are cleared, the clearing is complete.
View multipathing information.
multipath -ll
The following is an example of the command output. The residual dm-12 device is cleared.
mpathb (3618cf24100f8f457014a764c000001f6) dm-3 HUAWEI ,XSG1 size=100G features='0' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=-1 status=active |- 39:0:0:1 sdd 8:48 active ready running `- 38:0:0:1 sde 8:64 active ready running mpathn (3618cf24100f8f457315a764c000001f6) dm-5 HUAWEI ,XSG1 size=100G features='0' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=-1 status=active |- 39:0:0:2 sdc 8:32 active ready running `- 38:0:0:2 sdb 8:16 active ready running
View device information.
ls -l /sys/block/
The following is an example of the command output. SCSI disks sdi and sdj are cleared.
total 0 lrwxrwxrwx 1 root root 0 Aug 11 19:56 dm-0 -> ../devices/virtual/block/dm-0 lrwxrwxrwx 1 root root 0 Aug 11 19:56 dm-1 -> ../devices/virtual/block/dm-1 lrwxrwxrwx 1 root root 0 Aug 11 19:56 dm-2 -> ../devices/virtual/block/dm-2 lrwxrwxrwx 1 root root 0 Aug 11 19:56 dm-3 -> ../devices/virtual/block/dm-3 lrwxrwxrwx 1 root root 0 Aug 11 19:56 sdb -> ../devices/platform/host35/session2/target35:0:0/35:0:0:1/block/sdb lrwxrwxrwx 1 root root 0 Aug 11 19:56 sdc -> ../devices/platform/host34/target34:65535:5692/34:65535:5692:0/block/sdc lrwxrwxrwx 1 root root 0 Aug 11 19:56 sdd -> ../devices/platform/host39/session6/target39:0:0/39:0:0:1/block/sdd lrwxrwxrwx 1 root root 0 Aug 11 19:56 sde -> ../devices/platform/host38/session5/target38:0:0/38:0:0:1/block/sde lrwxrwxrwx 1 root root 0 Aug 11 19:56 sdh -> ../devices/platform/host39/session6/target39:0:0/39:0:0:3/block/sdh lrwxrwxrwx 1 root root 0 Aug 11 19:56 sdi -> ../devices/platform/host38/session5/target38:0:0/38:0:0:3/block/sdi
View disk information.
ls -l /dev/disk/by-id/
The following is an example of the command output. SCSI disks sdi and sdj are cleared.
total 0 lrwxrwxrwx 1 root root 10 Aug 11 19:57 dm-name-mpathb -> ../../dm-3 lrwxrwxrwx 1 root root 10 Aug 11 19:58 dm-name-mpathn -> ../../dm-5 lrwxrwxrwx 1 root root 10 Aug 11 19:57 dm-uuid-mpath-3618cf24100f8f457014a764c000001f6 -> ../../dm-3 lrwxrwxrwx 1 root root 10 Aug 11 19:58 dm-uuid-mpath-3618cf24100f8f457315a764c000001f6 -> ../../dm-5 lrwxrwxrwx 1 root root 9 Aug 11 19:57 scsi-3618cf24100f8f457014a764c000001f6 -> ../../sdd lrwxrwxrwx 1 root root 9 Aug 11 19:57 scsi-3618cf24100f8f45712345678000103e8 -> ../../sdi lrwxrwxrwx 1 root root 9 Aug 3 15:17 scsi-3648435a10058805278654321ffffffff -> ../../sdb lrwxrwxrwx 1 root root 9 Aug 2 14:49 scsi-368886030000020aff44cc0d060c987f1 -> ../../sdc lrwxrwxrwx 1 root root 9 Aug 11 19:57 wwn-0x618cf24100f8f457014a764c000001f6 -> ../../sdd lrwxrwxrwx 1 root root 9 Aug 11 19:57 wwn-0x618cf24100f8f45712345678000103e8 -> ../../sdi lrwxrwxrwx 1 root root 9 Aug 3 15:17 wwn-0x648435a10058805278654321ffffffff -> ../../sdb lrwxrwxrwx 1 root root 9 Aug 2 14:49 wwn-0x68886030000020aff44cc0d060c987f1 -> ../../sdc