Pod Issues on Huawei

After a Worker Node in the Cluster Breaks Down and Recovers, Pod Failover Is Complete but the Source Host Where the Pod Resides Has Residual Drive Letters

Mon, 01 Jan 0001 00:00:00 +0000

Symptom

A Pod is running on worker node A, and an external block device is mounted to the Pod through CSI. After worker node A is powered off abnormally, the Kubernetes platform detects that the node is faulty and switches the Pod to worker node B. After worker node A recovers, the drive letters on worker node A change from normal to faulty.

Environment Configuration

Kubernetes version: 1.18 or later

When a Pod Is Created, the Pod Is in the ContainerCreating State

Mon, 01 Jan 0001 00:00:00 +0000

Symptom

A Pod is created. After a period of time, the Pod is still in the ContainerCreating state. Check the log information (for details, see Viewing Huawei CSI Logs ). The error message “Fibre Channel volume device not found” is displayed.

Root Cause Analysis

This problem occurs because residual disks exist on the host node. As a result, disks fail to be found when a Pod is created next time.

Solution or Workaround

Use a remote access tool, such as PuTTY, to log in to any master node in the Kubernetes cluster through the management IP address.

A Pod Is in the ContainerCreating State for a Long Time When It Is Being Created

Mon, 01 Jan 0001 00:00:00 +0000

Symptom

When a Pod is being created, the Pod is in the ContainerCreating state for a long time. Check the huawei-csi-node log (for details, see Viewing Huawei CSI Logs ). No Pod creation information is recorded in the huawei-csi-node log. After the kubectl get volumeattachment command is executed, the name of the PV used by the Pod is not displayed in the PV column. After a long period of time (more than ten minutes), the Pod is normally created and the Pod status changes to Running.

A Pod Fails to Be Created and the Log Shows That the Execution of the mount Command Times Out

Mon, 01 Jan 0001 00:00:00 +0000

Symptom

When a Pod is being created, the Pod keeps in the ContainerCreating status. In this case, check the log information of huawei-csi-node (for details, see Viewing Huawei CSI Logs ). The log shows that the execution of the mount command times out.

Root Cause Analysis

Cause 1: The configured service IP address is disconnected. As a result, the mount command execution times out and fails.

Cause 2: For some operating systems, such as Kylin V10 SP1 and SP2, it takes a long time to run the mount command in a container using NFSv3. As a result, the mount command may time out and error message “error: exit status 255” is displayed. The possible cause is that the value of LimitNOFILE of container runtime containerd is too large (over 1 billion).

A Pod Fails to Be Created and the Log Shows That the mount Command Fails to Be Executed

Mon, 01 Jan 0001 00:00:00 +0000

Symptom

In NAS scenarios, when a Pod is being created, the Pod keeps in the ContainerCreating status. In this case, check the log information of huawei-csi-node (for details, see Viewing Huawei CSI Logs ). The log shows that the mount command fails to be executed.

Root Cause Analysis

The possible cause is that the NFS 4.0/4.1/4.2 protocol is not enabled on the storage side. After the NFS v4 protocol fails to be used for mounting, the host does not negotiate to use the NFS v3 protocol for mounting.

A Pod Fails to Be Created and Message "publishInfo doesn't exist" Is Displayed in the Events Log

Mon, 01 Jan 0001 00:00:00 +0000

Symptom

When a Pod is being created, the Pod keeps in the ContainerCreating state. It is found that the following alarm event is printed for the Pod: rpc error: code = Internal desc = publishInfo doesn’t exist

Root Cause Analysis

As required by CSI, when a workload needs to use a PV, the Container Orchestration system (CO system, communicating with the CSI plug-in using RPC requests) invokes the ControllerPublishVolume interface (provided by huawei-csi-controller) in the CSI protocol provided by the CSI plug-in to map the PV, and then invokes the NodeStageVolume interface (provided by huawei-csi-node) provided by the CSI plug-in to mount the PV. During a complete mounting operation, only the huawei-csi-node service receives the NodeStageVolume request. Before that, the huawei-csi-controller service does not receive the ControllerPublishVolume request. As a result, the huawei-csi-controller service does not map the PV volume and does not send the mapping information to the huawei-csi-node service. Therefore, error message publishInfo doesn’t exist is reported.

After a Pod Fails to Be Created or kubelet Is Restarted, Logs Show That the Mount Point Already Exists

Mon, 01 Jan 0001 00:00:00 +0000

Symptom

When a Pod is being created, the Pod is always in the ContainerCreating state. Alternatively, after kubelet is restarted, logs show that the mount point already exists. Check the log information of huawei-csi-node (for details, see Viewing Huawei CSI Logs ). The error information is: The mount /var/lib/kubelet/pods/xxx/mount is already exist, but the source path is not /var/lib/kubelet/plugins/kubernetes.io/xxx/globalmount

Root Cause Analysis

The root cause of this problem is that Kubernetes performs repeated mounting operations.

"I/O error" Is Displayed When a Volume Directory Is Mounted to a Pod

Mon, 01 Jan 0001 00:00:00 +0000

Symptom

When a Pod reads or writes a mounted volume, message “I/O error” is displayed.

Root Cause Analysis

When a protocol such as SCSI is used, if the Pod continuously writes data to the mount directory, the storage device will restart. As a result, the link between the device on the host and the storage device is interrupted, triggering an I/O error. When the storage device is restored, the mount directory is still read-only.

Failed to Create a Pod Because the iscsi_tcp Service Is Not Started Properly When the Kubernetes Platform Is Set Up for the First Time

Mon, 01 Jan 0001 00:00:00 +0000

Symptom

When you create a Pod, error Cannot connect ISCSI portal *.*.*.*: libkmod: kmod_module_insert_module: could not find module by name=‘iscsi_tcp’ is reported in the /var/log/huawei-csi-node log.

Root Cause Analysis

The iscsi_tcp service may be stopped after the Kubernetes platform is set up and the iSCSI service is installed. You can run the following command to check whether the service is stopped.

lsmod | grep iscsi | grep iscsi_tcp

The following is an example of the command output.

A Pod Fails to Be Created and Logs Show That an Initiator Has Been Associated with Another Host

Mon, 01 Jan 0001 00:00:00 +0000

Symptom

When a Pod is created using SAN storage, the Pod is always in the ContainerCreating status. The Pod logs report alarm event “rpc error: code = Internal desc = initiator xxx is already associated to another host”.

Root Cause Analysis

Cause 1: CSI automatically creates hosts, host groups, and initiators based on certain rules. If the same resources exist on the storage side before CSI is used, conflicts will occur. The possible cause is that the same initiator has been added before CSI is used.

A Pod Fails to Be Created and Logs Show "Get DMDevice by alias: dm-x failed"

Mon, 01 Jan 0001 00:00:00 +0000

Symptom

When a Pod is created, the Pod is in the ContainerCreating status for a long time. In addition, the following error message is reported in the logs of huawei-csi-node (for details, see Viewing Huawei CSI Logs ):

check device: dm-1 is a partition device failed. error: Get DMDevice by alias:dm-1 failed. error: Can not get DMDevice by alias: dm-1

Root Cause Analysis

In the DM-Multipath configuration file, the user_friendly_names parameter is not set to yes.

After Pods on the Same Node Are Deleted in a Batch Using the NVMe Protocol, Residual NVMe Links Exist on the Node

Mon, 01 Jan 0001 00:00:00 +0000

Symptom

In the NVMe protocol scenario, when pods on the same node are deleted in a batch, the pods are successfully deleted, but the NVMe links on the node are not cleared.

# nvme list-subsys
nvme-subsys0 - NQN=nqn.xxx.nvme:nvm-subsystem-sn-xxxxxxx
\
 +- nvme0 tcp traddr=xxx.xxx.xxx.xxx,trsvcid=4420,src_addr=xxx.xxx.xxx.xxx live 
 +- nvme1 tcp traddr=xxx.xxx.xxx.xxx,trsvcid=4420,src_addr=xxx.xxx.xxx.xxx live

Root Cause Analysis

If the NVMe protocol is used, the device paths on the host are cleared only after the host is unmapped from the storage resources. When multiple pods are mounted to the same volume and the pods are deleted in a batch, the CSI in the NodeUnstageVolume phase (unmount phase) cannot detect the device path cleanup in the subsequent ControllerUnpublishVolume phase (unmap phase). As a result, the NVMe links cannot be cleared in a timely manner.

In the SAN HyperMetro Scenario, the Subpath of the Aggregated Disks Corresponding to the Mounted Volume Is Lost

Mon, 01 Jan 0001 00:00:00 +0000

Symptom

The subpath of the aggregated disks corresponding to the mounted resource is lost.

Root Cause Analysis

Figure 1 SAN HyperMetro subpath loss

As shown in Figure 1 , if the link between the host and the storage device is disconnected due to factors such as HBA/NIC exceptions, switch/network jitter, or storage array service port faults, the host restarts and triggers disk scanning again. In this case, the link to the faulty storage device is disconnected on the host. After the fault is rectified, the link information is lost after the host scans the disks again. As a result, the lost link will not be automatically restored.