-
Bug
-
Resolution: Done
-
Undefined
-
None
-
4.11.z
-
None
-
None
-
Storage Sprint 229, Storage Sprint 230
-
2
-
False
-
-
N/A
-
Release Note Not Required
This bug is a backport clone of [Bugzilla Bug 2091873](https://bugzilla.redhat.com/show_bug.cgi?id=2091873). The following is the description of the original bug:
—
Description of problem:
[Local Storage Operator] Deleted node and added the node back localVolume/LocalVolumeSet CR provisioned PV couldn't become available again after pvc deleted
Version-Release number of selected component (if applicable):
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.nightly-2022-05-25-193227 True False 37m Cluster version is 4.11.0-0.nightly-2022-05-25-193227
$ oc get csv
NAME DISPLAY VERSION REPLACES PHASE
local-storage-operator.4.11.0-202205242136 Local Storage 4.11.0-202205242136 Succeeded
How reproducible:
Always
Steps to Reproduce:
1. Install a AWS cluster with OCP 4.11 nightly.
2. Install Local Storage Operator from OperatorHub of "stable" channel.
3. Create a LocalVolumeSet CR "mylvs" with fsType filesystem:ext4.
4. Create a EBS volume from aws backend and attach to one schedulable worker
5. Waiting for the volume provision avaiable PV by LocalVolumeSet "mylvs"
6. Create deployment,pvc consume the PV and wait for the deployment ready
7. Backup the volume attached node Object to yaml and delete the node by "oc delete node/ip-10-0-141-247.us-east-2.compute.internal"
8. Check the PV's status should not be "Terminating" and waiting for the deployment's pod become "Pending"
9. Add the node back by apply the backup yaml
10. Check the deployment become ready again
11. Delete the deployment,pvc and check the PV's status
- LVS
apiVersion: local.storage.openshift.io/v1alpha1
kind: LocalVolumeSet
metadata:
name: mylvs
spec:
deviceInclusionSpec:
deviceTypes:
- disk
- part
minSize: 1Gi
fsType: ext4
nodeSelector:
nodeSelectorTerms: - matchExpressions:
- key: kubernetes.io/hostname
operator: In
values: - ip-10-0-141-247.us-east-2.compute.internal
- ip-10-0-163-213.us-east-2.compute.internal
- ip-10-0-208-10.us-east-2.compute.internal
storageClassName: mylvs
volumeMode: Filesystem
—
- PVC
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: mypvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: mylvs
volumeMode: Filesystem
—
- Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
selector:
matchLabels:
app: myapp
replicas: 1
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: mycontainer
image: quay.io/pewang/storagetest:base
volumeMounts: - mountPath: /mnt/storage
name: data
volumes: - name: data
persistentVolumeClaim:
claimName: mypvc
Actual results:
In step 11: The PV become Released but couldn't become available again
wangpenghao@MacBook-Pro ~ oc get pvc
No resources found in openshift-local-storage namespace.
wangpenghao@MacBook-Pro ~ oc get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
local-pv-127200e9 11Gi RWO Delete Released openshift-local-storage/mypvc mylvs 14m
wangpenghao@MacBook-Pro ~ oc describe pv/local-pv-127200e9
Name: local-pv-127200e9
Labels: kubernetes.io/hostname=ip-10-0-141-247.us-east-2.compute.internal
storage.openshift.com/owner-kind=LocalVolumeSet
storage.openshift.com/owner-name=mylvs
storage.openshift.com/owner-namespace=openshift-local-storage
Annotations: pv.kubernetes.io/bound-by-controller: yes
pv.kubernetes.io/provisioned-by: local-volume-provisioner-ip-10-0-141-247.us-east-2.compute.internal-88d1b499-d1a8-4d57-bcbf-fbbc8fa5ffb0
storage.openshift.com/device-id: nvme-Amazon_Elastic_Block_Store_vol0b83b6ff8de3bfc60
storage.openshift.com/device-name: nvme1n1
Finalizers: [kubernetes.io/pv-protection]
StorageClass: mylvs
Status: Released
Claim: openshift-local-storage/mypvc
Reclaim Policy: Delete
Access Modes: RWO
VolumeMode: Filesystem
Capacity: 11Gi
Node Affinity:
Required Terms:
Term 0: kubernetes.io/hostname in [ip-10-0-141-247.us-east-2.compute.internal]
Message:
Source:
Type: LocalVolume (a persistent volume backed by local storage on a node)
Path: /mnt/local-storage/mylvs/nvme-Amazon_Elastic_Block_Store_vol0b83b6ff8de3bfc60
Events: <none>
- 40min later the pv still "Released"
$ oc get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
local-pv-127200e9 7Gi RWO Delete Released openshift-local-storage/mypvc mylvs 47m
Expected results:
In step 11: The PV become should become available again
Additional info:
Both localVolume/LocalVolumeSet CR has the same issue
Maybe related this fix: https://github.com/openshift/local-storage-operator/pull/334
This fix resolved the PV stuck "Terminating" by removing the ownerReferences of nodes.