persistentvolume-controller, "fedora-ostree-content-volume-2" already bound to a different claim #12555

Closed
opened 2025-05-12 14:33:12 +00:00 by c4rt0 · 15 comments

I'm trying to migrate a few CoreOS related projects from DeploymentConfig to Deployment. The one I will be focusing in this example is the fedora-ostree-pruner.
Currently number of the replicas is set to 1, after the build is complete, the replica which is being created remains in the Pending stage.

adamsky@fedorapc  ~/Work/ansible  ↱ main ±  oc get pods
NAME                                   READY   STATUS      RESTARTS   AGE
fedora-ostree-pruner-build-1-build     0/1     Completed   0          2m11s
fedora-ostree-pruner-f64475887-gjwt5   0/1     Pending     0          96s

Upon somewhat of an investigation it was observed that the volume is already bound to a different claim:

 adamsky@fedorapc  ~/Work/ansible  ↱ main ±  oc describe pods
...
Volumes:
  fedora-ostree-content-volume:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  fedora-ostree-content-volume
    ReadOnly:   false
...
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  107s  default-scheduler  0/8 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/8 nodes are available: 8 Preemption is not helpful for scheduling.

This is true for staging... and as I already found out, also for production 👽

OpenShift Workloads Dashboard returns:

Conditions:
Type, Status, Updated, Reason, 
PodScheduled, False,   May 12, 2025, 1:12 PM, Unschedulable

Message:
0/8 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/8 nodes are available: 8 Preemption is not helpful for scheduling.

Describe what you would like us to do: Due to permission issues I am unable to dig deeper into the matter, it would be nice if I could get some assistance in figuring this out.


When do you need this to be done by? (YYYY/MM/DD) : ASAP


I'm trying to migrate a few CoreOS related projects from DeploymentConfig to Deployment. The one I will be focusing in this example is the [fedora-ostree-pruner](https://pagure.io/fedora-infra/ansible/blob/main/f/playbooks/openshift-apps/fedora-ostree-pruner.yml). Currently number of the replicas is set to 1, after the build is complete, the replica which is being created remains in the `Pending` stage. ``` adamsky@fedorapc  ~/Work/ansible  ↱ main ±  oc get pods NAME READY STATUS RESTARTS AGE fedora-ostree-pruner-build-1-build 0/1 Completed 0 2m11s fedora-ostree-pruner-f64475887-gjwt5 0/1 Pending 0 96s ``` Upon somewhat of an investigation it was observed that the volume is already bound to a different claim: ``` adamsky@fedorapc  ~/Work/ansible  ↱ main ±  oc describe pods ... Volumes: fedora-ostree-content-volume: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: fedora-ostree-content-volume ReadOnly: false ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 107s default-scheduler 0/8 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/8 nodes are available: 8 Preemption is not helpful for scheduling. ``` This is true for staging... and as I already found out, also for production :alien: OpenShift Workloads Dashboard returns: ``` Conditions: Type, Status, Updated, Reason, PodScheduled, False, May 12, 2025, 1:12 PM, Unschedulable Message: 0/8 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/8 nodes are available: 8 Preemption is not helpful for scheduling. ``` ---- Describe what you would like us to do: Due to permission issues I am unable to dig deeper into the matter, it would be nice if I could get some assistance in figuring this out. ---- When do you need this to be done by? (YYYY/MM/DD) : ASAP ----
Author

cc @cverna

cc @cverna

On stg we tried deleting the PVC and re-running the playbook, but that didn't seems to help.

On stg we tried deleting the PVC and re-running the playbook, but that didn't seems to help.
Contributor

Metadata Update from @zlopez:

  • Issue priority set to: Waiting on Assignee (was: Needs Review)
  • Issue tagged with: Needs investigation, high-gain
**Metadata Update from @zlopez**: - Issue priority set to: Waiting on Assignee (was: Needs Review) - Issue tagged with: Needs investigation, high-gain

You need to specify the storage class name in your pvc definition, otherwise it uses ocs-storagecluster-ceph-rbd by default.
If you want to use a netapp volume (nfs) then use

spec:
  storageClassName: ""

(note that there is no PV fedora-ostree-content-volume-2 on stg, so the PVC will stay in pending until an admin creates that PV.
If you want an RBD volume, then use storageClassName: ocs-storagecluster-ceph-rbd and remove the volumeName. ODF will provision a new volume for you automatically.
Note that you cannot use ReadWriteMany in Filesystem mode with RBD.
If you want to use that combination, use the ocs-storagecluster-cephfs storage class instead.

You need to specify the storage class name in your pvc definition, otherwise it uses `ocs-storagecluster-ceph-rbd` by default. If you want to use a netapp volume (nfs) then use ``` spec: storageClassName: "" ``` (note that there is no PV `fedora-ostree-content-volume-2` on stg, so the PVC will stay in pending until an admin creates that PV. If you want an RBD volume, then use `storageClassName: ocs-storagecluster-ceph-rbd` and remove the `volumeName`. ODF will provision a new volume for you automatically. Note that you cannot use `ReadWriteMany` in `Filesystem` mode with RBD. If you want to use that combination, use the `ocs-storagecluster-cephfs` storage class instead.

@c4rt0 It looks like in coreos-ostree-importer we have a pvc template --> https://pagure.io/fedora-infra/ansible/blob/main/f/roles/openshift-apps/coreos-ostree-importer/templates/pvc.yml.j2

Maybe we should have the same for fedore-ostree-pruner

@c4rt0 It looks like in coreos-ostree-importer we have a pvc template --> https://pagure.io/fedora-infra/ansible/blob/main/f/roles/openshift-apps/coreos-ostree-importer/templates/pvc.yml.j2 Maybe we should have the same for fedore-ostree-pruner
Author

@c4rt0 It looks like in coreos-ostree-importer we have a pvc template --> https://pagure.io/fedora-infra/ansible/blob/main/f/roles/openshift-apps/coreos-ostree-importer/templates/pvc.yml.j2

Maybe we should have the same for fedore-ostree-pruner

I can see we already do, and from my understanding @darknao is referring to it. It's identical to the one coreos-ostree-importer:
https://pagure.io/fedora-infra/ansible/blob/main/f/roles/openshift-apps/fedora-ostree-pruner/templates/pvc.yml.j2

Thank you both of your answers. I think we can use ocs-storagecluster-cephfs as it is probably a good idea to keep the already existing ReadWriteMany. To be completely honest I'm not sure if we need the RBD volume.
I will post an update here as soon as I will test the above.

> @c4rt0 It looks like in coreos-ostree-importer we have a pvc template --> https://pagure.io/fedora-infra/ansible/blob/main/f/roles/openshift-apps/coreos-ostree-importer/templates/pvc.yml.j2 > > Maybe we should have the same for fedore-ostree-pruner I can see we already do, and from my understanding @darknao is referring to it. It's identical to the one coreos-ostree-importer: https://pagure.io/fedora-infra/ansible/blob/main/f/roles/openshift-apps/fedora-ostree-pruner/templates/pvc.yml.j2 Thank you both of your answers. I think we can use `ocs-storagecluster-cephfs` as it is probably a good idea to keep the already existing `ReadWriteMany`. To be completely honest I'm not sure if we need the RBD volume. I will post an update here as soon as I will test the above.

If the volume is shared between apps in and out of OpenShift, then the NetApp (NFS) volume makes sense.
If it's shared with other pods in the same namespace (or more than 1 replicas running), then CephFS.
For any other use (single pod, 1 replica), then RBD provides the best performance.

If the volume is shared between apps in and out of OpenShift, then the NetApp (NFS) volume makes sense. If it's shared with other pods in the same namespace (or more than 1 replicas running), then CephFS. For any other use (single pod, 1 replica), then RBD provides the best performance.

The volumes for fedora-ostree-pruner and coreos-ostree-importer are special because they are essentially maps into a netapp volume where the main (compose & prod) ostree repos are stored.

I guess we need the volume created in staging for us?

The volumes for fedora-ostree-pruner and coreos-ostree-importer are special because they are essentially maps into a netapp volume where the main (compose & prod) ostree repos are stored. I guess we need the volume created in staging for us?
Author

We spent some time with @cverna on it today. We have the volume specified, but we still have the pod hanging both in staging and in production. As you can see our updated pvc file contains both storageClassName and volumeName :/

We spent some time with @cverna on it today. We have the volume specified, but we still have the pod `hanging` both in staging and in production. As you can see our [updated pvc file](https://pagure.io/fedora-infra/ansible/blob/main/f/roles/openshift-apps/fedora-ostree-pruner/templates/pvc.yml.j2) contains both storageClassName and volumeName :/
Author
 ✘ adamsky@fedorapc  ~/Work/ansible   PR/pvc_patch  oc get pvc
NAME                           STATUS    VOLUME                           CAPACITY   ACCESS MODES   STORAGECLASS                VOLUMEATTRIBUTESCLASS   AGE
fedora-ostree-content-volume   Pending   fedora-ostree-content-volume-1   0                         ocs-storagecluster-cephfs   <unset>                 38m
``` ✘ adamsky@fedorapc  ~/Work/ansible   PR/pvc_patch  oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE fedora-ostree-content-volume Pending fedora-ostree-content-volume-1 0 ocs-storagecluster-cephfs <unset> 38m ```
Author

After reverting the changes in pvc.yml.j2 back to it's original stage and after a manual binding of the volume by @kevin the fedora-ostree-pruner in production is working once again as expected, using deployment:

 adamsky@fedora  ~/Work/ansible   PR/revert_to_DC  oc get deployment
NAME                   READY   UP-TO-DATE   AVAILABLE   AGE
fedora-ostree-pruner   1/1     1            1           28m

Staging is still an issue.

After reverting the [changes in pvc.yml.j2](https://pagure.io/fedora-infra/ansible/pull-request/2624) back to it's original stage and after a manual binding of the volume by @kevin the `fedora-ostree-pruner` in production is working once again as expected, using deployment: ``` adamsky@fedora  ~/Work/ansible   PR/revert_to_DC  oc get deployment NAME READY UP-TO-DATE AVAILABLE AGE fedora-ostree-pruner 1/1 1 1 28m ``` Staging is still an issue.

I created the pv in stg ( fedora-ostree-content-volume-2 )
but the pvc still seems to refer to fedora-ostree-content-volume-1 ?

I created the pv in stg ( fedora-ostree-content-volume-2 ) but the pvc still seems to refer to fedora-ostree-content-volume-1 ?

Metadata Update from @kevin:

  • Issue assigned to kevin
**Metadata Update from @kevin**: - Issue assigned to kevin

I think this is all sorted now?

If not, please re-open...

I think this is all sorted now? If not, please re-open...

Metadata Update from @kevin:

  • Issue close_status updated to: Fixed with Explanation
  • Issue status updated to: Closed (was: Open)
**Metadata Update from @kevin**: - Issue close_status updated to: Fixed with Explanation - Issue status updated to: Closed (was: Open)
Sign in to join this conversation.
No milestone
No project
No assignees
6 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Infrastructure/fedora-infrastructure#12555
No description provided.