infra-docs-fpo/modules/sysadmin_guide/pages/sop_upgrade.adoc

= Upgrade OCP4 Cluster
Please see the official documentation for more information [1][3], this SOP can be used as a rough guide.

== Resources

- [1] https://docs.openshift.com/container-platform/4.8/updating/updating-cluster-between-minor.html[Upgrading OCP4 Cluster Between Minor Versions]
- [2] xref:sop_etcd_backup.adoc[SOP Create etcd backup]
- [3] https://docs.openshift.com/container-platform/4.8/operators/admin/olm-upgrading-operators.html
- [4] https://docs.openshift.com/container-platform/4.8/backup_and_restore/disaster_recovery/scenario-2-restoring-cluster-state.html#dr-restoring-cluster-state[Restore etcd backup]
- [5] https://docs.openshift.com/container-platform/4.8/operators/admin/olm-upgrading-operators.html#olm-upgrading-operators[Upgrading Operators Prior to Cluster Update]
- [6] https://access.redhat.com/downloads/content/290/ver=4.8/rhel---8/4.8.18/x86_64/packages[Openshift Clients RPM Download]

== Prerequisites

- Incase an upgrade fails, it is wise to first take an `etcd` backup. To do so follow the SOP [2].
- Ensure that all installed Operators are at the latest versions for their channel [5].
- Ensure that the latest `oc` client rpm is available at `/srv/web/infra/bigfiles/openshiftboot/oc-client/` on the `batcave01` server. Retrieve the RPM from [6] choose the `Openshift Clients Binary` rpm. Rename rpm to `oc-client.rpm`
- Ensure that the `sudo rbac-playbook manual/ocp4-sysadmin-openshift.yml -t "upgrade-rpm"` playbook is run to install this updated oc client rpm.

== Upgrade OCP
At the time of writing the version installed on the cluster is `4.8.11` and the `upgrade channel` is set to `stable-4.8`. It is easiest to update the cluster via the web console. Go to:

- Administration
- Cluster Settings
- In order to upgrade between `z` or `patch` version (x.y.z), when one is available, click the update button.
- When moving between `y` or `minor` versions, you must first switch the `upgrade channel` to `fast-4.9` as an example. You should also be on the very latest `z`/`patch` version before upgrading.
- When the upgrade has finished, switch back to the `upgrade channel` for stable.


== Upgrade failures
In the worst case scenario we may have to restore etcd from the backups taken at the start [4]. Or reinstall a node entirely.

=== Troubleshooting
There are many possible ways an upgrade can fail mid way through.

- Check the monitoring alerts currently firing, this can often hint towards the problem
- Often individual nodes are failing to take the new MachineConfig changes and will show up when examining the `MachineConfigPool` status.
- Might require a manual reboot of that particular node
- Might require killing pods on that particular node
ocp4: reordering header levels 2022-03-03 11:44:45 +00:00			`= Upgrade OCP4 Cluster`
metrics-for-apps: Added new sops - Cluster upgrades - Creating etcd backups Signed-off-by: David Kirwan <dkirwan@redhat.com> 2021-09-28 11:30:50 +09:00			`Please see the official documentation for more information [1][3], this SOP can be used as a rough guide.`

ocp4: reordering header levels 2022-03-03 11:44:45 +00:00			`== Resources`
metrics-for-apps: Added new sops - Cluster upgrades - Creating etcd backups Signed-off-by: David Kirwan <dkirwan@redhat.com> 2021-09-28 11:30:50 +09:00
			`- [1] https://docs.openshift.com/container-platform/4.8/updating/updating-cluster-between-minor.html[Upgrading OCP4 Cluster Between Minor Versions]`
			`- [2] xref:sop_etcd_backup.adoc[SOP Create etcd backup]`
			`- [3] https://docs.openshift.com/container-platform/4.8/operators/admin/olm-upgrading-operators.html`
			`- [4] https://docs.openshift.com/container-platform/4.8/backup_and_restore/disaster_recovery/scenario-2-restoring-cluster-state.html#dr-restoring-cluster-state[Restore etcd backup]`
			`- [5] https://docs.openshift.com/container-platform/4.8/operators/admin/olm-upgrading-operators.html#olm-upgrading-operators[Upgrading Operators Prior to Cluster Update]`
Typo Signed-off-by: David Kirwan <dkirwan@redhat.com> 2021-11-09 11:35:27 +00:00			`- [6] https://access.redhat.com/downloads/content/290/ver=4.8/rhel---8/4.8.18/x86_64/packages[Openshift Clients RPM Download]`
metrics-for-apps: Added new sops - Cluster upgrades - Creating etcd backups Signed-off-by: David Kirwan <dkirwan@redhat.com> 2021-09-28 11:30:50 +09:00
ocp4: reordering header levels 2022-03-03 11:44:45 +00:00			`== Prerequisites`
metrics-for-apps: Added new sops - Cluster upgrades - Creating etcd backups Signed-off-by: David Kirwan <dkirwan@redhat.com> 2021-09-28 11:30:50 +09:00
			- Incase an upgrade fails, it is wise to first take an `etcd` backup. To do so follow the SOP [2].
fcos-pipeline-migration: Updated ocp4 upgrade SOP Installation of the oc client rpm Signed-off-by: David Kirwan <dkirwan@redhat.com> 2021-11-09 11:28:54 +00:00			`- Ensure that all installed Operators are at the latest versions for their channel [5].`
			- Ensure that the latest `oc` client rpm is available at `/srv/web/infra/bigfiles/openshiftboot/oc-client/` on the `batcave01` server. Retrieve the RPM from [6] choose the `Openshift Clients Binary` rpm. Rename rpm to `oc-client.rpm`
			- Ensure that the `sudo rbac-playbook manual/ocp4-sysadmin-openshift.yml -t "upgrade-rpm"` playbook is run to install this updated oc client rpm.
metrics-for-apps: Added new sops - Cluster upgrades - Creating etcd backups Signed-off-by: David Kirwan <dkirwan@redhat.com> 2021-09-28 11:30:50 +09:00
ocp4: reordering header levels 2022-03-03 11:44:45 +00:00			`== Upgrade OCP`
metrics-for-apps: Added new sops - Cluster upgrades - Creating etcd backups Signed-off-by: David Kirwan <dkirwan@redhat.com> 2021-09-28 11:30:50 +09:00			At the time of writing the version installed on the cluster is `4.8.11` and the `upgrade channel` is set to `stable-4.8`. It is easiest to update the cluster via the web console. Go to:

			`- Administration`
			`- Cluster Settings`
			- In order to upgrade between `z` or `patch` version (x.y.z), when one is available, click the update button.
			- When moving between `y` or `minor` versions, you must first switch the `upgrade channel` to `fast-4.9` as an example. You should also be on the very latest `z`/`patch` version before upgrading.
			- When the upgrade has finished, switch back to the `upgrade channel` for stable.


ocp4: reordering header levels 2022-03-03 11:44:45 +00:00			`== Upgrade failures`
metrics-for-apps: Added new sops - Cluster upgrades - Creating etcd backups Signed-off-by: David Kirwan <dkirwan@redhat.com> 2021-09-28 11:30:50 +09:00			`In the worst case scenario we may have to restore etcd from the backups taken at the start [4]. Or reinstall a node entirely.`

ocp4: reordering header levels 2022-03-03 11:44:45 +00:00			`=== Troubleshooting`
metrics-for-apps: Added new sops - Cluster upgrades - Creating etcd backups Signed-off-by: David Kirwan <dkirwan@redhat.com> 2021-09-28 11:30:50 +09:00			`There are many possible ways an upgrade can fail mid way through.`

			`- Check the monitoring alerts currently firing, this can often hint towards the problem`
			- Often individual nodes are failing to take the new MachineConfig changes and will show up when examining the `MachineConfigPool` status.
			`- Might require a manual reboot of that particular node`
			`- Might require killing pods on that particular node`