-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🌱 clusterctl: add flag to skip lagging provider check in ApplyCustomPlan #11196
🌱 clusterctl: add flag to skip lagging provider check in ApplyCustomPlan #11196
Conversation
The committers listed above are authorized under a signed CLA. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Welcome @w21froster! |
Hi @w21froster. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/area clusterctl |
@JoelSpeed @Jont828 Please take a look when you are available 🙏 |
@JoelSpeed @Jont828 Are you able to take a look? Let me know if you need more context on anything. |
Question: Could adding this and using it in the cluster-api-operator lead to issues? Could it be possible to have providers then running in different contract versions which could maybe lead to issues? Upgrading using clusterctl upgrades all providers at the same time instead of each one in parallel (so some could still be running while others are already upgraded). |
I don't think this should be an issue, we talked about it in the cluster-api-operator office hours and determined that this was probably the best way forward to add a flag in clusterctl to skip this check. We have different CR's for each provider, and when users upgrade their providers they typically move all versions at the same time. I guess there could potentially be a delay between reconciliation for each provider, but we haven't noticed any issues running this as a fork and upgrading Azure CAPI/CAPBK/KCP providers. Definitely open to better approaches though! I can stop by the CAPI office hours to discuss this issue we are having in more detail. |
I personally have some concern on disabling this check, considering that the value added of clusterctl is to ensure the health of the management cluster as whole. TBH, I think that if someone asks to the operator to upgrade a single provider, this operation must be put on hold if it can lead to an invalid cluster (leaning on "when users upgrade their providers they typically move all versions at the same time" seems weak). The upgrade operation for the providers involved should unblock itself when the users is upgrading enough providers to reach a valid state. The issues seems to be in "Each one of these controllers doesn't have knowledge of the other providers, and doesn't pass in enough information to clusterctl to be able to complete this check successfully", but I think there are ways to get around since AFAIK for each provider there is a CR with a desired state/target version |
Hey @fabriziopandini, sorry for the delayed response. Thank you for providing more context on this check. We don't want users to be able to break their cluster if they have a misconfiguration, so I think a PR should be made in the CAPI operator instead of CAPI to get this to pass. I will go ahead and close this PR |
What this PR does / why we need it:
Clusterctl runs a pre-check to see if any other providers are lagging behind the target contract before creating an upgrade plan. In the current implementation of cluster-api-operator, there are multiple controllers reconciling on each different provider type. Each one of these controllers doesn't have knowledge of the other providers, and doesn't pass in enough information to clusterctl to be able to complete this check successfully. This PR is adds a flag and
UpgradeOption
to allow us to skip this pre-check and successfully upgrade the provider.Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):This fixes issue 570 in the cluster-api-operator repo.