Skip to content

Commit

Permalink
sync machine labels
Browse files Browse the repository at this point in the history
  • Loading branch information
ykakarap committed Jan 9, 2025
1 parent 2c7219e commit 804d45d
Show file tree
Hide file tree
Showing 8 changed files with 103 additions and 25 deletions.
3 changes: 3 additions & 0 deletions controllers/alias.go
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,8 @@ type MachineReconciler struct {
WatchFilterValue string

RemoteConditionsGracePeriod time.Duration

SyncMachineLabels []string
}

func (r *MachineReconciler) SetupWithManager(ctx context.Context, mgr ctrl.Manager, options controller.Options) error {
Expand All @@ -81,6 +83,7 @@ func (r *MachineReconciler) SetupWithManager(ctx context.Context, mgr ctrl.Manag
ClusterCache: r.ClusterCache,
WatchFilterValue: r.WatchFilterValue,
RemoteConditionsGracePeriod: r.RemoteConditionsGracePeriod,
SyncMachineLabels: r.SyncMachineLabels,
}).SetupWithManager(ctx, mgr, options)
}

Expand Down
11 changes: 7 additions & 4 deletions docs/book/src/reference/api/metadata-propagation.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,10 @@ Top-level labels that meet a specific cretria are propagated to the Node labels
- `.labels.[label-meets-criteria]` => `Node.labels`
- `.annotations` => Not propagated.

Label should meet one of the following criteria to propagate to Node:
- Has `node-role.kubernetes.io` as prefix.
- Belongs to `node-restriction.kubernetes.io` domain.
- Belongs to `node.cluster.x-k8s.io` domain.
Labels on a Machine that match the list of target domains provided by the combined list of the default sync machine label domains and `--additional-sync-machine-labels` on the manager are synced down to the node.
The default sync machine label domains are:
- `node-role.kubernetes.io`
- `node-restriction.kubernetes.io`
- `*.node-restriction.kubernetes.io`
- `node.cluster.x-k8s.io`
- `*.node.cluster.x-k8s.io`
16 changes: 11 additions & 5 deletions docs/proposals/20220927-label-sync-between-machine-and-nodes.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,9 @@ With the "divide and conquer" principle in mind this proposal aims to address th

### Goals

- Support label sync from Machine to the linked Kubernetes node, limited to `node-role.kubernetes.io/` prefix and the `node-restriction.kubernetes.io` domain.
- Support label sync from Machine to the linked Kubernetes node.
- Support syncing labels from Machine to the linked Kubernetes node for the Cluster API owned `node.cluster.x-k8s.io` domain.
- Support to configure the domains of labels to be synced from the Machine to the Node.

### Non-Goals

Expand Down Expand Up @@ -98,14 +99,18 @@ While designing a solution for syncing labels between Machine and underlying Kub

### Label domains & prefixes

The idea of scoping synchronization to a well defined set of labels is a first answer to security/concurrency concerns; labels to be managed by Cluster API have been selected based on following criteria:
The list of labels that will be synced between the Machines and the Nodes is configured by a flag on the manager that list the domains or subdomains a label should belong to.

- The `node-role.kubernetes.io` label has been used widely in the past to identify the role of a Kubernetes Node (e.g. `node-role.kubernetes.io/worker=''`). For example, `kubectl get node` looks for this specific label when displaying the role to the user.
If no list is provided it defaults to:

- The `node-restriction.kubernetes.io/` domain is recommended in the [Kubernetes docs](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#noderestriction) for things such as workload placement; please note that in this case
- `node-role.kubernetes.io`: The `node-role.kubernetes.io` label has been used widely in the past to identify the role of a Kubernetes Node (e.g. `node-role.kubernetes.io/worker=''`). For example, `kubectl get node` looks for this specific label when displaying the role to the user.

- `node-restriction.kubernetes.io`, `*.node-restriction.kubernetes.io`: The `node-restriction.kubernetes.io/` domain is recommended in the [Kubernetes docs](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#noderestriction) for things such as workload placement; please note that in this case
we are considering the entire domain, thus also labels like `example.node-restriction.kubernetes.io/fips=true` fall in this category.

- Cluster API owns a specific domain: `node.cluster.x-k8s.io`.
- `node.cluster.x-k8s.io`, `*node.cluster.x-k8s.io`: Cluster API owns a specific domain: `node.cluster.x-k8s.io`.

**Note**: The list of domains and subdomains provided would be additive to the default list. The labels that meet the default list will always be synced to the Nodes.

#### Synchronization of CAPI Labels

Expand Down Expand Up @@ -163,3 +168,4 @@ Users could also implement their own label synchronizer in their tooling, but th

- [ ] 09/27/2022: First Draft of this document
- [ ] 09/28/2022: First Draft of this document presented in the Cluster API office hours meeting
- [ ] 01/09/2025: Update to support configurable label syncing Ref:[11657](https://github.com/kubernetes-sigs/cluster-api/issues/11657)
2 changes: 2 additions & 0 deletions internal/controllers/machine/machine_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,8 @@ type Reconciler struct {

RemoteConditionsGracePeriod time.Duration

SyncMachineLabels []string

controller controller.Controller
recorder record.EventRecorder
externalTracker external.ObjectTracker
Expand Down
24 changes: 14 additions & 10 deletions internal/controllers/machine/machine_controller_noderef.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ package machine
import (
"context"
"fmt"
"path/filepath"
"strings"

"github.com/pkg/errors"
Expand Down Expand Up @@ -127,7 +128,7 @@ func (r *Reconciler) reconcileNode(ctx context.Context, s *scope) (ctrl.Result,
// Compute labels to be propagated from Machines to nodes.
// NOTE: CAPI should manage only a subset of node labels, everything else should be preserved.
// NOTE: Once we reconcile node labels for the first time, the NodeUninitializedTaint is removed from the node.
nodeLabels := getManagedLabels(machine.Labels)
nodeLabels := r.getManagedLabels(machine.Labels)

// Get interruptible instance status from the infrastructure provider and set the interruptible label on the node.
interruptible := false
Expand Down Expand Up @@ -178,24 +179,27 @@ func (r *Reconciler) reconcileNode(ctx context.Context, s *scope) (ctrl.Result,

// getManagedLabels gets a map[string]string and returns another map[string]string
// filtering out labels not managed by CAPI.
func getManagedLabels(labels map[string]string) map[string]string {
func (r *Reconciler) getManagedLabels(labels map[string]string) map[string]string {
managedLabels := make(map[string]string)
for key, value := range labels {
dnsSubdomainOrName := strings.Split(key, "/")[0]
if dnsSubdomainOrName == clusterv1.NodeRoleLabelPrefix {
managedLabels[key] = value
}
if dnsSubdomainOrName == clusterv1.NodeRestrictionLabelDomain || strings.HasSuffix(dnsSubdomainOrName, "."+clusterv1.NodeRestrictionLabelDomain) {
managedLabels[key] = value
}
if dnsSubdomainOrName == clusterv1.ManagedNodeLabelDomain || strings.HasSuffix(dnsSubdomainOrName, "."+clusterv1.ManagedNodeLabelDomain) {
if r.allowedDomainSync(dnsSubdomainOrName) {
managedLabels[key] = value
}
}

return managedLabels
}

func (r *Reconciler) allowedDomainSync(domain string) bool {
for _, v := range r.SyncMachineLabels {
matched, _ := filepath.Match(v, domain)
if matched {
return true
}
}
return false
}

// summarizeNodeConditions summarizes a Node's conditions and returns the summary of condition statuses and concatenate failed condition messages:
// if there is at least 1 semantically-negative condition, summarized status = False;
// if there is at least 1 semantically-positive condition when there is 0 semantically negative condition, summarized status = True;
Expand Down
51 changes: 45 additions & 6 deletions internal/controllers/machine/machine_controller_noderef_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -700,7 +700,6 @@ func TestSummarizeNodeConditions(t *testing.T) {
}

func TestGetManagedLabels(t *testing.T) {
// Create managedLabels map from known managed prefixes.
managedLabels := map[string]string{
clusterv1.NodeRoleLabelPrefix + "/anyRole": "",

Expand All @@ -713,24 +712,64 @@ func TestGetManagedLabels(t *testing.T) {
"custom-prefix." + clusterv1.NodeRestrictionLabelDomain: "",
clusterv1.NodeRestrictionLabelDomain + "/anything": "",
"custom-prefix." + clusterv1.NodeRestrictionLabelDomain + "/anything": "",
"test.foo.com": "",
}

// Append arbitrary labels.
allLabels := map[string]string{
additionalLabels := map[string]string{
"foo": "",
"bar": "",
"company.xyz/node.cluster.x-k8s.io": "not-managed",
"gpu-node.cluster.x-k8s.io": "not-managed",
"company.xyz/node-restriction.kubernetes.io": "not-managed",
"gpu-node-restriction.kubernetes.io": "not-managed",
"wrong.test.foo.com": "",
}

allLabels := map[string]string{}
for k, v := range managedLabels {
allLabels[k] = v
}
for k, v := range additionalLabels {
allLabels[k] = v
}

g := NewWithT(t)
got := getManagedLabels(allLabels)
g.Expect(got).To(BeEquivalentTo(managedLabels))
tests := []struct {
name string
syncMachineLabels []string
allLabels map[string]string
managedLabels map[string]string
}{
{
name: "sync defined labels",
syncMachineLabels: []string{
"node-role.kubernetes.io",
"node-restriction.kubernetes.io",
"*.node-restriction.kubernetes.io",
"node.cluster.x-k8s.io",
"*.node.cluster.x-k8s.io",
"test.foo.com",
},
allLabels: allLabels,
managedLabels: managedLabels,
},
{
name: "sync all labels",
syncMachineLabels: []string{"*"},
allLabels: allLabels,
managedLabels: allLabels,
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
g := NewWithT(t)
r := &Reconciler{
SyncMachineLabels: tt.syncMachineLabels,
}
got := r.getManagedLabels(tt.allLabels)
g.Expect(got).To(BeEquivalentTo(tt.managedLabels))
})
}
}

func TestPatchNode(t *testing.T) {
Expand Down
7 changes: 7 additions & 0 deletions internal/controllers/machine/suite_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,13 @@ func TestMain(m *testing.M) {
APIReader: mgr.GetAPIReader(),
ClusterCache: clusterCache,
RemoteConditionsGracePeriod: 5 * time.Minute,
SyncMachineLabels: []string{
"node-role.kubernetes.io",
"node-restriction.kubernetes.io",
"*.node-restriction.kubernetes.io",
"node.cluster.x-k8s.io",
"*.node.cluster.x-k8s.io",
},
}).SetupWithManager(ctx, mgr, controller.Options{MaxConcurrentReconciles: 1}); err != nil {
panic(fmt.Sprintf("Failed to start MachineReconciler: %v", err))
}
Expand Down
14 changes: 14 additions & 0 deletions main.go
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ var (
clusterResourceSetConcurrency int
machineHealthCheckConcurrency int
useDeprecatedInfraMachineNaming bool
syncMachineLabels []string
)

func init() {
Expand Down Expand Up @@ -254,11 +255,23 @@ func InitFlags(fs *pflag.FlagSet) {
"Use deprecated infrastructure machine naming")
_ = fs.MarkDeprecated("use-deprecated-infra-machine-naming", "This flag will be removed in v1.9.")

fs.StringArrayVar(&syncMachineLabels, "--additional-sync-machine-labels", []string{}, "The list of additional domains used to sync labels from machines to nodes.")

flags.AddManagerOptions(fs, &managerOptions)

feature.MutableGates.AddFlag(fs)
}

var (
defaultSyncMachineLabels = []string{
"node-role.kubernetes.io",
"node-restriction.kubernetes.io",
"*.node-restriction.kubernetes.io",
"node.cluster.x-k8s.io",
"*.node.cluster.x-k8s.io",
}
)

// Add RBAC for the authorized diagnostics endpoint.
// +kubebuilder:rbac:groups=authentication.k8s.io,resources=tokenreviews,verbs=create
// +kubebuilder:rbac:groups=authorization.k8s.io,resources=subjectaccessreviews,verbs=create
Expand Down Expand Up @@ -565,6 +578,7 @@ func setupReconcilers(ctx context.Context, mgr ctrl.Manager, watchNamespaces map
ClusterCache: clusterCache,
WatchFilterValue: watchFilterValue,
RemoteConditionsGracePeriod: remoteConditionsGracePeriod,
SyncMachineLabels: append(defaultSyncMachineLabels, syncMachineLabels...),
}).SetupWithManager(ctx, mgr, concurrency(machineConcurrency)); err != nil {
setupLog.Error(err, "Unable to create controller", "controller", "Machine")
os.Exit(1)
Expand Down

0 comments on commit 804d45d

Please sign in to comment.