Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"method" metric label has unbound granularity #10208

Open
jacekn opened this issue Jul 17, 2023 · 2 comments · May be fixed by #12638
Open

"method" metric label has unbound granularity #10208

jacekn opened this issue Jul 17, 2023 · 2 comments · May be fixed by #12638
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@jacekn
Copy link

jacekn commented Jul 17, 2023

What happened:

I noticed a problem with metrics that can lead to metric granularity explosion.
In default config ingress will accept any HTTP method, even nonsense one. HTTP methods are exposed as values of the method label in metrics. This means that requests with random, unique, http methods ("AAA", "AAB", "AAC" and so on) will cause metric numbers to explode causing monitoring issues or possibly even affecting the controller itself (DoS).

What you expected to happen:

Prometheus best practice is for label values to be bound so I expected non-standard HTTP request to be handled in a way that doesn't affect metrics.

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):

This was tested with version 1.8.0

Kubernetes version (use kubectl version):

Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.14", GitCommit:"0018fa8af8ffaff22f36d3bd86289753ca61da81", GitTreeState:"clean", BuildDate:"2023-05-17T16:21:16Z", GoVersion:"go1.19.9", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: AWS EC2

  • OS (e.g. from /etc/os-release): Ubuntu 20.04

  • Kernel (e.g. uname -a): 5.11.0-1021-aws

  • Install tools: kubeadm

  • Kubectl info: Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.3", GitCommit:"25b4e43193bcda6c7328a6d147b1fb73a33f1598", GitTreeState:"clean", BuildDate:"2023-06-15T02:15:11Z", GoVersion:"go1.20.5", Compiler:"gc", Platform:"linux/amd64"}

  • How was the ingress-nginx-controller installed:

We installed using helm template that was stored in a git repo:

helm template --values=- --namespace=ingress-nginx --repo https://kubernetes.github.io/ingress-nginx ingress-nginx ingress-nginx
  • Current State of the controller:

Controller is full functional and it works without problems. The only visible problem is very high number of metrics being exported.

  • Current state of ingress object, if applicable:

All Ingress objects are fine and the controller handles requests fine.

How to reproduce this issue:

  1. Deploy controller and add an ingress object.
  2. Confirm what HTTP methods are exposed as metrics using the following prometheus query:
sum(nginx_ingress_controller_requests) by (method)
  1. Make a a few http reqeuests with random HTTP methods:
curl --request "AAA" https://example.com
curl --request "AAB" https://example.com
curl --request "AAC" https://example.com
curl --request "AAD" https://example.com
  1. Confirm that new methods now appear as label values:
sum(nginx_ingress_controller_requests{method=~"AA.*"}) by (method)
  1. Above label values persist until controller is restarted

Anything else we need to know:

Kubernetes security team's assessment is that this but doesn't require CVE.

With regards to the best solution - we could have valid codes listed in the controller code and report any methods not matching the allowlist as method="invalid" or similar.

@jacekn jacekn added the kind/bug Categorizes issue or PR as related to a bug. label Jul 17, 2023
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jul 17, 2023
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@github-actions
Copy link

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.

@github-actions github-actions bot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Aug 17, 2023
jacekn added a commit to jacekn/ingress-nginx that referenced this issue Jan 7, 2025
## What this PR does / why we need it:

This PR fixes kubernetes#10208 by checking whether the request method is valid.
For invalid methods we set `method="invalid_method"` label so that operators
can stil see traffic in the metrics but without unbound label value.

## Types of changes
<!--- What types of changes does your code introduce? Put an `x` in all the boxes that apply: -->
- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] CVE Report (Scanner found CVE and adding report)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Documentation only

## How Has This Been Tested?

I tested by building the image locally using `make build && make image`
and running the image in minikube.
The change has a fairly narrow scope so should be low risk.

## Checklist:
<!--- Go over all the following points, and put an `x` in all the boxes that apply. -->
<!--- If you're unsure about any of these, don't hesitate to ask. We're here to help! -->
- [ ] My change requires a change to the documentation.
- [ ] I have updated the documentation accordingly.
- [x] I've read the [CONTRIBUTION](https://github.com/kubernetes/ingress-nginx/blob/main/CONTRIBUTING.md) guide
- [x] I have added unit and/or e2e tests to cover my changes.
- [x] All new and existing tests passed.
@jacekn jacekn linked a pull request Jan 7, 2025 that will close this issue
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
Development

Successfully merging a pull request may close this issue.

2 participants