Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[POC] exporter batcher - byte size based batching #12017

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sfc-gh-sili
Copy link
Contributor

@sfc-gh-sili sfc-gh-sili commented Jan 5, 2025

Description

This is an POC of serialized size based batching.

Configuration is supported via an additional field to MaxSizeConfig.

type MaxSizeConfig struct {
	MaxSizeItems int `mapstructure:"max_size_items"`
	MaxSizeBytes int `mapstructure:"max_size_bytes"`
}

We will validate that at most one of the above fields are specified (TODO) and switch between item count-based batching vs. byte size-based batching accordingly.

To get the byte size of otlp protos, this PR updates pdata/internal/cmd/pdatagen/internal/templates/message.go.tmpl to expose an interface Size(). This change will apply to all pdatagen-generated files.

Performance

→ go test -bench=. -cpuprofile=cpu.prof -run Benchmark
goos: darwin
goarch: arm64
pkg: go.opentelemetry.io/collector/exporter/exporterhelper
cpu: Apple M1 Max
BenchmarkSplittingBasedOnItemCountManyLogs-10    	     576	   1988099 ns/op
BenchmarkSplittingBasedOnByteSizeManyLogs-10     	     386	   3095374 ns/op
BenchmarkSplittingBasedOnItemCountHugeLog-10     	    2758	    466363 ns/op
BenchmarkSplittingBasedOnByteSizeHugeLog-10      	     410	   2788484 ns/op
PASS
ok  	go.opentelemetry.io/collector/exporter/exporterhelper	6.301s

The above benchmark is tests two cases:

Case 1: merge split 1000 logs, where each incoming log involves one merge and one split. Byte based batching takes 70% more time in this case.
Case 2: merge split a log that splits into 100 logs. Byte based batching takes 500% more time in this case.

CPU Pprof shows that the majority of time is spent on calculating the byte size.
I tried reducing the number of byte-size calculation by caching byte size result in integers, but that did not help improve the performance (seems compiler or proto library is smart enough to reuse previously calculated result).

Optimization

  • Instead of splitting precisely, simply put the new request into a new batch if it goes beyond capacity.
→ go test -bench=. -cpuprofile=cpu.prof -run Benchmark
goos: darwin
goarch: arm64
pkg: go.opentelemetry.io/collector/exporter/exporterhelper
cpu: Apple M1 Max
BenchmarkSplittingBasedOnItemCountManyLogs-10    	     574	   2013375 ns/op
BenchmarkSplittingBasedOnByteSizeManyLogs-10     	     475	   2578009 ns/op
BenchmarkSplittingBasedOnItemCountHugeLog-10     	    2653	    446605 ns/op
BenchmarkSplittingBasedOnByteSizeHugeLog-10      	     428	   2762497 ns/op
PASS
ok  	go.opentelemetry.io/collector/exporter/exporterhelper	6.193s

Link to tracking issue

Fixes #

Testing

Documentation

Copy link

codecov bot commented Jan 5, 2025

Codecov Report

Attention: Patch coverage is 5.88235% with 192 lines in your changes missing coverage. Please review.

Project coverage is 90.98%. Comparing base (306c939) to head (43e811c).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
exporter/exporterhelper/logs_batch.go 3.75% 77 Missing ⚠️
exporter/exporterbatcher/config.go 0.00% 8 Missing and 4 partials ⚠️
exporter/internal/queue/default_batcher.go 75.00% 2 Missing and 1 partial ⚠️
exporter/exporterhelper/internal/request.go 0.00% 2 Missing ⚠️
exporter/exporterhelper/logs.go 0.00% 2 Missing ⚠️
exporter/exporterhelper/metrics.go 0.00% 2 Missing ⚠️
exporter/exporterhelper/traces.go 0.00% 2 Missing ⚠️
...xporter/exporterhelper/xexporterhelper/profiles.go 0.00% 2 Missing ⚠️
pdata/pcommon/generated_instrumentationscope.go 0.00% 2 Missing ⚠️
pdata/pcommon/generated_resource.go 0.00% 2 Missing ⚠️
... and 43 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #12017      +/-   ##
==========================================
- Coverage   91.70%   90.98%   -0.73%     
==========================================
  Files         455      455              
  Lines       24053    24253     +200     
==========================================
+ Hits        22058    22066       +8     
- Misses       1625     1812     +187     
- Partials      370      375       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sfc-gh-sili sfc-gh-sili force-pushed the sili-byte-size branch 2 times, most recently from 43e811c to 812f050 Compare January 8, 2025 01:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant