-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Registry manifest and Schema diff #400
base: main
Are you sure you want to change the base?
Conversation
# Conflicts: # crates/weaver_semconv/src/group.rs
# Conflicts: # .clippy.toml # Cargo.toml # crates/weaver_semconv_gen/src/lib.rs # src/registry/search.rs # src/registry/stats.rs # src/registry/update_markdown.rs
…ed on the message
…cated::Renamed::renamed_to
# Conflicts: # crates/weaver_codegen_test/build.rs
I have a question regarding the format of the new Old approach (still supported for compatibility reasons):
or
or
With this, we can handle simple attribute renaming scenarios, as well as merge scenarios (e.g., A and B are renamed to C; Weaver will detect this automatically). However, we currently have no way to represent a split (e.g., A is renamed to B and C). So with the current implementation, the semconv author will need to set We could make this explicit in the format of the deprecated field and in the diff output. This would allow for migration documentation that more accurately reflects the desired changes. However, it still wouldn’t enable automatic downgrades in the schema processor for the split scenario (at least without logic taking into account some additional context). Question: Adding such an advanced definition for the deprecated field isn’t particularly complicated, so I don’t mind including it. What do you think? Are there other types of deprecations you’d like to codify? |
Co-authored-by: Jeremy Blythe <[email protected]>
Co-authored-by: Jeremy Blythe <[email protected]>
…l-weaver into generate-otel-schema
…cies to be part of the public API.
…cies to be part of the public API.
semconv_version: v1.26.0 | ||
changes: | ||
attributes: | ||
- type: deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit and personal taste, so definitely not blocking
having
- name: http.server_name
type: deprecated
looks more readable to me, maybe because it's closer to how we write attribute definitions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I will make this change in the documentation.
and the deprecation note is stored in the note attribute. | ||
- `removed`: An item in the baseline registry was removed in the head | ||
registry. The name of the removed item is stored in the name attribute. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'll eventually need a way to describe other (breaking) changes such as attribute type, metric unit or instrument change.
Maybe we can add other
category that would be used for change types that we don't formally support yet? This would allow someone to look at the diff and understand that there was some change and maybe some manual intervention is necessary.
Otherwise, the lack of any mention could be perceived as 'there was no change, all looks good and compatible'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add a note and describe how this could be represented as a potential future extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still haven't finished some of the more mechanical parts, but did review the diff algorithm, deserialization core code.
I think we have an issue around attribute identity we may need to solve. PTAL at comments.
} | ||
|
||
/// Returns the number of non-fatal errors, or 1 if the result is a fatal error, 0 otherwise. | ||
pub fn error_count(&self) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Should we use num_errors
?
@@ -22,7 +22,7 @@ Brief: Span attributes used by non-OTLP exporters to represent OpenTelemetry Sco | |||
- Examples: [ | |||
"io.opentelemetry.contrib.mongodb", | |||
] | |||
- Deprecated: use the `otel.scope.name` attribute. | |||
- Deprecated: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this test be updated?
@@ -39,7 +39,7 @@ Brief: {{ resource.brief }} | |||
- Sampling relevant: {{ attribute.sampling_relevant }} | |||
{%- endif %} | |||
{%- if attribute.deprecated %} | |||
- Deprecated: {{ attribute.deprecated }} | |||
- Deprecated: {{ attribute.deprecated.note }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably need to put this in the changelog or have some kind of migration guide for this then.
/// Creates a new string attribute. | ||
/// Note: This constructor is used for testing purposes. | ||
#[cfg(test)] | ||
pub(crate) fn string<S: AsRef<str>>(name: S, brief: S, note: S) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: name, brief and note would have to be the same concrete type for this helper function.
You should either have three different type parameters for them or use something like impl Into<String>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I like the proposal.
|
||
/// Sets the deprecated field of the attribute. | ||
/// Note: This method is used for testing purposes. | ||
#[cfg(test)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like these helper methods. May replicate the pattern elsewhere.
|
||
/// Get the attributes of the resolved telemetry schema. | ||
#[must_use] | ||
pub fn attribute_map(&self) -> HashMap<&str, &Attribute> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's some faulty assumptions here.
- This assumes that all Attributes MUST show up in an attribute group. That is true for Semantic Conventions (due to custom policy), but not enforced by weaver.
- This assumes that all attribute groups are registries. This is not true for Semantic Conventions. We have some attribute groups that just "share" attributes for other non-attribute groups. These may be selected before the registry attributes.
Sadly, I think we likely need a more rigid identifier here than name. I haven't had a chance to look through the rest of the code for implications, but I don't think we can rely on this method to only give us unique attributes or attributes from the registry.
/// | ||
/// Note: At the moment (2024-12-30), I don't know a better way to identify | ||
/// the "registry" attributes other than by checking if the group ID starts | ||
/// with "registry.". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because "registry" is a semantic convention concept, not a weaver one. Should we elevate this to weaver itself?
|
||
/// Get the groups of a specific type from the resolved telemetry schema. | ||
#[must_use] | ||
pub fn groups(&self, group_type: GroupType) -> HashMap<String, &Group> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Any reason why this one uses String
instead of &str
like the other helpers?
semconv_specs: Vec<(String, SemConvSpec)>, | ||
) -> SemConvRegistry { | ||
) -> Result<SemConvRegistry, Error> { | ||
static VERSION_REGEX: LazyLock<Regex> = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We should use: https://docs.rs/semver/latest/semver/ and URL parser that can give us the last element of the path to send to the parser.
Note: The scope of this PR has been reduced to focus only focus on the schema diff feature. Github issues have been created to track the features that have been postponed #482, #483.
This PR implements the command
registry diff
, see the following example:In this example, the diff is displayed in markdown format. The following formats are supported: json, yaml, markdown, ansi, ansi_stats
A detailed description of the schema diff data model and the diffing process is visible here.
Notes:
weaver_otel_schema
is not essential for this PR; it was initially included as part of the preparations for theregistry schema-update
command. We have decided to implement this command in a future PR. However, for simplicity, I prefer to keep the preparation code in place instead of removing it. Same thing forall_changes
inweaver_version
.List of modifications to apply to the semantic conventions repository after the release of the Weaver containing the current PR:
registry-manifest.yaml
file with the version of the next release.Closes: #186