Read OpenTelemetry project contributing guide for general information about the project.
- Schema modeling rules: Standardize solutions to common schema modeling problems for a more consistent user experience and less bike shedding.
- Project tooling: Project tooling used to improve consistency, reduce bugs, and improve the maintainer experience.
- Pull request: Guidelines for submitting pull requests.
The following rules are enforced when modeling the configuration schema.
These rules must be enforced when making changes to the schema; however, changes to the rules are permitted. In other words, the versioning policy guarantees do not apply to the rules: so long as the versioning policy is not violated for existing properties, the modeling rules guiding new properties may change.
The schema is modeled using JSON schema draft 2020-12.
This is reflected in top level schema documents by setting "$schema": "https://json-schema.org/draft/2020-12/schema".
The schema semantics should follow a "what you see is what you get" (or WYSIWYG) philosophy. Another way to frame this is that implementations should minimize the amount of magic that occurs as a result of the absence of an optional property.
For example, in the following snippet .meter_provider is not set and the semantics indicate that a noop meter provider should be used, rather than some default meter provider definitions with a periodic metric reader and OTLP exporter. WYSIWYG: there is no .meter_provider and you get the closest equivalent to an empty / null / unset meter provider.
resource:
tracer_provider: ...
# meter_provider: ...
logger_provider: ...
propagators: ...It's not always possible to follow this philosophy. For example, when .attribute_limits is not set, the SDK defaults to using .attribute_limits.attribute_count_limit: 128, whereas a rigid interpretation of WYSIWYG would suggest the default should be no limit. In this case we have competing concerns: WYSIWYG is in tension with a safe default experience for users, and with the defaults as indicated in the specification.
If it seems difficult to define default semantics that satisfy WYSIWYG, consider making the property required, which prevents the need to define default semantics.
NOTE: Doing extra configuration work when properties are not explicitly configured is attractive because it reduces required configuration. However, it increases the cognitive load on users, who now have to reference potentially multiple external documents to understand what to expect in the absence of a property. WYSIWYG results in configuration files that are more verbose but are more self-documenting. A terse user experience can be achieved by leveraging a higher order templating tool like helm, where a simplified set of configuration parameters can be interpreted by a template engine to output the full configuration file. For example, the OpenTelemetry Collector Helm Chart accepts a number of presets like .presets.hostMetrics.enabled: true, which produce much more verbose collector configuration YAML.
Only properties that are described in opentelemetry-specification or semantic-conventions are modeled in the schema. However, it's acceptable to allow additional properties specific to a particular language or implementation, which are not covered by the schema. Model these by setting "additionalProperties": true (see JSON schema additionalProperties). Types should set "additionalProperties": false by default unless requested by an OpenTelemetry component maintainer to support additional options.
To remove redundant information from the configuration file, prefixes for data produced by each of the providers will be removed from configuration options. For example, under the meter_provider configuration, metric readers are identified by the word readers rather than by metric_readers. Similarly, the prefix span_ will be dropped for tracer_provider configuration, and logrecord for logger_provider.
Properties defined in the schema should be lower snake case.
enum values should be lower snake case.
When a property requires pattern matching, use wildcard * (match any number of any character, including none) and ? (match any single character) instead of regex. If a single property with wildcards is likely to be insufficient to model the configuration requirements, accept included and excluded properties, each with an array of strings with wildcard entries. The wildcard entries should be joined with a logical OR. If included is not specified, assume that all entries are included. Apply excluded after applying included. Examples:
- Given
excluded: ["a*"]: Match all except values starting witha. - Given
included: ["a*", "b*"],excluded: ["ab*"]: Match any value starting withaorb, excluding values starting withab. - Given
included: ["a", "b"],excluded: ["a"]: Match values equal tob.
Properties should be modeled using the most appropriate data structures and types to represent the information. This may result in a schema that doesn't support env var substitution for the standard env vars where a type mismatch occurs. For example, the OTEL_RESOURCE_ATTRIBUTES env var is modeled as a string, consisting of a comma separated list of key-value pairs, which is not the natural way to model a mapping of key-value pairs in JSON schema.
In instances where there is a type mismatch between the JSON schema and equivalent standard env var, an alternative version of the property may be provided to resolve the mismatch. For example, resource attributes are configured at .resource.attributes, but .resource.attributes_list is available with a format matching that of OTEL_RESOURCE_ATTRIBUTES. Alternative properties are reserved for cases where there is a demonstrated need for platforms to be able to participate in configuration and there is no reasonable alternative.
When a type requires a configurable list of name-value pairs (i.e. resource attributes, HTTP headers), model using an array of objects, each with name and value properties. While an array of name-value objects is slightly more verbose than an object where each key-value is an entry, the latter is preferred because:
-
Avoids user input as keys, which ensures conformity with the snake_case properties rule.
-
Allows both the names and the values to be targets for env var substitution. For example:
tracer_provider: processors: - batch: exporter: otlp: headers: - name: ${AUTHORIZATION_HEADER_NAME:-api-key} value: ${AUTHORIZATION_HEADER_VALUE}
JSON schema has two related but subtly different concepts involved in indicating the requirement level of properties and values:
typeofnull: When a property includes a type ofnullalong with other allowed types (i.e."type": ["string", "null"]), it indicates that even if the property key is present, the value may be omitted. This is useful in a variety of situations:- When modeling properties with primitive types that are candidates for env var substitution, since allowing
nullmeans that the configuration is valid even if the referenced env var is undefined. - When modeling objects that do not require any properties. In these cases, either no properties are required, or there are no properties and the presence of the property key expresses the desired state.
- When modeling properties with primitive types that are candidates for env var substitution, since allowing
- required: When a property is
required, the key must be included in the object or the configuration is invalid. Properties should be required when there is no well-known default semantic (i.e. it's not clear what the behavior is when the property is absent).
For example:
tracer_provider:
processors:
- simple:
exporter:
console:
limits:
attribute_value_length_limit: ${OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT}tracer_provideris not required. When omitted, a noop tracer provider is used.tracer_provider's type isobject. There's no sensible tracer provider which does not minimally set one entry inprocessors.exporteris required. A simple processor without an exporter is invalid.exporter's type isobject. Settingexportertonullor any non-object value is invalid.console's type is["object", "null"]. The console exporter has no properties, and we should not force the user to set an empty object (i.econsole: {}).limitsis not required. When omitted, default span limits are used.limits's type isobject. If a user includes thelimitsproperty, they must set at least one property. Settinglimitstonullis invalid.attributes_value_length_limitis not required. If omitted, no attribute length limits are applied.attributes_value_length_limit's type is["integer", "null]. If null (i.e. because theOTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMITenv var is unset), no attribute length limits are applied.
If a property is not required, it must include defaultBehavior describing the semantics when it is omitted. To differentiate between present but null and absent, non-required properties may optionally include nullBehavior describing the semantics when it is null.
If a property is required and nullable, it must include nullBehavior describing the semantics when it is null.
JSON schema's schema composition keywords (allOf, anyOf, oneOf) offer a tempting mechanism for object-oriented style inheritance and polymorphic patterns. However, JSON schema code generation tools may struggle or not support these keywords. Therefore, these keywords should be used judiciously, and should not be used to extend object types.
For example:
{
"Shape": {
"title": "Shape",
"type": "object",
"properties": {
"sides": { "type": "integer"}
}
},
"Square": {
"title": "Square",
"type": "object",
"allOf": [{"$ref": "#/$defs/Shape"}],
"properties": {
"side_length": {"type": "integer"}
}
}
}allOf is used in the Square type to extend the parent Shape type, such that Square has properties sides and side_length. Avoid this type of use.
Another example:
{
"AttributeNameValue": {
"title": "AttributeNameValue",
"type": "object",
"properties": {
"name": {
"type": "string"
},
"value": {
"oneOf": [
{"type": "string"},
{"type": "number"},
{"type": "boolean"},
{"type": "null"},
{"type": "array", "items": {"type": "string"}},
{"type": "array", "items": {"type": "boolean"}},
{"type": "array", "items": {"type": "number"}}
]
},
"type": {
"$ref": "#/$defs/AttributeType"
}
},
"required": [
"name", "value"
]
},
"AttributeType": {
"type": ["string", "null"],
"enum": [
null,
"string",
"bool",
"int",
"double",
"string_array",
"bool_array",
"int_array",
"double_array"
]
}
}oneOf is used to specify that the value property matches the Attribute AnyValue definition, and is either a primitive or array of primitives. This type of use is acceptable but should be used judiciously.
Because properties of type array are not candidates for env var substitution, it typically does not make sense to allow the array to be empty. In some cases, an empty array likely corresponds to an accidental misconfiguration which should be detected and reported as an error. In other cases, an empty array is meaningless and the user is better off omitting the property altogether.
For these reasons, minItems is typically set to 1.
NOTE: there are some valid cases where an empty array is semantically meaningful, such as when setting ExplicitBucketHistogram.boundaries.
The JSON schema title and description annotations are keywords that are not involved in validation. Instead, they act as a mechanism to help schemas be self-documenting, and may be used by code generation tools.
description must be included on all properties. Schema validation project tooling enforces this.
title should be omitted. Schema compilation project tooling ensures consistent type titles by including title for the root OpenTelemetryConfiguration type, and letting the $defs key be the title for all other type.
In JSON Schema, a schema is a document, and a subschema is contained in surrounding parent schema. Subschemas can be nested in various ways:
A property can directly describe a complex set of requirements including nested structures:
{
"properties": {
"shape": {
"type": "object",
"properties": {
"color": { "type": "string" },
"sides": { "type": "int" }
}
}
}
}Or a property can reference a subschema residing in a schema document's $defs:
{
"properties": {
"shape": {
"$ref": "#/$defs/Shape"
}
},
"$defs": {
"Shape": {
"type": "object",
"properties": {
"color": { "type": "string" },
"sides": { "type": "int" }
}
}
}
}In order to promote stylistic consistency and allow for reuse of concepts, object and enum types should be defined either as a top level schema document or as a subschema in a schema document's $defs.
SDK extension plugin interfaces should be modeled consistently for improved user experience and to facilitate implementations supporting custom implementations via the PluginComponentProvider mechanism.
The SpanExporter schema is typical:
...
"SpanExporter": {
"type": "object",
"additionalProperties": {
"type": ["object", "null"]
},
"minProperties": 1,
"maxProperties": 1,
"properties": {
"otlp_http": { "$ref": "common.json#/$defs/OtlpHttpExporter" },
// additional built-in exporters omitted for brevity
}
},Which results in YAML like:
tracer_provider:
processors:
- batch:
exporter:
otlp_http: # set the span exporter to be the built-in OTLP http exporter
endpoint: http://example/v1/traces
---
tracer_provider:
processors:
- batch:
exporter:
my_custom_exporter: # set the span exporter to be a custom exporter with name my_custom_exporter
property: value- The
SpanExportertype requires exactly one property to be set ("minProperties": 1,"maxProperties": 1), and requires that property to have a value of typeobjectornull("additionalProperties": {"type": ["object", "null"]}). - The property key refers to the
nameused to register a PluginComponentProvider. - The property value is passed as configuration as
propertieswhen PluginComponentProvider Create Component is called. SpanExporterhaspropertiesdescribing the names and schemas of built-in span exporters (i.e. those defined explicitly in the specification).
Schema validation project tooling enforces that types labeled isSdkExtensionPlugin: true are modeled consistently as described above.
This repository has a variety of tooling assisting with the development of the JSON schema and associated artifacts.
Much of the project tooling is written in JavaScript and uses Node to run. Before making changes to the project:
-
Install the latest LTS release of Node. For example, using nvm under Linux run:
nvm install --lts
-
Install tooling packages:
npm install
To run all project tooling targets:
# `all` is the default target and can be optionally omitted
make allThe JSON schema source is maintained as a set of YAML files in the schema directory. All files except those starting with meta_ are source schema files.
Maintaining the source in YAML instead of JSON makes it easier to maintain read and write multi-line property descriptions, which we lean on frequently to document complex property semantics.
It also allows us to maintain metadata that does not fit neatly into the JSON schema:
isSdkExtensionPlugin(boolean): Types labeled as SDK extension plugins are called out in documentation and have a consistent schema.defaultBehavior(string): Describes the behavior when a property is omitted. IfnullBehavioris not set,defaultBehavioralso describes the behavior when a property is null.defaultBehavioris required for all non-required properties.nullBehavior(string): Describes the behavior when a property isnull. This can optionally be set on non-required properties to differentiate behavior when a property is present butnull, vs. omitted entirely.nullBehavioris required for all required properties that are nullable.enumDescriptions(map<string, string>): Contains descriptions for each value of anenumtype.enumDescriptionsmust be present on allenumtypes, and each enum value must have a corresponding entry.
JSON schema source files are compiled into a single JSON schema output file at opentelemetry-configuration.schema.json using:
make compile-schemaHaving a single output file simplifies integration with tooling, as there eliminates the need to resolve external $refs.
The output file has property description fields which are enriched with additional information from the JSON schema which can be leveraged by code generation tooling for improved documentation.
The compile-schema target performs schema validation, failing with descriptive error messages if violations are found.
It's important to run compile-schema before committing changes to the schema as uncommitted changes will cause the build to fail. The default make target will run compile-schema automatically.
Before compiling the schema, the compile-schema target performs a variety of validation checks to help ensure schema consistency and quality:
- Validate all properties have a
description. - Validate all enum types have a
enumDescription(see above) and all enum values have a corresponding entry. - Validate all types labeled
isSdkExtensionPlugin: trueare modeled consistently. - Validate there are no subschemas (i.e. all types are defined at the top level of in
$defs). - Validate
defaultBehaviorandnullBehaviorare used correctly:- All non-required properties must have a
defaultBehavior. - All required properties must have a
nullBehaviorif they are nullable.
- All non-required properties must have a
The meta_schema_language_{language}.yaml files in schema track the language implementation status for a particular language.
meta_schema_language_{language}.yaml file content looks like:
latestSupportedFileFormat: 1.0.0-rc.1
typeSupportStatuses:
- type: Base2ExponentialBucketHistogramAggregation
status: supported # the support status, see below for allowed enum values
# notes: Uncomment to include optional additional notes on the implementation.
propertyOverrides:
- property: record_min_max
status: ignored
# other types omitted for brevityNotes:
.latestSupportedFileFormat(string): The latest version ofopentelemetry-configurationsupported by the{language}.typeSupportStatuses([]object): An array with entries for each type in the JSON schema..typeSupportStatuses[].type(string): The name of the JSON schema type. Maintained automatically by build tooling..typeSupportStatuses[].status(enum): Captures the support status of the type and all properties except overrides in.typeSupportStatuses[].propertyOverrides. See enum options below..typeSupportStatuses[].notes(string): Contains optional additional notes on the implementation..typeSupportStatuses[].propertyOverrides([]object): An array of properties which have different support statuses than the overall type as recorded in.typeSupportStatuses[].status. Omitted for enum types..typeSupportStatuses[].propertyOverrides[].property(string): The name of the property whose support status is overridden..typeSupportStatuses[].propertyOverrides[].status(string): The overridden support status. See enum options below.
.typeSupportStatuses[].enumOverrides([]object): An array of enum values which have different support statuses than the overall type as recorded in.typeSupportStatuses[].status. Omitted for non-enum types..typeSupportStatuses[].enumOverrides[].enumValue(string): The name of the enum value whose support status is overridden..typeSupportStatuses[].enumOverrides[].status(enum): The overridden support status. See enum options below.
- Status enum options, applicable to
.typeSupportStatuses[].status,.typeSupportStatuses[].propertyOverrides[].status:unknown: Language maintainer has not yet recorded a status.supported: The type / property is supported by the language implementation.not_implemented: The type / property is not parsed / recognized by the language implementation because the concept is not yet implemented but should be eventually.not_applicable: The type / property is not parsed / recognized by the language implementation because the concept is not applicable. E.g. C++ specific instrumentation for Java.ignored: The type / property is not parsed / recognized by the language implementation despite the concept being available in the language's programmatic configuration API.
Tooling ensures that the contents of these files are consistent with the contents of the JSON schema:
make fix-language-implementationsThe fix-language-implementations target synchronizes the contents of these files as follows:
- If a language implementation is known (i.e. defined in constant array
KNOWN_LANGUAGESin language-implementations.js) but does not have ameta_schema_language_{language}.yamlfile, add it. - If a
meta_schema_language_{language}.yamlexists for a language not inKNOWN_LANGUAGES, delete it. - For each language implementation file:
- If a type exists in the JSON schema and not in the language implementation file, add it.
- If a type exists in the language implementation file and not in the JSON schema, delete it.
- For each property in a type's
propertyOverrides, if the property does not exist in the JSON schema, delete it. - For each property in a type's
enumOverrides, if the value does not exist in the JSON schema, delete it.
When this target adds new entries to meta_schema_language_{langauage}.yaml, they are stubbed out with TODO placeholders. Contributors adding new schema types and properties should update these with sensible values.
It's important to run fix-language-implementations before committing changes to the schema as uncommitted changes will cause the build to fail. The default make target will run fix-language-implementations automatically.
snippets contains small targeted configuration files illustrating different scenarios. These are used to supplement documentation and as a library of valid configuration files which implementations can use in testing.
Each snippet file name follows the pattern: <JsonSchemaType>_<snake_case_description>.yaml:
<JsonSchemaType>: Refers to the name of a JSON schema type as referenced in types.<snake_case_description>: A short description of what the snippet demonstrates, in lower snake case.- File content must validate against the top level
OpenTelemetryConfigurationtype. - File content must contain
# SNIPPET_START, which marks the start of the relevant portion of the snippet. The content which follows is validated against the<JsonSchemaType>type in the file name. NOTE: The same number of whitespace characters proceeding# SNIPPET_STARTare stripped from later lines before validation.
For example, the content for snippet file OtlpHttpMetricExporter_use_base2_exponential_histogram looks like:
file_format: "1.0-rc.2"
meter_provider:
readers:
- periodic:
exporter:
otlp_http:
# SNIPPET_START
endpoint: http://localhost:4317
default_histogram_aggregation: base2_exponential_bucket_histogramNotes:
- The snippet has
<JsonSchemaType>ofOtlpHttpMetricExporterand<snake_case_description>ofuse_base2_exponential_histogram, or "use base2 exponential histogram" in plain english. - The portion after
# SNIPPET_STARTis validated against theOtlpHttpMetricExportertype:
endpoint: http://localhost:4317
default_histogram_aggregation: base2_exponential_bucket_histogramTo validate all snippets:
make validate-snippetsschema-docs.md contains generated markdown summarizing a variety of useful information about the JSON schema and language implementation status in an easy to consume format.
To generate:
make generate-markdownIt's important to run generate-markdown before committing changes to the schema as uncommitted changes will cause the build to fail. The default make target will run generate-markdown automatically.
To validate starter template examples against the JSON schema:
make validate-examplesFailures in the validate-examples target will cause the build to fail. The default make target will run validate-examples automatically.
A PR is ready to merge when:
- It has at least 1 approval from codeowners (TODO: bump to 2 when we have more codeowners)
- There is no
request changesfrom the codeowners - If a change to the schema, at least one example is updated to illustrate change
- All required status checks pass
- Has been tagged with any applicable labels
breaking: applied to PRs that qualify as breaking changes according to the versioning policy, including breaking changes to experimental features which are allowed in minor versions.