Skip to content

Data item normalizer type validation is overly strict #3892

@rooperuu

Description

@rooperuu

dlt version

1.25.0

Describe the problem

Some operations on DltSource only work if the normalizer is configured as the default dlt.common.normalizers.json.relational. For example, accessing max_table_nesting or root_key will fail if using any other normalizer, such as dlt.common.normalizers.json.relational_no_coercion. We observed this while trying to switch to a custom normalizer that extends the default relational normalizer by adding side effects without changing the resulting tables or rows.

Expected behavior

We expected that the normalizer would just work as long as it is compatible.

Steps to reproduce

The error can be reproduced like this:

import os
import dlt

os.environ["SCHEMA__JSON_NORMALIZER"] = '{"module": "dlt.common.normalizers.json.relational_no_coercion"}'

@dlt.source
def data():
    yield from []

print(data().max_table_nesting)

This example raises InvalidJsonNormalizer. With the SCHEMA__JSON_NORMALIZER definition removed, this just prints None.

Operating system

macOS

Runtime environment

Local

Python version

3.12

dlt data source

Empty DltSource works, see example.

dlt destination

No response

Other deployment details

No response

Additional information

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions