Skip to content

DuckLake: duckdb:/// catalog URI incorrectly applies META_TYPE 'sqlite', breaking DuckDB-backed catalogs #3870

@Analect

Description

@Analect

dlt version

1.24.0

Describe the problem

When using a DuckDB-backed DuckLake catalog (catalog="duckdb:///catalog.duckdb"), pipeline.run() fails with:

dlt.destinations.exceptions.DestinationConnectionError:
Failed to load DuckLake table data                                                                                                                                     
Failed to execute query "PRAGMA journal_mode=WAL": file is not a database                                                                                              

build_attach_statement in dlt/destinations/impl/ducklake/sql_client.py handles drivername='duckdb' identically to 'sqlite', always appending META_TYPE 'sqlite', META_JOURNAL_MODE 'WAL', META_BUSY_TIMEOUT 1000. META_TYPE 'sqlite' causes DuckLake to open the catalog file using SQLite operations; on a DuckDB-format file this hits the PRAGMA journal_mode=WAL SQLite pragma and fails.

The DuckLakeCredentials.__init__ docstring explicitly lists "duckdb:///catalog.duckdb" as a valid catalog URI, so this is a documented use case that is silently broken.

Expected behavior

A DuckDB-backed catalog should generate a clean attach without META_TYPE:

ATTACH IF NOT EXISTS 'ducklake:catalog.duckdb' AS mydb (DATA_PATH 's3://bucket/prefix/') 

Steps to reproduce

dlt version: 1.24

The attach statement logic can be tested in isolation without a live DuckLake instance:

from dlt.common.configuration.specs.connection_string_credentials import (
      ConnectionStringCredentials,
  )
from dlt.destinations.impl.ducklake.sql_client import DuckLakeSqlClient
                  
def _catalog(uri):
    c = ConnectionStringCredentials(uri)
    c.resolve()
    return c

# This should NOT contain META_TYPE for a duckdb:/// catalog                                                                                                           
stmt = DuckLakeSqlClient.build_attach_statement(
    ducklake_name="mydb",
    catalog=_catalog("duckdb:///catalog.duckdb"),
    storage_url="s3://bucket/prefix/",
)
print(stmt)     
assert "META_TYPE" not in stmt  # fails on unpatched dlt                                                                                                               

Expected output:
ATTACH IF NOT EXISTS 'ducklake:catalog.duckdb' AS mydb (DATA_PATH 's3://bucket/prefix/')

Actual output (unpatched):
ATTACH IF NOT EXISTS 'ducklake:catalog.duckdb' AS mydb (DATA_PATH 's3://bucket/prefix/', META_TYPE 'sqlite', META_JOURNAL_MODE 'WAL', META_BUSY_TIMEOUT 1000)

Operating system

Linux

Runtime environment

Virtual Machine

Python version

3.13

dlt data source

No response

dlt destination

No response

Other deployment details

No response

Additional information

The reproducer is kept as a standalone script rather than the full pytest form. A formal test will be included in the accompanying PR.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingdestinationIssue with a specific destination

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions