Skip to content

Commit 444ddf2

Browse files
authored
Add BatchAdapter to simplify using PhysicalExprAdapter / Projector to map RecordBatch between schemas (#19716)
I've now seen this pattern a couple of times, in our own codebase, working on apache/datafusion-comet#3047. I was going to add an example but I think adding an API to handle it for users is a better experience. This should also make it a bit easier to migrate from SchemaAdapter. In fact, I think it's possible to implement a SchemaAdapter using this as the foundation + some shim code. This won't be available in DF 51 to ease migration but it's easy enough to backport (just copy the code in this PR) for users that would find that helpful.
1 parent 7b0ed2d commit 444ddf2

3 files changed

Lines changed: 373 additions & 4 deletions

File tree

datafusion/datasource/src/schema_adapter.rs

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -115,10 +115,20 @@ pub trait SchemaMapper: Debug + Send + Sync {
115115

116116
/// Deprecated: Default [`SchemaAdapterFactory`] for mapping schemas.
117117
///
118-
/// This struct has been removed. Use [`PhysicalExprAdapterFactory`] instead.
118+
/// This struct has been removed.
119+
///
120+
/// Use [`PhysicalExprAdapterFactory`] instead to customize scans via
121+
/// [`FileScanConfigBuilder`], i.e. if you had implemented a custom [`SchemaAdapter`]
122+
/// and passed that into [`FileScanConfigBuilder`] / [`ParquetSource`].
123+
/// Use [`BatchAdapter`] if you want to map a stream of [`RecordBatch`]es
124+
/// between one schema and another, i.e. if you were calling [`SchemaMapper::map_batch`] manually.
125+
///
119126
/// See `upgrading.md` for more details.
120127
///
121128
/// [`PhysicalExprAdapterFactory`]: datafusion_physical_expr_adapter::PhysicalExprAdapterFactory
129+
/// [`FileScanConfigBuilder`]: crate::file_scan_config::FileScanConfigBuilder
130+
/// [`ParquetSource`]: https://docs.rs/datafusion-datasource-parquet/latest/datafusion_datasource_parquet/source/struct.ParquetSource.html
131+
/// [`BatchAdapter`]: datafusion_physical_expr_adapter::BatchAdapter
122132
#[deprecated(
123133
since = "52.0.0",
124134
note = "DefaultSchemaAdapterFactory has been removed. Use PhysicalExprAdapterFactory instead. See upgrading.md for more details."
@@ -178,10 +188,20 @@ impl SchemaAdapter for DeprecatedSchemaAdapter {
178188

179189
/// Deprecated: The SchemaMapping struct held a mapping from the file schema to the table schema.
180190
///
181-
/// This struct has been removed. Use [`PhysicalExprAdapterFactory`] instead.
191+
/// This struct has been removed.
192+
///
193+
/// Use [`PhysicalExprAdapterFactory`] instead to customize scans via
194+
/// [`FileScanConfigBuilder`], i.e. if you had implemented a custom [`SchemaAdapter`]
195+
/// and passed that into [`FileScanConfigBuilder`] / [`ParquetSource`].
196+
/// Use [`BatchAdapter`] if you want to map a stream of [`RecordBatch`]es
197+
/// between one schema and another, i.e. if you were calling [`SchemaMapper::map_batch`] manually.
198+
///
182199
/// See `upgrading.md` for more details.
183200
///
184201
/// [`PhysicalExprAdapterFactory`]: datafusion_physical_expr_adapter::PhysicalExprAdapterFactory
202+
/// [`FileScanConfigBuilder`]: crate::file_scan_config::FileScanConfigBuilder
203+
/// [`ParquetSource`]: https://docs.rs/datafusion-datasource-parquet/latest/datafusion_datasource_parquet/source/struct.ParquetSource.html
204+
/// [`BatchAdapter`]: datafusion_physical_expr_adapter::BatchAdapter
185205
#[deprecated(
186206
since = "52.0.0",
187207
note = "SchemaMapping has been removed. Use PhysicalExprAdapterFactory instead. See upgrading.md for more details."

datafusion/physical-expr-adapter/src/lib.rs

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
pub mod schema_rewriter;
3030

3131
pub use schema_rewriter::{
32-
DefaultPhysicalExprAdapter, DefaultPhysicalExprAdapterFactory, PhysicalExprAdapter,
33-
PhysicalExprAdapterFactory, replace_columns_with_literals,
32+
BatchAdapter, BatchAdapterFactory, DefaultPhysicalExprAdapter,
33+
DefaultPhysicalExprAdapterFactory, PhysicalExprAdapter, PhysicalExprAdapterFactory,
34+
replace_columns_with_literals,
3435
};

0 commit comments

Comments
 (0)