Skip to content

feat: Add auto_explain mode#19316

Open
nuno-faria wants to merge 21 commits intoapache:mainfrom
nuno-faria:auto_explain
Open

feat: Add auto_explain mode#19316
nuno-faria wants to merge 21 commits intoapache:mainfrom
nuno-faria:auto_explain

Conversation

@nuno-faria
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

Allowing users to check the execution plans without needing to change the existing application.

The auto_explain mode can be enabled with the datafusion.explain.auto_explain config. In addition, there are two other configs:

  • datafusion.explain.auto_explain_output: sets the output location of the plans. Supports stdout, stderr, and a file path.
  • datafusion.explain.auto_explain_min_duration: only outputs plans whose duration is greater than this value (similar to Postgres' auto_explain.log_min_duration).

Example in datafusion-cli:

-- regular mode
> select 1;
+----------+
| Int64(1) |
+----------+
| 1        |
+----------+
1 row(s) fetched.

-- with auto_explain enabled (the plan is not actually part of the result, it is sent to stdout)
> select 1;
+-------------------+------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                         |
+-------------------+------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | ProjectionExec: expr=[1 as Int64(1)], metrics=[output_rows=1, elapsed_compute=21.50µs, output_bytes=8.0 B, output_batches=1] |
|                   |   PlaceholderRowExec, metrics=[]                                                                                             |
|                   |                                                                                                                              |
+-------------------+------------------------------------------------------------------------------------------------------------------------------+
+----------+
| Int64(1) |
+----------+
| 1        |
+----------+
1 row(s) fetched.

What changes are included in this PR?

  • Extended the existing AnalyzeExec operator to support the auto_explain mode.
  • Added new explain configs.
  • Wrap plans in a AnalyzeExec operator when auto_explain is enabled.
  • Added tests.

Are these changes tested?

Yes.

Are there any user-facing changes?

New feature, but it's completely optional.

@github-actions github-actions Bot added documentation Improvements or additions to documentation core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) common Related to common crate physical-plan Changes to the physical-plan crate labels Dec 14, 2025
Comment thread datafusion/physical-plan/src/analyze.rs Outdated
Comment on lines +127 to +128
self.cache =
Self::compute_properties(&self.input, Arc::clone(&self.input.schema()));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be recomputed since the output changes.

Comment thread datafusion/physical-plan/src/analyze.rs Outdated
Comment on lines +256 to +263
if auto_explain {
if duration.as_millis() >= auto_explain_min_duration as u128 {
export_auto_explain(out, &auto_explain_output)?;
}
concat_batches(&inner_schema, &batches).map_err(DataFusionError::from)
} else {
Ok(out)
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The auto_explain mode will return the input's batches instead of the analyze.

@nuno-faria
Copy link
Copy Markdown
Contributor Author

Comment thread datafusion/physical-plan/src/analyze.rs Outdated
Comment thread datafusion/physical-plan/src/analyze.rs Outdated
Comment thread datafusion/physical-plan/src/analyze.rs Outdated
let fd: &mut dyn Write = match output {
"stdout" => &mut io::stdout(),
"stderr" => &mut io::stderr(),
_ => &mut OpenOptions::new().create(true).append(true).open(output)?,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need any kind of validation of the file location ?
Or it is left to the developer/admin to make sure it is a safe place ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need some kind of synchronisation when a file path is used for the output ? Two or more DF sessions using the same config may try to write to the same file simultaneously.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need any kind of validation of the file location ?
Or it is left to the developer/admin to make sure it is a safe place ?

I think it's better to leave this to the user (either way, an error is returned).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need some kind of synchronisation when a file path is used for the output ? Two or more DF sessions using the same config may try to write to the same file simultaneously.

I think again the responsibility of this falls on the user. Is it common to use multiple sessions over the same config?

Comment thread datafusion/common/src/config.rs Outdated
Comment thread datafusion/physical-plan/src/analyze.rs
# test auto_explain

statement ok
set datafusion.explain.auto_explain_output = 'test_files/scratch/auto_explain.txt';
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does something assert the contents of this output file ?
Does something remove this file at the end ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally tried to load the file to a table as CSV, as I think it is the only feasible way to check the contents, but since the file cannot be removed the result would always change. I mainly added these sqllogictests just to check the "set ..." commands.

As for removing the file, I'm not sure it is possible. With that said, I don't think it is necessary since it's written to the sqllogictest temporary dir.

@github-actions github-actions Bot removed the documentation Improvements or additions to documentation label Feb 1, 2026
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Feb 1, 2026
@alamb alamb mentioned this pull request Mar 2, 2026
Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nuno-faria and @martin-g -- sorry for the delay in reviewing this PR

Form my perspective, this is a valuable feature but having the output to stdout / stderr in the core library is not idea.

Instead, it seems to me that auto explain belongs in the client applications themselves (e.g. datafusion-cli)

I tried to explain my thinking more here

@nuno-faria
Copy link
Copy Markdown
Contributor Author

@alamb thanks for the review. I also get the concerns of polluting stdout with plans, and in my case I would be setting datafusion.explain.auto_explain_output pointed to some file to avoid this with the auto_explain enabled. It was a way to simulate the auto_explain of Postgres, but since DataFusion does not have a default log, it had to be stdout/stderr (or fail if a file is not provided). As for exposing it only in datafusion-cli, in my case it would be less useful, as I'm using the library directly.

I will have to look at the Observer suggestion once I get the time.

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Mar 4, 2026

pointed to some file to avoid this with the auto_explain enabled. It was a way to simulate the auto_explain of Postgres,

What about using log::info! or something (aka the existing logging hooks 🤔 )

@nuno-faria
Copy link
Copy Markdown
Contributor Author

What about using log::info! or something (aka the existing logging hooks 🤔 )

I added log::info (as well as log::error, log::warn, ...) instead of stdout/stderr.

I was also thinking of adding the query to the output, similar to Postgre's auto_explain. It would have to come from the unparsed logical plan since the original query is not available. However, maybe some users would not need the query, or maybe some would need other info that is not outputted. So maybe it might actually be better to leave this auto_explain feature for users to implement downstream as they wish.

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Mar 10, 2026

What about using log::info! or something (aka the existing logging hooks 🤔 )

I added log::info (as well as log::error, log::warn, ...) instead of stdout/stderr.

I was also thinking of adding the query to the output, similar to Postgre's auto_explain. It would have to come from the unparsed logical plan since the original query is not available. However, maybe some users would not need the query, or maybe some would need other info that is not outputted. So maybe it might actually be better to leave this auto_explain feature for users to implement downstream as they wish.

I think if you wanted to make it easier to implement downstream maybe we would add some sort of API / callback thing (trait object?) that could get the info for auto_explains.

Then the default implementation could log with info and downstream users could override it if they wanted

@nuno-faria
Copy link
Copy Markdown
Contributor Author

I think if you wanted to make it easier to implement downstream maybe we would add some sort of API / callback thing (trait object?) that could get the info for auto_explains.

Then the default implementation could log with info and downstream users could override it if they wanted

Sounds good. I'll try to tackle this soon.

@github-actions github-actions Bot added the execution Related to the execution crate label Mar 15, 2026
@github-actions github-actions Bot removed the execution Related to the execution crate label Mar 21, 2026
@nuno-faria
Copy link
Copy Markdown
Contributor Author

@alamb I refactored the previous auto_explain mode to now use a new PlanObserver trait. It has a method to be called when the physical plan is built and another to be called when the plan completes, using the result of the analyze operator.

pub trait PlanObserver: Send + Sync + 'static + Debug {
    fn plan_created(
        &self,
        id: &str,
        logical_plan: &LogicalPlan,
        physical_plan: &Arc<dyn ExecutionPlan>,
    ) -> Result<()>;
    
    fn plan_executed(
        &self,
        id: &str,
        explain_result: RecordBatch,
        duration: Duration,
    ) -> Result<()>;
}

The AnalyzeExec operator can now also receive a callback to pass the result. I opted for this approach to avoid the physical operators to depend on the PlanOperator. This way we can reuse the code in the analyze operator to provide the already formatted output.

It can be used like this:

let plan_observer = DefaultPlanObserver::new("auto_explain.txt".to_owned(), 0);
let ctx = SessionContext::new().with_plan_observer(Arc::new(plan_observer));
ctx.sql("create table t (k int, v int)").await?.collect().await?;

// auto explain needs to be enabled
ctx.sql("set datafusion.explain.auto_explain = true").await?.collect().await?;

ctx.sql("select * from t where k = 1 or k = 2 order by v desc limit 5").await?.collect().await?;

The DefaultPlanObserver writes using the log crate or a file, and it looks like this:

QUERY: SELECT t.k, t.v FROM t WHERE ((t.k = 1) OR (t.k = 2)) ORDER BY t.v DESC NULLS FIRST LIMIT 5
DURATION: 0.689ms
EXPLAIN:
+-------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                                                |
+-------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | SortExec: TopK(fetch=5), expr=[v@1 DESC], preserve_partitioning=[false], metrics=[output_rows=0, elapsed_compute=13.40µs, output_bytes=0.0 B, output_batches=0, row_replacements=0] |
|                   |   FilterExec: k@0 = 1 OR k@0 = 2, metrics=[output_rows=0, elapsed_compute=1ns, output_bytes=0.0 B, output_batches=0, selectivity=N/A (0/0)]                                         |
|                   |     DataSourceExec: partitions=1, partition_sizes=[0], metrics=[]                                                                                                                   |
|                   |                                                                                                                                                                                     |
+-------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

(If the sql feature is not enabled, the SQL query is not written.)

Let me know what you think about the API.

Comment thread datafusion/core/src/execution/plan_observer.rs
Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nuno-faria -- I went over this ne again. I think it is getting close

Comment thread datafusion/core/src/execution/plan_observer.rs
pub struct DefaultPlanObserver {
output: String,
min_duration_ms: usize,
/// stores a SQL representation of the logical plan, if the `sql` feature is enabled.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex points out that on the error path this map never gets cleaned up -- so a bunch of errors will potentially cause this map to grow without bound. Maybe something we could clean up as a follow on PR (file a ticket to fix, etc)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I completely missed that. I think we could use a queue that removes queries from that map once it reaches some limit.

/// - `log::info`
/// - `log::debug`
/// - `log::trace`
/// - a file path: creates the file if it does not exist, or appends to it if it does.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make more sense to have an explicit enum or something here especially as this can do file I/O

I worry that if someone accidentally passes in log:error (one :) that will write to a file named log:error

What do you think about using something like

enum Output {
  LogError, 
  LogWarn, 
...
  LogToFile(String),
}

?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

function_factory,
cache_factory,
prepared_plans: HashMap::new(),
plan_observer: Some(Arc::new(DefaultPlanObserver::default())),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only be set when auto_explain is enabled, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally yes, but then as far as I'm aware the user wouldn't be able to turn on the default auto_explain without also having to first set a plan_observer. But if it is better that way I can set it to None, let me know.

self.optimize_physical_plan(plan, session_state, |_, _| {})
let mut plan = self.optimize_physical_plan(plan, session_state, |_, _| {})?;

// setup the auto explain mode if necessary
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this would be cleaner to add to handle_explain_or_analyze (or put it in its own method)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this to the existing setup_auto_explain method so that everything is there. I'm not sure if it would fit in the handle_explain_or_analyze method since at that point the Analyze is still not in the plan.

Comment thread datafusion/physical-plan/src/analyze.rs Outdated
return_inner: bool,
}

/// Optionally used by the `AnalyzeExec` operator to callback it with the result.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We won't be able to easily add new things to this callback

It might also be useful to point out somewhere that setting return_inner is effectively going to buffer the entire query output into RAM (which could be quite large)

I wonder if it would make sense to define a trait

trait AnalyzeObserver {
 ...
}

And then instead of

    /// If Some, passes the output of the analyze once it completes, as well as the duration.
    callback: Option<AnalyzeCallback>,

Do something like

    /// If Some, passes the output of the analyze once it completes, as well as the duration.
    callback: Option<Arc<dyn AnalyzeObserver>>,

That way if we want to add more methods (like adding something which sees the plan metrics) it would be easier to add without an API change.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced the callback with the following trait:

pub trait AnalyzeObserver: Debug + Send + Sync {
    /// Provides the EXPLAIN ANALYZE output (annotated plan) and the total duration.
    fn analyze_result_callback(
        &self,
        result: RecordBatch,
        duration: std::time::Duration,
    ) -> Result<()>;
}

fn plan_executed(
&self,
id: &str,
explain_result: RecordBatch,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this result the actual output batches?

Or is it the annotated explain plan?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The annotated plan. I changed the variable name to annotated_plan to make it clearer.

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Apr 8, 2026

Shoot -- I lost track of this one. I put it on my short list for review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate core Core DataFusion crate documentation Improvements or additions to documentation physical-plan Changes to the physical-plan crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add auto_explain mode

3 participants