Skip to content

Fix the potential bug of check_all_column_from_schema#5287

Merged
alamb merged 2 commits intoapache:mainfrom
ygf11:index-of-column
Feb 17, 2023
Merged

Fix the potential bug of check_all_column_from_schema#5287
alamb merged 2 commits intoapache:mainfrom
ygf11:index-of-column

Conversation

@ygf11
Copy link
Copy Markdown
Contributor

@ygf11 ygf11 commented Feb 15, 2023

Which issue does this PR close?

Closes #.

Rationale for this change

The check_all_column_from_schema is used to check if columns are all in the schema.
It is based on index_of_column. Giving the column, index_of_column has three result:

  1. No such field, return FieldNotFound error.
  2. Find only one field, return this field.
  3. Find one more field, return Ambiguous reference error.

1 and 3 will return Error. In check_all_column_from_schema, we need distinguish these two, but currently we don't do it. This pr will fix it.

What changes are included in this PR?

  • Refactor index_of_column_by_name to return Result<Option<usize>>.
  • Add is_column_from_schema, and check_all_column_from_schema calls it.

Are these changes tested?

Yes

Are there any user-facing changes?

@github-actions github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules labels Feb 15, 2023
@ygf11 ygf11 marked this pull request as ready for review February 15, 2023 11:27
@jackwener jackwener self-requested a review February 16, 2023 14:14
Copy link
Copy Markdown
Member

@jackwener jackwener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A great improvement👍. I noticed this problem past.

Perhaps more improvements can be made here in the future.
Some function used for find will return Result, it's a little strange, maybe we can use Result<Option<>> or Option<>

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Feb 17, 2023

Perhaps more improvements can be made here in the future.
Some function used for find will return Result, it's a little strange, maybe we can use Result<Option<>> or Option<>

I couldn't agree more 👍

As @ygf11 points out, a ticket I filed yesterday iI think suggests related work: #5309

@alamb alamb merged commit f154a9a into apache:main Feb 17, 2023
@ursabot
Copy link
Copy Markdown

ursabot commented Feb 17, 2023

Benchmark runs are scheduled for baseline = fed4019 and contender = f154a9a. f154a9a is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@ygf11 ygf11 deleted the index-of-column branch February 21, 2023 10:50
jiangzhx pushed a commit to jiangzhx/arrow-datafusion that referenced this pull request Feb 24, 2023
* Fix the potential bug of check_all_column_from_schema

* rename contain_column to is_column_from_schema
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

logical-expr Logical plan and expressions optimizer Optimizer rules

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants