Skip to content

Commit 94cfa3d

Browse files
gaogaotiantianzhengruifeng
authored andcommitted
[SPARK-55132][INFRA] Upgrade numpy version on lint image
### What changes were proposed in this pull request? Upgrade numpy version on lint image and fixed some minor lint failures. ### Why are the changes needed? When we do `pip install ./dev/requirements.txt` locally, we normally have the latest version of `numpy`. This creates a diff between our local dev environment and CI. We should keep this as close as possible so we can rely on local mypy results. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Locally mypy test passed. ### Was this patch authored or co-authored using generative AI tooling? No Closes #53913 from gaogaotiantian/upgrade-lint-numpy. Authored-by: Tian Gao <[email protected]> Signed-off-by: Ruifeng Zheng <[email protected]>
1 parent c583ab2 commit 94cfa3d

File tree

3 files changed

+5
-5
lines changed

3 files changed

+5
-5
lines changed

dev/spark-test-image/lint/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ RUN python3.11 -m pip install \
9191
'jinja2' \
9292
'matplotlib' \
9393
'mypy==1.8.0' \
94-
'numpy==2.0.2' \
94+
'numpy==2.4.1' \
9595
'numpydoc' \
9696
'pandas' \
9797
'pandas-stubs' \

python/pyspark/pandas/frame.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11293,7 +11293,7 @@ def _bool_column_labels(self, column_labels: List[Label]) -> List[Label]:
1129311293
"""
1129411294
# Rely on dtype rather than spark type because columns that consist of bools and
1129511295
# Nones should be excluded if bool_only is True
11296-
return [label for label in column_labels if is_bool_dtype(self._psser_for(label))] # type: ignore[arg-type]
11296+
return [label for label in column_labels if is_bool_dtype(self._psser_for(label))]
1129711297

1129811298
def _result_aggregated(
1129911299
self, column_labels: List[Label], scols: Sequence[PySparkColumn]

python/pyspark/pandas/series.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1205,10 +1205,10 @@ def map(
12051205
else:
12061206
current = current.when(self.spark.column == F.lit(to_replace), value)
12071207

1208-
if hasattr(arg, "__missing__"):
1209-
tmp_val = arg[np._NoValue] # type: ignore[attr-defined]
1208+
if isinstance(arg, dict) and hasattr(arg, "__missing__"):
1209+
tmp_val = arg[np._NoValue]
12101210
# Remove in case it's set in defaultdict.
1211-
del arg[np._NoValue] # type: ignore[attr-defined]
1211+
del arg[np._NoValue]
12121212
current = current.otherwise(F.lit(tmp_val))
12131213
else:
12141214
current = current.otherwise(F.lit(None).cast(self.spark.data_type))

0 commit comments

Comments
 (0)