Skip to content

Commit 179cfa1

Browse files
zhengruifengYicong-Huang
authored andcommitted
[SPARK-55413][PYTHON][INFRA] Upgrade Python minimum dep test images to Ubuntu 24.04
### What changes were proposed in this pull request? Upgrade Python minimum dep test images to Ubuntu 24.04 ### Why are the changes needed? to test against newer versions ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? PR builder with ``` default: '{"PYSPARK_IMAGE_TO_TEST": "python-minimum", "PYTHON_TO_TEST": "python3.10"}' ``` https://github.com/zhengruifeng/spark/actions/runs/21777887237/job/62837606695 ``` default: '{"PYSPARK_IMAGE_TO_TEST": "python-ps-minimum", "PYTHON_TO_TEST": "python3.10"}' ``` https://github.com/zhengruifeng/spark/actions/runs/21791697723/job/62875766911 ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#54200 from zhengruifeng/u24_mini. Authored-by: Ruifeng Zheng <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent b1bd1e5 commit 179cfa1

File tree

2 files changed

+34
-19
lines changed

2 files changed

+34
-19
lines changed

dev/spark-test-image/python-minimum/Dockerfile

Lines changed: 17 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -15,16 +15,16 @@
1515
# limitations under the License.
1616
#
1717

18-
# Image for building and testing Spark branches. Based on Ubuntu 22.04.
18+
# Image for building and testing Spark branches. Based on Ubuntu 24.04.
1919
# See also in https://hub.docker.com/_/ubuntu
20-
FROM ubuntu:jammy-20240911.1
20+
FROM ubuntu:noble
2121
LABEL org.opencontainers.image.authors="Apache Spark project <[email protected]>"
2222
LABEL org.opencontainers.image.licenses="Apache-2.0"
2323
LABEL org.opencontainers.image.ref.name="Apache Spark Infra Image For PySpark with old dependencies"
2424
# Overwrite this label to avoid exposing the underlying Ubuntu OS version label
2525
LABEL org.opencontainers.image.version=""
2626

27-
ENV FULL_REFRESH_DATE=20260127
27+
ENV FULL_REFRESH_DATE=20260206
2828

2929
ENV DEBIAN_FRONTEND=noninteractive
3030
ENV DEBCONF_NONINTERACTIVE_SEEN=true
@@ -43,20 +43,28 @@ RUN apt-get update && apt-get install -y \
4343
libssl-dev \
4444
openjdk-17-jdk-headless \
4545
pkg-config \
46-
python3.10 \
47-
python3-psutil \
4846
tzdata \
4947
software-properties-common \
50-
zlib1g-dev \
48+
zlib1g-dev
49+
50+
# Install Python 3.10
51+
RUN add-apt-repository ppa:deadsnakes/ppa
52+
RUN apt-get update && apt-get install -y \
53+
python3.10 \
5154
&& apt-get autoremove --purge -y \
5255
&& apt-get clean \
5356
&& rm -rf /var/lib/apt/lists/*
5457

55-
ARG BASIC_PIP_PKGS="numpy==1.22.4 pyarrow==18.0.0 pandas==2.2.0 six==1.16.0 scipy scikit-learn coverage unittest-xml-reporting"
56-
# Python deps for Spark Connect
57-
ARG CONNECT_PIP_PKGS="grpcio==1.76.0 grpcio-status==1.76.0 googleapis-common-protos==1.71.0 zstandard==0.25.0 graphviz==0.20 protobuf==6.33.5"
58+
# Setup virtual environment
59+
ENV VIRTUAL_ENV=/opt/spark-venv
60+
RUN python3.10 -m venv --without-pip $VIRTUAL_ENV
61+
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
5862

5963
# Install Python 3.10 packages
6064
RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10
65+
66+
ARG BASIC_PIP_PKGS="numpy==1.22.4 pyarrow==18.0.0 pandas==2.2.0 six==1.16.0 scipy scikit-learn coverage unittest-xml-reporting psutil"
67+
ARG CONNECT_PIP_PKGS="grpcio==1.76.0 grpcio-status==1.76.0 googleapis-common-protos==1.71.0 zstandard==0.25.0 graphviz==0.20 protobuf==6.33.5"
68+
6169
RUN python3.10 -m pip install --force $BASIC_PIP_PKGS $CONNECT_PIP_PKGS && \
6270
python3.10 -m pip cache purge

dev/spark-test-image/python-ps-minimum/Dockerfile

Lines changed: 17 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -15,16 +15,16 @@
1515
# limitations under the License.
1616
#
1717

18-
# Image for building and testing Spark branches. Based on Ubuntu 22.04.
18+
# Image for building and testing Spark branches. Based on Ubuntu 24.04.
1919
# See also in https://hub.docker.com/_/ubuntu
20-
FROM ubuntu:jammy-20240911.1
20+
FROM ubuntu:noble
2121
LABEL org.opencontainers.image.authors="Apache Spark project <[email protected]>"
2222
LABEL org.opencontainers.image.licenses="Apache-2.0"
2323
LABEL org.opencontainers.image.ref.name="Apache Spark Infra Image For Pandas API on Spark with old dependencies"
2424
# Overwrite this label to avoid exposing the underlying Ubuntu OS version label
2525
LABEL org.opencontainers.image.version=""
2626

27-
ENV FULL_REFRESH_DATE=20260127
27+
ENV FULL_REFRESH_DATE=20260206
2828

2929
ENV DEBIAN_FRONTEND=noninteractive
3030
ENV DEBCONF_NONINTERACTIVE_SEEN=true
@@ -43,21 +43,28 @@ RUN apt-get update && apt-get install -y \
4343
libssl-dev \
4444
openjdk-17-jdk-headless \
4545
pkg-config \
46-
python3.10 \
47-
python3-psutil \
4846
tzdata \
4947
software-properties-common \
50-
zlib1g-dev \
48+
zlib1g-dev
49+
50+
# Install Python 3.10
51+
RUN add-apt-repository ppa:deadsnakes/ppa
52+
RUN apt-get update && apt-get install -y \
53+
python3.10 \
5154
&& apt-get autoremove --purge -y \
5255
&& apt-get clean \
5356
&& rm -rf /var/lib/apt/lists/*
5457

55-
56-
ARG BASIC_PIP_PKGS="pyarrow==18.0.0 pandas==2.2.0 six==1.16.0 numpy scipy coverage unittest-xml-reporting"
57-
# Python deps for Spark Connect
58-
ARG CONNECT_PIP_PKGS="grpcio==1.76.0 grpcio-status==1.76.0 googleapis-common-protos==1.71.0 zstandard==0.25.0 graphviz==0.20 protobuf==6.33.5"
58+
# Setup virtual environment
59+
ENV VIRTUAL_ENV=/opt/spark-venv
60+
RUN python3.10 -m venv --without-pip $VIRTUAL_ENV
61+
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
5962

6063
# Install Python 3.10 packages
6164
RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10
65+
66+
ARG BASIC_PIP_PKGS="pyarrow==18.0.0 pandas==2.2.0 six==1.16.0 numpy scipy coverage unittest-xml-reporting psutil"
67+
ARG CONNECT_PIP_PKGS="grpcio==1.76.0 grpcio-status==1.76.0 googleapis-common-protos==1.71.0 zstandard==0.25.0 graphviz==0.20 protobuf==6.33.5"
68+
6269
RUN python3.10 -m pip install --force $BASIC_PIP_PKGS $CONNECT_PIP_PKGS && \
6370
python3.10 -m pip cache purge

0 commit comments

Comments
 (0)