Skip to content

Commit 3405255

Browse files
committed
[SPARK-55394][PYTHON][INFRA] Upgrade Python 3.10 test image to Ubuntu 24.04
### What changes were proposed in this pull request? Upgrade Python 3.10 test image to Ubuntu 24.04 ### Why are the changes needed? to test with a newer os version ### Does this PR introduce _any_ user-facing change? no, infra-only ### How was this patch tested? PR builder with ``` default: '{"PYSPARK_IMAGE_TO_TEST": "python-310", "PYTHON_TO_TEST": "python3.10"}' ``` https://github.com/zhengruifeng/spark/actions/runs/21751601990/job/62751479813 ### Was this patch authored or co-authored using generative AI tooling? No Closes #54177 from zhengruifeng/ubuntu_24_py_10. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
1 parent 9268812 commit 3405255

File tree

1 file changed

+17
-13
lines changed

1 file changed

+17
-13
lines changed

dev/spark-test-image/python-310/Dockerfile

Lines changed: 17 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -15,16 +15,16 @@
1515
# limitations under the License.
1616
#
1717

18-
# Image for building and testing Spark branches. Based on Ubuntu 22.04.
18+
# Image for building and testing Spark branches. Based on Ubuntu 24.04.
1919
# See also in https://hub.docker.com/_/ubuntu
20-
FROM ubuntu:jammy-20240911.1
20+
FROM ubuntu:noble
2121
LABEL org.opencontainers.image.authors="Apache Spark project <dev@spark.apache.org>"
2222
LABEL org.opencontainers.image.licenses="Apache-2.0"
2323
LABEL org.opencontainers.image.ref.name="Apache Spark Infra Image For PySpark with Python 3.10"
2424
# Overwrite this label to avoid exposing the underlying Ubuntu OS version label
2525
LABEL org.opencontainers.image.version=""
2626

27-
ENV FULL_REFRESH_DATE=20260203
27+
ENV FULL_REFRESH_DATE=20260206
2828

2929
ENV DEBIAN_FRONTEND=noninteractive
3030
ENV DEBCONF_NONINTERACTIVE_SEEN=true
@@ -45,25 +45,29 @@ RUN apt-get update && apt-get install -y \
4545
libxml2-dev \
4646
openjdk-17-jdk-headless \
4747
pkg-config \
48-
python3.10 \
49-
python3-psutil \
50-
qpdf \
5148
tzdata \
52-
wget \
53-
zlib1g-dev \
49+
software-properties-common \
50+
zlib1g-dev
51+
52+
# Install Python 3.10
53+
RUN add-apt-repository ppa:deadsnakes/ppa
54+
RUN apt-get update && apt-get install -y \
55+
python3.10 \
5456
&& apt-get autoremove --purge -y \
5557
&& apt-get clean \
5658
&& rm -rf /var/lib/apt/lists/*
5759

60+
# Setup virtual environment
61+
ENV VIRTUAL_ENV=/opt/spark-venv
62+
RUN python3.10 -m venv --without-pip $VIRTUAL_ENV
63+
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
64+
65+
# Install Python 3.10 packages
66+
RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10
5867

5968
ARG BASIC_PIP_PKGS="numpy pyarrow>=22.0.0 six==1.16.0 pandas==2.3.3 scipy plotly<6.0.0 mlflow>=2.8.1 coverage matplotlib openpyxl memory-profiler>=0.61.0 scikit-learn>=1.3.2 pystack>=1.6.0 psutil"
60-
# Python deps for Spark Connect
6169
ARG CONNECT_PIP_PKGS="grpcio==1.76.0 grpcio-status==1.76.0 protobuf==6.33.5 googleapis-common-protos==1.71.0 zstandard==0.25.0 graphviz==0.20.3"
6270

63-
# Install Python 3.10 packages
64-
RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10
65-
RUN python3.10 -m pip install --ignore-installed 'blinker>=1.6.2' # mlflow needs this
66-
RUN python3.10 -m pip install --ignore-installed 'six==1.16.0' # Avoid `python3-six` installation
6771
RUN python3.10 -m pip install $BASIC_PIP_PKGS unittest-xml-reporting $CONNECT_PIP_PKGS && \
6872
python3.10 -m pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu && \
6973
python3.10 -m pip install deepspeed torcheval && \

0 commit comments

Comments
 (0)