[SPARK-56302][CORE] Free task result memory eagerly during serialization on executor

ivoson · LuciferYang · commit 3851cb50e254 · 2026-04-09T15:03:57.000+08:00
### What changes were proposed in this pull request? Eagerly null intermediate objects during task result serialization in `Executor` to reduce peak heap memory usage. During result serialization in `TaskRunner.run()`, three representations of the result coexist on the heap simultaneously: 1. `value` — the raw task result object from `task.run()` 2. `valueByteBuffer` — first serialization of the result 3. `serializedDirectResult` — second serialization wrapping the above into a `DirectTaskResult` Each becomes dead as soon as the next is produced, but none were released. This PR nulls each reference as soon as it's no longer needed: - `value = null` after serializing into `valueByteBuffer` - `valueByteBuffer = null` and `directResult = null` after re-serializing into `serializedDirectResult` All changes are confined to the executor side within `TaskRunner.run()`, where the variables are local and not exposed to other components. ### Why are the changes needed? For tasks returning large results (e.g. `collect()` on large datasets), the redundant copies can roughly triple peak memory during serialization, increasing GC pressure or causing executor OOM. Eagerly freeing dead references lets the GC reclaim memory sooner. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing UTs ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code v2.1.88 Closes apache#55110 from ivoson/free-result-memory-asap. Lead-authored-by: Tengfei Huang <tengfei.huang@databricks.com> Co-authored-by: Tengfei Huang <tengfei.h@gmail.com> Signed-off-by: yangjie01 <yangjie01@baidu.com>
diff --git a/core/src/main/scala/org/apache/spark/executor/Executor.scala b/core/src/main/scala/org/apache/spark/executor/Executor.scala
@@ -57,6 +57,7 @@ import org.apache.spark.status.api.v1.ThreadStackTrace
 import org.apache.spark.storage.{StorageLevel, TaskResultBlockId}
 import org.apache.spark.util._
 import org.apache.spark.util.ArrayImplicits._
+import org.apache.spark.util.io.ChunkedByteBuffer
 
 private[spark] object IsolatedSessionState {
   // Authoritative store for all isolated sessions. Sessions are put here when created
@@ -883,7 +884,7 @@ private[spark] class Executor(
         val resources = taskDescription.resources.map { case (rName, addressesAmounts) =>
           rName -> new ResourceInformation(rName, addressesAmounts.keys.toSeq.sorted.toArray)
         }
-        val value = Utils.tryWithSafeFinally {
+        var value: Any = Utils.tryWithSafeFinally {
           val res = task.run(
             taskAttemptId = taskId,
             attemptNumber = taskDescription.attemptNumber,
@@ -938,7 +939,9 @@ private[spark] class Executor(
 
         val resultSer = env.serializer.newInstance()
         val beforeSerializationNs = System.nanoTime()
-        val valueByteBuffer = SerializerHelper.serializeToChunkedBuffer(resultSer, value)
+        var valueByteBuffer: ChunkedByteBuffer = SerializerHelper.serializeToChunkedBuffer(
+          resultSer, value)
+        value = null // Allow GC to reclaim the raw task result
         val afterSerializationNs = System.nanoTime()
 
         // Deserialization happens in two parts: first, we deserialize a Task object, which
@@ -982,10 +985,15 @@ private[spark] class Executor(
         val accumUpdates = task.collectAccumulatorUpdates()
         val metricPeaks = metricsPoller.getTaskMetricPeaks(taskId)
         // TODO: do not serialize value twice
-        val directResult = new DirectTaskResult(valueByteBuffer, accumUpdates, metricPeaks)
+        var directResult: DirectTaskResult[Any] = new DirectTaskResult(
+          valueByteBuffer, accumUpdates, metricPeaks)
         // try to estimate a reasonable upper bound of DirectTaskResult serialization
         val serializedDirectResult = SerializerHelper.serializeToChunkedBuffer(ser, directResult,
           valueByteBuffer.size + accumUpdates.size * 32 + metricPeaks.length * 8)
+        // Allow GC to reclaim the first serialization buffer. Both references must be
+        // nulled: the local var and the field inside directResult point to the same object.
+        valueByteBuffer = null
+        directResult = null
         val resultSize = serializedDirectResult.size
         executorSource.METRIC_RESULT_SIZE.inc(resultSize)