-
Notifications
You must be signed in to change notification settings - Fork 584
Open
Labels
Description
Backend
VL (Velox)
Bug description
HashBuild OOM:
org.apache.gluten.exception.GlutenException: org.apache.gluten.exception.GlutenException: Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: Operator::isBlocked failed for [operator: HashBuild, plan node ID: 6]: Error during calling Java code from native code: org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget$OutOfMemoryException: Not enough spark off-heap execution memory. Acquired: 8.0 MiB, granted: 3.0 MiB. Try tweaking config option spark.memory.offHeap.size to get larger space to run this application (if spark.gluten.memory.dynamic.offHeap.sizing.enabled is not enabled).
Current config settings:
spark.gluten.memory.offHeap.size.in.bytes=3.5 GiB
spark.gluten.memory.task.offHeap.size.in.bytes=3.5 GiB
spark.gluten.memory.conservative.task.offHeap.size.in.bytes=1792.0 MiB
spark.memory.offHeap.enabled=true
spark.gluten.memory.dynamic.offHeap.sizing.enabled=false
Memory consumer stats:
Task.27337: Current used bytes: 3.5 GiB, peak bytes: N/A
\- Gluten.Tree.31: Current used bytes: 3.5 GiB, peak bytes: 3.5 GiB
\- Capacity[8.0 EiB].31: Current used bytes: 3.5 GiB, peak bytes: 3.5 GiB
+- NativePlanEvaluator-33.0: Current used bytes: 3.5 GiB, peak bytes: 3.5 GiB
| \- single: Current used bytes: 3.5 GiB, peak bytes: 3.5 GiB
| +- root: Current used bytes: 3.5 GiB, peak bytes: 3.5 GiB
| | +- task.Gluten_Stage_43_TID_27337_VTID_33: Current used bytes: 3.5 GiB, peak bytes: 3.5 GiB
| | | +- node.6: Current used bytes: 3.5 GiB, peak bytes: 3.5 GiB
| | | | +- op.6.2.0.HashBuild: Current used bytes: 3.5 GiB, peak bytes: 3.5 GiB
| | | | \- op.6.0.0.HashProbe: Current used bytes: 48.6 KiB, peak bytes: 49.6 KiBThe Vanilla Spark use SortMergeJoin whose right side is much more smaller, but gluten choose left side as build side based on the logical stats.
+- * SortMergeJoin LeftOuter (113)
: :- * Sort (91)
: : +- AQEShuffleRead (90)
: : +- ShuffleQueryStage (89)
: : +- Exchange (88)
: +- * Sort (112)
: +- * Project (111)
: +- * BroadcastHashJoin Inner BuildRight (110)
: :- AQEShuffleRead (98)
: : +- ShuffleQueryStage (97)
: : +- Exchange (96)
: : +- * Project (95)
: : +- * Filter (94)
: : +- * ColumnarToRow (93)
: : +- Scan parquet fp_datamining.rt_rec_vn_spl_feature_item (92)
: +- BroadcastQueryStage (109)
: +- BroadcastExchange (108)
: +- * HashAggregate (107)
: +- AQEShuffleRead (106)
: +- ShuffleQueryStage (105)
: +- Exchange (104)left side:
right side:
Gluten version
No response
Spark version
None
Spark configurations
No response
System information
No response
Relevant logs
Reactions are currently unavailable