Skip to content

[VL] AWS S3 read performance is very bad when executor.cores are big #11765

@FelixYBW

Description

@FelixYBW

Description

appid executors x cores IOThreads loadQuantum maxCoalescedBytes maxCoalescedDistanceBytes prefetchRowGroups SplitPreloadPerDriver elapsed
app-20260314210515-0006 16x1 512 16777216 131072 0 8 8 64.83501200299997
app-20260314210322-0005 8x2 512 16777216 131072 0 8 8 62.53538827000011
app-20260314210124-0004 4x4 512 16777216 131072 0 8 8 69.0349243720002
app-20260314205823-0003 2x8 512 16777216 131072 0 8 8 131.8899209910005
app-20260314200243-0000 1x16 256 16777216 131072 0 8 8 286.929762965
app-20260314231115-0030 1 1024 16777216 131072 0 8 32 224.50527761900048
app-20260314231548-0031 1 1024 16777216 131072 0 8 16 214.15762610500133
app-20260314232010-0032 1 1024 16777216 131072 0 8 8 212.18881194200003
app-20260314232430-0033 1 1024 16777216 131072 0 8 4 206.15889941000023
app-20260314232845-0034 1 1024 16777216 131072 0 8 2 210.9045796419996
app-20260314233303-0035 1 1024 16777216 131072 0 8 1 247.84443900300175
app-20260314233759-0036 1 1024 16777216 131072 0 8 0 245.9798966769995

Looks like the root cause is in aws_cpp_sdk. There is a limit of outstanding requests each process can have.

Gluten version

None

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions