I have a model that returns samples ok with low values of tune, but when I bump to high values, python crashes just as the sampling completes, with my OS (Ubuntu) logs showing an OOM event. I don't need the warmup samples, so I tried setting save_warmup = False, but no dice.
Is it possible that the dropping of the warmup data is occurring unnecessarily late, ex. after conversion from arrow to InferenceData? That's the only thing I can think of to explain why higher values for tune cause crashes despite equal values for draws.