-
Notifications
You must be signed in to change notification settings - Fork 92
Description
Is your idea related to a problem? Please describe.
When users open Environment, Dataset, Notebook, Pipeline, or ML Studio detail pages, the app requests the stack field (e.g. via getEnvironment, getDataset). That triggers a full stack describe task (CloudFormation: DescribeStacks, DescribeStackResources, DescribeStackEvents), which runs in the awsworker Lambda. Each of those API calls can return a lot of data, and the worker then writes resources, events, and outputs to the DB. With many users or many stacks, this leads to:
- Worker Lambda timeouts (e.g. 15 minutes) after "Stack describe events response", because the full describe is too heavy.
- High worker and DB load from every detail-page view, even though the UI on those pages only needs stack status and stack URI, not resources or events.
Describe the solution you'd like
Introduce a light stack describe that only updates status and stack ID, and use it for the "view" path. Keep the full describe for the Stack tab.
-
View path (resolving stack field on Environment/Dataset/Notebook/Pipeline/ML Studio detail pages):
Queue a light describe task that fetches and updates only stack.status and stackid in the DB. No resources, events, or outputs. This keeps status fresh without heavy CF calls or large DB writes. -
Stack tab path (when the user opens the Stack tab):
Keep the current behavior, run the full describe (DescribeStacks, DescribeStackResources, DescribeStackEvents) and update status, outputs, resources, and events in the DB.
Implementation approach:
- Add a new worker task type (e.g., cloudformation.stack.describe_status) that only runs DescribeStacks and updates stack.status and stackid.
- For detail-page stack resolution (resolve_parent_obj_stack), queue the light task instead of the full describe.
- Stack tab continues using the existing full describe.
Result: detail-page views only trigger a light DescribeStacks call and a small DB update, while the Stack tab still gets the full describe with resources and events. This reduces worker load and avoids timeouts, without changing what users see on the Stack tab.
P.S. Don't attach files. Please, prefer add code snippets directly in the message body.