Skip to content

Support S3 storage for results #345

@twelch

Description

@twelch

Need

Currently geoprocessing function results are limited to text-based results, typically JSON. This output is becoming increasingly large and more difficult to stick into DynamoDB. For example the kelpPersist output for the entire MLPA network is about 4.5 MB and is split over 12 DynamoDB items.

There is also an opportunity to be able to produce multiple outputs, and produce different types of outputs including images.

Solution

Continue to store the task state in DynamoDB, but store one or more results in S3 and point to those items in the DynamoDB item.

Architectural Design

  • new private s3 results bucket is created on project deploy. Lambdas are given permission to read/write to bucket.
  • developer in geoprocessing function calls a helper function to store result to S3 and gets metadata.
  • array of bucket result metadata is returned by GP function instead of actual metrics
  • task.complete will store bucket item metadata in dynamodb record instead of data
  • task.get will look at bucket items and if JSON, will return it directly. If filetype that can't be returned directly, perhaps image, then it returns a pre-signed URL that browser client can use to fetch the item.

Requirements

  • S3 items will be uniquely keyed (UUID)
  • if item already exists in S3 with this name, it will be overwritten
  • clear-all-results command will also clear S3 result bucket

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions