Hi all,
I'm loving the addition of external S3 storage since 2024.2.0: ODK Central v2024.2: Submission deletes via API and S3 media storage
This post is mostly to provide info for anyone searching for similar, rather than a question. Hope this is ok
(Also I added this to the development section as it's probably more for devs)
Preamble
Originally I wanted to post a question requesting that the S3 keys for submission attachments are included in the submission JSON/CSV somewhere.
This was primarily because I am using a public access S3 bucket, so it made sense to simply construct the S3 URL using the key and access the data (e.g. to embed multiple submission photos in a web page from their S3 URLs).
However, I realise the typical use case for this would involve a private access bucket, so instead went about another route.
Getting the S3 URLs for submission attachments
This is quite simple in hindsight.
The main process is: list attachments --> request attachment --> get pre-signed S3 URL --> do what you want with the URL! (download, display the img, etc).
-
List the submissions for the project (i.e. get the submission UUID you are interested in):
/v1/projects/{PROJECT_ID}/forms/{FORM_ID}/submissions
Returns:
{ instanceId: "uuid:e83db2b4-5e82-4e61-bc32-04750e511aff" ... }
It's also possible to get submission UUIDs via the OData endpoint.
-
List the attachments for a given submission UUID:
/v1/projects/{PROJECT_ID}/forms/{FORM_ID}/submissions/{SUBMISSION_UUID}/attachments
Returns:
[{"name":"1731676401897.jpg","exists":true}, ...]
The 'name' field here is stored in the Central database table
submission_attachments
as fieldname
, and is generated to be unique.This is the field that is used to download the attachment below.
-
Request a pre-signed URL for each attachment:
/v1/projects/{PROJECT_ID}/forms/{FORM_ID}/submissions/{SUBMISSION_UUID}/attachments/{ATTACHMENT_NAME}
Returns (example):
https://YOUR_S3_PROVIDER/BUCKET_NAME/blob-5-35bd9c1c5cbb5fb549b5f2bfa9d1f8a7fad45fc2? response-content-disposition=attachment%3B%20filename%3D%221731676401897.jpg%22%3B%20 filename%2A%3DUTF-8%27%271731676401897.jpg&response-content-type=image%2Fjpeg&X-Amz-Algorithm=AWS4-HMAC-SHA256& X-Amz-Credential=fmtm%2F20241115%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20241115T160531Z&X-Amz-Expires=60& X-Amz-SignedHeaders=host&X-Amz-Signature=3cd31e4303c5be4a2e649500322679952f6940b95432f4abb1bbd4b83916c5e5
Note that Central will seamlessly handle either sending the blob directly from the database, or providing a pre-signed URL for download from the S3 bucket.
Like I said above, this seems obvious with hindsight, but I didn't realise Central was capable of providing pre-signed URLs to access the images.
(originally I thought the only way to access the S3 data was from a submission .zip
dump).
Hope this helps someone!