1. What is the issue? Please be detailed.
It seems that there are challenges in handling photos (or other attachments) at scale on ODK, since they are stored as blobs inside the postgres DB, which may lead to rapidly increasingly postgres/docker disk space requirement when many media files are uploaded in form submissions, and images are also not directly accessible such as via URLs.
These issues seem resolvable if the images are stored in an external location such as AWS or Cloud Storage, and I'm aware of some discussion on this since ~2 years ago, but it doesn't seem to have gained as much steam.
I'm wondering if the workaround idea below could potentially help resolve the issue, though it is not ideal, but it might be simpler to implement.
The idea is to add a new type of Photo (or Video/File) capture field on ODK Collect app, that behaves as follows:
- When a user captures/uploads a photo (or video or other file), ODK Collect assigns a unique filename and record this value as a string in the form submission.
- The file is saved locally on the device, with the matching unique filename.
- No attachment/media files are uploaded with the form submissions.
- The user will still need to find a way to upload the media files (e.g., using Google Photos, Google Drive, or even by manually downloading them off the device and to a cloud).
- The point is there will be a unique ID that can reliably map the form entry in ODK Central to the file/filename.
This is far from ideal, but wondering if this may be a reasonable workaround, since it's not clear whether getting ODK Central to store media files (e.g., images) in external storage is progressing?
Thanks very much in advance for any insights/thoughts. Or if you have any working solution/experience on tackling this, would love to hear that.
Alternative idea (manual process):
The submissions data & attachments can be manually downloaded as a zip file. So in principle it is possible to manually upload attachments into a cloud storage (though this requires a manual process). Given this, I'm wondering is it possible to delete the disk space consuming blobs in the postgres DB after the zip file download, via ODK Central API, or via some URL argument/parameter when downloading the zip file? This is to keep the postgres DB disk usage reasonably stable and don't explode when many attachment files are uploaded.
Thank you again.