Store media files uploaded to Google Drive in separate directories

Currently all media file uploaded to Google Drive are stored in Open Data Kit > Submissions. That means files from different forms and different submissions of those forms are mixed there.

Probably it would be better to have a separate directory for each form and for each submission. It could be:
Open Data Kit > Submissions> Form name > Submission name > File name
Open Data Kit > Submissions> All widgets > All widgets_2019-11-20_01-00-04 > 1574204528724.jpg

In my opinion it would reduce the mess there but I want to hear what other users think about it.

It would also allow us to simplify media file names, currently every file is named like:
/storage/emulated/0/odk/instances/All widgets_2019-11-20_01-00-04/1574204528724.jpg

but it could be just 1574204528724.jpg

Oddly, I have two sub directories in my Google Drive under Submissions that are named as the FormID (from XLSForm) where the media files are stored for the respective submissions (filenames are the full path) - I didn't knowingly create or specify these, so I wonder how / why they are there. For other forms they are in the main Submissions directory - as you describe.

Personally, I think your suggestion is good, retains a logical structure and is consistent with other ODK tools. Of course, with your suggestion of folder structure and naming convention, it could open the way to extending Briefcase to pull from Google Drive, or downloading and pointing to a 'Collect Directory', if you were to insert the instance xml in that folder as well... Just thinking ahead!

One thing I hadn't appreciated until about 2 minutes ago, which might cause me some problems if I want to download the media files related to form submissions, is that I think the media files are stored on the Drive of the person that uploads them, not on the same Drive account as the upload spreadsheet? I didn't see this documented anywhere, but that appears to be the case. Just wondering about that and whether it is feasible to manage things in a way that allows the survey data to be 'hosted' in a single account - also relevant to my previous point. What happens if an enumerator has a clear-out of their drive, finds a folder full of images and thinks that they can be deleted...

So I was thinking that in parallel to specifying the submission target spreadsheet, another would be the target folder for media? Getting a bit more complicated...

But it potentially causes trouble with permissions for different folders across Drive - that might be a whole new can of worms? Obvious privacy problems if the data is openly available, but it could be an administrative headache to manage the spreadsheet and submission folder(s) permissions separately?

Thanks for dedicating time to Google Sheets issues. Much appreciated.

1 Like

That's true I forgot about it it's another thing we should try to improve. There is a topic which describes that problem Set folder location for photos in Google Drive

1 Like

Personally, I like this kind of filename:
All widgets_2019-11-20_01-00-04/1574204528724.jpg
as it makes it easier to see and find photos manually when necessary.
Feel free to get rid of "/storage/emulated/0/odk/instances/", I already have scripts stripping all that off.

1 Like

I'm guessing the folder name should be based on something stable like an identifier. Do form names ever change part way through a data collection campaign? Are form names localized for some enumerators? The folder needs to be consistent despite these possible variations in naming.

It is stable, using the sample I provided:
Open Data Kit > Submissions> All widgets > All widgets_2019-11-20_01-00-04 > 1574204528724.jpg

Open Data Kit > Submissions > - always the same
All widgets - shareable part of instance names which is its form name
All widgets_2019-11-20_01-00-04 - an instance name which is based on a form name + date