Recording 1-2h interviews in Collect and exporting them from Central

Thalie · September 29, 2020, 10:36am

Hello,

I am currently trying to assess the feasibility of audio recording with ODK Collect with a requirement of 1-2h recording (...) for a crazy amount of in-depth interviews / focus group discussions (ideally using an external microphone).

So far, I have focused on 3 aspects:

max recording duration when using different third-party apps launched by ODK Collect / any crashes or risks of losing the recording at the point of collection?
data size / duration of data transfer when sending form
data retrieval from the ODK server

So far, I have performed the following tests (using RecForge II as a third-party)

2 audio tracks: 25 min / 45 min - transfer: less than 1 min (with good 4G connection)
1 audio track: 1 hour (on a note, there seems to be an error in the displayed recording duration when above 1h / the time falls back to 00:14 while it should instead indicate 01:14) - transfer: less than 1 min
download data from ODK Central (draft form not published) => the download fails

Do other users have similar indications when using other third-party audio recorders? This would be super helpful, while I continue to investigate.

seadowg · October 5, 2020, 8:10am

Out of interest are you planning to have a form that records other pieces of data or are you just planning to recording the interviews?

Oh wow nice find. I'll take a look at this in Collect.

What are you using to download and what are you seeing when it fails?

Thalie · October 5, 2020, 4:56pm

Yes, there are also short sociodemographic questionnaires associated with in-depth interviews, but it possibly could be conducted on another device. We also have a smaller study in which photos will be taken and GPS coordinates will be regularly recorded during a "walking" interview.

I indeed did not describe this properly. I tried to download a CSV/zip of my test database (including 2 submissions with the 25 / 45 min and 1h audio records I previously mentioned) using the "Download all records" button on ODK Central.

The download failed while trying to estimate the size of the *zip file.

I tested this again this time on a database with 1 single submission including a very short audio record and can retrieve the *.zip without any problems with the audio file in the media subdirectory, so I suspect that the problem with my initial test was that the audio records were too heavy and I should investigate more compressed formats, such as the package indicated by @aurdipas, but then it is likely I will run into the same issue if handling a database with a large amount of audio submissions... tricky! Should audio / media rather be accessed through the REST API?

seadowg · October 6, 2020, 8:45am

Just to get some more insight are you recording the interviews for transcribing later, archiving, auditing or some other reason?

Hmmm something is definitely not right there. I think you should post another support question and make sure to include the Central version you're running, how it's deployed and include screenshots of the errors you're seeing.

LN · October 9, 2020, 9:13pm

It's really hard to make guarantees of good behavior because there's so much going on with a mobile device. If you have a modern device and keep the screen on, you more than likely won't have any issues though this will have more to do with the audio capture app than with Collect. I would recommend recording multiple shorter tracks if you can to reduce your risk of losing an entire interview.

How big were those files? How much RAM does your server have? An export with large media files can make the server run out of memory so you may need to increase your server memory. I believe this is more sensitive to the size of your largest media file rather than the aggregate size of many media files but I'd like confirmation from @issa on that. When the server runs out of memory, it just kills Central and unfortunately Central doesn't have any way of gracefully letting you know why your download is failing. You may want to set up memory monitoring (instructions for Digital Ocean) to help you confirm that this is the problem and help you decide how much RAM you may need.

You may have better luck with this but if an individual file is very large you may still run into a similar problem. Regardless, you may find it beneficial to be able to just download the latest submissions instead of getting a zip with all of them each time. You can write your own script to do this, or use Briefcase.

You can track @seadowg's progress at https://github.com/getodk/collect/issues/4143.