Adding native audio recording for `audio` type questions

This is a continuation of the recent discussion here. It's also very related to creating background recordings since both require native recording capabilities.

1. What is the general goal of the feature?

  • Collect users should be able to record audio responses to interview questions without the need for external apps
  • Other applications often are not free, have many confusing settings, and generally add to the training requirements for enumerators

2. What are some example use cases for this feature?

  • Recording interview responses
  • Recording focus group discussions
  • Recording environmental samples (e.g. birds, traffic, etc.)

3. What can you contribute to making this feature a reality?

  • UX/UI design, funding

Hi @Tino_Kreutzer, great that you created this thread. Today I spoke with my friend cofinancer&developer and he shared with me some concerns about using AMR regarding Licensing and patent issues that we should take into account:

"In the category of personal computer products, e.g., media players, the AMR decoder is licensed for free. The license fee for a sold encoder falls from $0.40 to $0.30 with volume, up to a maximum of $300,000 annually. The minimum annual royalty is not applied to licensed products that fall under the category of personal computer products and use only the free decoder.[7][8]"

Other royalty-free formats of interest that he shared with me are:

FLAC - compresión sin pérdida, comprime wav a ~40%:

Opus - compresión con pérdida con enfoque en voz (alternativa a mp3):

I guess a good point to start the conversation is to define what we could do with the resources my friend and I have available and from there, we should look for other funds or developers contributing to it. Also, as you mentioned defining the specs is the other side.

I will describe in the next days what was the initial scope and from there we can continue the conversation. What do you think?


Hi all,
Description of the App that we already arrange / got funds to develop:

The key points of this app are:
• Available on google play and it can be run on Android.
• Audio could be recorder in mp3, wav and other formats such as FLAC and OPUS.
• Open source code
• We are thinking to use it for short interviews or answering open questions
• Responds to: android.provider.MediaStore.Audio.Media.RECORD_SOUND_ACTION
• Compatible with ODK collect v.1.27.3
• Developed in Java

We were thinking to develop first the external app and use it in the field. Later, we could search for some funds / time to help in adapting it to be integrated in ODK collect. The reason for this is that we need to have it working in September. We will be sending the data to ODK Central and I don´t know if there is something else we should consider or if this is enough to start.

Thank you for your comments and I will let you know the name of the app and where is the code once we have it ready to share it.

Thanks for the update @Inti_Luna! Looking forward to seeing what you've worked on. It would be interesting to have the team working on the app connect with the ODK community on here or on Slack around how any future integrations could work and around potential blockers to merging like licensing, code quality and test coverage.

Hello Seadowg and all,
The code is almost ready for this first version. You can see the code in

Some points to mention:

  • formats available at the moment: wav, flac and opus.
  • app is 1.5MBs and less than 5MB installed
  • 10 seconds are (16Khz, mono, 16bits):
    191 KB in wav
    112 KB in flac (58% of wav size)
    12 KB in opus (6% of wav size)

I wanted to ask for people speaking french, german and portuguese if you can help with this translation to have it ready for this first version. there will be instructions how to add more languages but at least I wanted to start having spanish, english, french, german and portugues.Other non-main stream languages are welcome.

You can use these excel sheets to complete with the words in the specific languages and send it back to me ASAP please.frances.xlsx (8.1 KB) german.xlsx (8.1 KB) portuguese.xlsx (8.1 KB)

We are waiting to have the app revised by google play and then shared with you.

Hi @Inti_Luna, great to see this project advance! I haven't tested your first version yet. Do you have a compiled APK you can share?

Great to see the support for Opus! I was wondering if you've done tests for speech recording of opus vs AMR (in terms of file size relative to the quality)? Was there a licensing issue that prevented including MP3?

The value of this in contexts where data security is an issue cannot be understated

Big problems we see are that

  1. The audio recorder saves a copy of the recording to disk, which has to be manually removed. At present our teams collect a lot of sensitive health data interviews and use RecForge Pro II. It is a very messy process to clean up the files after saving.

  2. Audio files get big very quickly, putting strain on ability to upload media files with limited bandwidth. The Opus format sounds promising if 1 hour of interview comes in at less than 5 MB, but I haven't worked with Opus before and I would love to test an APK to see how this works in the field.

It would be good if the native audio system could be flexible in how it handles files

  1. Audio file encryption using ODK form level encryption. Should work out of the box I guess.
  2. Option to save copy of file to Android storage (in encrypted format)
  3. Option not to send media file with data submission. i.e. leave data file on device but send data without attachments
  4. Options for format, quality etc would be coded from within XLSForm specification.
1 Like

This should be taken care of by Remove previously taken images from gallery (via setting) which should be implemented soon. (It will be not just images but all attachments.)

RecForge II allows recording in opus, but strangely only in 48kHz. I've had very good results recording voice with AMR/3gp, though Opus claims to be even smaller. In my work AMR required 0.1 MB per minute at phone quality, which would be small enough for field contexts. Higher quality would be good for multi speaker or bad recording environments, though. This is why allowing users to specify the recording quality is quite important.

1 Like