This spec proposal relates to the new in-app recording feature discussed here.
We propose making the audio recording quality configurable in form design. This would allow the form designer to make an informed decision based on the analysis to be done and doesn't add to the complexity of configuring clients.
XLSForm
Introduce a key named quality
to the parameters
column. This matches the pattern established by max-pixels
or audit parameters. This key would only be applicable to questions of type audio
.
type | name | label | parameters |
---|---|---|---|
audio | my_recording | Label | quality=voice-only |
XForm
Introduce a bind
attribute in the odk
namespace with name quality
applicable to fields with bind
type binary
. It would be ignored for binary questions with mediatype
other than audio/*
.
<bind nodeset="/data/my_audio" type="binary" odk:quality="voice-only"/>
...
<upload mediatype="audio/*" ref="/data/my_audio">
Values
Only the following string literals would be allowed as values:
-
voice-only
: minimizes file size by optimizing for voice. Only appropriate for one speaker/participant at a time with minimal background noise. -
low
: allows voice recording in noisier backgrounds but not great for detailed sounds. -
normal
: high enough quality for most applications while keeping file size low. -
external
: recording will be delegated to an external app (same as current behaviour in Collect 1.28)
value | extension | codec | channels | sample rate | bitrate | file size |
---|---|---|---|---|---|---|
voice-only |
.amr | AMR | mono | 8kHz | 12.2kbps | ~5MB/hour |
low |
.m4a | AAC | mono | 32kHz | 24kpbs | ~11MB/hour |
normal |
.m4a | AAC | mono | 32kHz | 64kbps | ~30MB/hour |
Decisions around the details of the different quality settings (codec, container, bitrate etc) were based on the defaults we’ve seen in Sony (now deprecated but very popular) and Google’s Android recorder apps. From conversations on the forum and with potential users, AMR was identified as a good choice for low storage, voice optimized recordings that still work with transcription services. For the moment we’ve chosen not to offer a PCM/compressed lossless option as we’ve not seen many use cases that require it and would be more work to implement. If people need this they could continue to use external
but we also want to make sure that high quality records could be added later as a contribution.
By default, the recording quality will be normal. We propose making the use of an external app configurable in Collect settings but not the quality. This will give the form designer control over the file type and size when using the internal recorder.