Export of offline submissions from Collect

What high-level problem are you trying to solve?

At ODK summit José was describing a problem he has during data collection in DRC (where they collect literally millions of submissions via ODK):

  • Enumerators in a community use their own phones during a data collection activity, entirely offline.
  • The supervisor must then collect the phones from all enumerators, then travel to a location with internet (possibly 2-3 days travel, which is dangerous travelling with 20+ phones!).
  • During the transit time, the data collection activity has to stop (the enumerators no longer have their phones).
  • Ideally the submissions would all be on one device, so the supervisor can travel with it safely to the area with internet connectivity.

Any ideas on how ODK could help you solve it?

I labelled this post 'export offline submissions', but to avoid an XY problem, what is actually needed is a way to sync data between devices, ideally onto a single centralised device, so a single device can do the final submission for multiple enumerators.

  1. Using Starlink was discussed to mitigate this problem, as coverage is improving. However, electricity is also intermittent in the region, meaning the satellite connection receiver may not work / be reliable.
  2. I suggested running Central with a WiFi antennae connected to a laptop, to sync all data onto the laptop via WiFi. However, there are 40+ projects ongoing simultaneously, meaning distributing 40 laptops, so this won't work.
  3. Also suggested running a RaspberryPi or similar SBC running Central, but this also isn't ideal. A battery pack would be required for a reliable connection to mitigate electricity issues.
  4. A really nice option could be device-to-device sync via Collect. E.g. a 'send to nearby device' button, to export the submissions, send to another user, then have them import the submission into Collect. All users could do this and send the data to a single supervisors version of Collect. However, this would be technically difficult to implement, so I don't know how realistic it is. There are many interesting sync protocols for peer-to-peer networking that could be explored if going this route.
  5. The solution I settled on as the most feasible is data export from Collect. If an enumerator works offline and collects offline submissions, but they have not been synced to a server yet, could we have an export button to simply dump the submission XML a single zip containing XML files? These zips could then be:
    • Copied to a single device via NFC, bluetooth, etc.
    • Transported on a single device to internet connectivity.
    • Uploaded by the supervisor to HQ.
    • Ingested into Central, likely via the OpenRosa endpoint, or via pyodk?
    • Risk: if the user exports submissions, do we delete them from their device? If not, how do we prevent conflicts if their device gets internet. If yes, a risk would be losing the file, meaning the submissions are lost. I imagine this should probably be a slightly hidden options that isn't easy for users to do accidentally.

Additional context

  • A natural follow up to this could be import of the exported zip into Collect, making the whole process of submitting the final data a lot easier (directly from the device, instead of a bespoke workflow).
2 Likes

Before ODK Briefcase deprecated, this was a very useful functionality, to transfer the data (instances) from ODK Collect to a Laptop.

Not only useful for the Jose use case, but as well for very small projects that had no need of a Server.

  • Data collected in Collect
  • Exported instances to a Laptop (via Briefcase in the past)
  • Instance exported to csv (directly on the same laptop via Briefcase) and ready for analysis

It would be extremely useful if this functionality would be in ODK.
A button to export instances to a device connected via USB?
Or to export to csv?

1 Like

Oh cool, thanks for the input Aurelio!

I have never used Briefcase, so didnt know it was functionality previously in the ODK ecosystem!

I imagine there was a good reason for the team to deprecate Briefcase & the approach used there (I don't know the context). But I wonder if Collect to Collect data transfer is something that would be considered :smiley:

1 Like