Enable forms to access data from an external source

1. What is the general goal of the feature?
The Pulldata feature works great for static information, but I'd like to have a way to retrieve information from the web and display it in the form at the time of survey. It can be through an API/google sheets/etc.

2. What are some example use cases for this feature?
It could be used to check if the particular house has already been visited before by another surveyor. It could be to verify some parameter using already available data.

3. What can you contribute to making this feature a reality?
No coding experience but I can contribute for testing.

It feels like this is a pretty common area that people would like features for! Would you be able to flesh out the use case you're trying to support? How do you currently deal with this scenario? What's the biggest problem you're running into at the moment?

I'd be happy to help! Our projects sometimes deal with interviewing the residents of a city with respect to health, mosqitoes etc.

Here's a typical scenario:

A surveyor has already visited a particular area on Day1. The first question is the House number which is unique throughout the city. The process is to open the form, enter the House number first, and then attempt to interview the resident. The second question in the form is if the residents are available for interview. If yes, it is followed by the other survey questions. If not, the form exits after recording that interview was not possible. The data is submitted to the server on a live basis and is available for download through an API endpoint.

On Day2, the process continues with the same Surveyor or a different one. Ideally, as soon as a House number is entered into Question1 of the form, it should pull the data which has already been submitted till now using the API endpoint (as a CSV file) and should display various columns which show the status of this House number. Whether it has been visited before, when & who visited, was it a successful visit, etc.

This feature will provide the following benefits:

  • Surveyors will not waste time trying to interview residents who have already been interviewed previously
  • We can programmatically prevent someone from visiting a house a second time if the visit was a success
  • We can encourage the surveyor to try harder for the interview if the previous visits were unsuccessful.

Other (ideal) requirements and notes:

  • It should be possible to pull data from a variety of sources, even google sheets which someone can keep updating at the back-end with any other kind of information required.
  • If the data pull doesn't work for any reason (eg. bad data connection, API is down etc), it should still be possible to proceed on the form. It would be best if the data pull status can be captured in a variable, so that we can make the form logic according to that.

How we are managing currently:

  • We provide a Google sheet doc link containing updated visit information which each surveyor can check for the status of each house, but practically it doesn't happen because either the back-end person forgot to update the file, or the surveyor is not savvy enough to do a search within the sheets etc.
  • Actually many of our surveyors spend a lot of time ringing doorbells, waiting for responses, etc. only to find that the interview has already happened before.

Hope it is clear, if any other info is required please let me know.


This is great info @amalm! Thank you. It's especially useful to see the "How we are managing currently" details. Understanding the problems you're running into is super helpful to getting to the WHY of changes that are being suggested.

One follow up question: Given that you're currently using a Google sheet doc and you're suggesting that the form be able to pull data from a Google Sheet I'm assuming your surveyor's devices are almost always connected to the internet when visiting households. Is that correct?

Yes, they're almost always connected to the internet. There are rare instances when the connection is bad, and it is for those cases that I suggested the form be able to continue with a flag variable or something.

Oh, and what would make this feature ideal is if the external data pull only pulls the new updated records and not the entire database from the start. This would reduce both the data consumption and time taken to download.
Although I have no idea if this is possible with the data structures in use.

This would definitely all need new feature development but it's something that has come up quite a few times. A bunch of discussions around this are listed here (hopefully we can add this to the list as well).

One other piece of info that might provide some additional context is what organization you're working for/with? No worries if you don't feel comfortable posting that!

I am looking for same idea and we want to learn from your experience. Pls help me how to pulldata fom live data through the internet. I use google docs to stored submitted forms from surveyors. Could you share with us your successful?

Thanks so much!

I'm also interested in this feature but not sure if it has been made available.

We don't have immediate plans to connect directly from clients to arbitrary data sources though it's something that we will likely consider in the future.

If the data you want to feed back into follow-up forms is from ODK submissions, Entities should be a good match.

If the data exists in some external source of truth, you can consider using an external process to keep a Central entity list synchronized with it. For example, see the discussion here.

I think, a basic requirement should still be that new features can be used without permanent internet. This is one of the highlights of ODK (Collect), and necessary for many humanitarian projects/users.