Collect: Extend audit log to include GPS coordinates

What is the general goal of the feature?
As a supervisor, I would like to occasionally collect GPS coordinates in the background to provide evidence that data was collected in a particular place.

Currently, users can use required GPS questions, but it's possible to get bad data. For example, if you have a long survey, you can collect GPS in the morning, then go home and fill in the survey data whenever.

Users could also use third party GPS apps to track location, but that's a fragile process. Also, the location doesn't come attached to the submission which isn't ideal.

What are some example use cases for this feature?
There have been related feature requests over the time that suggest that this is a good idea. This feature is a little different because:

  1. Unlike Auto-gps implementation on ODK Collect (which we still want to do), this will not require the form designer to decide when in the form they want the location captured, it'll be done opportunistically. The data also will not be in the form, it'll be in a audit file. As a side-effect, we won't have to touch too much of the spec or backing libraries so it should be easier to implement and ship.
  2. The goal isn't to provide a high-fidelity trace of the location like Record a geoline in the background and Parallel collection of Geotrace while collecting other information, but rather to be opportunistic and battery saving while providing some audit capabilities.

How will the feature work

See https://github.com/opendatakit/roadmap/issues/28 for the final spec.

~~Update: We have a rough specification at https://docs.google.com/document/d/1Xw4WOctVgg11ZfYwTnrL6qlTKSYwhn_pTc2GQ1OFaFM~~

We will extend form audit log to add location to the log if an accuracy or frequency parameter is set in the audit row of the XLSForm.

  • accuracy will be the minimum acceptable accuracy in meters for a location to be recorded.
  • frequency will be the minimum frequency in seconds that the location will be fetched
  • age will be the time in seconds that will location will be valid

When Collect opens the form, we will request location updates from the GPS provider every frequency seconds (with minDistance=0, for those familiar with the Android API). If the location meets the accuracy threshold it will be cached.

At the time of the audit event (e.g., view question, form save) occurs, Collect will check the cache for the location. If one is found and age has not been exceeded, it will write to the log.

Location information will be recorded in three columns: latitude, longitude, accuracy in the audit log that is attached to Collect's submissions.

To ensure enumerators are aware that this sensitive data is being gathered, Collect will show a dialog with This form collects your location in the background (or something similar) on first load with an OK and Do not show again option.

Risks
This feature will touch a number of tools across the ecosystem. It'll require changes to the XForm spec documentation, XLSForm documentation, pyxform to support the new parameters, JavaRosa to pass through those parameters to Collect, and Aggregate to display the new rows.

What can you contribute to making this feature a reality?
Spec and implementation guidance. Also docs.

Oh and if this isn't clear, I'd love some critical feedback on this! Does the approach make sense? What haven't I thought of?

1 Like

I do think this approach make sense to me.I like the concept of auto gps implementation on ODK collect As like of start time and end time, as soon as interview starts. it collects automatically and ends as well. Isn't it same sort of things.

2 Likes

Are there any data privacy issues involved here, since it might appear - in effect - to directly/indirectly enable GPS tracking of individual enumerators? eg the new GDPR laws that recently came into effect in the EU seem to have something to say about such things: "Will logistics companies still be able to track drivers?"

Would it be sufficient to have an explicit 'opt-out' option in any associated form that enables this feature (aka informed consent)?

My concern is (a) that this may be considered an 'invasion of privacy' in some countries, and (b) laws around this may well be quite different between countries, consequently we'd have to worry about controlling what countries a specific form may be deployed to. Ugh.

If this a can of worms, a possible alternative might be to timestamp GPS acquisitions (or does the existing audit log already do this?). At least this can give a rough indication that the GPS fix was acquired at a 'reasonable' time relative to the start/end time of the survey.

@yanokwa
I very much like this feature. It needed like yesterday.

On privacy concerns, enumerators are study staff and need to 1. Account for their time use 2. Collect research data authentically and credibly.

Please be let me know when this can be tested.

Paul

1 Like

When the frequency interval is met and the device tries to record a location, how long would the device try to reach the desired accuracy?

From one of our M&E (monitoring and evaluation) staff:

So this is collecting GPS coordinates as meta data? Certainly great for data quality and managing the process! With such a feature available (along with several other types of metadata), it'd be helpful to include "enumerator consent" step even before the beneficiary consent, to inform the data collector that certain meta-data (incl. their GPS location) are being collected. (I guess such consent can be obtained at on-boarding of the staff/volunteers, but integrating it with the data collection itself would a more efficient way of ensuring it happens, and may serve as a reminder to drive the desirable behavior.) I think this can just be a simple boiler plate language added as an 'acknowledge' question at the beginning.

Could be something good to note in the documentation.


Also:

One concern is - I know sometimes people have tough time collecting GPS coordinates as a question depending on location (hence the tip to not make it mandatory), so I wonder if making this collected in the background would cause the Collect app to be clunky or freeze if the device can't get the signal..

If the user sets the frequency to a very short interval, there might be some impact on battery usage but otherwise this concern shouldn't materialize?

Regarding the 'invasion of privacy', I think this issue is more related with having good practices and procedures in placed rather not having this feature. If you are collecting a GPS coordinate maybe you can add a note at some point on the survey specifying it, or say it when explaining the data that has to be collected. If the country does not allow to gather GPS coordinates, you should not include this option.

Just for clarification, I am not an expert on GDPR, so this is only my point of view. Maybe an expert has other opinion.

1 Like

Agreed with the responses from @berpita and @paul_macharia about @Xiphware's question. Collect isn't a consumer app, it's a workforce app where privacy requirements are different. It's up to the form designer to design an appropriate form.

@danbjoseph Getting consent is a great idea, but I don't think we should bake it into the implementation. It does raise the question about where we put these survey best practices (and maybe sample form snippets) in our documentation. Perhaps you can try to think through where we could put these things?

When the frequency interval is met and the device tries to record a location, how long would the device try to reach the desired accuracy?

The device will not stop trying because I can't think of a good way to determine when to remove updates and add them back.

When Collect opens the form, we will request location updates from the GPS provider every frequency seconds (with minDistance=0, for those familiar with the Android API). If the location meets the accuracy threshold it's cached.

At the time of the audit event (e.g., view question, form save) occurs, Collect will check the cache for the location. If one is found and age has not been exceeded, it will write to the log.

So with this setup, you could potentially get a log with the same locations for a few events. That is, as long as the location isn't old, it will be written to the log. And yes, if a user sets a frequency to a very short interval, they could drain the battery.

I've added the above detail to the top post of this topic.

1 Like

Would it be adequate to have something that (explicitly) indicates when a particular form is going to actively track the user? Its not quite 'informed consent', but at least they are aware they are (now) being tracked. Also, it might be prudent to make sure form designers are aware that when enabling this particular feature in their form, that there may be local regulations they need to follow depending on where the form is being deployed.

Sorry if I'm being excessively pedantic/worrywort here, but I'm not convinced that - in a strict legal sense - workforce vs consumer necessarily trumps data privacy laws. And anything that tracks users without their knowledge (or consent) raises warning flags for me :worried:

We can certainly include a note in the documentation that form designers should be aware of privacy regulations.

As far as the user, said user would have had to previously accepted that Collect will have access to their location. Additionally, the GPS icon on the top of the phone's UI will be active and blinking while their location is being collected (and I'm pretty sure the whole time the form is open). What other UI elements would you add to Collect to provide greater visibility into what is happening? Sketches or mockups are welcome!

1 Like

I think its less the issue that Collect is acquiring or using the GPS, but rather that the user's location is actively being tracked, and that this data will be reported back. Perhaps something a simple as a short read-only note at the beginning of the form saying "This form actively tracks and reports your location." Which could just be a recommended best-practice to form writers when using this feature. At the end of the day, Collect is just a general purpose tool, so the burden lies on the form creator to ensure it is used in accordance to local regulations.

What I dont want to see happen is for Collect to be inadvertently labelled a 'spyware' because someone deploys a GPS audit tracking form but never informs their users. This is the difference to, say, Google Earth - which is constantly using the GPS, but not actually reporting your location back anywhere to be processed for ulterior purposes.

1 Like

@Dr. Gareth,

I seem to see your concern.

Maybe at enumerator recuitement/hiring, a statement like "we trust you will collect data authentically and credibly. Inbuilt device features will be used to verify this" could be included in their terms of engagement. Metadata including GPS coordinates will have been taken care of.

I think it's a thin line working to make data collection as credible (researchers and data managers could understand what I mean) as possible and not "spying" on field staff.

Please let me know when the feature is ready for use. Hoping to try it ASAP.

Asante,
Paul Macharia

1 Like

It is a difficult question but I think manageable if what is proposed is to record the location when a question is answered. Ie using the audit log. When they are answering questions then presumably the enumerator is working and if they answer that question in a bar then its not unreasonable for the employer to know. If they spend their time in the bar playing candy crush then no location will be recorded?

It would be good to have an icon at the top of the phones UI displayed to indicate that the GPS location is being recorded while the form is completed. I take it the GPS icon that Yaw refers to just indicates that GPS is enabled ? Could an ODK specific icon be added to show that it is also being recorded?

[After cogitating on this for far too many days now, I think I've changed my mind... sigh].

As loath as I am to drag anything remotely resembling public policy into a technical discussion, I do feel strongly that ODK is a force for social good, through its facilitation of collecting (hopefully) accurate data about the world we live in. I would not want to see ODK exploited for ulterior purposes, but in this day and age there are many examples of cutting-edge technology being exploited without consideration of individuals' rights of privacy. At the end of the day I think we all have the right to know whether the devices we are carrying are actively recording and reporting our daily activities (including our precise whereabouts) so that we may at least respond accordingly. ie "informed consent".

Knowing how easy it is to subjugate, I also don't automatically trust those who provide me with technology to hold my privacy paramount. Providing a mobile device, along with some suitable forms, makes it too easy and attractive to covertly track a person's location, without their knowledge, at the flick of a switch ('justified' or not). My inclination might therefore be to have Collect always pop up an alert whenever starting any form that activates GPS audit tracking, of the ilk "This form actively tracks and reports your location. [OK] [Get me out of here]". eg this is much as Firefox does when encountering untrusted webservers, and seems more in keeping with the predominantly socially conscious nature of open-source: the user comes first.

Or to put it different way, what is the compelling argument against explicitly alert users when their location is actively being tracked and reported, and giving them a way out?

[OK. I'm going to shut up now, put my tinfoil hat back on, and retire to my fallout shelter... :slight_smile: ]

2 Likes

Apps request permissions upon install. Would it make sense to have an automatic screen at the start of filling a blank form that alerts the user as to what metadata is being collected (should users know that a form is recording start time and end time and logging the time spent on each question)? Do we make this optional. as it's certainly good practice but also there are circumstances where it might not be desirable... enumerators might be already briefed on what's being collected and be tasked with completing a large number of surveys in a limited amount of time and having them be required to swipe past another screen each time they start adds an unnecessary and ultimately burdensome amount of time to the data collection process? Could you make the screen show up only the first time a given survey form is opened... but what if a device is shared among multiple enumerators or the supervisor checks the device/form once before handing it off to an enumerator and they never see it? Would a warning upon just opening the ODK Collect app be enough? Generic such as "forms filled out with this device may be configured to collect X, Y, Z in the background" or somehow a custom message based on the survey forms that have been downloaded to the device?

1 Like

@danbjoseph Perhaps you have some insight here: to what extent do field surveys typically/atypically involve notions of anonymity of the subject(s)? For most practical purposes, taking a GPS fix in a survey eliminates any guarantee of anonymity (unless all your subjects travel to you). With an explicit geopoint question in the survey it is obvious to both the enumerator and subject that their exact whereabouts is going to be reported. But in a survey without such question, but with GPS audit tracking, there may be a mis-perception on the part of the subject that their responses are still anonymous. Thoughts?

1 Like

excellent point. however, i would also argue that geopoints may be frequently collected without the subject knowing as well. for example, after exiting a house the enumerator then records the location. pressing the button to record a geopoint can be done without saying anything to, or asking anything of, a respondent. i'm not sure how much of this can be solved with software design (short of just not implementing functionalities) and how much needs to be just following best/ethical practices when implementing surveys? other random side thought is that some forms may be only observational and there is no "subject".

1 Like

To add to what @danbjoseph said, I would guess most of the folks that collect GPS location of a participant (not the enumerator) don't tell the participant. It's not ideal, but there's not much we can do about that.

@Xiphware Form designers (who are also our users) do not want to give enumerators a way out. That is, if a form says you need an audit, then it is required and your options are to accept to have the audit log or the form doesn't load.

I'm not opposed to notifying on form load, but I want to make sure we aren't adding a burden to the common case: enumerators who are being hired and have been briefed on the terms for their hiring.

I propose we add dialog on form load that says, This form collects your location in the background (or something similar) on first load with an OK and Do not show again option. This adds more state that we have to track in the app, but so be it.

The caveat here is most enumerators will not see this message because they often aren't the ones that open the form first, but I think it's the best we can do.

2 Likes

I think this is reasonable. The 'user' is, in effect, acknowledging that they are installing an optional component to an app which will collect and report their location information. And the fact this acknowledgement itself is being audited, means we have a paper trial to prove they agreed, should they subsequently dispute their consent.

2 Likes