Collect: Extend audit log to include GPS coordinates

Hi everyone. I have started investigating this issue. Here is the doc I'm working on https://docs.google.com/document/d/1Xw4WOctVgg11ZfYwTnrL6qlTKSYwhn_pTc2GQ1OFaFM/edit?usp=sharing

Please review and comment if you have any thoughts.
Thanks!

1 Like

Hi All,

Under GDPR we would need to collect specific a priori consent to collect GPS data on the enumerators [opt-out is a key thing that GDPR precludes. all consent must be 'opt-in']. In a research study we would additionally require ethics board consent (both from lead institution and in country boards) to do the same. You would need only collect this once per enumerator per study, so a consent form signed by your team members at the start of the study would allow you to 'track' the enumerators throughout the study.

Possibly more important is that this form behaviour implicitly collects gps data on study participants in a study where the data being collected involves collection of data about living individuals. This also requires both a priori consent from the participant and ethics board approval, which I am confident that you would never get from any ethics board. They are very unlikely to allow any study to collect gps in the background for 'project monitoring' reasons because those data have no specific and direct contribution to the goals of the work but would make it possible to identify the location of individuals, their homes etc.

A standard requirement for any GPS data would also be encryption on the server. GDPR sees no legal difference between data that are fully identifying or pseudonymous (or actually encrypted) and GPS data are not considered to be anonymous whether names/ids are attached or not. Encryption is therefore the bare minimum for making it 'reasonably unlikely' [a GDPR term] that a malicious actor could identify individuals and their locations from any information in the data set, including 'background' metadata and audit information.

1 Like

@Jefferson_Francisco this feature is being track as a card on the project roadmap and is currently in the "Under Consideration" category

@chrissyhroberts thank you for your valuable and detailed input. this feature would be something that has to be enabled by the form administrator, and not a default part of the audit log. there are a wide variety of users of ODK and i think there are scenarios for this feature that are outside the use-case you outline (paid enumerators, that have signed consents about the tracking, that are surveying utility assets like light posts or agricultural features?). it is important that any user of the technology understands applicable legal issues around their activities. we would hope they would also consider what is generally good ethics, regardless of a legal framework to regulate it. even with just the geopoint question, it would be possible for an enumerator to collect location of a surveyed person without explicitly asking their consent. and even a basic text question collecting someone's name means you should encrypt and protect your data. while including the option to add location to the things tracked in the audit log is something people could use unethically and/or illegally, i'm not sure it increases the potential for unethical action that much when you consider the full range of functionality of any widely adaptable mobile data collection tool. but i would be interested in hearing arguments to the contrary. there's a huge breadth of experiences within the ODK community and the many different perspectives help make it a stronger, better project.

2 Likes

@Jefferson_Francisco && @danbjoseph : The point is not to only comply with GDPR but to consider as paramount that this is a data tool and that the way the developer community implements the tool's features is both a guide and functional limiter on best practise for data collection. I think it is very much the responsibility of the developers to consider how the feature implementation supports all users to make good decisions about how to collect their data in a responsible way

You make a strong point that ODK is an open tool, but open is about more than just the source code; it is also about behaviours. If you essentially want a tool that can surreptitiously check that your submitters were where they said they were (implicitly saying that you neither trust them nor the data they submit) then this is far from open and in the realms of the kind of intrusive tracking that Google, Facebook and co are rightly criticised for.

In my opinion if you want to know where your data was collected, then an open, front facing gps data collection that the operators are aware of is the simplest, most open tool for this. OK so it seems that the current gps widget might need to be altered so that data entry does not include the ability to change the gps data manually.

Finally, I think it is important to remember that many GPS modules will fill the last good fix data because the assigned variables may not be cleared out when the GPS is not feeding new NMEA sentences. This means that if the GPS signal goes down and the secret gps tagging collects an old gps signal, then it might look like your operator faked all their data and sat at home submitting forms; when in fact they were working with a dodgy signal. Are you going to fire them on the basis of this data? We've seen this GPS behaviour on lots of mapping tests. Would be very important in the least to collect data from the correct NMEA sentence and ensure that date and time were captured, as would support you not to make such mistakes.

Hi All. Data protection/privacy concerns are very real and an important thing to discuss. Please refrain from ad hominem attacks or comments that might appear to cast negative intentions on anyone here. This forum is an important place for the community to come together and discuss things like this, but please do so in a civil manner that takes into account that people from a wide variety of cultures/backgrounds/etc may be on here and what one person might interpret as simply "direct" might be interpreted differently by someone else. I've seen plenty of shouting matches erupt on forums and list-serves and I don't want this to devolve into that, as nothing productive usually comes out of those.

2 Likes

Thank you, @danbjoseph, for bringing us back to respectful discourse.

I'm not sure whether this was made explicit but this is very much not a theoretical requirement -- several large organizations already do this kind of background position capture with the full cooperation of their enumerators and in compliance with local regulation. Currently, they need to build custom applications or rely on existing third-party applications. This makes the analysis difficult and makes linking positions to events in the form quite difficult. Alternately, some other organizations have the enumerators manually request locations at different points in the form. This is tedious for enumerators and can slow them down significantly.

There is plenty of room for logs and checks even in a context where everyone is fully trusted because mistakes or misunderstandings can happen! @yanokwa, perhaps the goal statement could be reframed slightly because it really does place a lot of emphasis on "cheating" when that's not my primary understanding of why this feature is important.

First, please note that the collection is not intended to be secret. The current specification includes a UI-blocking dialog on the first load of a form to ensure someone who has installed Collect themselves will always be aware of the position logging. Additionally, every subsequent open of the form would result in a snack bar appearing at the bottom of the form to remind the user that their position will be logged by the current form. GPS access is NOT intended to be enforced. That is, an enumerator would always have the ability to turn off access to the GPS if for some reason their location should not be tracked at that time.

The current proposal includes the form developer specifying a window of time in which points are accepted. If no point is available in that time window, no point will be logged.

I have not heard anyone suggesting making personnel decisions purely on the basis of audit log information. Generally the location data is used as part of the data analysis in some way. If it is considered for monitoring purposes, I am sure that it would only be one piece of information considered. That said, as others have noted, if organizations want to fire their enumerators for something trivial like using the letter "q" too many times in their answers, there's unfortunately not much that can be done about that.

1 Like

[emphasis added]

I think this will go a long way towards satisfying informed consent and opt-out requirements. However we should also probably anticipate additional requirements - around checks-and-balances and privacy guarantees - coming in once deployers begin using this new feature in the field and actually have to look at whatever national/regional/local privacy laws they must operate under (eg GDPR).

It is worth noting that regulations around this do vary greatly, not just between countries but even between states, and there are quite different rules around GPS tracking of company assets (eg vehicles), company owned-mobile devices (eg phones), employee-owned phones, and single-purpose GPS trackers. Again, we should probably anticipate that what is proposed currently may well not suffice as a one-size-fits-all, and many deployers may want additional knobs in order to comply to local privacy regulations...

Please accept my sincere apologies @Jefferson_Francisco @LN and community members. Clearly I helped to push this discussion thread towards an outright argument, which was not my intention at all. Also apologies for glib terminology such as 'secret' which should probably have read 'background'.

@LN I absolutely get the point that having stuff happen in the background can be preferable for many reasons, including some that actively benefit the enumerator. I suppose that at the heart of this is the problem that whilst geopoint information is usually collected as data, it can also sometimes be metadata. I agree with you and @Xiphware that the use of the snack bar would basically solve the argument by making the process transparent. It's a half-way house between a foreground and background process.

I also think @Xiphware is right that data law complexity and diversity is massive and currently in a state of rapid change. This sort of thing may need to be continuously re-scoped. Flexibility in the functions of the software is clearly needed here as everywhere in the ODK ecosystem, but I would be concerned if that flexibility extended to options to do things like turn off the snack bar message.

Best wishes to all

2 Likes

@chrissyhroberts, you clearly have experience, knowledge, and opinions on the matter. Thank you for sharing and being willing to discuss. Important cues like tone of voice are lost when discussions are carried out via written word or missing when discussions are with people you haven't built a relationship with. Was just preemptively reminding everyone of that earlier. We're all in this together.

1 Like

This feature has shipped in ODK Collect v1.20 Beta. Please try it and offer your feedback in the beta topic.

1 Like