Personal data, privacy or paranoia

What is the problem? Please be detailed.
Permissions for ODK-Collect and personal data:

I'm putting together a project that will need a few people to install ODK Collect on their mobiles. I want to reassure them that this isn't F@cebook and I'm not C@mbridge An@lytica...

Looking at app permissions (list taken from Google Play today) we're pretty much opening our trusting souls:

Identity
find accounts on the device
add or remove accounts

Contacts
find accounts on the device

Location
approximate location (network-based)
precise location (GPS and network-based)

Phone
read phone status and identity

Photos/Media/Files
read the contents of your USB storage
modify or delete the contents of your USB storage

Storage
read the contents of your USB storage
modify or delete the contents of your USB storage

Camera
take pictures and videos

Microphone
record audio

Wi-Fi connection information
view Wi-Fi connections

Device ID & call information
read phone status and identity

Identity
find accounts on the device

Location
precise location (GPS and network-based)

Other

receive data from Internet
access SurfaceFlinger
view network connections
full network access
use accounts on the device
prevent device from sleeping
read Google service configuration
view network connections
full network access
use accounts on the device
read Google service configuration
view network connections
full network access

So, I'm just wondering if any information is gathered by the app and sent somewhere other than Aggregate (and perhaps subsequently from Aggregate to another server). This would include personal data from the mobile. I confess I didn't know what SurfaceFlinger was, let alone giving permission to access it!

It's possible that some of this could come under the new General Data Protection regulations due to be introduced in Europe in May 2018, but I'm also guessing that if I put a metadata question in my form asking to identify the phone number, then I've got personal data and I need to handle it appropriately! Especially if it's linked to location and time. Which means that Aggregate needs to be secure enough too... I'm now going off the idea of 'auto GPS' feature being developed by @Raghu_Mittal that could potentially record a location without the user's knowledge (there's probably a positive way round that though!). No disrespect to the objective of that.

Is anything being 'scraped' from a users phone by using ODK Collect? Am I inadvertently collecting or sending other personal data?

Sorry to sound paranoid - one of the issues with the 'business models' of the internet and recognising there is a real cost to "free software", but who is paying and with what? I want to be honest and up front with the people who might be using my form(s) and I don't have the skills to look far enough 'under the hood' to satisfy myself.

Can't remember the original source of the quote, but it goes something like "if the service is free, you are the product" - if I'm inadvertently 'selling' someone's data by getting them to use ODK it would be good to know.

I think lots of people give their time and energy to ODK for free, and I certainly appreciate the help I've received, so please don't take it as an insult for me to ask the questions.

Thanks.

2 Likes

Valid concerns and good to approach any situation with critical thinking. A few quick notes. And I'm sure other people have plenty of comments to add.

  • ODK as a free and open-source project is free as in "liberty" and not as in "free lunch".
  • Unlike with some software and services, you are free to inspect the full source code of ODK and examine exactly what the software is doing.
  • Your data will be gathered on a server somewhere (e.g. ODK Aggregate). It could be on a laptop or local network. If you use Google's App Engine you are agreeing to their service agreement. Installing on Amazon Web Services or Microsoft Azure also has implications in terms of where the physical servers are located, etc. You should be aware of any third-party assistance you use for hosting or other services, and the security of the server you choose.
  • ODK as a tool won't violate GDPR or other privacy regulations/best practices, but it's certainly possible to do so with your form design. The data and metadata collected during a survey will all be things you set during form design. You're right that if you collect PII or SPI, you'll need to handle it appropriately!
2 Likes

To build on @danbjoseph's great answer...

free is an imprecise term. ODK is libre (use the source code with no restrictions) and for a lot of people it's also gratis (use the apps at no cost).

Regardless of what free you are referring to, we (ODK's leadership, contributors, etc) don't have any access to your data. That is, when you use Collect, you are sending data to servers that we don't control (e.g., Aggregate, Google Sheets). And when you install Aggregate on App Engine or Tomcat, you are installing it on hardware that we don't even know exists.

If for some reason you wanted to grant us access to your data (e.g., you want us to log into a server to fix something), you'll need to explicitly grant that access and you'll know you are doing it.

We do gather some usage analytics and crash logs to prioritize features and fixes. That data is generally anonymized and opt-out. We try to collect and aggregate (see what I did there?) just what we need and only core contributors have access to that data.

We have a detailed write up on how we handle security and privacy at https://docs.opendatakit.org/security-privacy. We also have a good write up on how to encrypt your form data at https://docs.opendatakit.org/encrypted-forms to further secure your data.

Note that our security and privacy policy does not apply to hosted services like Ona or Kobo because they don't fall under ODK's governance. You should read their security and privacy policies, but in general, if you don't control the hardware or encrypt the data, someone could read that data.

As far as Collect, we started building it before Android was released (we were interns at the Google) and from that time until Android 6, the OS required all potentially used permissions to be asked for on install. And since our forms are pretty powerful, the list of things we might want to do is long.

We are working on changes to Collect so we only ask for permissions when we need them. That's what Android prefers, but it's a complicated change because we need to find a way to do it that doesn't make life harder for enumerators. You don't want to pop up the permission for the camera the first time the enumerator branches to a camera question they might not have seen in a training.

And Collect is generally deployed by an organization to a group of workers and those workers shouldn't always have that control to decide what gets collected. And you kinda don't want the enumerator to prevent the phone from collecting the device location because the organization really needs to know that the form was filled at a particular location.

Long story short, Collect's permissions are complicated, but I'm hoping we can ship those improvements in September. And SurfaceFlinger? That's how we do the swipe to move to the next question.

2 Likes

Thanks @danbjoseph and @yanokwa for your thoughtful responses regarding use of data. It's helpful to get the human response as well as the 'corporate' one of the Privacy Statement :slight_smile:

I recognise the difficulties of permissions and for me it is more a case of reassuring folk that their data is not going anywhere else - and taking appropriate responsibility for their data in form design.

Another topic just arising (ODK Privacy and Security) highlights an issue with privacy between different projects - especially difficult if working with more than one client... will be interested to see if there is a way of introducing 'walls' within one instance of Aggregate.

Thanks for taking time to respond.

2 Likes