Collect will need to stop using IMEI as deviceID and making simSerial and subscriberID available

We can add the new deviceId in the User and device identity section of the general prefs to make it easier to track down and my expectation is that you'll be able to use it in the form logic.

The caveat is that new deviceId is not stable. If a user uninstalls and reinstalls Collect, they'll get a new one.

The way the Play Store is heading suggests that if you want some stable ID for enumerators, enumerator-specific server side login is the best way forward.

1 Like

What you have described would work for us!

2 Likes

Thanks, @jpringle! We will generate and display this identifier starting in the next Collect release so that folks can start preparing for the change.

Since we have agreed to generate our own IDs, we need to decide what those will look like. I see two options:

  • a random UUID encoded to base 64. That would make it a 22-character string that would look like CSG+GQxwSGGxxQAAdyLbtA or ieAP5DzHSXihTgAAC7f17g. Possible concerns are that they could very well contain bad words in any language and that they are long to type in.
  • a 16-character random alphanumeric string to match Enketo's behavior. Collisions will be more likely but still improbable. Similarly, they could contain bad words.

For reference, IMEIs are 15 digits long (and the way they are assigned guarantees uniqueness).

@martijnr, what is the purpose of the domain prefix? Is it for preventing collisions? Is it common for submissions to the same form to come in from different Enketo installs?

@jpringle does 15 vs 22 characters make a big difference to you? Would being able to copy the value to the clipboard of the device be helpful?

It's to make a distinction between apps on the same device (Collect, Enketo). I think this may have been written in one of the older spec documents that was used a basis for ours.

Interesting! It seems that to create a roster of device ids someone would need to look them up separately for each app anyway so there wouldn't be a chance of confusion or conflict. That said, should Collect continue the pattern in some way? I suggested a uuid: prefix but I suppose we could do a collect: prefix to make it clear which app is generating the id.

The length doesn't matter. We would prefer that the identifier be unique, and we would even be fine using base 16 encoding. Ability to auto-copy the number the clipboard would be useful. But now that I think of it, we would probably have our enumerators submit a form with this ID as a field to help us assemble the dataset I described earlier.

1 Like

Hi @LN. Sorry for such a late reponse. Our systems do capture the device id during form submission so we can track down any funny business by being able to identify the source of the form submissions. However, we likely will not need this moving forward, since we are employing device-level authentication (which matched the typical usage pattern used up until now). We can get better information without these, so this likely will be phased out of our systems.

1 Like

That sounds great, thanks for sharing your plans!

After talking about it a bit more @LN have decided to go with a 16 character random alpha numeric string prefixed with collect:. The prefix will be helpful if we ever need to change the way the ids are generated (as we could change the prefix to collectx.x.x or even just collect1:).

Hi all - thanks for sharing this.
For background - I've been working with my own custom fork of Collect v1 for about 7 years now, updating it from the main trunk and doing a complete rebuild once every 2-3 years. The APKs are distributed privately to my users to interact with a number of specific web services, which are in-house bespoke projects - I don't distribute via the Playstore. My web service backend requires a device ID, in order to give a customised form list for each user, so I fetch and append the deviceID to the form list url.
Having a device ID at submission time is less of an issue, as I include a UID in every form download.
So I'd appreciate some clarification about your proposal, as it will be relevant to me - but I'm also confused by the current behaviour of Collect. In the initial post on this topic, LN suggests that the device ID is currently "sent with every form submission and form list request". I've just tested v1.25.1 as downloaded from the Playstore, and I can't see any sign of a deviceID being appended to the querystring when requesting a form list. Am I missing something? Does this only happen when the server type is set to Aggregate or Google Sheets? (I use protocol_other.)
Grateful for any enlightenment!
Thanks
Nik

Well, that's very interesting! I based my message on what the OpenRosa spec states but I didn't actually verify what functionality Collect has. It looks like for some reason it has never sent deviceId with the formList request.

The good news is that this makes the change even less risky since it couldn't have been used for logic to filter form lists.

The bad news is that we have to make an explicit decision about whether to maintain this deviation from the intended specification or add the deviceId query parameter. I think we should not add support for it now. It was always easy to falsify and won't be guaranteed stable anymore moving forward. Instead, we should encourage use of different accounts to filter form lists, as Aggregate, Central and many different servers do.

You'll need to either target API < 29 and use devices < Android 11 to maintain usage of the IMEI, switch to the new install-based identifier we've described in this thread, or use accounts to do the filtering.

Except in my fork! Ah well ... :slight_smile:

I'd be happy to switch to using user accounts, but I'm not clear how (or indeed if!) the credentials are being included in the formList request, when the Server type is set to "other"..? (I've done some quick tests for standard HTTP authentication, but that doesn't seem to be happening.)
If you're going with a 16-char string to ID the device/installation - is the idea that will this be generated automatically on installation, never change unless the app is reinstalled, and be visible in the User and device identity > Form Metatdata screen?
Thanks
Nik

If your server is configured for basic or digest auth, the server will issue a challenge and Collect will respond with credentials.

Yes, this is exactly right!

Ah - of course I'd need to password-protect the target folder, which would make a gradual transition and back-compaibility with older versions a bit difficult! :slight_smile:
But it sounds like the installation ID will be viable, as long as I can maintain back-compatibility with my own parameter. Currently I use:

https://my.server.com/formlist/?deviceID=123456789012345

...where 123456789012345 is the IMEI. Am I right that the new form list request would automatically be set to:

https://my.server.com/formlist/?deviceId=Collect:ABCDEFGHIJKLMNOP

...where ABCDEFGHIJKLMNOP is the planned installation ID? (NB we need to be clear whether it's deviceID or deviceId! )

I think this is a positive move for another reason - not all Android devices have an IMEI, so I already use the MAC address for wifi-only devices. Furthermore, devices with multiple interfaces sometimes return different device IDs depending on the connection method - one phone I have has 2 SIM slots, but the device ID obtained by Collect isn't the IMEI for either slot - instead it's one of three different MAC addresses depending on whether it's connected by wifi or one or other of the 4G SIM cards. So a single installation ID will be a Good Thing in the long run.

One other question though - while I favour identifying Collect as the source of the deviceID in the querystring, is it wise to use a colon (":") as the delimiter? The colon normally defines a port number in a URI, and while that shouldn't matter if it's in the querystring rather than the URI, some webservers might not interpret it correctly as part of a querystring parameter...

Thanks
Nik

We now have a final timeline and know the Collect versions in which changes will be made. ODK Collect v1.28 will be going out shortly (see the beta). It will show in settings that deviceID will change and that simSerial and subscriber ID will be removed in v1.29.

ODK Collect v1.29 will be out in Nov/Dec 2020 and will remove simSerial and subscriber ID. It will also switch deviceID to what we have been displaying as installID.

Sorry I missed your follow-up message, @Blitheringeejit.

Submissions will continue to be sent with the deviceID (ID in uppercase, I wrote it wrong above) query parameter. The value of that query parameter will become what is currently shown in the settings as install ID. It will look like collect:<16-char random string>.

I think you're exactly right. Additionally, users can currently revoke or never give Collect permissions to read their unique identifiers which can lead to blank values. Now the deviceID value will never be blank.

Thanks for bringing that up. I believe colons are safe to use in query parameters (see here, here), but I do see a number of questions about it. I'll do a little more reading around whether they should be URI-encoded. I've seen colons used as namespaced query parameters like we're doing and also as part of timestamps.

1 Like

Currently, our reporting system which uses IMEI as equipment identifier is not working with the latest version of ODK. Is it due to the issues mentioned above?

That is correct, applications including Collect can no longer access IMEI. If you now have a dataset that has a mix of IMEIs and new deviceIDs, you could ask data collectors to fill and submit a form asking them only for their IMEI from settings or the back of their physical device. Also include the deviceid metadata question and that will let you build a connection between old and new deviceids.

Is this for the latest version of ODK or old versions will no longer work with IMEI. Is it possible to identify reports based on the information captured on the ODK form management including phone number?

Older versions will still work like before.

You can still use deviceid in your form it will be just built in a different way not using IMEI. Is that a problem?

That is true as long as the devices are running Android versions 9 or below. With Android 10 or higher, I believe the app will crash. Please note that versions that access unique device identifiers can't be distributed through the Play Store and must be installed via APK.