Attempt to name each survey with a UID

Hello All,

I have been spending the past little while trying to create an 'instance name' for each survey that would also serve as a UID.

After reading through the old posts in the thread regarding creating UIDs ,I cobble together the following command: in the 'settings' tab in my excel in the 'instance name' column:

concat($string(today(),' - ',${sur},' - ',${loc_001},' - ',${loc_002},' - ',${loc_003},' - ',${loc_004},' - ',${uuid()})

The goal is that the name of each individual survey will be made up of the date the survey was conducted ($string(today()), surveyor name (sur), four variables from the questionnaire indicating location (loc_001- loc_004), and finally a random number (uuid).

Unfortunately, this command does not seem to work. I have tweaked what I could think of, but I would be very grateful if a more seasoned programmer can let me know where I might be going wrong, or, an easier way to do this.

Cheers,

Ebony

Ebony,

Start simple. Pick one of those variables and see if it works (e.g.,
concat("My Form- ",${sur}) ).

As an aside, I wouldn't use anything user entered as a true UID. Use
the $instanceID that is already in the form data. If you need
something that humans can read, that's fine, but for analysis, it may
not be unique enough.

Yaw

ยทยทยท -- Need ODK consultants? https://nafundi.com provides form design, server setup, in-field training, and software development for ODK.

On Sat, Oct 24, 2015 at 6:13 AM, ebony.bertorelli@gmail.com wrote:

Hello All,

I have been spending the past little while trying to create an 'instance name' for each survey that would also serve as a UID.

After reading through the old posts in the thread regarding creating UIDs ,I cobble together the following command: in the 'settings' tab in my excel in the 'instance name' column:

concat($string(today(),' - ',${sur},' - ',${loc_001},' - ',${loc_002},' - ',${loc_003},' - ',${loc_004},' - ',${uuid()})

The goal is that the name of each individual survey will be made up of the date the survey was conducted ($string(today()), surveyor name (sur), four variables from the questionnaire indicating location (loc_001- loc_004), and finally a random number (uuid).

Unfortunately, this command does not seem to work. I have tweaked what I could think of, but I would be very grateful if a more seasoned programmer can let me know where I might be going wrong, or, an easier way to do this.

Cheers,

Ebony

--

Post: opendatakit@googlegroups.com
Unsubscribe: opendatakit+unsubscribe@googlegroups.com
Options: http://groups.google.com/group/opendatakit?hl=en


You received this message because you are subscribed to the Google Groups "ODK Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opendatakit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Also, be aware that the random value can change each time you open, review
or edit the form. You should wrap uuid() or random() with once() to ensure
that you only generate that value once. Search the list for once( for more
info.

ยทยทยท On Fri, Oct 23, 2015 at 4:54 PM, Yaw Anokwa wrote:

Ebony,

Start simple. Pick one of those variables and see if it works (e.g.,
concat("My Form- ",${sur}) ).

As an aside, I wouldn't use anything user entered as a true UID. Use
the $instanceID that is already in the form data. If you need
something that humans can read, that's fine, but for analysis, it may
not be unique enough.

Yaw

Need ODK consultants? https://nafundi.com provides form design, server
setup, in-field training, and software development for ODK.

On Sat, Oct 24, 2015 at 6:13 AM, ebony.bertorelli@gmail.com wrote:

Hello All,

I have been spending the past little while trying to create an 'instance
name' for each survey that would also serve as a UID.

After reading through the old posts in the thread regarding creating
UIDs ,I cobble together the following command: in the 'settings' tab in my
excel in the 'instance name' column:

concat($string(today(),' - ',${sur},' - ',${loc_001},' - ',${loc_002},'

  • ',${loc_003},' - ',${loc_004},' - ',${uuid()})

The goal is that the name of each individual survey will be made up of
the date the survey was conducted ($string(today()), surveyor name (sur),
four variables from the questionnaire indicating location (loc_001-
loc_004), and finally a random number (uuid).

Unfortunately, this command does not seem to work. I have tweaked what I
could think of, but I would be very grateful if a more seasoned programmer
can let me know where I might be going wrong, or, an easier way to do this.

Cheers,

Ebony

--

Post: opendatakit@googlegroups.com
Unsubscribe: opendatakit+unsubscribe@googlegroups.com
Options: http://groups.google.com/group/opendatakit?hl=en


You received this message because you are subscribed to the Google
Groups "ODK Community" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to opendatakit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Post: opendatakit@googlegroups.com
Unsubscribe: opendatakit+unsubscribe@googlegroups.com
Options: http://groups.google.com/group/opendatakit?hl=en


You received this message because you are subscribed to the Google Groups
"ODK Community" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to opendatakit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Mitch Sundt
Software Engineer
University of Washington
mitchellsundt@gmail.com

Thanks Yaw and Mitch.

A couple things just so we are on the same page..for each surveys instance name (file name) it's important that it includes information that is not unique so I can link household surveys together which may have been conducted at different times/days...that's why I need to pull the three location variables into the instance name of each survey. Further, knowing the surveyor for each survey will make it easier to manipulate a database of survey files in order to pull up surveys conducted by one person. Lastly, the point of the UID generation part of the instance name is to include one variable which is unique in the case that all other markers of the instance name are the same across two surveys. This string of code to create an ID does not necessarily need to be the files instance name (although desirable) but can also be a variable that is created within the survey itself and shows as a field.

I have read examples where this type of concatenated instance name is possible. When you tell me to start simply is it because adding too many variables will not work? From looking at my code, which I have checked against as many online resources as I could find, does it look like there is anything wrong with it (brackets in the wrong places, misuse of code command?, etc?)

Thank you for letting me know the UID will change each time the survey is opened. Is there a benefit to using UID() over random()?

I apologize if I am misunderstanding anything, I am teaching myself this from scratch and have read as many resources as possible but clearly still have information gaps in understanding the way examples and materials are written.

Cheers,

Ebony

ยทยทยท On Monday, October 26, 2015 at 11:44:52 AM UTC-7, Mitch Sundt wrote: > Also, be aware that the random value can change each time you open, review or edit the form. You should wrap uuid() or random() with once() to ensure that you only generate that value once. Search the list for once( for more info. > > > On Fri, Oct 23, 2015 at 4:54 PM, Yaw Anokwa wrote: > Ebony, > > > > Start simple. Pick one of those variables and see if it works (e.g., > > concat("My Form- ",${sur}) ). > > > > As an aside, I wouldn't use anything user entered as a true UID. Use > > the $instanceID that is already in the form data. If you need > > something that humans can read, that's fine, but for analysis, it may > > not be unique enough. > > > > Yaw > > -- > > Need ODK consultants? https://nafundi.com provides form design, server > > setup, in-field training, and software development for ODK. > > > > > > On Sat, Oct 24, 2015 at 6:13 AM, wrote: > > > Hello All, > > > > > > I have been spending the past little while trying to create an 'instance name' for each survey that would also serve as a UID. > > > > > > After reading through the old posts in the thread regarding creating UIDs ,I cobble together the following command: in the 'settings' tab in my excel in the 'instance name' column: > > > > > > concat($string(today(),' - ',${sur},' - ',${loc_001},' - ',${loc_002},' - ',${loc_003},' - ',${loc_004},' - ',${uuid()}) > > > > > > The goal is that the name of each individual survey will be made up of the date the survey was conducted ($string(today()), surveyor name (sur), four variables from the questionnaire indicating location (loc_001- loc_004), and finally a random number (uuid). > > > > > > Unfortunately, this command does not seem to work. I have tweaked what I could think of, but I would be very grateful if a more seasoned programmer can let me know where I might be going wrong, or, an easier way to do this. > > > > > > Cheers, > > > > > > Ebony > > > > > > -- > > > -- > > > Post: opend...@googlegroups.com > > > Unsubscribe: opendatakit...@googlegroups.com > > > Options: http://groups.google.com/group/opendatakit?hl=en > > > > > > --- > > > You received this message because you are subscribed to the Google Groups "ODK Community" group. > > > To unsubscribe from this group and stop receiving emails from it, send an email to opendatakit...@googlegroups.com. > > > For more options, visit https://groups.google.com/d/optout. > > > > -- > > -- > > Post: opend...@googlegroups.com > > Unsubscribe: opendatakit...@googlegroups.com > > Options: http://groups.google.com/group/opendatakit?hl=en > > > > --- > > You received this message because you are subscribed to the Google Groups "ODK Community" group. > > To unsubscribe from this group and stop receiving emails from it, send an email to opendatakit...@googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. > > > > > > -- > > Mitch Sundt > Software Engineer > University of Washington > mitche...@gmail.com

t

Hi Ebony,

No apologies needed. Seems like you've figured a lot of things on your
own and that's great!

My advice still remains the same. Start simple. Pick one of those
variables and see if it works (e.g., concat("My Form- ",${sur})). Then
add a few of the others (e.g., ${loc_001}), then date, then uuid. That
process will help you find the source of the problem, because I'm not
sure why it's failing. If I had to guess it's either the string
command, the uuid or today.

random() returns a number between 0 and 1. uuid() probably returns a
long random string, something like
de305d54-75b4-431b-adb2-eb6b9e546014. Test to confirm.

Thanks,

Yaw

ยทยทยท -- Need ODK consultants? https://nafundi.com provides form design, server setup, in-field training, and software development for ODK.

On Fri, Oct 30, 2015 at 2:02 AM, ebony.bertorelli@gmail.com wrote:

Thanks Yaw and Mitch.

A couple things just so we are on the same page..for each surveys instance name (file name) it's important that it includes information that is not unique so I can link household surveys together which may have been conducted at different times/days...that's why I need to pull the three location variables into the instance name of each survey. Further, knowing the surveyor for each survey will make it easier to manipulate a database of survey files in order to pull up surveys conducted by one person. Lastly, the point of the UID generation part of the instance name is to include one variable which is unique in the case that all other markers of the instance name are the same across two surveys. This string of code to create an ID does not necessarily need to be the files instance name (although desirable) but can also be a variable that is created within the survey itself and shows as a field.

I have read examples where this type of concatenated instance name is possible. When you tell me to start simply is it because adding too many variables will not work? From looking at my code, which I have checked against as many online resources as I could find, does it look like there is anything wrong with it (brackets in the wrong places, misuse of code command?, etc?)

Thank you for letting me know the UID will change each time the survey is opened. Is there a benefit to using UID() over random()?

I apologize if I am misunderstanding anything, I am teaching myself this from scratch and have read as many resources as possible but clearly still have information gaps in understanding the way examples and materials are written.

Cheers,

Ebony

On Monday, October 26, 2015 at 11:44:52 AM UTC-7, Mitch Sundt wrote:

Also, be aware that the random value can change each time you open, review or edit the form. You should wrap uuid() or random() with once() to ensure that you only generate that value once. Search the list for once( for more info.

On Fri, Oct 23, 2015 at 4:54 PM, Yaw Anokwa yan...@nafundi.com wrote:
Ebony,

Start simple. Pick one of those variables and see if it works (e.g.,

concat("My Form- ",${sur}) ).

As an aside, I wouldn't use anything user entered as a true UID. Use

the $instanceID that is already in the form data. If you need

something that humans can read, that's fine, but for analysis, it may

not be unique enough.

Yaw

--

Need ODK consultants? https://nafundi.com provides form design, server

setup, in-field training, and software development for ODK.

On Sat, Oct 24, 2015 at 6:13 AM, ebony.be...@gmail.com wrote:

Hello All,

I have been spending the past little while trying to create an 'instance name' for each survey that would also serve as a UID.

After reading through the old posts in the thread regarding creating UIDs ,I cobble together the following command: in the 'settings' tab in my excel in the 'instance name' column:

concat($string(today(),' - ',${sur},' - ',${loc_001},' - ',${loc_002},' - ',${loc_003},' - ',${loc_004},' - ',${uuid()})

The goal is that the name of each individual survey will be made up of the date the survey was conducted ($string(today()), surveyor name (sur), four variables from the questionnaire indicating location (loc_001- loc_004), and finally a random number (uuid).

Unfortunately, this command does not seem to work. I have tweaked what I could think of, but I would be very grateful if a more seasoned programmer can let me know where I might be going wrong, or, an easier way to do this.

Cheers,

Ebony

--

--

Post: opend...@googlegroups.com

Unsubscribe: opendatakit...@googlegroups.com

Options: http://groups.google.com/group/opendatakit?hl=en


You received this message because you are subscribed to the Google Groups "ODK Community" group.

To unsubscribe from this group and stop receiving emails from it, send an email to opendatakit...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

--

Post: opend...@googlegroups.com

Unsubscribe: opendatakit...@googlegroups.com

Options: http://groups.google.com/group/opendatakit?hl=en


You received this message because you are subscribed to the Google Groups "ODK Community" group.

To unsubscribe from this group and stop receiving emails from it, send an email to opendatakit...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

Mitch Sundt
Software Engineer
University of Washington
mitche...@gmail.com

t

--

Post: opendatakit@googlegroups.com
Unsubscribe: opendatakit+unsubscribe@googlegroups.com
Options: http://groups.google.com/group/opendatakit?hl=en


You received this message because you are subscribed to the Google Groups "ODK Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opendatakit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hello,

I am trying to use 4 tablets to conduct surveys of 500 students. I am looking for a way to generate a unique study ID in an ODK xls form that could then be linked back to a consent form with participants phone numbers so that we may contact them for a follow up portion of our study. We will keep the phone numbers on consent forms separate from the information collected on our survey. Is there a way to generate a unique study ID number that is not duplicated across the 4 tablets we will use for this project? Would the approach written above work? I am very new to this and appreciate the help, sorry if this is obvious or covered elsewhere on the forum. Thank you for your help.

Best Regards,
Sarah

Hi @Sarah_Pfeil
I think you can use preloaded values(eg. deviceid as it's unique) look here https://opendatakit.org/help/form-design/external-apps/
Here is an example form:
preload.xml (899 Bytes)

The only way to ensure IDs are not duplicated across tablets in an offline manner is to generate a really long ID (e.g., use meta/instance_id which is 30 characters). But that's going to be painful to re-enter.

Grzegorz is on the right track that you want to include device ID, but it might be simpler to assign a letter to each tablet (e.g., A-D) and then add date and time.

How much of the date/time you include depends on when you are collecting the data. For example, if you survey all the students in the same day, then you really only need hours and minutes and seconds. But if you survey one a month, then you really only need the month to uniquely identify students.

If I generate Unique Study ID numbers in ODK is there a protocol or process other people have come up with to link this to informed consent forms that does not require dedicated full-time research staff?

I currently have paper consent forms with a participant's phone number on them and link the consent form number to a predetermined Unique Study ID via a "study id crosswalk" that is password protected on a study computer. The de-identified study ID is texted to a research assistant and then is manually entered into the research tablet by research assistants. This creates problems because it means we need dedicated staff who are always available and at a computer during the study hours to manage texting study ID numbers, as well as the problem of study ID numbers being kept in SMS data on research assistant phones.

The long term goal is to be able to keep participant survey answers separate from their names and phone numbers, which are on paper consent forms, and to be able to later contact survey participants, who answer survey questions a certain way, by calling them to come back for a qualitative interview.

I am struggling with the logistics of doing this and how to best use ODK to do it while minimizing dependence on one person to send de-identified study id numbers via sms. Do you have any ideas on best practices for maintaining de-identified study ID numbers in a longitudinal study that uses ODK? Any suggestions would be greatly appreciated as I have found limited information on this subject. Please forgive me if I have missed something in the forum. Thank you for your time and help.

Cheers,
Sarah

So we do longitudinal household surveys where we create a unique identifier for each household, typically prior to the fielding (so like your predetermined Unique Study ID). How we're managing this is not going to be very helpful for you if you're using ODK Collect. We're using ODK survey and we are just loading the unique household ids on to the tablet and into Tables for a particular enumerator. I think this is like the earlier suggestion of preloading (but much easier to do in ODK Survey--it's just an Excel you push on to the tablet).

We did, however, run through a few other ideas first before settling on this that might work for you:

  1. Having the question for unique id be a select_one and having a choices list of the ids
  2. You could also do this with a csv as the choices list if that works in ODK collect

One question I have about your workflow is that you say you text ID numbers to research assistants--based on what? How does the research assistant get assigned/ask for a particular ID or individual? For your long term goal, how do you call them back?

so I have a crosswalk document that is a list linking consent form number to unique study ID number. they text a research coordinator and say the consent form number and are sent back the unique study ID number. the problem with this is that there has to be a research coordinator infront of the crosswalk document and available to text research assistants back in real time. We have discussed giving research assistnats a list with the study id numbers that correspond to the consent forms they have for that day but then we create the problem of having lots of little lists that would need to be shredded.

Rather than paper lists, what about encrypted digital ones? Assuming your RAs have smartphones, there might be a non-ODK tech solution.

I have not used it for exactly this purpose, but I have used the combination of Dropbox and Boxcryptor for securing data in general. You could do something like create a Boxcryptor encrypted folder for each research assistant, and put in a word document for each consent form to UID that they need that day. Ideally, RAs would only log in right when they need the UID and delete the word doc. immediately after entering the UID and log out for each, but even if they don't do that your research coordinator could delete all the previous day's UID mappings when putting in the next the following morning. Digitally shredding the lists, as it were.

I may be 'under' thinking this, but I lean towards simple solutions. Linking instances to signed consent forms appears straight forward in practice, but data entry errors throw lots of wrenches at this process.

TL:DR: 1) Create Unique IDs using survey data 2) Couple with consistent remote data quality checks (remote and not in real time) 3) Require supervisor to update Unique IDs on paper consent.

I'd start with a unique ID that is relatively easy to provide quality checks on*. Typically ours include concat of (unique enumerator ID, Daily HH #**, Month#, Day#, geographic location tags (i.e. concat(District#, Zone#, Village#,). You can do a calculate in ODK which spits this out via a 'note' for the enumerator to then place on the paper consent form. This HHID can then be linked to the form name.

*During the first week of fieldwork I typically run analysis on the fully uploaded data once a day, flagging duplicate HHIDs, identifying where the problem lay, and providing reports to team leaders for follow-up with enumerators. This can be cumbersome, but its not a full time job. After the .do or .sps is written it typically takes less than an hour (or less if the teams are good). Once enumerators realize their errors (and that they are being tracked), the rate of 'date entry error' typically goes down quickly (by day 3 or 4). This does assume a comprehensive up front training. (Note the last few days of fieldwork sees a spike in these type of errors in my experience).

**Daily HH# is the 'Nth household THAT enumerator has interviewed THAT day'

In principle the geographic location is not needed for a unique #, but sometimes I add them to quality check and trim them off during later stages of analysis/programming.

This isn't perfect (as data entry errors caught can be fixed for the Unique ID, but won't be reflected in the Form Name or the consent form). However I could envision a straight forward work flow where the responsibility to update the paper form names is pushed down to the supervisors for them to update while in the field. This does stem from someone at a cpu doing data quality work, but not required in real time, or full time.

Just my thoughts...

2 Likes