I have just started a project and we have multiple active forms on ODK central. Occasionally we need to update the versions of the forms. I was wondering if it is still possible to upload finalised questionnaires if they have been completed on an older version of the form (one that has since been updated)?
We try to minimise this by making sure everyone regularly updates their phones whilst they have data, but want to know what happens if someone forgets and continues to collect data using older versions of the forms.
If you create a form with version 1 and publish it, then with version 2 and publish it, then with version 3 and publish it, Central will accept submissions built from versions 1, 2, 3.
When you do any kind of data export, Central and Briefcase will use the version 3 form definition. That means that if you had fieldA in version 1 and then removed it by version 3, fieldA would not appear in your export, even if some submissions have it. That data still exists but you'd have to do some manual processing to access it.
Central will accept submissions for any combination of form_id and version that it has published and that has not been deleted. There are a couple of special cases which likely don't affect most people but that I will illustrate below for completeness.
If you create a form with version 1 and publish it then create a draft with version 2 (no publishing) and another draft with version 3 that you publish, Central will accept submissions built from versions 1 and 3.
If you create a form with version 1 and publish it, then delete the whole form, Central will not accept submissions created for version 1, even if you re-create the form with version 2.
This capability to managed evolving forms is really great in Central !
@LN could you tell us more about such manual processing ?
As Central expose all data it would be expected to expose all fields use since the first version of the form. Is it something planned for futures Central releases ?
Typically when I've removed fields it's been because they turned out not to be actionable and there's no value to keeping them around. One case I can think of where I have wanted to retain the data for a column is when I used a "Is this practice or real data?" question. I wanted to remove it once data collectors were not practicing anymore and all data was real but I still wanted that column in my data to be able to filter out practice submissions marked before I removed the question. I set the relevant column to false() to achieve that. That way, the field is still part of the schema meaning it is included in the export. Maybe what we can do as a first step is to document this strategy.
Thanks as always @ln...
I never add the idea to hide a field this way !
And it is the solution to that.
Right now in our main form I realize I wanted first ask colleagues if the data is sensitive or not and named the field "sensitive" And I finally use a label meaning the opposite "Is the data sharable"... So I will delete that field to create a new one, logically named.
So just for a matter of curiosity, what do you mean ?
I do curl calls over submissions and sub-tables.
Do you mean I get all fields use since the beginning of the form ? The "limitation" just occurs on csv export ?
Ah, yes, I think this kind of rewording is pretty common as well. You can use the relevance concept to hide one question and show the other. You can also change the original question to a calculate based off the value of the new question. That way your sensitive field is what you'll use in analysis no matter the form version and it will always have the same meaning. The only change will be in the user experience.
Exactly. If you deal with XML directly, then you're getting the raw submissions. You can use your knowledge of the desired end schema to read the fields you want and include logic across form versions, for example (pull fieldX from submissions for version 1 and fieldY from submissions for version 2).