Approaches for linking form instance changes to individuals

LN · September 3, 2019, 7:37pm

Thanks so much to all who have contributed to improving this spec. @yanokwa has invited me to be on the TSC call tomorrow. I can be there about 15 minutes in.

I took a moment to listen to the last call. I should have done that before responding. Luckily what I wrote in my last comment is still relevant but I realize it may need more context. First, thanks to @aurdipas for bringing up the question about the non-blank to blank to non-blank case and to @adam.butler for the explanation of the intended behavior in the current spec draft.

Why not ask for change reasons only on form re-open

An earlier draft of the spec proposed asking for change reasons on form re-open but as mentioned by @tomsmyth in the call, it’s common to partially fill a form and to complete it on re-open. It doesn’t make sense to ask for change reasons then.

Also, asking for change reasons any time a non-blank value is changed better matches paper. For example, if I write down a child’s age as 12 and then I need to cross it out to write 13, I would need to initial and explain this because it was written in ink. This is the case even if I immediately catch the mistake and make the fix.

Possible strategies to overcome issues with asking for change reasons only on form re-open

The big downside of letting the form designer choose when to ask for change reasons (always, after save, after finalize…) as @tomsmyth suggested or leaving it up to the enumerator as @Xiphware described is that both require more training and leave more room for enumerator error. For example, if the form designer can choose when change reasons are requested, enumerators have to be trained on when to save vs. when to finalize and have to be given instructions on what to do if they do the wrong one.

In the case of the enumerator deciding, another downside is that it leaves the door open for both accidental and deliberate opt-out.

Why tracking pristine status doesn’t seem worth it

On the call, @martijnr described requesting change reasons when non-blank values are changed as a crude stand-in for requesting change reasons when a previously-set value is changed. That’s correct, though it’s not clear to me that requesting reasons for changes to previously-set value is much better.

Tracking whether each field has ever been set (“pristine status”) is possible. Then explanations for changes could be requested when changes are made to non-pristine values regardless of whether the new value is blank or not.

I believe this would lead to the following two differences when compared to what is currently written in the spec:

Changes to default values would not require explanations because they are pristine
In a case where the user blanks a value and then sets it again, explanations would be requested both when blanking the value and when it is set again (even if immediately after).

I’m not entirely sure that those are improvements. In the default case, I can see it either way. In the blanked to non-blank case, requesting two reasons (one for clearing the value and one for re-entering a value) seems like it would generally be redundant.

Tracking pristine status would add complexity and possibly have performance implications. I think it would need to provide clear user benefit for it to be worthwhile.

Am I underestimating the importance of the blanked to non-blank case?

See my previous post for more on why I think the blanked to non-blank case is uncommon and above for why I think suggested ways to address it are not much better or have other downsides. But am I underestimating its importance and/or overblowing the challenges with the alternatives? @aurdipas, is it possible you reacted really strongly to this because the wording in the limitation section made it sound like no change reason was asked for at all in that case?

When to log events when there are multiple questions on a screen

As @Xiphware has pointed out in his recent post and @martijnr highlighted in the doc, it’s not as clear when question events should be logged when there are several questions on a page and especially when some question types can be updated continuously.

This affects all audit features and not this one specifically. Currently, Collect defines the start and end times of question events from within field lists as the times when the field list is entered and exited, respectively. This definitely needs to be improved, but I don’t see a reason for it to block this spec.

I propose we make separate decisions on this, possibly for each question type separately, as other clients get ready to implement the audit log features.

Study design standards

On the call, @martijnr also asked about study design standards. See bullet 2 in the original feature description for an example. One thing I’ve understood from @dr_michaelmarks (correct me if I’m getting this wrong) is that protocols and training are more important than tech.

That is, the technology has to make it possible to collect things like user identifiers and reasons for change but beyond that, a robust protocol with things like independent oversight go a long way in determining whether an approach is standards compliant. It’s possible to design a bad protocol with great tech or to design a great protocol with limited tech (e.g. paper).