Validate against duplicate entries within nested repeats

Hi everyone,

I have the following scenario:

I'm collecting data for a household, and each household can have a maximum of say, 3 plots of land. And for each of these three, they can grow a maximum of 2 crop types. So I want to collect data on crop yields for each plot and it's important to prevent a situation whereby a user enters a given crop more than once for a given plot, i.e, this is valid:

Plot A
Crop Maize

Plot A
Crop Beans

Plot B
Crop Maize

Plot B
Crop Beans

but not

Plot A
Crop Maize

Plot A
Crop Maize <--- duplication

To achieve this kind of data collection I am using nested repeats of questions...can anyone advise on what kind of calculation/constraint would help me achieve the above validation? It seems as though I'd need to use loop logic to check through the values of the repeats and halt/flag on encountering a repeat but I can't imagine how I'd do that with the XLSX form?

Thanks in advance!

Hussein

This question is more appropriate for the opendatakit@ list. Adding that
list.

opendatakit-developers@ is for people modifying the software, not those
writing forms and using the tools.

Here's an XLS file and an XML file.

The XML file has been generated from the XLS file then manually edited to
use the names of the chosen crop and plot in the yield question.

The plot and crop are asked for on the same screen. I would generally
recommend this since the constraint is applied on forward-swipe off of a
screen. If you ask these questions on different screens, you might get odd
behaviors.

The technique is to use the selected() predicate to detect whether an
already-entered value matches the current crop_type field's answer. If it
does, the constraint is violated:

not(selected(/* accumulation of already-entered values */, .))

multiple-select responses are just space-separated lists of values. We can
construct such a list using the join command:

join(' ', /* already-entered-values */)

giving us this constraint:

not(selected(join(' ', /* already-entered-values */ ), .))

To get the already-entered values, you need a complicated XPath expression.
See

In this case, we are referencing the values within the existing filled-in
form, so we want to refer to the crop_type values in our form. Those have
this path (the first element in the path is the filename of your .xls file):

/OnlyOneOfSet/plot/plot_info/crop_type

But if we just used this, we would get the current answer PLUS the answers
for all choices of plot code (Plot A, Plot B, etc.).

We need to filter which repeat groups we include to construct the set of
all crop_type values that we care about.

To do that, we apply a filtering constraint on the repeat group:

/OnlyOneOfSet/plot[ /* filtering constraint to select applicable repeats
goes here */ ]/plot_info/crop_type

And the filtering constraint is evaluated where '.' refers to the
currently-under-consideration plot repeat, and we need to use current()/ to
refer to the current crop_type value.

With that syntax:

current()/
== crop_type field currently being verified

current()/../
== plot_info field-list group containing that field

current()/../plot_code
== the plot code (Plot A, Plot B, etc.) corresponding to the crop_type
field currently being verified.

current()/../..
== plot repeat instance of the crop_type field currently being verified.

So we have two parts to the constraint:

./plot_info/plot_code=current()/../plot_code
== select repeats with plot_code choices matching the plot_code of the
current repeat.

position(.) != position(current()/../..)
== omit the current repeat from consideration.

OnlyOneOfSet.xml (3.4 KB)

OnlyOneOfSet.xls (30 KB)

ยทยทยท On Tue, Aug 16, 2016 at 6:57 AM, wrote:

Hi everyone,

I have the following scenario:

I'm collecting data for a household, and each household can have a maximum
of say, 3 plots of land. And for each of these three, they can grow a
maximum of 2 crop types. So I want to collect data on crop yields for each
plot and it's important to prevent a situation whereby a user enters a
given crop more than once for a given plot, i.e, this is valid:

Plot A
Crop Maize

Plot A
Crop Beans

Plot B
Crop Maize

Plot B
Crop Beans

but not

Plot A
Crop Maize

Plot A
Crop Maize <--- duplication

To achieve this kind of data collection I am using nested repeats of
questions...can anyone advise on what kind of calculation/constraint would
help me achieve the above validation? It seems as though I'd need to use
loop logic to check through the values of the repeats and halt/flag on
encountering a repeat but I can't imagine how I'd do that with the XLSX
form?

Thanks in advance!

Hussein

--
You received this message because you are subscribed to the Google Groups
"ODK Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to opendatakit-developers+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Mitch Sundt
Software Engineer
University of Washington
mitchellsundt@gmail.com

3 Likes

hello I am somewhat stuck in this formula I hope you can help me I will be very grateful
Help modifying formula