(Post)processing multi-select values

Hi,

we did some tests with forms with multi select fields and are now
wondering about how the stored information should be used, e.g. with
the "Visualize Tools" of ODK Aggregate 1.0.
The problem is that values being selected in a multi select are stored
as combined individual values and thus don't count for the single
values.

Example Code:
Defoliation

Select

Widespread
Widespread


None
None

When we select "Widespread" and "None", the diagrams and Spreadsheets
process the data as "Widespread None" and not "Widespread" and "None".
(I would like to attach a Screenshot to make this situation/problem
clearer... I hope you understand what I want to say.)

Or do we make use of the multi-select in a wrong way?

Regards!

(Our setup: ODK Aggregate 1.0 (Production) on Tomcat with PostgreSQL)

This is a tricky data representation question. We punted it in 0.9.x and
1.0 because a standard solution won't make sense in all situations.

If you have a multi-select with 30 choices, would you want a spreadsheet to
be generated with 30 column headings for this one multi-select, and a "X"
under the columns with matching values? What if there are 50 choices?
What if the number of choices is unlimited (this is possible with itemsets
drawing their choices from an external database)? At some point, it makes
more sense to produce a spreadsheet with just "N" columns for a given
multi-select (1st choice, 2nd choice, ...), and show up to N choices for
that multiselect (but which N do you show?).

This would all need additional layers of configuration over the simpler
version available in 0.9.x and 1.0.

If you have your data in a MySQL or PostgreSQL database, you may want to
look at Tableau or another data visualization package that can directly
work off the native database tables holding your data.

Mitch

··· On Thu, Nov 3, 2011 at 8:50 AM, LETSGO CeLeKT wrote:

Hi,

we did some tests with forms with multi select fields and are now
wondering about how the stored information should be used, e.g. with
the "Visualize Tools" of ODK Aggregate 1.0.
The problem is that values being selected in a multi select are stored
as combined individual values and thus don't count for the single
values.

Example Code:
Defoliation

Select

Widespread
Widespread


None
None

When we select "Widespread" and "None", the diagrams and Spreadsheets
process the data as "Widespread None" and not "Widespread" and "None".
(I would like to attach a Screenshot to make this situation/problem
clearer... I hope you understand what I want to say.)

Or do we make use of the multi-select in a wrong way?

Regards!

(Our setup: ODK Aggregate 1.0 (Production) on Tomcat with PostgreSQL)

--
Post: opendatakit@googlegroups.com
Unsubscribe: opendatakit+unsubscribe@googlegroups.com
Options: http://groups.google.com/group/opendatakit?hl=en

--
Mitch Sundt
Software Engineer


University of Washington
mitchellsundt@gmail.com

i'm agreed with mitch that this is a pretty messy problem, but we can
at least find ways to explain this limitation a bit better (in docs or
ui) and also sketch out what it'd take to fix it.

i've filed the issue at
http://code.google.com/p/opendatakit/issues/detail?id=409. could you
attach a screenshot to the issue?

··· On Thu, Nov 3, 2011 at 20:40, Mitch Sundt wrote: > This is a tricky data representation question. We punted it in 0.9.x and > 1.0 because a standard solution won't make sense in all situations. > > If you have a multi-select with 30 choices, would you want a spreadsheet to > be generated with 30 column headings for this one multi-select, and a "X" > under the columns with matching values? What if there are 50 choices? > What if the number of choices is unlimited (this is possible with itemsets > drawing their choices from an external database)? At some point, it makes > more sense to produce a spreadsheet with just "N" columns for a given > multi-select (1st choice, 2nd choice, ...), and show up to N choices for > that multiselect (but which N do you show?). > > This would all need additional layers of configuration over the simpler > version available in 0.9.x and 1.0. > > If you have your data in a MySQL or PostgreSQL database, you may want to > look at Tableau or another data visualization package that can directly work > off the native database tables holding your data. > > Mitch > > On Thu, Nov 3, 2011 at 8:50 AM, LETSGO CeLeKT wrote: >> >> Hi, >> >> we did some tests with forms with multi select fields and are now >> wondering about how the stored information should be used, e.g. with >> the "Visualize Tools" of ODK Aggregate 1.0. >> The problem is that values being selected in a multi select are stored >> as combined individual values and thus don't count for the single >> values. >> >> Example Code: >> Defoliation >> >> Select >> >> Widespread >> Widespread >> >> >> None >> None >> >> >> >> >> When we select "Widespread" and "None", the diagrams and Spreadsheets >> process the data as "Widespread None" and not "Widespread" and "None". >> (I would like to attach a Screenshot to make this situation/problem >> clearer... I hope you understand what I want to say.) >> >> Or do we make use of the multi-select in a wrong way? >> >> Regards! >> >> (Our setup: ODK Aggregate 1.0 (Production) on Tomcat with PostgreSQL) >> >> -- >> Post: opendatakit@googlegroups.com >> Unsubscribe: opendatakit+unsubscribe@googlegroups.com >> Options: http://groups.google.com/group/opendatakit?hl=en > > > > -- > Mitch Sundt > Software Engineer > http://www.OpenDataKit.org > University of Washington > mitchellsundt@gmail.com > > -- > Post: opendatakit@googlegroups.com > Unsubscribe: opendatakit+unsubscribe@googlegroups.com > Options: http://groups.google.com/group/opendatakit?hl=en >

I attached a screenshot of a pie chart to the issue.

I can really imagine this is a more complicated question and there
cannot be a standard solution.
We hope we could present our thoughts on that issue and could give you
some ideas on discussing it.

We will look for some way to use and process every selected entry of a
multi-select.

··· On Nov 4, 5:49 am, Yaw Anokwa wrote: > i'm agreed with mitch that this is a pretty messy problem, but we can > at least find ways to explain this limitation a bit better (in docs or > ui) and also sketch out what it'd take to fix it. > > i've filed the issue athttp://code.google.com/p/opendatakit/issues/detail?id=409. could you > attach a screenshot to the issue? > > > > > > > > On Thu, Nov 3, 2011 at 20:40, Mitch Sundt wrote: > > This is a tricky data representation question. We punted it in 0.9.x and > > 1.0 because a standard solution won't make sense in all situations. > > > If you have a multi-select with 30 choices, would you want a spreadsheet to > > be generated with 30 column headings for this one multi-select, and a "X" > > under the columns with matching values? What if there are 50 choices? > > What if the number of choices is unlimited (this is possible with itemsets > > drawing their choices from an external database)? At some point, it makes > > more sense to produce a spreadsheet with just "N" columns for a given > > multi-select (1st choice, 2nd choice, ...), and show up to N choices for > > that multiselect (but which N do you show?). > > > This would all need additional layers of configuration over the simpler > > version available in 0.9.x and 1.0. > > > If you have your data in a MySQL or PostgreSQL database, you may want to > > look at Tableau or another data visualization package that can directly work > > off the native database tables holding your data. > > > Mitch > > > On Thu, Nov 3, 2011 at 8:50 AM, LETSGO CeLeKT wrote: > > >> Hi, > > >> we did some tests with forms with multi select fields and are now > >> wondering about how the stored information should be used, e.g. with > >> the "Visualize Tools" of ODK Aggregate 1.0. > >> The problem is that values being selected in a multi select are stored > >> as combined individual values and thus don't count for the single > >> values. > > >> Example Code: > >> Defoliation > >> > >> Select > >> > >> Widespread > >> Widespread > >> > >> > >> None > >> None > >> > >> > > >> When we select "Widespread" and "None", the diagrams and Spreadsheets > >> process the data as "Widespread None" and not "Widespread" and "None". > >> (I would like to attach a Screenshot to make this situation/problem > >> clearer... I hope you understand what I want to say.) > > >> Or do we make use of the multi-select in a wrong way? > > >> Regards! > > >> (Our setup: ODK Aggregate 1.0 (Production) on Tomcat with PostgreSQL) > > >> -- > >> Post: opendatakit@googlegroups.com > >> Unsubscribe: opendatakit+unsubscribe@googlegroups.com > >> Options:http://groups.google.com/group/opendatakit?hl=en > > > -- > > Mitch Sundt > > Software Engineer > >http://www.OpenDataKit.org > > University of Washington > > mitchellsu...@gmail.com > > > -- > > Post: opendatakit@googlegroups.com > > Unsubscribe: opendatakit+unsubscribe@googlegroups.com > > Options:http://groups.google.com/group/opendatakit?hl=en

In our aggregator, we store everything in a relational format. A
Response has many Answers, and an Answer to a multiselect has many
Choices (one for each selected value).

Then we create a single table for export using an SQL View. In that
table, each row represents one Choice (for multiselects) or one Answer
(for non-multiselects). So a multiselect will have multiple rows in
the view, and answers to all other question types have one row in the
view.

We send this view out to Tableau, which handles it quite nicely,
though at a hefty price!

Pretty soon we're going to start adding some basic visualizations and
analyses into the tool itself, using Tableau only for the more nit
picky questions.

··· On 4 November 2011 04:20, LETSGO CeLeKT wrote: > I attached a screenshot of a pie chart to the issue. > > I can really imagine this is a more complicated question and there > cannot be a standard solution. > We hope we could present our thoughts on that issue and could give you > some ideas on discussing it. > > We will look for some way to use and process every selected entry of a > multi-select. > > > > On Nov 4, 5:49 am, Yaw Anokwa wrote: >> i'm agreed with mitch that this is a pretty messy problem, but we can >> at least find ways to explain this limitation a bit better (in docs or >> ui) and also sketch out what it'd take to fix it. >> >> i've filed the issue athttp://code.google.com/p/opendatakit/issues/detail?id=409. could you >> attach a screenshot to the issue? >> >> >> >> >> >> >> >> On Thu, Nov 3, 2011 at 20:40, Mitch Sundt wrote: >> > This is a tricky data representation question. We punted it in 0.9.x and >> > 1.0 because a standard solution won't make sense in all situations. >> >> > If you have a multi-select with 30 choices, would you want a spreadsheet to >> > be generated with 30 column headings for this one multi-select, and a "X" >> > under the columns with matching values? What if there are 50 choices? >> > What if the number of choices is unlimited (this is possible with itemsets >> > drawing their choices from an external database)? At some point, it makes >> > more sense to produce a spreadsheet with just "N" columns for a given >> > multi-select (1st choice, 2nd choice, ...), and show up to N choices for >> > that multiselect (but which N do you show?). >> >> > This would all need additional layers of configuration over the simpler >> > version available in 0.9.x and 1.0. >> >> > If you have your data in a MySQL or PostgreSQL database, you may want to >> > look at Tableau or another data visualization package that can directly work >> > off the native database tables holding your data. >> >> > Mitch >> >> > On Thu, Nov 3, 2011 at 8:50 AM, LETSGO CeLeKT wrote: >> >> >> Hi, >> >> >> we did some tests with forms with multi select fields and are now >> >> wondering about how the stored information should be used, e.g. with >> >> the "Visualize Tools" of ODK Aggregate 1.0. >> >> The problem is that values being selected in a multi select are stored >> >> as combined individual values and thus don't count for the single >> >> values. >> >> >> Example Code: >> >> Defoliation >> >> >> >> Select >> >> >> >> Widespread >> >> Widespread >> >> >> >> >> >> None >> >> None >> >> >> >> >> >> >> When we select "Widespread" and "None", the diagrams and Spreadsheets >> >> process the data as "Widespread None" and not "Widespread" and "None". >> >> (I would like to attach a Screenshot to make this situation/problem >> >> clearer... I hope you understand what I want to say.) >> >> >> Or do we make use of the multi-select in a wrong way? >> >> >> Regards! >> >> >> (Our setup: ODK Aggregate 1.0 (Production) on Tomcat with PostgreSQL) >> >> >> -- >> >> Post: opendatakit@googlegroups.com >> >> Unsubscribe: opendatakit+unsubscribe@googlegroups.com >> >> Options:http://groups.google.com/group/opendatakit?hl=en >> >> > -- >> > Mitch Sundt >> > Software Engineer >> >http://www.OpenDataKit.org >> > University of Washington >> > mitchellsu...@gmail.com >> >> > -- >> > Post: opendatakit@googlegroups.com >> > Unsubscribe: opendatakit+unsubscribe@googlegroups.com >> > Options:http://groups.google.com/group/opendatakit?hl=en > > -- > Post: opendatakit@googlegroups.com > Unsubscribe: opendatakit+unsubscribe@googlegroups.com > Options: http://groups.google.com/group/opendatakit?hl=en >

The issue has been filed at
http://code.google.com/p/opendatakit/issues/detail?id=409 and will be
fixed in our next release.

Tom, Aggregate also store the data the data relationally; however,
with the many different forms of export it's easy to use the wrong
representation, we just used the wrong data form when calculating the
chart.

Waylon

··· On Fri, Nov 4, 2011 at 4:33 AM, Thomas Smyth wrote: > In our aggregator, we store everything in a relational format. A > Response has many Answers, and an Answer to a multiselect has many > Choices (one for each selected value). > > Then we create a single table for export using an SQL View. In that > table, each row represents one Choice (for multiselects) or one Answer > (for non-multiselects). So a multiselect will have multiple rows in > the view, and answers to all other question types have one row in the > view. > > We send this view out to Tableau, which handles it quite nicely, > though at a hefty price! > > Pretty soon we're going to start adding some basic visualizations and > analyses into the tool itself, using Tableau only for the more nit > picky questions. > > > On 4 November 2011 04:20, LETSGO CeLeKT wrote: >> I attached a screenshot of a pie chart to the issue. >> >> I can really imagine this is a more complicated question and there >> cannot be a standard solution. >> We hope we could present our thoughts on that issue and could give you >> some ideas on discussing it. >> >> We will look for some way to use and process every selected entry of a >> multi-select. >> >> >> >> On Nov 4, 5:49 am, Yaw Anokwa wrote: >>> i'm agreed with mitch that this is a pretty messy problem, but we can >>> at least find ways to explain this limitation a bit better (in docs or >>> ui) and also sketch out what it'd take to fix it. >>> >>> i've filed the issue athttp://code.google.com/p/opendatakit/issues/detail?id=409. could you >>> attach a screenshot to the issue? >>> >>> >>> >>> >>> >>> >>> >>> On Thu, Nov 3, 2011 at 20:40, Mitch Sundt wrote: >>> > This is a tricky data representation question. We punted it in 0.9.x and >>> > 1.0 because a standard solution won't make sense in all situations. >>> >>> > If you have a multi-select with 30 choices, would you want a spreadsheet to >>> > be generated with 30 column headings for this one multi-select, and a "X" >>> > under the columns with matching values? What if there are 50 choices? >>> > What if the number of choices is unlimited (this is possible with itemsets >>> > drawing their choices from an external database)? At some point, it makes >>> > more sense to produce a spreadsheet with just "N" columns for a given >>> > multi-select (1st choice, 2nd choice, ...), and show up to N choices for >>> > that multiselect (but which N do you show?). >>> >>> > This would all need additional layers of configuration over the simpler >>> > version available in 0.9.x and 1.0. >>> >>> > If you have your data in a MySQL or PostgreSQL database, you may want to >>> > look at Tableau or another data visualization package that can directly work >>> > off the native database tables holding your data. >>> >>> > Mitch >>> >>> > On Thu, Nov 3, 2011 at 8:50 AM, LETSGO CeLeKT wrote: >>> >>> >> Hi, >>> >>> >> we did some tests with forms with multi select fields and are now >>> >> wondering about how the stored information should be used, e.g. with >>> >> the "Visualize Tools" of ODK Aggregate 1.0. >>> >> The problem is that values being selected in a multi select are stored >>> >> as combined individual values and thus don't count for the single >>> >> values. >>> >>> >> Example Code: >>> >> Defoliation >>> >> >>> >> Select >>> >> >>> >> Widespread >>> >> Widespread >>> >> >>> >> >>> >> None >>> >> None >>> >> >>> >> >>> >>> >> When we select "Widespread" and "None", the diagrams and Spreadsheets >>> >> process the data as "Widespread None" and not "Widespread" and "None". >>> >> (I would like to attach a Screenshot to make this situation/problem >>> >> clearer... I hope you understand what I want to say.) >>> >>> >> Or do we make use of the multi-select in a wrong way? >>> >>> >> Regards! >>> >>> >> (Our setup: ODK Aggregate 1.0 (Production) on Tomcat with PostgreSQL) >>> >>> >> -- >>> >> Post: opendatakit@googlegroups.com >>> >> Unsubscribe: opendatakit+unsubscribe@googlegroups.com >>> >> Options:http://groups.google.com/group/opendatakit?hl=en >>> >>> > -- >>> > Mitch Sundt >>> > Software Engineer >>> >http://www.OpenDataKit.org >>> > University of Washington >>> > mitchellsu...@gmail.com >>> >>> > -- >>> > Post: opendatakit@googlegroups.com >>> > Unsubscribe: opendatakit+unsubscribe@googlegroups.com >>> > Options:http://groups.google.com/group/opendatakit?hl=en >> >> -- >> Post: opendatakit@googlegroups.com >> Unsubscribe: opendatakit+unsubscribe@googlegroups.com >> Options: http://groups.google.com/group/opendatakit?hl=en >> > > -- > Post: opendatakit@googlegroups.com > Unsubscribe: opendatakit+unsubscribe@googlegroups.com > Options: http://groups.google.com/group/opendatakit?hl=en >