Auto-generation and submission of test data

Andrew · May 5, 2016, 11:21am

Dear ODK team/community,

I have a data system which makes use of ODK for the data gathering portion.
As part of testing the system it would be very useful to automatically
(programmatically) generate large amounts of submissions data and submit it
to an Aggregate instance (e.g. using Python scripts)

I would appreciate some guidance as to the easiest/quickest way to do this.

It is easy enough to generate a csv file containing test submission info,
or to generate multiple xml submission files from a template.

What are the options for pushing this data to an Aggregate instance?
Some older posts suggest that using Briefcase is the way to go with this -
is this still the best option?

Do I only need to copy the xml files into Briefcase storage folder, or does
it need other info/files?
Do I need to generate the unique uuids, or are they generated as part of
the submission process?

Is there any way to import csv data into Briefcase for pushing to
Aggregate, or must it be in the xml files?

Thanks and regards,
Andrew

Mitch_S · May 5, 2016, 6:58pm

There is no import-csv functionality. You would need to generate XML files.

The easiest import is to copy the directory structure of ODK Collect from a
device, then direct ODK Briefcase to pull from that directory structure
(after placing the form definition in the proper directory and creating the
instance folders and instance files).

Once you have loaded an ODK Briefcase with data, it can then be pushed to
ODK Aggregate. The push mechanism uses the same filled-in-form-submission
process as ODK Collect.

It is best if you generate a unique instance ID for each record. If you
don't, ODK Aggregate will synthesize one. During the push from ODK
Briefcase, ODK Briefcasewill update the submission XML file with additional
root element attributes to record the instance id that ODK Aggregate
assigned to it. This prevents multiple uploads of that record by multiple
runs of this ODK Briefcase (so you gain nothing by not specifying an
instance id yourself).

···

On Thu, May 5, 2016 at 4:21 AM, Andrew wrote:

Dear ODK team/community,

I have a data system which makes use of ODK for the data gathering portion.
As part of testing the system it would be very useful to automatically
(programmatically) generate large amounts of submissions data and submit it
to an Aggregate instance (e.g. using Python scripts)

I would appreciate some guidance as to the easiest/quickest way to do this.

It is easy enough to generate a csv file containing test submission info,
or to generate multiple xml submission files from a template.

What are the options for pushing this data to an Aggregate instance?
Some older posts suggest that using Briefcase is the way to go with this -
is this still the best option?

Do I only need to copy the xml files into Briefcase storage folder, or
does it need other info/files?
Do I need to generate the unique uuids, or are they generated as part of
the submission process?

Is there any way to import csv data into Briefcase for pushing to
Aggregate, or must it be in the xml files?

Thanks and regards,
Andrew

--

Post: opendatakit@googlegroups.com
Unsubscribe: opendatakit+unsubscribe@googlegroups.com
Options: http://groups.google.com/group/opendatakit?hl=en

You received this message because you are subscribed to the Google Groups
"ODK Community" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to opendatakit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Mitch Sundt
Software Engineer
University of Washington
mitchellsundt@gmail.com

Andrew · May 14, 2016, 9:41pm

Thanks Mitch, this did the job.

Regards,
Andrew

···

On Thursday, 5 May 2016 20:58:23 UTC+2, Mitch Sundt wrote: > > There is no import-csv functionality. You would need to generate XML > files. > > The easiest import is to copy the directory structure of ODK Collect from > a device, then direct ODK Briefcase to pull from that directory structure > (after placing the form definition in the proper directory and creating the > instance folders and instance files). > > Once you have loaded an ODK Briefcase with data, it can then be pushed to > ODK Aggregate. The push mechanism uses the same filled-in-form-submission > process as ODK Collect. > > It is best if you generate a unique instance ID for each record. If you > don't, ODK Aggregate will synthesize one. During the push from ODK > Briefcase, ODK Briefcasewill update the submission XML file with additional > root element attributes to record the instance id that ODK Aggregate > assigned to it. This prevents multiple uploads of that record by multiple > runs of this ODK Briefcase (so you gain nothing by not specifying an > instance id yourself). > > > > On Thu, May 5, 2016 at 4:21 AM, Andrew <acawo...@gmail.com > wrote: > >> Dear ODK team/community, >> >> I have a data system which makes use of ODK for the data gathering >> portion. >> As part of testing the system it would be very useful to automatically >> (programmatically) generate large amounts of submissions data and submit it >> to an Aggregate instance (e.g. using Python scripts) >> >> I would appreciate some guidance as to the easiest/quickest way to do >> this. >> >> It is easy enough to generate a csv file containing test submission info, >> or to generate multiple xml submission files from a template. >> >> What are the options for pushing this data to an Aggregate instance? >> Some older posts suggest that using Briefcase is the way to go with this >> - is this still the best option? >> >> Do I only need to copy the xml files into Briefcase storage folder, or >> does it need other info/files? >> Do I need to generate the unique uuids, or are they generated as part of >> the submission process? >> >> Is there any way to import csv data into Briefcase for pushing to >> Aggregate, or must it be in the xml files? >> >> Thanks and regards, >> Andrew >> >> -- >> -- >> Post: opend...@googlegroups.com >> Unsubscribe: opendatakit...@googlegroups.com >> Options: http://groups.google.com/group/opendatakit?hl=en >> >> --- >> You received this message because you are subscribed to the Google Groups >> "ODK Community" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to opendatakit...@googlegroups.com . >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > Mitch Sundt > Software Engineer > University of Washington > mitche...@gmail.com >

E_Piqo · May 18, 2016, 11:48am

Hi Andrew,

I'm curious to know, how do you generate the XML programmatically? (I mean
valid XML that follow the form's logic (skips, etc)"

Any library that can help me with that?

Best,
piqo

···

On Saturday, May 14, 2016 at 11:41:53 PM UTC+2, Andrew wrote: > > Thanks Mitch, this did the job. > > Regards, > Andrew > > On Thursday, 5 May 2016 20:58:23 UTC+2, Mitch Sundt wrote: >> >> There is no import-csv functionality. You would need to generate XML >> files. >> >> The easiest import is to copy the directory structure of ODK Collect from >> a device, then direct ODK Briefcase to pull from that directory structure >> (after placing the form definition in the proper directory and creating the >> instance folders and instance files). >> >> Once you have loaded an ODK Briefcase with data, it can then be pushed to >> ODK Aggregate. The push mechanism uses the same filled-in-form-submission >> process as ODK Collect. >> >> It is best if you generate a unique instance ID for each record. If you >> don't, ODK Aggregate will synthesize one. During the push from ODK >> Briefcase, ODK Briefcasewill update the submission XML file with additional >> root element attributes to record the instance id that ODK Aggregate >> assigned to it. This prevents multiple uploads of that record by multiple >> runs of this ODK Briefcase (so you gain nothing by not specifying an >> instance id yourself). >> >> >> >> On Thu, May 5, 2016 at 4:21 AM, Andrew wrote: >> >>> Dear ODK team/community, >>> >>> I have a data system which makes use of ODK for the data gathering >>> portion. >>> As part of testing the system it would be very useful to automatically >>> (programmatically) generate large amounts of submissions data and submit it >>> to an Aggregate instance (e.g. using Python scripts) >>> >>> I would appreciate some guidance as to the easiest/quickest way to do >>> this. >>> >>> It is easy enough to generate a csv file containing test submission >>> info, or to generate multiple xml submission files from a template. >>> >>> What are the options for pushing this data to an Aggregate instance? >>> Some older posts suggest that using Briefcase is the way to go with this >>> - is this still the best option? >>> >>> Do I only need to copy the xml files into Briefcase storage folder, or >>> does it need other info/files? >>> Do I need to generate the unique uuids, or are they generated as part of >>> the submission process? >>> >>> Is there any way to import csv data into Briefcase for pushing to >>> Aggregate, or must it be in the xml files? >>> >>> Thanks and regards, >>> Andrew >>> >>> -- >>> -- >>> Post: opend...@googlegroups.com >>> Unsubscribe: opendatakit...@googlegroups.com >>> Options: http://groups.google.com/group/opendatakit?hl=en >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "ODK Community" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to opendatakit...@googlegroups.com. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >> >> -- >> Mitch Sundt >> Software Engineer >> University of Washington >> mitche...@gmail.com >> >

Andrew · May 19, 2016, 9:06am

Hi Piqo,

The approach was to use an example submission for that form as a template
for submissions.
To generate new submissions I just edited (programatically) the fields I
wanted to set specifically.
This was done using a python script:
o Read the copied template xml file into memory using lxml library (from
lxml import etree)
o Search the xml tree for the specific fields that need to be changed and
set these values
o Write the xml tree out to file

I'll post the script here for you a bit later today.

Regards,
Andrew

···

On Wednesday, 18 May 2016 13:48:30 UTC+2, El Piqo wrote: > > Hi Andrew, > > I'm curious to know, how do you generate the XML programmatically? (I mean > valid XML that follow the form's logic (skips, etc)" > > Any library that can help me with that? > > Best, > piqo > > On Saturday, May 14, 2016 at 11:41:53 PM UTC+2, Andrew wrote: >> >> Thanks Mitch, this did the job. >> >> Regards, >> Andrew >> >> On Thursday, 5 May 2016 20:58:23 UTC+2, Mitch Sundt wrote: >>> >>> There is no import-csv functionality. You would need to generate XML >>> files. >>> >>> The easiest import is to copy the directory structure of ODK Collect >>> from a device, then direct ODK Briefcase to pull from that directory >>> structure (after placing the form definition in the proper directory and >>> creating the instance folders and instance files). >>> >>> Once you have loaded an ODK Briefcase with data, it can then be pushed >>> to ODK Aggregate. The push mechanism uses the same >>> filled-in-form-submission process as ODK Collect. >>> >>> It is best if you generate a unique instance ID for each record. If you >>> don't, ODK Aggregate will synthesize one. During the push from ODK >>> Briefcase, ODK Briefcasewill update the submission XML file with additional >>> root element attributes to record the instance id that ODK Aggregate >>> assigned to it. This prevents multiple uploads of that record by multiple >>> runs of this ODK Briefcase (so you gain nothing by not specifying an >>> instance id yourself). >>> >>> >>> >>> On Thu, May 5, 2016 at 4:21 AM, Andrew wrote: >>> >>>> Dear ODK team/community, >>>> >>>> I have a data system which makes use of ODK for the data gathering >>>> portion. >>>> As part of testing the system it would be very useful to automatically >>>> (programmatically) generate large amounts of submissions data and submit it >>>> to an Aggregate instance (e.g. using Python scripts) >>>> >>>> I would appreciate some guidance as to the easiest/quickest way to do >>>> this. >>>> >>>> It is easy enough to generate a csv file containing test submission >>>> info, or to generate multiple xml submission files from a template. >>>> >>>> What are the options for pushing this data to an Aggregate instance? >>>> Some older posts suggest that using Briefcase is the way to go with >>>> this - is this still the best option? >>>> >>>> Do I only need to copy the xml files into Briefcase storage folder, or >>>> does it need other info/files? >>>> Do I need to generate the unique uuids, or are they generated as part >>>> of the submission process? >>>> >>>> Is there any way to import csv data into Briefcase for pushing to >>>> Aggregate, or must it be in the xml files? >>>> >>>> Thanks and regards, >>>> Andrew >>>> >>>> -- >>>> -- >>>> Post: opend...@googlegroups.com >>>> Unsubscribe: opendatakit...@googlegroups.com >>>> Options: http://groups.google.com/group/opendatakit?hl=en >>>> >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "ODK Community" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to opendatakit...@googlegroups.com. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> >>> >>> -- >>> Mitch Sundt >>> Software Engineer >>> University of Washington >>> mitche...@gmail.com >>> >>

Andrew · May 20, 2016, 2:57pm

Hi Piqo,

Find attached the python script and config files as promised.

Full steps for use are:

o Create your form and load it onto a device
o Ensure ODK Collect is not connected to the internet
o Fill Blank Form and complete the form you loaded. At the end Save and
Finalize
o Use OI File Manager (app from Play Store) to zip the /odk/ folder on your
device
o Copy the zip file to your computer
o Unzip the folder. Navigate to the /instances/ folder.
o There should be a subfolder generated by your submission, something like
_date_time
o Inside this folder you will find the xml file from your submission, as
well as any media files (e.g. photos)
o This is the structure you need to replicate using the scripts

Details of the scripts:
o settings_test.json: this is just a config file (saves me using a bunch
of command-line paramaters)
o numsubmissions - how many submissions to generate. Max value of 999
in this case
o submdate - simulated date of submission
o outputpath - folder to create for files output by scripts
o all other values were specific to my application - I have left them
as an example. I needed a unique sequential number which incremented with
each submission for my testing.

o last_uuid.json: keeps track of the last_uuid generated for the
submissions, for repeated running of the script. Is updated with each run.
I just started the uuid at 0000.....00 and incremented it with each
submission. Only requirement is for them to be unique so I didn't bother
making it any more complex than that.
Similarly, I set the submission time as 1 second apart for each submission,
just to get some spread.

Using the scripts:
o Modify the create_submission_file() function in the script to set values
as you need.
Note that if values are stored at root level then root.find() will find
them directly
If they are within a group (like 'grp_personal_1' in my script) then you
must first find the group, and then find with that 'node' for the fields
stored in the group.

o Make a copy of your example xml file and save it as
template_submission.xml

o If you have a media file that you want submitted, create a suitable
subfolder and example file (see FOLDER_PHOTO_TEMPLATE and PIC_ID in the
script)

o edit settings_test.json as desired

o run the script using:

python genSubmissions.py settings_test.json

o Cut and paste the folders from output_path into the /instances/ folder
mentioned above (or modify the script to write there directly

o Pull these submissions into ODK Briefcase, either via the GUI or from cmd
line:
java -jar "ODK Briefcase v1.4.5 Production.jar" -id MyFormID -od
"C:\ODKtesting\test_files\test_odk" -sd "C:\ODKtesting\Briefcase"

od = unzipped odk directory
sd = Briefcase storage directory to create/use

o Now use the Briefcase GUI to push the forms to your instance
(unfortunately no cmd line way to do this). In my case I was using a VM for
my testing.

Hope that's of some use!

Regards,
Andrew

last_uuid.json (16 Bytes)

settings_test.json (214 Bytes)

genSubmissions.py (4.55 KB)

···

On Thursday, 19 May 2016 11:06:32 UTC+2, Andrew wrote: > > Hi Piqo, > > The approach was to use an example submission for that form as a template > for submissions. > To generate new submissions I just edited (programatically) the fields I > wanted to set specifically. > This was done using a python script: > o Read the copied template xml file into memory using lxml library (from > lxml import etree) > o Search the xml tree for the specific fields that need to be changed and > set these values > o Write the xml tree out to file > > I'll post the script here for you a bit later today. > > Regards, > Andrew > > On Wednesday, 18 May 2016 13:48:30 UTC+2, El Piqo wrote: >> >> Hi Andrew, >> >> I'm curious to know, how do you generate the XML programmatically? (I >> mean valid XML that follow the form's logic (skips, etc)" >> >> Any library that can help me with that? >> >> Best, >> piqo >> >> On Saturday, May 14, 2016 at 11:41:53 PM UTC+2, Andrew wrote: >>> >>> Thanks Mitch, this did the job. >>> >>> Regards, >>> Andrew >>> >>> On Thursday, 5 May 2016 20:58:23 UTC+2, Mitch Sundt wrote: >>>> >>>> There is no import-csv functionality. You would need to generate XML >>>> files. >>>> >>>> The easiest import is to copy the directory structure of ODK Collect >>>> from a device, then direct ODK Briefcase to pull from that directory >>>> structure (after placing the form definition in the proper directory and >>>> creating the instance folders and instance files). >>>> >>>> Once you have loaded an ODK Briefcase with data, it can then be pushed >>>> to ODK Aggregate. The push mechanism uses the same >>>> filled-in-form-submission process as ODK Collect. >>>> >>>> It is best if you generate a unique instance ID for each record. If you >>>> don't, ODK Aggregate will synthesize one. During the push from ODK >>>> Briefcase, ODK Briefcasewill update the submission XML file with additional >>>> root element attributes to record the instance id that ODK Aggregate >>>> assigned to it. This prevents multiple uploads of that record by multiple >>>> runs of this ODK Briefcase (so you gain nothing by not specifying an >>>> instance id yourself). >>>> >>>> >>>> >>>> On Thu, May 5, 2016 at 4:21 AM, Andrew wrote: >>>> >>>>> Dear ODK team/community, >>>>> >>>>> I have a data system which makes use of ODK for the data gathering >>>>> portion. >>>>> As part of testing the system it would be very useful to automatically >>>>> (programmatically) generate large amounts of submissions data and submit it >>>>> to an Aggregate instance (e.g. using Python scripts) >>>>> >>>>> I would appreciate some guidance as to the easiest/quickest way to do >>>>> this. >>>>> >>>>> It is easy enough to generate a csv file containing test submission >>>>> info, or to generate multiple xml submission files from a template. >>>>> >>>>> What are the options for pushing this data to an Aggregate instance? >>>>> Some older posts suggest that using Briefcase is the way to go with >>>>> this - is this still the best option? >>>>> >>>>> Do I only need to copy the xml files into Briefcase storage folder, or >>>>> does it need other info/files? >>>>> Do I need to generate the unique uuids, or are they generated as part >>>>> of the submission process? >>>>> >>>>> Is there any way to import csv data into Briefcase for pushing to >>>>> Aggregate, or must it be in the xml files? >>>>> >>>>> Thanks and regards, >>>>> Andrew >>>>> >>>>> -- >>>>> -- >>>>> Post: opend...@googlegroups.com >>>>> Unsubscribe: opendatakit...@googlegroups.com >>>>> Options: http://groups.google.com/group/opendatakit?hl=en >>>>> >>>>> --- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "ODK Community" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to opendatakit...@googlegroups.com. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> >>>> >>>> -- >>>> Mitch Sundt >>>> Software Engineer >>>> University of Washington >>>> mitche...@gmail.com >>>> >>>