Starting DataCollect project - questions to handle date exports

Gunter_Lorek · April 13, 2014, 9:55am

Hi all,

I'm starting a project to collect machine's peripheral devices data with 6-10 mobiles at about 650 locations in germany.
ODK Collect and the Aggregate cloud server on AppEngine are working fine, data and pictures are delivered to Aggregate.
So well...
I'm planning to import all the records and linked pictures (jpeg, 3-5 MP) to an local database on my syno diskstation (mariaDB/mysql) to check the data an manage the periodic maintenance of all kinds of collected parts of the machines. There are up to 10 repeats possible in each form. Because of the limitations of the CSV export the usage of Briefcase is recommended.

Should Briefcase pull the data from Aggregate or may the mobiles deliver the data directly to Briefcase?

Thanks a lot for replying,

Gunter

Mitch_S · April 14, 2014, 7:26pm

Briefcase is always the active party, pulling the data off of the phone,
pulling data off of the server, or pushing the data into the server.

If you have software skills, you might consider:

(1) set up an ODK Aggregate server running on Tomcat / MySQL to publish the
data directly into a MySQL database. If you are comfortable with working
directly with data tables, this may be the easiest direct route to the
data. The data table structure is described here:
http://code.google.com/p/opendatakit/wiki/AggregateDatabaseStructure

This can closely tie your app into the data tables of ODK Aggregate --
which may or may not be good -- we have no plans and see no need to alter
the data representation of the 1.x tables in ODK Aggregate. To add a layer
of isolation, you could either create a VIEW that you then use in your app
(so that you can join to external data, fold in newer submission data
tables, etc. as your forms/needs change), or you can set up a stored
procedure to copy the data from the Aggregate tables into your own data
tables.

(2) set up a web service of your own that receives published JSON
serializations of the form data from ODK Aggregate, and transforms these
directly into your data table representation. The JSON (and XML)
publishers are described here:
http://code.google.com/p/opendatakit/wiki/AggregateToJSonXmlREDCapPublishers
Publishers can be configured to stream data as it comes into the ODK
Aggregate server, so once you set it up, it will trickle completed
submissions into your server.

We also have a less-technical description of the data transfer options
here: http://opendatakit.org/use/aggregate/data-transfer/

Mitch

···

On Sun, Apr 13, 2014 at 2:55 AM, wrote:

Hi all,

I'm starting a project to collect machine's peripheral devices data with
6-10 mobiles at about 650 locations in germany.
ODK Collect and the Aggregate cloud server on AppEngine are working fine,
data and pictures are delivered to Aggregate.
So well...
I'm planning to import all the records and linked pictures (jpeg, 3-5 MP)
to an local database on my syno diskstation (mariaDB/mysql) to check the
data an manage the periodic maintenance of all kinds of collected parts of
the machines. There are up to 10 repeats possible in each form. Because of
the limitations of the CSV export the usage of Briefcase is recommended.

Should Briefcase pull the data from Aggregate or may the mobiles deliver
the data directly to Briefcase?

Thanks a lot for replying,

Gunter

--

Post: opendatakit@googlegroups.com
Unsubscribe: opendatakit+unsubscribe@googlegroups.com
Options: http://groups.google.com/group/opendatakit?hl=en

You received this message because you are subscribed to the Google Groups
"ODK Community" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to opendatakit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Mitch Sundt
Software Engineer
University of Washington
mitchellsundt@gmail.com

Gunter_Lorek · April 17, 2014, 2:07pm

Mitch, thank you very much for your detailed answer. Because of other work on the project, I couldn'n answer until today.

We have to check all the data coming in from our (estimated 8) data collectors in the field. There will be data from about 20 facilities per day with around 80 fields and 20 pictues per facility. There are 4 goups with looped groups inside. No one knows how good the data quality will be. There are safety components, which have to get checked against misenty.
This will be an exciting project, I suppose.

The direct way via Tomcat you described will sure the nicest, but the manual way will the safest to check all data before they are going into the database.
Because of a not reliable internet connection via LTE at the moment, I' m looking for a better fault-tolerant system to pull the data.

After the Easter holidays I will work on this, plenty of time to think about between the coloured eggs. I'll report which way we'll go.

Happy Easter!

Gunter