[ODK Developers] Re: Sending stuck in a loop

Do you think it will be a long time before Open Rosa changes to the way Mitch suggested?

How about renaming all files back to normal after completion of the send to keep from messing with Briefcase? Or just have Collect keep track of the ones it has sent and write that list to a file that can be referred to when the send if re-initiated?

I agree that the entire community should and would benefit from having an efficient restart of the send capability. As I talk with NGO's their desire is to have better insight to what is happening in the field across the ocean. They ask for photos to be sent but that is like pulling teeth. I would hazard a guess that once M&E coordinators realize that they can use cell phones in the hands of every national field worker to snap photos and videos of every project... the field folks will be happy to shoot photos and videos from their phones instead of having to put quite so much textual data to describe what's going on (a picture is worth a thousand words) and the home office will be thrilled to have better insight to what's going on with the projects. For example... how are each of the test crops doing each day?, what is the condition of the water well?, Is the condition of the home improving as women are being paid directly with aid dollars?, etc.

I have some good news to report with my trying to upload forms with large amount of questions and with videos...
The large upload that was getting stuck in reloading from the beginning while I was in a youth hostel in Beijing worked fine the first time when I got home to Texas and re initiated the upload.

E.g. 1742 questions containing 1002 videos each between 2-10MB for a Form total size of 4.4GB over a 2.0Mb upload home line

I feel sure the China upload would have done well if the restart from where left off would have been in place since it did upload 600 of the media files on the first shot. But there must have been something about the internet line quality that would lapse once about every 6 hours which would cause a restart from the beginning. :frowning:

So good job ODK with being able to handle forms with a large number of questions and a 1000 of 2-11 second videos! And good job for the guys SurveyCTO who tuned the Amazon servers running the aggregate! If the restart capability is fixed I think ODK can reliably be used in development scenarios that require large amounts of short media files. It would be even well suited for travel blogging. For example --- capture photos and video with associated blog text then when you get back to a wifi location ... upload the form to your personal travel blog site that puts all of it online for your friends to see.

Bob

Original message

··· From: "Yaw Anokwa" To: BobAchgill@hotmail.com; Dated: 7/11/2014 7:28:49 PM Subject: Re: [ODK Developers] Re: Sending stuck in a loop

Bob,

If you are looking for a something quick that would work for your app,
this sounds like it'd work. You probably couldn't use Briefcase (or
other downstream tools) to pull from Collect and you'd probably have
to change form deletion so it removes the file as well.

I wouldn't call it a fix though. The fix is what Mitch described
earlier. And said fix also has the huge benefit of being useful to the
entire community.

Yaw

Need ODK services? http://nafundi.com provides form design, server
setup, professional support, and software development for ODK.
On Fri, Jul 11, 2014 at 2:42 PM, Bob Achgill BobAchgill@hotmail.com wrote:

Since the Android device is the origination point of files being sent up to
the server and the desire it to avoid unnecessarily resending the same large
media files to the server... would it work just to simply rename files at
they are successfully sent to the server? e.g.
########LargeMediaAlreadySent.mp4 This way the renamed media file(s) won't
be found to be sent upon re initiation of the send. To my knowledge I don't
know that files already sent will ever be used again in the life cycle of a
submitted form. Aren't already submitted forms just hanging around till
they are deleted? Renaming is better than just programmatically deleting
the sent files in case you later need to debug why a form is having trouble
being sent. "Ahhah, it appears that all the files got sent!" "Or, all but
this one got renamed so ... let's focus on why it did not get sent."

For example: In driving from LA to DC and the GPS has a hiccup while in a
rain storm... you would not need to go all the way back to LA and start
over... just recalculate from the latest known position. In this case
renaming files as you successfully upload each one determines that they will
not be found to be sent again as the send recalculates what is left to send.
This might drive your error message crazy if their is an error message for
File not found... but that is easy to address by adding a condition to check
if the file was already renamed because it was already sent.

Would this proposed fix mess with the Open Rosa standard? At simplest it is
just a rename line of code and a few more lines to add the case to file not
found error message processing.

What do you think?


Original message
From: "Mitch Sundt" mitchellsundt@gmail.com
To: yanokwa@nafundi.com; crobert@surveycto.com;
opendatakit-developers@googlegroups.com;
Dated: 7/3/2014 11:11:27 AM
Subject: Re: [ODK Developers] Re: Sending stuck in a loop

Long term, this would be a useful discussion to open up with all the
OpenRosa server implementers -- adding a mechanism to the protocol to obtain
the list of media attachments already uploaded to the server, along with
their md5 hashes.

Short term, if you are communicating with an ODK Aggregate 1.x-derived
server, the code change to ODK Collect would be:

Obtain list of media attachment files for the submission from the device SD
card / external storage location.

Issue Head request to establish authentication / authorization for session.

If attempting to re-send a submission that previously failed, issue an ODK
Briefcase http://code.google.com/p/opendatakit/wiki/BriefcaseAggregateAPI
GET view/downloadSubmission request to the server.

If this is successful, filter the list of media attachments to exclude those
that the server reports as having been successfully uploaded.
If this request fails, the server is not ODK Aggregate-based, or the
submission is not present -- ignore the error and proceed.

Proceed with the usual POST work flow, using the filtered list of
attachments to POST.


Mitch

On Wed, Jul 2, 2014 at 8:12 PM, bobachgill@gmail.com wrote:

Just for grins...
From what I see here in China... when the export flood gate is opened...
smart phones in the world will all be stamped "made in China". They are
using their own population to test them... the $100 to $200 models looking
really good!

--
You received this message because you are subscribed to the Google Groups
"ODK Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to opendatakit-developers+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Mitch Sundt
Software Engineer
University of Washington
mitchellsundt@gmail.com

Just to loop the community in, FYI, on the private side of this
conversation...

We've not been enthusiastic about "fixing" this particular problem because
any fix would need to be coupled with newly-enforced limits in our service.
As I've explained to Bob, we offer unlimited hosting for surveys under the
expectation that there would be certain practical limits that would keep
surveys from costing hundreds, thousands, or even tens of thousands of
dollars per month for physical hosting. Bob's particular usage goes
considerably beyond any expectations we had for what our servers would be
asked to bear, and we are more than a little bit ambivalent about catering
to that particular use-case (collecting a potentially very large number of
4GB submissions). Our mirroring, back-up, and general hosting strategies
are designed for what would more traditionally be thought of as survey data.

That's not to say that fixing the Collect send process to resume where it
left off is a bad idea. And normally if a customer was having a problem
like this we would dive right in and work on a fix. In this case, though,
there's a bigger issue with our low-price hosting model: it may need hard
limits that we enforce before we start optimizing for multi-GB use cases.
Otherwise, one user can set up the next Youtube and we'll be on the hook
for all of the hosting costs.

Best,

Chris

··· On Thu, Jul 17, 2014 at 12:14 AM, Bob Achgill wrote:

Do you think it will be a long time before Open Rosa changes to the way
Mitch suggested?

How about renaming all files back to normal after completion of the send
to keep from messing with Briefcase? Or just have Collect keep track of
the ones it has sent and write that list to a file that can be referred to
when the send if re-initiated?

I agree that the entire community should and would benefit from having an
efficient restart of the send capability. As I talk with NGO's their
desire is to have better insight to what is happening in the field across
the ocean. They ask for photos to be sent but that is like pulling teeth.
I would hazard a guess that once M&E coordinators realize that they can use
cell phones in the hands of every national field worker to snap photos and
videos of every project... the field folks will be happy to shoot photos
and videos from their phones instead of having to put quite so much textual
data to describe what's going on (a picture is worth a thousand words) and
the home office will be thrilled to have better insight to what's going on
with the projects. For example... how are each of the test crops doing
each day?, what is the condition of the water well?, Is the condition of
the home improving as women are being paid directly with aid dollars?, etc.

I have some good news to report with my trying to upload forms with large
amount of questions and with videos...
The large upload that was getting stuck in reloading from the beginning
while I was in a youth hostel in Beijing worked fine the first time when I
got home to Texas and re initiated the upload.

E.g. 1742 questions containing 1002 videos each between 2-10MB for a Form
total size of 4.4GB over a 2.0Mb upload home line

I feel sure the China upload would have done well if the restart from
where left off would have been in place since it did upload 600 of the
media files on the first shot. But there must have been something about
the internet line quality that would lapse once about every 6 hours which
would cause a restart from the beginning. :frowning:

So good job ODK with being able to handle forms with a large number of
questions and a 1000 of 2-11 second videos! And good job for the guys
SurveyCTO who tuned the Amazon servers running the aggregate! If the
restart capability is fixed I think ODK can reliably be used in development
scenarios that require large amounts of short media files. It would be
even well suited for travel blogging. For example --- capture photos and
video with associated blog text then when you get back to a wifi location
... upload the form to your personal travel blog site that puts all of it
online for your friends to see.

Bob


Original message
From: "Yaw Anokwa" yanokwa@nafundi.com
To: BobAchgill@hotmail.com;
Dated: 7/11/2014 7:28:49 PM

Subject: Re: [ODK Developers] Re: Sending stuck in a loop

Bob,

If you are looking for a something quick that would work for your app,
this sounds like it'd work. You probably couldn't use Briefcase (or
other downstream tools) to pull from Collect and you'd probably have
to change form deletion so it removes the file as well.

I wouldn't call it a fix though. The fix is what Mitch described
earlier. And said fix also has the huge benefit of being useful to the
entire community.

Yaw

Need ODK services? http://nafundi.com provides form design, server
setup, professional support, and software development for ODK.
On Fri, Jul 11, 2014 at 2:42 PM, Bob Achgill BobAchgill@hotmail.com wrote:

Since the Android device is the origination point of files being sent up
to
the server and the desire it to avoid unnecessarily resending the same
large
media files to the server... would it work just to simply rename files at
they are successfully sent to the server? e.g.
########LargeMediaAlreadySent.mp4 This way the renamed media file(s)
won't
be found to be sent upon re initiation of the send. To my knowledge I
don't
know that files already sent will ever be used again in the life cycle
of a
submitted form. Aren't already submitted forms just hanging around till
they are deleted? Renaming is better than just programmatically deleting
the sent files in case you later need to debug why a form is having
trouble
being sent. "Ahhah, it appears that all the files got sent!" "Or, all but
this one got renamed so ... let's focus on why it did not get sent."

For example: In driving from LA to DC and the GPS has a hiccup while in a
rain storm... you would not need to go all the way back to LA and start
over... just recalculate from the latest known position. In this case
renaming files as you successfully upload each one determines that they
will
not be found to be sent again as the send recalculates what is left to
send.
This might drive your error message crazy if their is an error message
for
File not found... but that is easy to address by adding a condition to
check
if the file was already renamed because it was already sent.

Would this proposed fix mess with the Open Rosa standard? At simplest it
is
just a rename line of code and a few more lines to add the case to file
not
found error message processing.

What do you think?


Original message
From: "Mitch Sundt" mitchellsundt@gmail.com
To: yanokwa@nafundi.com; crobert@surveycto.com;
opendatakit-developers@googlegroups.com;
Dated: 7/3/2014 11:11:27 AM
Subject: Re: [ODK Developers] Re: Sending stuck in a loop

Long term, this would be a useful discussion to open up with all the
OpenRosa server implementers -- adding a mechanism to the protocol to
obtain
the list of media attachments already uploaded to the server, along with
their md5 hashes.

Short term, if you are communicating with an ODK Aggregate 1.x-derived
server, the code change to ODK Collect would be:

Obtain list of media attachment files for the submission from the device
SD
card / external storage location.

Issue Head request to establish authentication / authorization for
session.

If attempting to re-send a submission that previously failed, issue an
ODK
Briefcase
http://code.google.com/p/opendatakit/wiki/BriefcaseAggregateAPI
GET view/downloadSubmission request to the server.

If this is successful, filter the list of media attachments to exclude
those
that the server reports as having been successfully uploaded.
If this request fails, the server is not ODK Aggregate-based, or the
submission is not present -- ignore the error and proceed.

Proceed with the usual POST work flow, using the filtered list of
attachments to POST.


Mitch

On Wed, Jul 2, 2014 at 8:12 PM, bobachgill@gmail.com wrote:

Just for grins...
From what I see here in China... when the export flood gate is opened...
smart phones in the world will all be stamped "made in China". They are
using their own population to test them... the $100 to $200 models
looking

really good!

--
You received this message because you are subscribed to the Google
Groups

"ODK Developers" group.
To unsubscribe from this group and stop receiving emails from it, send
an

email to opendatakit-developers+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Mitch Sundt
Software Engineer
University of Washington
mitchellsundt@gmail.com

--
You received this message because you are subscribed to the Google Groups
"ODK Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to opendatakit-developers+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Bob: if you hire someone to make the changes as I described and contribute
them back to the code tree, then they will be accepted.

The core team doesn't have the resources to make this change, as your use
case is very extreme.

The expected use case would utilize a series of shorter surveys, with each
survey containing some contextual information about what is being captured.

i.e., in the expected usage, you might fill out 1000 short surveys, each
with one documenting video. If a lot of contextual data needs to be
repeated across those videos, you would use the data preloading features to
supply that data (or a reference to it), so that the creation of a new
survey would not be too burdensome.

It this expected use case, the likelihood of failures during transmission
are lower, and the quantity of data needing to be re-sent is also lower.

Having one large survey, with an embedded repeat group that has 1000
repetitions is very atypical. The repeat group functionality is intended
for smaller group sizes -- e.g., the number of people in a dwelling.

And in the ODK 2.0 tools, we do away with repeat groups entirely; we
instead provide mechanisms to access and link multiple forms together so
that each repeat group is itself a form with a link back to the enclosing
form. I.e., we just have flat, fixed-width forms.

··· ----------- Chris: Very interesting challenge. I look forward to seeing how your business / business model evolves as digital data collection moves farther from the paper-based form data we're all familiar with and more toward Bob's integrated video/audio/image document style.

Mitch

On Thu, Jul 17, 2014 at 2:54 AM, Christopher Robert crobert@surveycto.com wrote:

Just to loop the community in, FYI, on the private side of this
conversation...

We've not been enthusiastic about "fixing" this particular problem because
any fix would need to be coupled with newly-enforced limits in our service.
As I've explained to Bob, we offer unlimited hosting for surveys under the
expectation that there would be certain practical limits that would keep
surveys from costing hundreds, thousands, or even tens of thousands of
dollars per month for physical hosting. Bob's particular usage goes
considerably beyond any expectations we had for what our servers would be
asked to bear, and we are more than a little bit ambivalent about catering
to that particular use-case (collecting a potentially very large number of
4GB submissions). Our mirroring, back-up, and general hosting strategies
are designed for what would more traditionally be thought of as survey data.

That's not to say that fixing the Collect send process to resume where it
left off is a bad idea. And normally if a customer was having a problem
like this we would dive right in and work on a fix. In this case, though,
there's a bigger issue with our low-price hosting model: it may need hard
limits that we enforce before we start optimizing for multi-GB use cases.
Otherwise, one user can set up the next Youtube and we'll be on the hook
for all of the hosting costs.

Best,

Chris

On Thu, Jul 17, 2014 at 12:14 AM, Bob Achgill BobAchgill@hotmail.com wrote:

Do you think it will be a long time before Open Rosa changes to the way
Mitch suggested?

How about renaming all files back to normal after completion of the send
to keep from messing with Briefcase? Or just have Collect keep track of
the ones it has sent and write that list to a file that can be referred to
when the send if re-initiated?

I agree that the entire community should and would benefit from having an
efficient restart of the send capability. As I talk with NGO's their
desire is to have better insight to what is happening in the field across
the ocean. They ask for photos to be sent but that is like pulling teeth.
I would hazard a guess that once M&E coordinators realize that they can use
cell phones in the hands of every national field worker to snap photos and
videos of every project... the field folks will be happy to shoot photos
and videos from their phones instead of having to put quite so much textual
data to describe what's going on (a picture is worth a thousand words) and
the home office will be thrilled to have better insight to what's going on
with the projects. For example... how are each of the test crops doing
each day?, what is the condition of the water well?, Is the condition of
the home improving as women are being paid directly with aid dollars?, etc.

I have some good news to report with my trying to upload forms with large
amount of questions and with videos...
The large upload that was getting stuck in reloading from the beginning
while I was in a youth hostel in Beijing worked fine the first time when I
got home to Texas and re initiated the upload.

E.g. 1742 questions containing 1002 videos each between 2-10MB for a Form
total size of 4.4GB over a 2.0Mb upload home line

I feel sure the China upload would have done well if the restart from
where left off would have been in place since it did upload 600 of the
media files on the first shot. But there must have been something about
the internet line quality that would lapse once about every 6 hours which
would cause a restart from the beginning. :frowning:

So good job ODK with being able to handle forms with a large number of
questions and a 1000 of 2-11 second videos! And good job for the guys
SurveyCTO who tuned the Amazon servers running the aggregate! If the
restart capability is fixed I think ODK can reliably be used in development
scenarios that require large amounts of short media files. It would be
even well suited for travel blogging. For example --- capture photos and
video with associated blog text then when you get back to a wifi location
... upload the form to your personal travel blog site that puts all of it
online for your friends to see.

Bob


Original message
From: "Yaw Anokwa" yanokwa@nafundi.com
To: BobAchgill@hotmail.com;
Dated: 7/11/2014 7:28:49 PM

Subject: Re: [ODK Developers] Re: Sending stuck in a loop

Bob,

If you are looking for a something quick that would work for your app,
this sounds like it'd work. You probably couldn't use Briefcase (or
other downstream tools) to pull from Collect and you'd probably have
to change form deletion so it removes the file as well.

I wouldn't call it a fix though. The fix is what Mitch described
earlier. And said fix also has the huge benefit of being useful to the
entire community.

Yaw

Need ODK services? http://nafundi.com provides form design, server
setup, professional support, and software development for ODK.
On Fri, Jul 11, 2014 at 2:42 PM, Bob Achgill BobAchgill@hotmail.com wrote:

Since the Android device is the origination point of files being sent
up to
the server and the desire it to avoid unnecessarily resending the same
large
media files to the server... would it work just to simply rename files
at
they are successfully sent to the server? e.g.
########LargeMediaAlreadySent.mp4 This way the renamed media file(s)
won't
be found to be sent upon re initiation of the send. To my knowledge I
don't
know that files already sent will ever be used again in the life cycle
of a
submitted form. Aren't already submitted forms just hanging around till
they are deleted? Renaming is better than just programmatically deleting
the sent files in case you later need to debug why a form is having
trouble
being sent. "Ahhah, it appears that all the files got sent!" "Or, all
but
this one got renamed so ... let's focus on why it did not get sent."

For example: In driving from LA to DC and the GPS has a hiccup while in
a
rain storm... you would not need to go all the way back to LA and start
over... just recalculate from the latest known position. In this case
renaming files as you successfully upload each one determines that they
will
not be found to be sent again as the send recalculates what is left to
send.
This might drive your error message crazy if their is an error message
for
File not found... but that is easy to address by adding a condition to
check
if the file was already renamed because it was already sent.

Would this proposed fix mess with the Open Rosa standard? At simplest
it is
just a rename line of code and a few more lines to add the case to file
not
found error message processing.

What do you think?


Original message
From: "Mitch Sundt" mitchellsundt@gmail.com
To: yanokwa@nafundi.com; crobert@surveycto.com;
opendatakit-developers@googlegroups.com;
Dated: 7/3/2014 11:11:27 AM
Subject: Re: [ODK Developers] Re: Sending stuck in a loop

Long term, this would be a useful discussion to open up with all the
OpenRosa server implementers -- adding a mechanism to the protocol to
obtain
the list of media attachments already uploaded to the server, along with
their md5 hashes.

Short term, if you are communicating with an ODK Aggregate 1.x-derived
server, the code change to ODK Collect would be:

Obtain list of media attachment files for the submission from the
device SD
card / external storage location.

Issue Head request to establish authentication / authorization for
session.

If attempting to re-send a submission that previously failed, issue an
ODK
Briefcase
http://code.google.com/p/opendatakit/wiki/BriefcaseAggregateAPI
GET view/downloadSubmission request to the server.

If this is successful, filter the list of media attachments to exclude
those
that the server reports as having been successfully uploaded.
If this request fails, the server is not ODK Aggregate-based, or the
submission is not present -- ignore the error and proceed.

Proceed with the usual POST work flow, using the filtered list of
attachments to POST.


Mitch

On Wed, Jul 2, 2014 at 8:12 PM, bobachgill@gmail.com wrote:

Just for grins...
From what I see here in China... when the export flood gate is
opened...

smart phones in the world will all be stamped "made in China". They are
using their own population to test them... the $100 to $200 models
looking

really good!

--
You received this message because you are subscribed to the Google
Groups

"ODK Developers" group.
To unsubscribe from this group and stop receiving emails from it, send
an

email to opendatakit-developers+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Mitch Sundt
Software Engineer
University of Washington
mitchellsundt@gmail.com

--
You received this message because you are subscribed to the Google Groups
"ODK Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to opendatakit-developers+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"ODK Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to opendatakit-developers+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Mitch Sundt
Software Engineer
University of Washington
mitchellsundt@gmail.com