Regex problem?

The help at http://opendatakit.org/help/form-design/binding warns that complex regex patterns may cause stack overflow crashes. But I think I'm encountering a different problem with similar symptoms. I have a form (xlsform attached) that has a repeat to solicit any number of 10 digit ID numbers (mwid). I want to check that the ID numbers are not duplicated. Which I am doing by (a) having a calculated field outside the repeat (mwids) for which the calculation is:
join(' ',${mwid})
and a required question inside the loop with relevant condition:
regex(${mwids},concat(${mwid},'.*?',${mwid}))
and constraint false().

This all works, as long as I don't enter more than 5 ID numbers. If I enter 8 or more ID numbers, it all seems to work until the form is saved. At which point, ODK Collect 1.4 (1038) crashes "The application ODK Collect (process org.odk.collect.android) has stopped unexpectedly. Please try again.". But before it crashes, it has written (in the .cache folder) an .xml.save file containing all the data from the current instance. If I enter 6 or 7 numbers, ODK sometimes crashes and sometimes doesn't. If I make the ID number 5 digits instead of 10 digits, I can enter more ID numbers before the crash occurs on saving.

So it looks to me as if it is the length of the string to be searched, rather than the complexity of the regex pattern, that is causing the crash, and that for some reason, the crash is only induced when ODK Collect re-checks the current instance data before saving.

Any suggestions for getting round this?

Thanks.

TestDup.xls (32 KB)

Hi James,

My guess is that ODK Collect is running out of RAM because of
something funny the JavaROSA core is doing. The core team has some
upcoming changes to the regex library and some memory optimizations
from SurveyCTO that might automagically fix this issue.

Either way, it's a bug, so please file a bug report at
https://code.google.com/p/opendatakit/issues/list with this sample
form. What would also help the core team is a stack trace showing the
bug. To get a stack trace, follow the instructions at
https://code.google.com/p/opendatakit/wiki/CollectTroubleshooting.

I can't think of a good workaround. The real solution to this problem
is to build a widget (or standalone app) that hands out IDs that can't
be re-used. Perhaps if you file a feature request someone in the
community can build one.

Thanks,

Yaw

··· -- Need ODK services? http://nafundi.com provides form design, server setup, professional support, and software development for ODK.

On Thu, Mar 6, 2014 at 7:21 PM, james.beard.tz@gmail.com wrote:

The help at http://opendatakit.org/help/form-design/binding warns that complex regex patterns may cause stack overflow crashes. But I think I'm encountering a different problem with similar symptoms. I have a form (xlsform attached) that has a repeat to solicit any number of 10 digit ID numbers (mwid). I want to check that the ID numbers are not duplicated. Which I am doing by (a) having a calculated field outside the repeat (mwids) for which the calculation is:
join(' ',${mwid})
and a required question inside the loop with relevant condition:
regex(${mwids},concat(${mwid},'.*?',${mwid}))
and constraint false().

This all works, as long as I don't enter more than 5 ID numbers. If I enter 8 or more ID numbers, it all seems to work until the form is saved. At which point, ODK Collect 1.4 (1038) crashes "The application ODK Collect (process org.odk.collect.android) has stopped unexpectedly. Please try again.". But before it crashes, it has written (in the .cache folder) an .xml.save file containing all the data from the current instance. If I enter 6 or 7 numbers, ODK sometimes crashes and sometimes doesn't. If I make the ID number 5 digits instead of 10 digits, I can enter more ID numbers before the crash occurs on saving.

So it looks to me as if it is the length of the string to be searched, rather than the complexity of the regex pattern, that is causing the crash, and that for some reason, the crash is only induced when ODK Collect re-checks the current instance data before saving.

Any suggestions for getting round this?

Thanks.

--

Post: opendatakit@googlegroups.com
Unsubscribe: opendatakit+unsubscribe@googlegroups.com
Options: http://groups.google.com/group/opendatakit?hl=en


You received this message because you are subscribed to the Google Groups "ODK Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opendatakit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Thanks. I've submitted a bug report now.

··· On Tuesday, March 11, 2014 11:59:57 PM UTC, Yaw Anokwa wrote: > Hi James, > > > > My guess is that ODK Collect is running out of RAM because of > > something funny the JavaROSA core is doing. The core team has some > > upcoming changes to the regex library and some memory optimizations > > from SurveyCTO that might automagically fix this issue. > > > > Either way, it's a bug, so please file a bug report at > > https://code.google.com/p/opendatakit/issues/list with this sample > > form. What would also help the core team is a stack trace showing the > > bug. To get a stack trace, follow the instructions at > > https://code.google.com/p/opendatakit/wiki/CollectTroubleshooting. > > > > I can't think of a good workaround. The real solution to this problem > > is to build a widget (or standalone app) that hands out IDs that can't > > be re-used. Perhaps if you file a feature request someone in the > > community can build one. > > > > Thanks, > > > > Yaw > > -- > > Need ODK services? http://nafundi.com provides form design, server > > setup, professional support, and software development for ODK. > > > > On Thu, Mar 6, 2014 at 7:21 PM, wrote: > > > The help at http://opendatakit.org/help/form-design/binding warns that complex regex patterns may cause stack overflow crashes. But I think I'm encountering a different problem with similar symptoms. I have a form (xlsform attached) that has a repeat to solicit any number of 10 digit ID numbers (mwid). I want to check that the ID numbers are not duplicated. Which I am doing by (a) having a calculated field outside the repeat (mwids) for which the calculation is: > > > join(' ',${mwid}) > > > and a required question inside the loop with relevant condition: > > > regex(${mwids},concat(${mwid},'.*?',${mwid})) > > > and constraint false(). > > > > > > This all works, as long as I don't enter more than 5 ID numbers. If I enter 8 or more ID numbers, it all seems to work until the form is saved. At which point, ODK Collect 1.4 (1038) crashes "The application ODK Collect (process org.odk.collect.android) has stopped unexpectedly. Please try again.". But before it crashes, it has written (in the .cache folder) an .xml.save file containing all the data from the current instance. If I enter 6 or 7 numbers, ODK sometimes crashes and sometimes doesn't. If I make the ID number 5 digits instead of 10 digits, I can enter more ID numbers before the crash occurs on saving. > > > > > > So it looks to me as if it is the length of the string to be searched, rather than the complexity of the regex pattern, that is causing the crash, and that for some reason, the crash is only induced when ODK Collect re-checks the current instance data before saving. > > > > > > Any suggestions for getting round this? > > > > > > Thanks. > > > > > > -- > > > -- > > > Post: opendatakit@googlegroups.com > > > Unsubscribe: opendatakit+unsubscribe@googlegroups.com > > > Options: http://groups.google.com/group/opendatakit?hl=en > > > > > > --- > > > You received this message because you are subscribed to the Google Groups "ODK Community" group. > > > To unsubscribe from this group and stop receiving emails from it, send an email to opendatakit+unsubscribe@googlegroups.com. > > > For more options, visit https://groups.google.com/d/optout.