As far as "real" forms go, here is a pair that I think would be good to have benchmarks for:
- nigeria_wards_external.xml (15.9 KB) with attachments wards.xml (2.1 MB) and lgas.xml (112.3 KB) (XLSForm in Google Sheets -- be sure to see my note if you want to convert with pyxform)
- nigeria_wards_internal.xml.zip (510.3 KB) (XLSForm in Google Sheets)
They both have cascading selects with many elements. They also both have a question with a simple XPath query to get a single value. My quick qualitative assessment is that the form with external data loads much more quickly than the one with internal instances and that this should be explored further. Once the form is loaded, evaluation time feels the same which is what I would expect.
@dcbriccetti has already done some work profiling these (see Collect large form performance - #26 by dcbriccetti) and having benchmarks that any further performance improvements can be verified against will be very valuable.
So we don't lose track of related work, these are past conversations that might be worth revisiting:
- https://github.com/opendatakit/javarosa/issues/170
- Profiling and performance tuning
- External data - current state of affairs - #5 by cooperka
If this all looks rather disjoint, it is. We haven't had a specific performance mandate yet so this has been pushed on bit by bit as people have time/curiosity/insomnia.
Another quick note -- someone opening up the XML for the Nigeria ward form with internal secondary instances might notice that there are unnecessary translations. This is a pyxform
issue and I've filed it at https://github.com/XLSForm/pyxform/issues/285. I don't think that makes a big performance difference.