I was talking about changes we made to the JavaRosa/Collect implementation. We did not specifically look for changes to make in the form design but I think if there had been anything majorly problematic we would have noticed it.
I agree with all of this. The images are not maintained in RAM after being taken so they should have minimal impact on performance if any at all. I do want to emphasize what @danbjoseph said -- storage can become an issue and bandwidth can as well. You should be warned on conversion from XLSForm to use max-pixel
if you don't already and that's important.
If all of your data collectors are sending images at the same time, you should expect that the submission throughput will be low and that there may be errors sending. You will likely have to stagger submissions and you probably want to run some tests with the number of submissions with images you expect to send simultaneously on the connection you will be using.
If you have an example of a system that is arbitrarily programmable and tries to make assertions about likely performance of a user-produced artifact, that would be helpful to look at. Given a form definition, it would be possible (though not simple) to quantify the amount of memory used per repeat instance. That's only part of the battle, though, because as you've seen first hand, Android and other applications on the device can be doing all kinds of things and the decisions about what Android chooses to keep in memory and not are hard to predict.
I'm still not convinced that Collect's RAM usage is the primary problem. Android 7.0 specifically has a lot of bugs and so it might just be that you were hitting some before some additional reboots. You could try searching for articles about the most common Android 7.0 issues to see if any of those match what you've experienced and whether there are suggestions for approaching them.
Collect does save a snapshot of collected data on each screen change. One thing you could do is to train data collectors on staying calm if things freeze up, rebooting their device, and when they open up Collect and tap on the form they want to fill out, using the hierarchy view to navigate to the question they were on before the reboot.
Memory is reclaimed once a form is closed so that does not capture the point of highest memory usage, unfortunately.
I'm not sure that you're going to get actionable information this way. What I might suggest is focusing on identifying some repeatable processes that allow data collectors to proceed with minimal interruption. That might be things like clearing any other running processes, killing Collect and restarting it, rebooting as I described above.
I think another major question you need to answer is whether the really is an issue after devices have been updated, kept online for a little while and then rebooted. Your previous experience suggests that there might not be.
You could have some questions at the beginning of a flat household form that ask for building information and that use a default
to pull the last saved value for each. That way a data collector can simply swipe through if all the information is the same as for the last record they saved. You could get fancy and calculate a hash of those building values to use as a building identifier (or even just concatenate them all together) when you analyze your data. I want to emphasize again that I am not convinced that there really is a problem with form design, though. I think it's worth doing some trials with the new form updates and the latest version of Collect and see what you find.