Collect download fails and upload times out

1. What is the problem? Be very detailed.
We have recently begun experiencing an issue with uploading forms from Collect to Aggregate. The submission process has been running well for years, and this problem only began in the last couple of days (with no changes to Collect, the server, Aggregate, etc).

When we try to upload completed forms, we see the following error message:

Upload Results
[form name] - Error: Generic Exception: failed to connect to [our aggregate url]/[our ip] (port 8080) from [some device ip] (port 49282) after 30000 ms.
OR
[form name] - Error: Generic Exception: Connection reset

If I try to Get Blank Form, it takes ~2 minutes to load the list of ~20 forms. However, it does eventually load, which indicates to me that this is not an issue with credentials, target URL, etc. Then, when I select a form to download to the device, I receive the error message:

Download Results
[form name] (ID: [form id] - Failure

If I try to load Aggregate through a browser, it all loads quickly, we can view all of the data, downloads of xml versions of forms work fine, etc.

My sense is that there is a memory issue here, but I'm not sure where to look.
This study has been running for about 3 years, so a lot of data has been generated and stored in the DB...

2. What app or server are you using and on what device and operating system? Include version numbers.
ODK Collect 2021.2.3
ODK Aggregate 1.7.1
Red Hat Enterprise Linux 7.9
Tomcat 8
JDK 1.8.0_302
mysql 15.1

3. What you have you tried to fix the problem?
I have tried:

  • rebooting everything
  • switching to (good) wifi only on the device, to confirm it was not a sim/reception issue
  • switching to using the IP instead of the domain for the target Aggregate URL
  • increasing the max_allowed_packet in /etc/my.cnf

4. What steps can we take to reproduce the problem?

5. Anything else we should know or have? If you have a test form or screenshots or logs, attach below.
catalina.out
Aug 13, 2021 12:48:22 PM org.opendatakit.common.security.spring.RoleHierarchyImpl refreshReachableGrantedAuthorities
INFO: Executing: refreshReachableGrantedAuthorities
Aug 13, 2021 12:48:22 PM org.opendatakit.common.security.spring.UserServiceImpl reloadPermissions
INFO: Executing: reloadPermissions
Aug 13, 2021 12:48:22 PM org.opendatakit.aggregate.form.FormFactory internalGetForms
INFO: FormCache: fetching new list of Forms
Aug 13, 2021 12:48:22 PM org.opendatakit.aggregate.util.BackendActionsTable logValues
INFO: incoming- last Fetch: -564672 [S: -392821 Eq: -1057793 Fs: -157793] futureMillis: 0
Aug 13, 2021 12:48:22 PM org.opendatakit.aggregate.util.BackendActionsTable logValues
INFO: -fetched- last Fetch: 0 [S: -393793 Eq: -1057793 Fs: -157793] futureMillis: -1
Aug 13, 2021 12:48:22 PM org.opendatakit.aggregate.util.BackendActionsTable logValues
INFO: Eq-update last Fetch: 0 [S: -393793 Eq: 0 Fs: -157793] futureMillis: 0 requested: 0
Aug 13, 2021 12:50:27 PM org.opendatakit.common.security.spring.RoleHierarchyImpl refreshReachableGrantedAuthorities
INFO: Executing: refreshReachableGrantedAuthorities
Aug 13, 2021 12:50:27 PM org.opendatakit.common.security.spring.RoleHierarchyImpl refreshReachableGrantedAuthorities
INFO: Executing: refreshReachableGrantedAuthorities
Aug 13, 2021 12:50:27 PM org.opendatakit.common.security.spring.UserServiceImpl reloadPermissions
INFO: Executing: reloadPermissions
Aug 13, 2021 12:50:27 PM org.opendatakit.common.security.spring.UserServiceImpl reloadPermissions
INFO: Executing: reloadPermissions

localhost_access_log.2021-08-13.txt
127.0.0.1 - - [13/Aug/2021:12:38:58 -0500] "GET /jamiibora/formList HTTP/1.1" 401 1087
127.0.0.1 - - [13/Aug/2021:12:38:58 -0500] "GET /jamiibora/formList HTTP/1.1" 200 10873
127.0.0.1 - - [13/Aug/2021:12:48:22 -0500] "GET /jamiibora/formList HTTP/1.1" 401 1111
127.0.0.1 - - [13/Aug/2021:12:48:22 -0500] "GET /jamiibora/formList HTTP/1.1" 200 10873
(this seems to correspond with when I try to load all of the blank forms)

/var/logs/httpd
[Thu Aug 12 01:46:52.864196 2021] [proxy:warn] [pid 30636] [client 27.215.50.147:45930] AH01092: no HTTP 0.9 request (with no host line) on incoming request and preserve host set forcing hostname to be odk-agregate.soph.uab.edu for uri /boaform/admin/formLogin
[Thu Aug 12 08:50:41.030445 2021] [proxy:warn] [pid 30628] [client 172.104.138.223:27364] AH01092: no HTTP 0.9 request (with no host line) on incoming request and preserve host set forcing hostname to be odk-agregate.soph.uab.edu for uri /fuN3
[Thu Aug 12 08:52:47.225660 2021] [proxy_http:error] [pid 22860] (70008)Partial results are valid but processing is incomplete: [client 27.215.82.167:53026] AH01095: prefetch request body failed to 127.0.0.1:8080 (127.0.0.1) from 27.215.82.167 ()
[Thu Aug 12 09:19:57.912066 2021] [proxy_http:error] [pid 30639] (70008)Partial results are valid but processing is incomplete: [client 49.143.32.6:2646] AH01095: prefetch request body failed to 127.0.0.1:8080 (127.0.0.1) from 49.143.32.6 ()
[Thu Aug 12 14:12:00.996998 2021] [proxy:warn] [pid 22860] [client 104.152.52.26:54558] AH01092: no HTTP 0.9 request (with no host line) on incoming request and preserve host set forcing hostname to be odk-agregate.soph.uab.edu for uri /
[Thu Aug 12 15:16:58.261056 2021] [proxy:warn] [pid 30628] [client 115.55.155.0:53872] AH01092: no HTTP 0.9 request (with no host line) on incoming request and preserve host set forcing hostname to be odk-agregate.soph.uab.edu for uri /boaform/admin/formLogin

/var/logs/httpd/error_log
[Thu Aug 12 01:46:52.864196 2021] [proxy:warn] [pid 30636] [client 27.215.50.147:45930] AH01092: no HTTP 0.9 request (with no host line) on incoming request and preserve host set forcing hostname to be odk-agregate.soph.uab.edu for uri /boaform/admin/formLogin
[Thu Aug 12 08:50:41.030445 2021] [proxy:warn] [pid 30628] [client 172.104.138.223:27364] AH01092: no HTTP 0.9 request (with no host line) on incoming request and preserve host set forcing hostname to be odk-agregate.soph.uab.edu for uri /fuN3
[Thu Aug 12 08:52:47.225660 2021] [proxy_http:error] [pid 22860] (70008)Partial results are valid but processing is incomplete: [client 27.215.82.167:53026] AH01095: prefetch request body failed to 127.0.0.1:8080 (127.0.0.1) from 27.215.82.167 ()
[Thu Aug 12 09:19:57.912066 2021] [proxy_http:error] [pid 30639] (70008)Partial results are valid but processing is incomplete: [client 49.143.32.6:2646] AH01095: prefetch request body failed to 127.0.0.1:8080 (127.0.0.1) from 49.143.32.6 ()
[Thu Aug 12 14:12:00.996998 2021] [proxy:warn] [pid 22860] [client 104.152.52.26:54558] AH01092: no HTTP 0.9 request (with no host line) on incoming request and preserve host set forcing hostname to be odk-agregate.soph.uab.edu for uri /
[Thu Aug 12 15:16:58.261056 2021] [proxy:warn] [pid 30628] [client 115.55.155.0:53872] AH01092: no HTTP 0.9 request (with no host line) on incoming request and preserve host set forcing hostname to be odk-agregate.soph.uab.edu for uri /boaform/admin/formLogin

/var/logs/httpd/ssl_error_log
[Sun Aug 08 16:18:00.063480 2021] [proxy_http:error] [pid 17905] (-102)Unknown error -102: [client 157.245.176.143:39228] AH01095: prefetch request body failed to 127.0.0.1:8080 (127.0.0.1) from 157.245.176.143 ()
[Wed Aug 11 06:14:57.287456 2021] [proxy_http:error] [pid 22929] (-102)Unknown error -102: [client 157.245.176.143:40314] AH01095: prefetch request body failed to 127.0.0.1:8080 (127.0.0.1) from 157.245.176.143 ()
[Thu Aug 12 19:21:56.555905 2021] [proxy:warn] [pid 29970] [client 109.248.6.120:57737] AH01092: no HTTP 0.9 request (with no host line) on incoming request and preserve host set forcing hostname to be odk-aggregate.soph.uab.edu for uri /
[Fri Aug 13 04:19:02.153198 2021] [proxy:warn] [pid 29941] [client 178.73.215.171:32472] AH01092: no HTTP 0.9 request (with no host line) on incoming request and preserve host set forcing hostname to be odk-aggregate.soph.uab.edu for uri /
[Fri Aug 13 05:28:55.782319 2021] [ssl:error] [pid 2855] [client 54.90.247.217:44926] AH02042: rejecting client initiated renegotiation

Please note that Aggregate is no longer being supported. Central is now the ODK server. The post below has for more details about the end-of-life for Aggregate.

However, someone on the forum may still be able to help you.

For those that encounter this issue in the future, we have resolved it by opening 8080 on the server.
It looks like Aggregate was set up to load over https, except user updates, data pulls and data pushes which go over http.

It is still unclear to me why this error occurred suddenly. Most likely, it flew under the radar for a while, then IT closed 8080 as part of some security update. However, IT claims that they have not ever had an inbound 8080 request...