ODK Central with nonstandard port number and custom SSL: Cannot retrieve forms with Collect

1. What is the problem? Be very detailed.
I set up ODK Central and uploaded a form.
Configured Collect via QR code.
In Collect, on page Get Blank Form a list of forms (present on Central) displays succesfully.
The actual download ('Get Selected') fails with the message:

Untitled Form (ID: testform) - Unable to resolve host "local": No address associated with hostname

This happens both for drafts and finalized forms. When two forms are selected for download, two error messages appear.

Curiously, while the successful Get Blank Form request is recorded in both docker's nginx and service logs, the failed Get Selected forms request isn't logged anywhere. It seems no request is sent at all? When I do an invalid request using the browser (for example one that triggers a HTTP 404 response), it is logged as usual.

2. What app or server are you using and on what device and operating system? Include version numbers.
ODK Collect 1.26.2.

ODK Central 0.8.0 was installed on a custom Debian 10 machine (not Digital Ocean) using Docker. Another web server that occupied ports 80 and 443 was running there already. There is also an existing Let's Encrypt certificate in use for the domain. I used port numbers 4480 and 4443 for ODK. During installation, i adhered to appropriate instructions with following deviations:

.env, modified first two lines:

SSL_TYPE=customssl
DOMAIN=local
SYSADMIN_EMAIL=mail@example.com

docker-compose.yml, modified the port numbers in the nginx chapter:

nginx:
 container_name: nginx
 build:
   context: .
   dockerfile: nginx.dockerfile
 depends_on:
   - service
 environment:
   - SSL_TYPE=${SSL_TYPE}
   - DOMAIN=${DOMAIN}
   - CERTBOT_EMAIL=${SYSADMIN_EMAIL}
 ports:
   - "4480:80"         # modified
   - "4443:443"        # modified
 healthcheck:
   test: [ "CMD-SHELL", "nc -z localhost 443 || exit 1" ]

Copied fullchain.pem and privkey.pem from the /etc/letsencrypt/live/directory/my.example.com/ as specified in instructions.

A separate question: I am concerned this manual coping of certificates will bite me back when the certificate will be renewed, won't it? Would creating a symbolic link to the original files work? The best thing would be if the builder skipped certbot altogether and nginx used the existing Let's Encrypt certificate natively, but I don't know how to instruct the installer to do so. It would rule out one possible source of the problem.

Only after these modifications of the standard procedure the installation of Central finished properly and Central started successfully. Until I started testing the uploaded form, I encountered no problems.

3. What you have you tried to fix the problem?
My problem was similar to this question: https://forum.getodk.org/t/25084, however changing the domain in .env file from DOMAIN=local to the real domain name DOMAIN=my.example.com resulted in an unsuccesful instalation and nginx failed to start with:

nginx: [emerg] BIO_new_file("/etc/customssl/live/my.example.com/fullchain.pem") failed (SSL: error:02001002:system library:fopen:No such file or directory:fopen('/etc/customssl/live/my.example.com/fullchain.pem','r') error:2006D080:BIO routines:BIO_new_file:no such file)

Building and running with SSL_TYPE=letsencrypt in .env terminates with certbot error AuthorizationError: Some challenges have failed.

4. What steps can we take to reproduce the problem?
Build Central on a non-standard port with a SSL certificate already present using the configuration mentioned above. Then try to retrieve an uploaded form.

I didn't have an another machine to try to reproduce the error.

Welcome to the community, @jary. When you get a chance, please introduce yourself here!

There is a bug in Central. We've fixed it at https://github.com/getodk/central/pull/134 (and documented it at https://github.com/getodk/docs/pull/1216), but it hasn't been merged yet.

If you'd like to try the fix, first make sure .env has the non-local domain (e.g., my.example.com). Then run the following:

cd ~/central;
git fetch;
git checkout issa/fix-custom-ssl;
docker-compose build nginx;
systemctl restart docker-compose@central;

Just be sure to git checkout master and build nginx and restart central after we ship the fix.

The manual copying of certs should not bite you. The above change tells nginx to not use certbot when there is a custom SSL cert.

The port change should be fine, but @issa would know more.

Hello @yanokwa and thanks for your help.

I've changed .env

SSL_TYPE=customssl
DOMAIN=my.example.com             # of course used the real address
SYSADMIN_EMAIL=mail@example.com

and rebuilt nginx with your patch. It worked, but now Collect throws a different error when downloading forms:
Untitled form (ID: testform) - Attempt to invoke virtual method 'int java.io.InputStream.read(byte[])' on a null object reference

I have destroyed my setup and started from scratch just to be sure, but to no avail.

I looked up a similar question which is for Aggregate, maybe there's a lead?

Weird. Quick things to try. Does going back to the default ports work? Can you confirm using https://www.sslshopper.com/ that the certs are valid?

I can't use :443 as it's occupied by a running Apache server and taking it down is unfortunately not an option.

SSL Checker shows all green, says everything is fine.

When inspecting the logs docker logs nginx -f, there are requests recorded when when retrieving the list of forms. When actually downloading them nothing is logged. It looks like there was no request sent at all. It's interesting that one type of request works fine and another one does not.

Will try to build it again tomorrow, this time without reusing my previous configuration.

Tried to install once more again (I noticed the fix was merged to master repository) but no difference, the error is still there.

I have been able to reproduce the problem and I can confirm the issue is the external mapping of the port.

It's unlikely that support for non 80/443 ports will be added soon, but I've filed an issue at https://github.com/getodk/central/issues/138 so the Central team can discuss it and perhaps add it to the roadmap. Would you be open to sending in a PR?

As an alternative, you could also try setting up a reverse proxy on your machine and routing port 80/443 to the existing server and routing 4480/4443 to Central.

1 Like

Thank you for identifing the problem. I will try to apply the workaround next week and report if works.

As for the pull request, I'm not sure if I can help, my programming skills are basic.

After a bit of experimenting, I've successfully managed to get it to work. Thanks to @yanokwa for pointing me in the right direction.

What I actually end up with:
One machine, one IP address, two services:

  • an Apache server that was already running on the machine and could not be altered
  • ODK represented by an nginx server in Docker
    A reverse proxy server was needed to separate requests for Apache and ODK. The separation is ensured by setting ODK to listen to custom ports. Apache keeps listening to standard ports :80 and :443.

Assumptions:

  • ODK was already installed, configured to listen to custom ports :4480 and :4443. This configuration does not work perfectly, which is why we're here. It's probably better if you apply these steps appropriately during installation.
  • The current domain my.example.com already has a Let's Encrypt certificate.
  • Root access.

There were three necessary things to do:

  1. Register a new subdomain: survey.example.com
  2. Set up a reverse proxy
  3. Rebuild nginx

Disclaimer: the order of these steps is what it probably should be, but I didn't try it this way. Due to the experimenting involved the actual order of successful steps is shrouded in mystery.

Setting a new subdomain

Add a new DNS record. Then add the new subdomain to your Let's Encrypt certificate:

sudo certbot --expand -d my.example.com -d survey.example.com

The reverse proxy

Add the following content to httpd.conf. The three marked lines were probably added by certbot. I'm pretty sure I didn't put them there. (I ran certbot after modifying httpd.conf.) If certbot does not add those three lines, add them manually. Be sure the addresses are correct and the files exist.

<VirtualHost *:80>
    ServerName survey.example.com
    ProxyPreserveHost On
    ProxyPass / http://localhost:4480/           # 4480 is the alternative for port 80
    ProxyPassReverse / http://localhost:4480/
</VirtualHost>

<VirtualHost _default_:443>
    SSLProxyEngine On
    ServerName survey.example.com
    ProxyPreserveHost On
    ProxyPass / https://localhost:4443/          # 4443 is an alternative for 443
    ProxyPassReverse / https://localhost:4443/
    
    SSLCertificateFile /etc/letsencrypt/live/my.example.com/fullchain.pem   # added by certbot
    SSLCertificateKeyFile /etc/letsencrypt/live/my.example.com/privkey.pem  # added by certbot
    Include /etc/letsencrypt/options-ssl-apache.conf                        # added by certbot
</VirtualHost>

After saving the config file don't forget to reload it.
sudo apachectl -k graceful

Rebuild nginx

.env is now this:

SSL_TYPE=letsencrypt
DOMAIN=survey.example.com
SYSADMIN_EMAIL=my.mail@example.com

I didn't have to change docker-compose.yml but I'm adding it here to be complete.
This just the nginx chapter, and only the two commented lines are going to be changed.

nginx:
    container_name: nginx
    build:
      context: .
      dockerfile: nginx.dockerfile
    depends_on:
      - service
    environment:
      - SSL_TYPE=${SSL_TYPE}
      - DOMAIN=${DOMAIN}
      - CERTBOT_EMAIL=${SYSADMIN_EMAIL}
    ports:
      - "4480:80"   # on the left goes the alternative port number, on right the real port 
      - "4443:443"  # and the same applies for https  
    healthcheck:
      test: [ "CMD-SHELL", "nc -z localhost 443 || exit 1" ]

Then run docker-compose build in your working folder and run the usual steps:
docker-compose up --no-start
sudo systemctl start docker-compose@central
Check systemctl status docker-compose@central and docker ps -a. Everything green, up and healthy? Good. Now it should work. If not, try the steps in a different order.

3 Likes