ConnectionError: timeout exceeded when trying to connect, at Object.createConnection

Hi,
I would be so grateful if someone can help me into below concern:

We started getting frequently below timeout error after upgrading ODK central version from 1.2.1 to 1.4.2.

{"message":"Completely unhandled exception: timeout exceeded when trying to connect","details":{"stack":["ConnectionError: timeout exceeded when trying to connect","at Object.createConnection (/usr/odk/node_modules/slonik/dist/src/factories/createConnection.js:54:23)"]}}

We are using custom postgres database server which is hosted on same linux machine where central application is hosted. Below is my finding that I did to troubleshoot the issue :

  • The request to the application server is high so, I was in concern may be RAM usage causing issue but we verifed RAM usage looks like ok.
  • we also verified active connection to database and we noticed whenever the count of below query return 16 application stop working and start throwing below timeout error. I am not sure is there any limit of 16 of any parameter in central application. I have not seen any limit in postgress configuration.

select count(*) from pg_stat_activity; = 16

Regards,
Abhishek

Is this happening immediately after an upgrade or a day or two after? It sounds like the latter, but when traffic gets high then you get the timeouts?

When you say "request to the application server is high" what exactly do you mean? What is making those requests (data collectors, API integrations, etc) and how how large or frequent are these requests?

Look at the resource usage or errors on the PostgreSQL server. Anything interesting there?

Hi,
Thanks for your reply.
Yes, I guess it start after two or three days after upgrade. We are using api call to sync form through the mobile app.

I have not seen any error log in postgres sql server and resources also seems ok.

Regards,
Abhishek

We're thinking that this may be due to a Central issue with the database connection pool. I think it's possible that the behavior you're seeing is related to this post:

We're currently investigating the issue further. I have a couple of questions about your query of pg_stat_activity:

  • When is the count if you specify where state <> 'idle' ?
  • When you first start or restart the server, what is the count?

Hi,
Thanks for your reply. Please find answer of your questions:

  1. What is the count if you specify where state <> 'idle' ? - it is 1. I checked when application is working fine, I will check and reply back when problem occured.
  2. When you first start or restart the server, what is the count? - 7.

Regards,
Abhishek

1 Like

Hi,
Today we again get same problem and I checked the count of below query
select count() from pg_stat_activity where state <> 'idle' - It was 11.
select count(
) from pg_stat_activity - It was 16.

Regards,
Abhishek

1 Like

Thank you, @abhishek! It's been helpful to see these counts. We're still investigating this issue, but we've posted some updates in the other topic. In particular, take a look at @yanokwa's suggestions there:

We'll be continuing to update that topic as we get more information.

1 Like