Aggregate instances and resource consumption

Dear ODK Developers,

I am trying to run lots of Aggregate instances on a single Tomcat server
and I am finding that each instance takes a monstrous amount of resources
-- even without any data or usage. Since these instances will be idle 98%
of the time, it seems that it should at least be possible, in principle, to
run a bunch of instances on a a single server without requiring that server
to have infinite resources. Do any of you have quick advice for me, given
your own hard-earned experience? Is it just foolhardy to try and run a lot
of Aggregate instances on a single server?

I have not yet determined how much of a constraint simultaneous database
connections may or may not be, but the ODK DB config is a concrete case in
point:











Each instance starts out with 30 connections out of the starting gate and
may grow to 120. I have increased mysql to handle many simultaneous
connections, but still I am tempted to trim this config. That temptation is
tempered, however, by the worry that ODK may sometimes require "bursts" of
connections in certain rare cases -- and, if I don't allow each instance up
to 120, certain types of complex form operation will fail (perhaps
something like deleting a form that has lots of media files or is split
across database tables?).

Again, any quick advice would be greatly appreciated.

Thanks very much,

Chris

First, it is common on Linux servers to need to increase by well over 10x
the number of open sockets and file handles that the default install of the
OS has when you are running database servers or web servers on that
system. This generally does not impact performance. The primary
limitation will always be main memory (RAM), as that limits the number and
size of simultaneous JVMs that you can run, and the size of the JVM will
limit the size of any File Export you attempt in Aggregate (the constructed
file must fit entirely in memory in the JVM in which it is running). I
would fully expect needing to specify 10,000 open sockets and file handles
for a heavily used system. This is 10x the default number of sockets and
files in the stock install.

And, for database servers, allocating a big quantity of the main memory for
scratch space will vastly improve database server performance.

Next, none of the settings have been investigated or tuned for performance,
so you can definitely play around with them.

For each server instance, there are settings within Tomcat that specify
maxThreads -- the number of concurrent requests that will be processed by
the JVM at any one time. Other requests are queued for execution, up to the
limit of acceptCount requests.
http://tomcat.apache.org/tomcat-6.0-doc/config/http.html The default is to
allow 200 simultaneous threads and 100 queued threads. If users are doing a
lot of file exports, you would need to trim this down. If each of these
users were to access the database, they would each need their own
connection to the database, hence there could be 200 x 2 + 100 sockets just
for managing the number of active requests and queued requests, and each of
their database connections. The connection pool then manages cycling
connections into a 'idle' pool, maintaining extra free connections, etc.,
bumping the total number of sockets needed up to 500 or so.

Note also that if you deploy multiple Aggregate servers within the same
Tomcat instance (which is definitely doable), you can share ONE database
connection pool across ALL those instances, rather than maintain one pool
per Aggregate server. This requires moving the bonecp jar into the lib
directory of the tomcat server, removing it from the individual
applications' lib directories, and likely adjusting some server
configuration to get the server to use a common database connection pool
when the Tomcat server starts (before it starts the individual
applications).

There are also configuration settings within BoneCP to control the
management of the database connection pool. Unfortunately, that looks to
be very poorly documented, which means wading through the configuration
settings options in the code
http://jolbox.com/bonecp/downloads/site/apidocs/com/jolbox/bonecp
/BoneCPConfig.html

To keep things simple, I would first just bump up the max numbers of
sockets and file handles in the OS until I start to see performance issues,
and consolidate the bonecp connection pools only as a last resort.

Mitch

··· On Fri, Oct 12, 2012 at 6:26 AM, Christopher Robert wrote:

Dear ODK Developers,

I am trying to run lots of Aggregate instances on a single Tomcat server
and I am finding that each instance takes a monstrous amount of resources
-- even without any data or usage. Since these instances will be idle 98%
of the time, it seems that it should at least be possible, in principle, to
run a bunch of instances on a a single server without requiring that server
to have infinite resources. Do any of you have quick advice for me, given
your own hard-earned experience? Is it just foolhardy to try and run a lot
of Aggregate instances on a single server?

I have not yet determined how much of a constraint simultaneous database
connections may or may not be, but the ODK DB config is a concrete case in
point:











Each instance starts out with 30 connections out of the starting gate and
may grow to 120. I have increased mysql to handle many simultaneous
connections, but still I am tempted to trim this config. That temptation is
tempered, however, by the worry that ODK may sometimes require "bursts" of
connections in certain rare cases -- and, if I don't allow each instance up
to 120, certain types of complex form operation will fail (perhaps
something like deleting a form that has lots of media files or is split
across database tables?).

Again, any quick advice would be greatly appreciated.

Thanks very much,

Chris

--
Mitch Sundt
Software Engineer
University of Washington
mitchellsundt@gmail.com

Mitch,

Thank you so much for the very helpful reply. You've given me a bunch of
good ideas to pursue. As usual, I'll be keeping detailed notes and I can
always append to the existing ODK-on-AWS wiki doc once I'm through.

Thanks again,

Chris

··· On Friday, October 12, 2012, Mitch S wrote:

First, it is common on Linux servers to need to increase by well over 10x
the number of open sockets and file handles that the default install of the
OS has when you are running database servers or web servers on that
system. This generally does not impact performance. The primary
limitation will always be main memory (RAM), as that limits the number and
size of simultaneous JVMs that you can run, and the size of the JVM will
limit the size of any File Export you attempt in Aggregate (the constructed
file must fit entirely in memory in the JVM in which it is running). I
would fully expect needing to specify 10,000 open sockets and file handles
for a heavily used system. This is 10x the default number of sockets and
files in the stock install.

And, for database servers, allocating a big quantity of the main memory
for scratch space will vastly improve database server performance.

Next, none of the settings have been investigated or tuned for
performance, so you can definitely play around with them.

For each server instance, there are settings within Tomcat that specify
maxThreads -- the number of concurrent requests that will be processed by
the JVM at any one time. Other requests are queued for execution, up to the
limit of acceptCount requests.
http://tomcat.apache.org/tomcat-6.0-doc/config/http.html The default is
to allow 200 simultaneous threads and 100 queued threads. If users are
doing a lot of file exports, you would need to trim this down. If each of
these users were to access the database, they would each need their own
connection to the database, hence there could be 200 x 2 + 100 sockets just
for managing the number of active requests and queued requests, and each of
their database connections. The connection pool then manages cycling
connections into a 'idle' pool, maintaining extra free connections, etc.,
bumping the total number of sockets needed up to 500 or so.

Note also that if you deploy multiple Aggregate servers within the same
Tomcat instance (which is definitely doable), you can share ONE database
connection pool across ALL those instances, rather than maintain one pool
per Aggregate server. This requires moving the bonecp jar into the lib
directory of the tomcat server, removing it from the individual
applications' lib directories, and likely adjusting some server
configuration to get the server to use a common database connection pool
when the Tomcat server starts (before it starts the individual
applications).

There are also configuration settings within BoneCP to control the
management of the database connection pool. Unfortunately, that looks to
be very poorly documented, which means wading through the configuration
settings options in the code
http://jolbox.com/bonecp/downloads/site/apidocs/com/jolbox/bonecp
/BoneCPConfig.html

To keep things simple, I would first just bump up the max numbers of
sockets and file handles in the OS until I start to see performance issues,
and consolidate the bonecp connection pools only as a last resort.

Mitch

On Fri, Oct 12, 2012 at 6:26 AM, Christopher Robert < chrislrobert@gmail.com <javascript:_e({}, 'cvml', 'chrislrobert@gmail.com');>> wrote:

Dear ODK Developers,

I am trying to run lots of Aggregate instances on a single Tomcat server
and I am finding that each instance takes a monstrous amount of resources
-- even without any data or usage. Since these instances will be idle 98%
of the time, it seems that it should at least be possible, in principle, to
run a bunch of instances on a a single server without requiring that server
to have infinite resources. Do any of you have quick advice for me, given
your own hard-earned experience? Is it just foolhardy to try and run a lot
of Aggregate instances on a single server?

I have not yet determined how much of a constraint simultaneous database
connections may or may not be, but the ODK DB config is a concrete case in
point:











Each instance starts out with 30 connections out of the starting gate and
may grow to 120. I have increased mysql to handle many simultaneous
connections, but still I am tempted to trim this config. That temptation is
tempered, however, by the worry that ODK may sometimes require "bursts" of
connections in certain rare cases -- and, if I don't allow each instance up
to 120, certain types of complex form operation will fail (perhaps
something like deleting a form that has lots of media files or is split
across database tables?).

Again, any quick advice would be greatly appreciated.

Thanks very much,

Chris

--
Mitch Sundt
Software Engineer
University of Washington
mitchellsundt@gmail.com <javascript:_e({}, 'cvml',
'mitchellsundt@gmail.com');>

Just to circle back and report on this, following are the settings I've
gone with for the moment. They seem to be working in the context of Amazon
Web Service Linux server instances.

Bumping up number of open file handles in /etc/security/limits.conf (then
confirm with "ulimit –a" after restart):

  • soft nofile 16384
  • hard nofile 65536

Tomcat settings in server.xml: maxThreads="2000".

MySQL settings in my.cnf:

[mysqld]
max_allowed_packet = 100M
max_connections = 12000

For Tomcat options, it turns out that the JVM needs roughly 164MB per
Aggregate instance (though this may be bloated a bit by libraries in my
customizations). Thus, if X is the amount of overall memory on my server
instance, then Y=(X-1)*1024 (to hold 1GB for MySQL, the base OS, etc., and
convert to MB), and W=Y/4, I configure Tomcat as follows:

CATALINA_OPTS="-Xmx[Y]m -XX:PermSize=[W]m -XX:MaxPermSize=[W]m
-XX:-HeapDumpOnOutOfMemoryError"

This allows me to safely run N=Y/164 separate Aggregate hosts.

I may well run into other limits, but overall memory seems to be what binds
most -- and most problematically. Tomcat crashes in a variety of ways when
memory is low, and it is often extremely unclear what the source of the
problem is. Nearly every cryptic, difficult problem we have experienced has
ultimately been related to insufficient memory.

This is, of course, without taking any steps to share connections or
otherwise economize. I have a strong preference for keeping all Aggregate
hosts fully separate (so that their versions can differ, etc.), but the
cost of larger-memory instances on the AWS-EC2 infrastructure has me
re-thinking that stance. As things stand, I can run dramatically fewer
hosts per server instance than I had originally expected (and than I had
originally budgeted for).

Best,

Chris

··· On Friday, October 12, 2012, Mitch S wrote:

First, it is common on Linux servers to need to increase by well over 10x
the number of open sockets and file handles that the default install of the
OS has when you are running database servers or web servers on that
system. This generally does not impact performance. The primary
limitation will always be main memory (RAM), as that limits the number and
size of simultaneous JVMs that you can run, and the size of the JVM will
limit the size of any File Export you attempt in Aggregate (the constructed
file must fit entirely in memory in the JVM in which it is running). I
would fully expect needing to specify 10,000 open sockets and file handles
for a heavily used system. This is 10x the default number of sockets and
files in the stock install.

And, for database servers, allocating a big quantity of the main memory
for scratch space will vastly improve database server performance.

Next, none of the settings have been investigated or tuned for
performance, so you can definitely play around with them.

For each server instance, there are settings within Tomcat that specify
maxThreads -- the number of concurrent requests that will be processed by
the JVM at any one time. Other requests are queued for execution, up to the
limit of acceptCount requests.
http://tomcat.apache.org/tomcat-6.0-doc/config/http.html The default is
to allow 200 simultaneous threads and 100 queued threads. If users are
doing a lot of file exports, you would need to trim this down. If each of
these users were to access the database, they would each need their own
connection to the database, hence there could be 200 x 2 + 100 sockets just
for managing the number of active requests and queued requests, and each of
their database connections. The connection pool then manages cycling
connections into a 'idle' pool, maintaining extra free connections, etc.,
bumping the total number of sockets needed up to 500 or so.

Note also that if you deploy multiple Aggregate servers within the same
Tomcat instance (which is definitely doable), you can share ONE database
connection pool across ALL those instances, rather than maintain one pool
per Aggregate server. This requires moving the bonecp jar into the lib
directory of the tomcat server, removing it from the individual
applications' lib directories, and likely adjusting some server
configuration to get the server to use a common database connection pool
when the Tomcat server starts (before it starts the individual
applications).

There are also configuration settings within BoneCP to control the
management of the database connection pool. Unfortunately, that looks to
be very poorly documented, which means wading through the configuration
settings options in the code
http://jolbox.com/bonecp/downloads/site/apidocs/com/jolbox/bonecp
/BoneCPConfig.html

To keep things simple, I would first just bump up the max numbers of
sockets and file handles in the OS until I start to see performance issues,
and consolidate the bonecp connection pools only as a last resort.

Mitch

On Fri, Oct 12, 2012 at 6:26 AM, Christopher Robert < chrislrobert@gmail.com> wrote:

Dear ODK Developers,

I am trying to run lots of Aggregate instances on a single Tomcat server
and I am finding that each instance takes a monstrous amount of resources
-- even without any data or usage. Since these instances will be idle 98%
of the time, it seems that it should at least be possible, in principle, to
run a bunch of instances on a a single server without requiring that server
to have infinite resources. Do any of you have quick advice for me, given
your own hard-earned experience? Is it just foolhardy to try and run a lot
of Aggregate instances on a single server?

I have not yet determined how much of a constraint simultaneous database
connections may or may not be, but the ODK DB config is a concrete case in
point:











Each instance starts out with 30 connections out of the starting gate and
may grow to 120. I have increased mysql to handle many simultaneous
connections, but still I am tempted to trim this config. That temptation is
tempered, however, by the worry that ODK may sometimes require "bursts" of
connections in certain rare cases -- and, if I don't allow each instance up
to 120, certain types of complex form operation will fail (perhaps
something like deleting a form that has lots of media files or is split
across database tables?).

Again, any quick advice would be greatly appreciated.

Thanks very much,

Chris

--
Mitch Sundt
Software Engineer
University of Washington
mitchellsundt@gmail.com