On the MMO I’m working on, we do quite a bit of service monitoring via jetty. Things like having a URL that reports back version information for a component, or a page full of statistics showing recent activity, or a way to trigger a self-check in a component to determine if the component is sane and healthy or not.
For certain components we run multiple instances of the service on the same box. In development environments, it’s a pretty small number of instances. On a production environment it’s a much bigger number. Typical stuff. However Operations noticed that startup times for a server full of instances jumped from seconds to over 20 minutes when the number of instances was increased from 3 to 12. Ouch.
So, something is blocking on startup. After some digging, I came across this post explaining how jetty uses a secure random number generator for session ids, which is based on the pool of entropy generated by the system. Sure enough…
$ cat /proc/sys/kernel/random/entropy_avail
128
(On development cluster machines that value is upwards of 3000)
Since we don’t care if the jetty session IDs are securely generated or not, switching the generator from secure to not-secure took us back to a few seconds to start up 20 instances.