Too Many Open Files Error


Diffusion 5.9

Diffusion 6.0

Diffusion 6.1

Diffusion 6.2

Diffusion 6.3

Diffusion Cloud


This tutorial will guide you through the workaround and resolution of the java.io.IOException: Too many open files error.

Error

The following PUSH-000191 error will appear in the server log.

|WARN|Client Connector Thread|PUSH-000191|Connector 'com.pushtechnology.diffusion.io.nio.
[email protected]:'
- Unable to accept connection.|com.pushtechnology.diffusion.io.nio.AcceptSelector 
java.io.IOException: Too many open files

Cause

Operating systems have a limit on the number of files that can be concurrently open by any one process. The default for most distributions is only 1024 files. Each open file also has an associated file-descriptor. Socket connections are treated like files and they use file descriptor, and are therefore subject to the same resource limits. The reason for the resources to be exhausted could be one of the following:

  • Part of your solution code is leaking file-descriptors (if you are using publishers).
  • Your load-balancer is keeping TCP sockets open after use.

Workaround

Determine the current maximum file-descriptor count imposed by your operating system. This can be achieved by running JConsole and determining the value of the following MBean.  

  • MBeans > java.lang > OperatingSystem > Attributes

The information of interest at this stage is MaxFileDescriptorCount and OpenFileDescriptorCount. The error will occur when OpenFileDescriptorCount reaches the limit imposed by MaxFileDescriptorCount. Too Many Open Files Increase the current MaxFileDescriptorCount value using the ulimit command. This can be done as a global setting by your network administrator. If this is not possible, the start script for Diffusion must be modified. Edit thestartdiffusion.sh file and add the following line at the start.

ulimit -n 8192

The value can be changed to any suitable value. This command can be run only by a user with the right permissions. Warning: This may be a temporary resolution. It is recommended that you investigate the reason for the limits to be reached.

Resolution

It is not possible to determine the reason for the limit to be reached after the event has occurred of than that your Diffusion server experienced file-descriptor exhaustion. However, it is likely that likely that either

  • Part of your solution code is leaking file-descriptors (if you are using publishers).
  • Your load-balancer is keeping TCP sockets open after use.

In either case, as this problem is likely to reoccur, it is advised you to monitor the values of OpenFileDescriptorCountandMaxFileDescriptorCount over time. JConsole provides a useful graphing feature for such circumstances (see below), from which you can save data as CSV for ease of use with tools like Excel. Too Many Open Files When this problem reoccurs, we recommend using standard tools such as lsof to list the files/sockets/pipes/fifos/devices that your solution has open.