Here is the capture in my browser:
I purposely put the same message as in the captured image so that search engine will be able to find when people search using the error message.
java.net.SocketException: No buffer space available (maximum connections reached?): JVM_Bind at java.net.PlainSocketImpl.socketBind(Native method) at java.net.PlainSocketImpl.bind(Unknown Source) at java.net.ServerSocket.(Unknown Source) at hudson.TcpSlaveAgentListener.(TcpSlaveAgentListener.java:91) at hudson.model.Hudson.(Hudson.java:598) at hudson.WebAppMain$2.run(WebAppMain.java:224)
I initially tried to find any solutions to this issue using Google search without much luck. There are few reference on this kind of error but it happened long time ago and seemed to be unrelated to the problem we encountered.
Initially I was thinking of the user limits on Windows, something like ulimit in UNIX/Linux systems.
After 2 cups of coffee, there come the enlightenment.
It's hard to believe that this problem was caused by this (http://support.microsoft.com/kb/196271). The Hudson CI application is currently running on Windows Server 2003 box. This version of Windows Server has something called "maximum ephemeral port number".
Quoting from this Wikipedia entry:
"Ephemeral port is a transport protocol port for Internet Protocol (IP) communications allocated automatically from a predefined range by the TCP/IP Stack Software. It is typically used by the TCP, UDP or SCTP as port for the client end of a client-server communication when the application doesn't bind the socket to a specific port number, or by a server application to free up service's well known listening port and establish a service connection to the client host. The allocations are temporary and only valid for the duration of the connection. After completion of the communication session the ports become available for reuse, although most implementations simply increment the last used port number until the ephemeral port range is exhausted."
According to the Microsoft support website:
Windows Vista and Windows Server 2008 use the IANA suggested ephemeral ports range while the Windows Server 2003 is still using port range of 1250 to 5000.
Ephemeral ports are short-lived port, chosen ad-hoc to serve. In most implementations, they usually only add the port numbers by one until get exhausted on port numbers.
May be this is just my silly thought of me, but it seems like they implemented like this pseudo-code:
[sourcecode lang="python"]
next_port_num = 1250
def get_next_ephemeral_port_num():
result = next_port_num
next_port_num = next_port_num + 1
if next_port_num > 5000:
raise ValueError('no buffer space available')
return result
[/sourcecode]
I wasn't really sure that it is the cause of the problem, but I guess no kangaroo being harmed in the making of the game, so I tried it anyway.
The remedy was simply, run the Registry Editor in the Administrator privilege, open the key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters and create new DWORD entry, "MaxUserPort", fill with 65534.
The result, after 2 days monitoring, Hudson is still running well.
NOTE:
- It seems that the problem happens to this specific version of Windows (Windows Server 2003)
- I haven't been able to reproduce this problem on another platforms