2
votes

For last few days Debian (Google Compute Engine) server is facing socket not available issue everyday and requires tomcat to be restarted.

Dec 30, 2015 1:16:53 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
    at java.net.PlainSocketImpl.socketAccept(Native Method)
    at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
    at java.net.ServerSocket.implAccept(ServerSocket.java:530)
    at java.net.ServerSocket.accept(ServerSocket.java:498)
    at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
    at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:220)
    at java.lang.Thread.run(Thread.java:745)

During the situation using when check using netstat -ap found GOOGLE :( is flooding the server.

tcp        1      0 backend-tomcat:53976 wk-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:42251 wl-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:45929 wn-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:42159 wl-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:55348 wo-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:44973 wa-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:42148 wl-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:45729 wn-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:45721 wn-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:42146 wl-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:36557 wq-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:45723 wn-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:45915 wn-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:45295 wn-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:53819 wk-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        0      0 backend-tomcat:45968 wn-in-f95.1e100.n:https ESTABLISHED 5081/java       
tcp        1      0 backend-tomcat:41737 wl-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:55132 wm-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:53969 wk-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:40883 wj-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:45734 wn-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:40889 wj-in-f95.1e100.n:https CLOSE_WAIT  5081/java       
tcp        1      0 backend-tomcat:41834 wl-in-f95.1e100.n:https CLOSE_WAIT  5081/java       

Checked the process with lsof -p 5081 and there is huge list of below lines

java    5081 tomcat7 1181u  IPv4            1959257      0t0     TCP backend-tomcat.c.stutzenappointments.internal:48570->wb-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1182u  IPv4            1959420      0t0     TCP backend-tomcat.c.stutzenappointments.internal:48575->wb-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1183u  IPv4            1959526      0t0     TCP backend-tomcat.c.stutzenappointments.internal:48577->wb-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1184u  IPv4            1959676      0t0     TCP backend-tomcat.c.stutzenappointments.internal:48583->wb-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1185u  IPv4            1960099      0t0     TCP backend-tomcat.c.stutzenappointments.internal:48591->wb-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1186u  IPv4            1959857      0t0     TCP backend-tomcat.c.stutzenappointments.internal:48589->wb-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1187u  IPv4            1960209      0t0     TCP backend-tomcat.c.stutzenappointments.internal:48596->wb-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1188u  IPv4            1960352      0t0     TCP backend-tomcat.c.stutzenappointments.internal:49524->wa-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1189u  IPv4            1960555      0t0     TCP backend-tomcat.c.stutzenappointments.internal:49526->wa-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1190u  IPv4            1960721      0t0     TCP backend-tomcat.c.stutzenappointments.internal:49531->wa-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1191u  IPv4            1960962      0t0     TCP backend-tomcat.c.stutzenappointments.internal:49537->wa-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1192u  IPv4            1961151      0t0     TCP backend-tomcat.c.stutzenappointments.internal:49539->wa-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1193u  IPv4            1961318      0t0     TCP backend-tomcat.c.stutzenappointments.internal:49544->wa-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1194u  IPv4            1961557      0t0     TCP backend-tomcat.c.stutzenappointments.internal:49550->wa-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1195u  IPv4            1961828      0t0     TCP backend-tomcat.c.stutzenappointments.internal:46380->wl-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1196u  IPv4            1962155      0t0     TCP backend-tomcat.c.stutzenappointments.internal:46386->wl-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1197u  IPv4            1962356      0t0     TCP backend-tomcat.c.stutzenappointments.internal:46392->wl-in-f95.1e100.net:https (CLOSE_WAIT)
java    5081 tomcat7 1198u  IPv4            1962461      0t0     TCP backend-tomcat.c.stutzenappointments.internal:46393->wl-in-f95.1e100.net:https (CLOSE_WAIT)

from ulimit found open files limit are set to max

open files                      (-n) 65536

Is it due to Cloud Debugger feature of GCE? Is there any way to request Google not to make server bleed daily ?

1
The netstat output appears to be outgoing connections to Google. Does your application make use of any Google APIs? Consider installing / using lsof to see other files used by your application. Is there a chance that tomcat or your app is leaking file descriptors? If it's not leaking, but normal operation, consider increasing file descriptors (ulimit) available to tomcat.Angus Davis
If it is outgoing connection then it going to be far bad, As using Google's Compute engine are they tracking every activity ? Is there any way to stop it ? No Google API used until now.arvindwill
You mention Cloud Debugger in your post; have you enabled that? Cloud Debugger does need to make API calls to Google in order to function. If you've enabled Cloud Debugger, you may want to try disabling it and seeing if the problem goes away.Angus Davis
yes now removed by sudo emacs  /etc/default/tomcat7 #STUTZEN below line is commented to check if Cloud Debugger is causing the socket leak issue #JAVA_OPTS="${JAVA_OPTS} $( /opt/cdbg/format_env_gce.sh --app_class_path=/var/lib/tomcat7/webapps/ROOT --app_class=/usr/share/tomcat7-examples/examples --agent_logs_dir=/var/log/tomcat7 )" lets wait for a day and check, so far no connections to that domainarvindwill
3 days gone, tomcat was stable after removing Cloud Debugger (:|arvindwill

1 Answers

2
votes

Thank you for reporting it. This bug is now fixed: https://github.com/GoogleCloudPlatform/cloud-debug-java/commit/057d2f181ebfb5b670e31ecc9c155b38cc495150.

Please bump the application version in format_env_gce.sh script to make sure the VM is picking up the latest version of the debugger agent. For example:

...format_env_gce.sh --version=2 ...

If you are using Managed VM, it will automatically updated to the latest version of the debugger agent next time you redeploy.

The latest binary is available at the project's GitHub page.