Friday, February 15, 2008

Killing a Jserv in a Load Balanced Environment

Some time back, we came across a very interesting situation where the requirement was to kill a jserv. Here is the background : In our environment (11.5.8 + JDK 1.3.1_19), we have multiple Jservs configured to run under apache. These Jservs are running in "Automatic" mode - in other words not in manual mode. One of the jservs was throwing noclassdefinitionfound error and we wanted to kill that jserv so that we could get rid of this issue. The problem in this approach is that all the jservs write to one single jserv.log file and there is no way to identify which jserv is hitting this error based on the log file.

I came up with the following approach to overcome this issue:

1. Time spent by each jservs in user mode and kernel mode
2. IO performed by each jservs
3. Virtual Memory consumed by each of the jserv processes.

Since the load balancing happens in a round robin fashion, in other words apache distributes the load in a round robin fashion between the jservs started automatically, it is fair to assume that each of the jservs should see equal amount of load in terms of user requests. Based on this assumption and the facts collated as above, I was able to zero down on a single jserv which had spent less time, less io and least vm amongst the four jservs to be the culprit.

I killed that jserv and the issue was fixed.

- Aravind Kamath

1 comment:

Sandeep Singh said...

Hi Aravind,
I also faced same issue. In this situation checked below things.

1. run url with OA_HTML/xxwhere.jsp

you will get the node name where you are getting the issue.

2. check wrapper.bin.parameters=-DLONG_RUNNING_JVM parameter value. Set it false or comment this parameter.

this parameter disable distributed caching.

No need to bounce, rotatelogs will take care.