Problem
Oracle resource is segfaulting due to memory leak in OracleAgent.
Error Message
kernel: OracleAgent[967]: segfault at 0000000000000000 rip 00000000f7c557b3 rsp 00000000f65d78c8 error 4
Cause
In the OracleAgent monitor ep ( entry point ), we execute the command "ps -eo pid,args | grep
<sid>", and we specify the buffer size of 4096 to capture the output of this
command.
Now on this system when the 'os' command is executed by the OracleAgent monitor ep this
command will return more than 300 oracle processes and the total number of
characters which will be returned is equal to 7291, as shown
------------------------------------------------------------------------
# cat
var/VRTSvcs/log/tmp/Oracle-0 | wc
317 634 7291
------------------------------------------------------------------------
This is greater then the buffer size we have provided, so the monitor ep will get
only the 4096 characters, and the OracleAgent will core dump.
Form the gdb we can see the value of pidargs (which is "2"):
------------------------------------------------------------------------
# fg 1
gdb (wd: /apgshared/Support/yamada/281-756-406/0903)
info locals
i = 5
pos = 0xf719194f ""
pidargs = 0xf719194e "2" <<-------------------------------
pidbuf = 0x0
arg_list = (char **) 0x0
_buffer = {__routine = 0x804af9e <del_proc_array>, __arg = 0xf71908c0,
__canceltype = -149354280,
__prev = 0xf71908e0}
------------------------------------------------------------------------
So the above code will do strchr("2",' '), as no space is present so strchr will
return NULL and we will enter into the infinite loop and with each iteration
memory will be allocated but will not get free.
Solution
Solution is available in Storage Foundation 5.0MP4.
Applies To
Storage Foundation for Oracle 5.0MP1 on Linux