ablog

不器用で落着きのない技術者のメモ

RHEL 4 Update 6 または 7 で EM Agent を起動すると Linux がクラッシュするらしい

oracle-l メーリングリストから。
これはカーネルのバグで、カーネルにパッチをあてれば解決するとのこと。
sys_times()システムコールを同時にガンガン使うとカーネルパニックを引き起す不具合があったらしい。

  • 現象

Hey all,

I hit the following bug:

Note 729543.1 Linux Crashes when Enterprise Manager Agent Starts on
RHEL 4 Update 6 and 7

Apparently any application which makes calls to sys_times() is
affected... I use Grid Control to monitor my servers, so because of
this bug I can't leave them running. Has anyone hit this bug before?
If so, did you try using database control to monitor your servers?
Any idea if that makes calls to sys_times()?

Kernel bug and Grid Control agent. - oracle-l - FreeLists
  • 解決策

Hi,

Actually it's an OS kernel bug, and it has been fixed. Running up2date resolves the issue.

Here are further details from the note:
====================================
Bug Introduced in RHEL 4.6:

Broken in RHEL kernel : 2.6.9-67.0.20.EL
Fixed in RHEL kernel : 2.6.9-67.0.22.EL

Broken in OEL kernel : 2.6.9-67.0.20.0.1.EL
Fixed in OEL kernel : 2.6.9-67.0.20.0.2.EL
Fixed in OEL kernel : 2.6.9-67.0.22.0.1.EL

4.7:

Broken in RHEL kernel: 2.6.9-78.EL
Fixed in RHEL kernel : 2.6.9-78.0.1.EL

Broken in OEL kernel : N/A - OEL 4.7 base (GA) kernel includes the fix for
this crash (2.6.9-78.0.0.0.1.EL)
Fixed in OEL kernel : 2.6.9-78.0.0.0.1.EL
Fixed in OEL kernel : 2.6.9-78.0.1.0.1.EL
====================================
Kurt

The opinions expressed in this email are my own and are not the opinions of any
company.

Re: Kernel bug and Grid Control agent. - oracle-l - FreeLists

ぐぐってみると、Oracle の中の人がブログに書いていた。

The kernel errata that was made available by Red Hat on August 6 2008 fixes the RHEL4.7 kernel problem mentioned in my last entry.
We have issued a corresponding OE4.7 kernel errata, even though we already had the fix in the stock OEL4.7 kernel. You do not need to install this latest errata to fix the problem. You can get the details here:
http://oss.oracle.com/pipermail/el-errata/2008-August/000703.html

The Corresponding Red Hat bugzillas are:
RH bug 455525
RH bug 455074
RH bug 453507

Explanation of the changes (quoting from the Red Hat errata text):

"A set of patches detailed as "sys_times: Fix system unresponsiveness
during many concurrent invocation of sys_times()" and "Minor code cleanup
to sys_times() call" introduced regression which caused a kernel panic
under high load. These patches were reverted in the current release. "

http://blogs.oracle.com/ezannoni/2008/08/rhel47_kernel_bug.html