ablog

不器用で落着きのない技術者のメモ

Oracle Database の待機イベントの歴史

Oracle Database の待機イベントは version 7.0 (1991-1992年頃)で初めて実装された。元々は開発チームがベンチマークボトルネックを特定するために実装されたものだった。 Juan Loaiza によると、Mark Porter も関わっていて、主に Keshevan Srinivasan が実装したらしい。Mark Porter は少し前まで AWS の RDS のサービス(開発)チームにいて、AWS Summit Tokyo で Aurora PostgreSQL 互換について講演したりもしています。1999年に Anjo Kolk が YAPP(Yet Another Performance Profiling Method)を発表し、2000 年に Oracle Magazine で紹介され広く使われるようになったようです。 により一般に待機イベントを使ったパフォーマンス分析が行われるようになった。

Oracle Insights: Tales of the Oak Table (Oaktable)

Oracle Insights: Tales of the Oak Table (Oaktable)

  • 作者:Ensor, Dave
  • 発売日: 2004/07/28
  • メディア: ペーパーバック
P.76

Correct Instrumentation Is Key
In the mid 1980s realized that no matter how many counters and ratios they looked at. it was still pure guesswork (hence luck or thereof) whether a person managed to identify and remove the correct (in other words, the biggest) bottleneck of a given application or business unit.
So they instrumented the whole mainframe environment, including DB2, MVS(later OS/390, at present z/OS), and other components. The instrumentation aimed at providing time-based measurements on the session level, and proved so powerful that today, many years later, it's possible to predict within a very small margin what, say, a CPU-upgrade will mean in terms of response time for each application.


FUN FACT The DB2 database code for AIX and Windows were written by different teams with little or no contact to the mainframe coders. Consequently, DB2 on AIX and Windows are not instrumented. Amazing, sad and true.


Around 1991 or 1992 Juan Loaiza and others from Oracle development were forced to instrument the Oracle kernel in the same way. Here's the story, as told to tribute to one of the truly great minds inside Oracle Development.


I think what you are referring to are the wait statistics that were implemented in 7.0. This stuff was developed because we were running a benchmark that we could not get to perform. We had spent several weeks trying to figure out what was happening with no success. The symptoms were clear -- the system was mostly idle -- we just couldn't figure out why.


We looked the statistics and ratios and kept coming up theories, the module was that none of them were right. So we wasted weeks tuning and fixing things that were not the problem. Finally we ran out of ideas and were forced to go back and instrument the code to figure out what the problem was.


Once the waits were instrumented the problem was diagnosed in minutes. We were having "free buffer" waits because the DBWR was not writing blocks fast enough, It's amazing how hard that was to figure out with statistics, and how easy it was to figure out once the waits were instrumented.

The "credit" for this should go to a number of people. I remember that Mark Porter was involved, and Keshevan Srinivasan did most of the actual instrumentation of the code. There were probably others involved but it has been so many years that I don't remember it clearly anymore.


(中略)

It is worth noticing that if any of numerous suggestions from "Guess&Grimacing" sessions held in Oracle Development had helped, the instrumentation of the kernel may never have taken place.
In the mid 1990s Anjo Kolk invented the YAPP method(as he describes in Chapter 4). In the process, he became the first human on Earth to take full advantage of the instrumentation


f:id:yohei-a:20200723194540p:plain
OWI(Oracle Wait Interface)の概要(PDF) | 日本エクセム//データベース, アプリケーションサーバーの見える化で効率化 インフラエンジニアを応援