ablog

不器用で落着きのない技術者のメモ

2種類の非同期I/O

非同期I/Oの「非同期」の意味が2つあって、紛らわしくなった背景を書いてみた*1

  • Unix では (non)synchronizedと(a)synchronous という用語が使われているが、英語(自然言語)だと紛らわしい*2
  • (a)synchronous はI/O処理を待つかどうか、(non)synchronized はディスクに書き終わってから折り返すかどうかを意味する
  • 例えば書込みの場合、
    • synchronized: ディスクに書き終わってから折り返す
    • "non"synchronized: ディスクに書き終わってなくても折り返す
    • synchronous: OSカーネルのバッファに書き終わってから折り返す
    • "a"synchronous: I/Oリクエストを投げたらユーザー空間にデータが残っても折り返す
  • 「非同期I/O」は "a"synchronous I/O の日本語訳なので、I/Oリクエストを並行でバラバラ投げることを指し、ディスクに同期するか否かのことではない(一般的になことが多い)
    • "a"sync、lib"a"io などの名前の由来も関係してそう

Kernel Asynchronous I/O (AIO) Support for Linux
...
AIO enables even a single application thread to overlap I/O operations with other processing, by providing an interface for submitting one or more I/O requests in one system call (io_submit()) without waiting for completion, and a separate interface (io_getevents()) to reap completed I/O operations associated with a given completion group.

Kernel Asynchronous I/O (AIO) Support for Linux

DISK_ASYNCH_IO controls whether I/O to datafiles, control files, and logfiles is asynchronous

DISK_ASYNCH_IO
  • Oracle Database on Linux の書込みでいうとこんな感じ
Synchronized Nonsynchronized
Synchronous 同期I/O(O_SYNC 付きで (p)write) ない
Asynchronous ない 非同期I/O(io_submit and io_getevents)
    • Oracle Database は常に O_SYNC フラグをつけるので、ディスクに同期しないI/Oは行いません
    • 非同期I/Oの場合は io_submit でI/Oリクエストを投げた後、io_getevents で完了を確認するので、書けたかどうかわからないのに処理を前に進めません

出典

LINUXシステムプログラミング - ablog

Linuxシステムプログラミング

Linuxシステムプログラミング

  • Chapter 4. Advanced File I/O
    • Synchronized, Synchronous, and Asynchronous Operations

Unix systems use the terms synchronized, nonsynchronized, synchronous, and asynchronous freely, without much regard to the fact that they are confusing—in English, the differences between “synchronous” and “synchronized” do not amount to much!
A synchronous write operation does not return until the written data is—at least—stored in the kernel’s buffer cache. A synchronous read operation does not return until the read data is stored in the user-space buffer provided by the application. On the other side of the coin, an asynchronous write operation may return before the data even leaves user space; an asynchronous read operation may return before the read data is available. That is, the operations may not actually take place when requested, but only be queued for later. Of course, in this case, some mechanism must exist for determining when the operation has actually completed and with what level of success.
A synchronized operation is more restrictive and safer than a merely synchronous operation. A synchronized write operation flushes the data to disk, ensuring that the on-disk data is always synchronized vis-à-vis the corresponding kernel buffers. A synchronized read operation always returns the most up-to-date copy of the data, presumably from the disk.
In sum, the terms synchronous and asynchronous refer to whether I/O operations wait for some event (e.g., storage of the data) before returning. The terms synchronized and nonsynchronized, meanwhile, specify exactly what event must occur (e.g., writing the data to disk).
Normally, Unix write operations are synchronous and nonsynchronized; read operations are synchronous and synchronized.[17] For write operations, every combination of these characteristics is possible, as Table 4-1 illustrates.

  • Table 4-1. Synchronicity of write operations
Synchronized Nonsynchronized
Synchronous Write operations do not return until the data is flushed to disk. This is the behavior if O_SYNC is specified during file open. Write operations do not return until the data is stored in kernel buffers. This is the usual behavior.
Asynchronous Write operations return as soon as the request is queued. Once the write operation ultimately executes, the data is guaranteed to be on disk. Write operations return as soon as the request is queued. Once the write operation ultimately executes, the data is guaranteed to at least be stored in kernel buffers.

Read operations are always synchronized, as reading stale data makes little sense. Such operations can be either synchronous or asynchronous, however, as illustrated in Table 4-2.

  • Table 4-2. Synchronicity of read operations
Synchronized
Synchronous Read operations do not return until the data, which is up-to-date, is stored in the provided buffer (this is the usual behavior).
Asynchronous Read operations return as soon as the request is queued, but when the read operation ultimately executes, the data returned is up-to-date.

In Chapter 2, we discussed how to make writes synchronized (via the O_SYNC flag) and how to ensure that all I/O is synchronized as of a given point (via fsync() and friends). Now, let’s look at what it takes to make reads and writes asynchronous.

補足

  • 書込みにフォーカスして書いています。
  • ディスクI/Oの話で、ネットワークI/Oについては触れていません。

*1:正確性重視だと回りくどくなるので、乱暴に書いてます

*2:日本語だとなおさら...