Linux page cache là phần RAM kernel dùng để cache file content từ disk — khi đọc file lần đầu, kernel load data vào page cache; lần sau đọc cùng file sẽ phục vụ từ RAM (không động đến disk). Write-back: write vào file chỉ update page cache (dirty pages), kernel flush xuống disk theo định kỳ hoặc khi fsync() được gọi.
Vì vậy write thường nhanh (vào RAM), nhưng data chưa an toàn trên disk cho đến khi flush. Tại sao I/O nhanh: vì page cache absorb phần lớn I/O — database với shared_buffers nhỏ vẫn nhanh vì OS page cache bù đắp. free -h hiển thị buff/cache = page cache size. vmstat bi/bo = block in/out (disk I/O thực sự). Drop cache để benchmark thực sự:
sync; echo 3 > /proc/sys/vm/drop_cacheswrite-back vs write-through: write-back (default) = nhanh nhưng có thể mất data nếu power fail; write-through (O_SYNC hoặc fsync) = chậm hơn nhưng durable.
Database (PostgreSQL, MySQL) tự gọi fsync tại write-ahead log để đảm bảo durability.
The Linux page cache is the portion of RAM that the kernel uses to cache file content from disk — on the first read of a file, the kernel loads the data into the page cache; subsequent reads of the same file are served from RAM (no disk access). Write-back: writes to a file only update the page cache (as dirty pages); the kernel flushes them to disk periodically or when fsync() is called.
This means writes are typically fast (into RAM), but data is not safely on disk until flushed. Why I/O is fast: the page cache absorbs most I/O — a database with small shared_buffers can still perform well because the OS page cache compensates. free -h shows buff/cache = the page cache size. vmstat shows bi/bo = actual block in/out (real disk I/O). Drop the cache for accurate benchmarks:
sync; echo 3 > /proc/sys/vm/drop_cacheswrite-back vs write-through: write-back (default) = fast but risks data loss on power failure; write-through (O_SYNC or fsync) = slower but durable.
Databases (PostgreSQL, MySQL) call fsync on their write-ahead logs to guarantee durability.