The Rio File Cache: Surviving Operating System Crashes
Reading notes
Overview
- The goal of the Rio file cache is to enable memory to survive operating system crashes without writing data to disk.
- To achieve reliability, we protect memory during a crash and restore it during a reboot (a "warm" reboot).
- The goal of the Rio file cache is to enable memory to survive operating system crashes without writing data to disk / achieve the performance of main memory with the reliability of disk: write-back performance with write-through reliability.
- Method: eliminating all reliability-induced writes to disk
- Trade-off: performance and reliability
- Applications requiring high reliability, such as transaction processing, write data synchronously through to disk, but this limits throughput to that of disk.
- While optimizations such as logging and group commit can increase effective disk throughput.
- asynchronously writing data to disk: This allows a greater degree of overlap between CPU time and I/O time. Unfortunately, asynchronous writes make no firm guarantees about when the data is safe on disk;
2. Design and Implementation of a Reliable File Cache
protect the file cache from unauthorized stores
- virtual memory protection
- code patching (slow)
The second step in enabling the file cache to survive a crash is to do a warm reboot.
4. Performance
- The main benefit of Rio discussed so far is reliability: all writes to the file cache are immediately as permanent and safe as files on disk.
Class notes
Overview
Goal:
- the performance of main memory with the reliability of disk
- Write-back performance with write-through reliability
Write-through: write is done synchronously both to the cache and to the backing store. Write-back (or write-behind): initially, writing is done only to the cache. The write to the backing store is postponed until the cache blocks containing the data are about to be modified/replaced by new content.
How to do performance
- eliminates reliability-induced write to the disk (sync writes)
- recall Ousterhout(author), LFS papers
How to do reliability
- Rio protects file cache memory during normal operation
- Restores file cache contents on ‘warm’ reboot
Reboots can be either cold (alternatively known as hard) where the power to the system is physically turned off and back on again, causing an initial boot of the machine, or warm (alternatively known as soft) where the system restarts without the need to interrupt the power.