I am writing an application that really cares about durability of created files (a Certificate Transparency log), and... oof.
I fsync the file. I fsync the directory. Ok.
But... how do I test it? Even targeting a specific filesystem, I have to make VMs and try to race killing them?
I fsync the file. I fsync the directory. Ok.
But... how do I test it? Even targeting a specific filesystem, I have to make VMs and try to race killing them?
Comments
https://jepsen.io/
It involved drives acquired from a market somewhere in Asia that would always say "OK", but the data was never written. Or they managed to fake the storage?
But I think (hope) hat's not the norm. 😂
https://github.com/FiloSottile/sunlight/pull/30/files#diff-3ebed9953cad6795070bcec5f4141e48a1e77f2d4b979411664d8a4e43c41331
Got lots of good testing recs. I think the strategy is going to be LazyFS or ALICE or Gosim or dm-log-writes in CI to test the application, and manual power cuts in production to test hw and fs.
Ship a list of things you think you've committed off-machine, cut the power, see if anything the OS claims is persisted fails to be there, repeat until satisfied.
I think typically it’s used on a slightly higher level (killing processes), but I imagine not super hard to make it work with e.g. Xen and kernel panics. Or maybe even electric relays to shut off power.
The Tigerbeetle team also often talks about FS durability and simulation testing (https://github.com/tigerbeetle/tigerbeetle/blob/main/docs/internals/ARCHITECTURE.md#direct-io).
"A FUSE file system with an internal dedicated page cache that only flushes data if explicitly requested by the application. This is useful for simulating power failures and losing unsynced data."
https://github.com/dsrhaslab/lazyfs
Apparently used by Jespen:
https://github.com/jellevandenhooff/gosim
I tried an example of simulating crashing a server after writing data to disk with os.File.Sync vs. "yolo" (without os.File.Sync):
https://go.dev/play/p/1L1pXCLh5_k
Just a quick test.
It doesn't work in the playground (you need to run it locally -- I put instructions in the playground link).
I'd love to see it continue to progress (part of which might be more people trying it out 😅).
I will probably still wire a relay to the production machine once, but I might do CI based on one or both of these.
I *think* it doesn't even try to keep the dir inode in cache like Postgres to avoid losing a write-through error.
The challenge will be designing correct semantics on top of fsync :/.