For a while there, mutt would fsync() after every message it wrote to a folder. Imagine moving 10,000 messages to another folder. Yes, mutt should probably have just fsync()d after it was done writing everything, but even that is annoying. I don't even want to wait for that. I want an interface that responds as fast as damn well possible (see threading below).
LILO back in 1998 used to install in a split second. Boom, done, reboot. Recently (before I started using grub), it changed into some beast that would fsync() so many times during an install that it would take about 5 seconds to run. The result is a higher chance of a power failure or crash while it is fsync()ing, meaning that the time the system is unbootable has been increased, meaning that the use of fsync() has actually made it LESS reliable. Oops! But I really cared about that data!
mythtv (and most syslogs by default) open O_SYNC or use fsync() on all log files (and mythtv fsync()s its video files so often that I can't sit in the same room as the hard drive, grinding every second or more. This is incredibly frustrating. When pressing the up arrow in the listings, it decides to log some silly error which grinds away at the hard drive. Argh!
This doesn't go away as we move to SSDs! Instead of making noise, it just wears out the flash! Really, we don't actually want to write this often.
There were just so many cases where the speed or responsiveness impact from the use of fsync() annoyed me that I ended up writing an LD_PRELOAD library that no-ops fsync(), fdatasync(), and O_SYNC. I have been running this on every desktop I have since around January, 2000, and I have yet to lose any data as a result. This is as a DESKTOP user. Sure, I'm not recording bank transactions.
YES, there needs to be a way of an application knowing when something is written. YES, there needs to be a way to ask for it to happen soon, please. But blocking is the worst possible implementation for an application developer.
What about some sort of notification event when the data is actually on disk? To me, the needs of an application always seemed to be more along the lines of "let me know when that's on disk so I can forget about it", like the way an ACK back to a TCP stack lets it forget those bytes in its window, or grouping some changes into an atomic transaction or ordered commits (eg: mutt not wanting to write the new msgs, delete the old, and lose both because of write ordering).
Does a Firefox user care that they might lose a site in their history if their machine crashes a few seconds after loading the page? No. What they care about is if their entire history disappears (or gets corrupted due to lack of ordering and has the same effect).
You say "the same thread to draw the UI and do IO. That's plain stupid". Consider the alternative! Mutt would have to become a multithreaded application ONLY so that it could wait on the disk in the background and still be responsive. As you well know, threaded programming is hard. Most people get it wrong.
All OSes already have an asynchronous write-back queue (dirty pages and all these write-back timers and VM heuristics). These exist because UNIX is not DOS. Blocking on the creation of data is just not feasible for performance. The distinction is between writing streams and writing records. It sure as hell doesn't make sense to spawn a thread every time one wants to write anything to disk. So where do we draw the line? Bash doesn't know that your writing to a file is uber-important. Do we add --but-please-fsync-it-because-i-like-it to every shell utility? Do we run "sync" between every step? No, because the power supply could explode at any time anyway.
Imagine an SMTP server where it can pile up a bunch of writes to be checkpointed, ask it to happen sometime soon (not in that order), and be notified when it's on disk so that it can write back "OK" to the sender. This would let any mail server be completely single-threaded (or at least have no need for multiple threads except for more worthy, CPU-bound tasks). The same applies to nearly any daemon or application that needs to write to disk and still be responsive. (Reading, eg., Apache, is another problem, hence my surprise that there is no wakeup support for O_NONBLOCK on files, but that's another story.)
Anyway, those are my feelings on the topic. "fsync the data you care about", but also realize how annoying, wasteful, and counter-productive it can be in some cases to do so. Most people don't have data that they care about that much anyway. I just want "cp a b; rm a" to at least leave a or b around. I know POSIX doesn't guarantee it, but don't you agree that "cp a b ; sync ; rm a" seems like overkill?
(No, I don't know how to implement a callback API for shell programs. Hmm, I wonder why developers would like implicit ordering...)
]]>For a while there, mutt would fsync() after every message it wrote to a folder. Imagine moving 10,000 messages to another folder. Yes, mutt should probably have just fsync()d after it was done writing everything, but even that is annoying. I don't even want to wait for that. I want an interface that responds as fast as damn well possible (see threading below).
LILO back in 1998 used to install in a split second. Boom, done, reboot. Recently (before I started using grub), it changed into some beast that would fsync() so many times during an install that it would take about 5 seconds to run. The result is a higher chance of a power failure or crash while it is fsync()ing, meaning that the time the system is unbootable has been increased, meaning that the use of fsync() has actually made it LESS reliable. Oops! But I really cared about that data!
mythtv (and most syslogs by default) open O_SYNC or use fsync() on all log files (and mythtv fsync()s its video files so often that I can't sit in the same room as the hard drive, grinding every second or more. This is incredibly frustrating. When pressing the up arrow in the listings, it decides to log some silly error which grinds away at the hard drive. Argh!
This doesn't go away as we move to SSDs! Instead of making noise, it just wears out the flash! Really, we don't actually want to write this often.
There were just so many cases where the speed or responsiveness impact from the use of fsync() annoyed me that I ended up writing an LD_PRELOAD library that no-ops fsync(), fdatasync(), and O_SYNC. I have been running this on every desktop I have since around January, 2000, and I have yet to lose any data as a result. This is as a DESKTOP user. Sure, I'm not recording bank transactions.
YES, there needs to be a way of an application knowing when something is written. YES, there needs to be a way to ask for it to happen soon, please. But blocking is the worst possible implementation for an application developer.
What about some sort of notification event when the data is actually on disk? To me, the needs of an application always seemed to be more along the lines of "let me know when that's on disk so I can forget about it", like the way an ACK back to a TCP stack lets it forget those bytes in its window, or grouping some changes into an atomic transaction or ordered commits (eg: mutt not wanting to write the new msgs, delete the old, and lose both because of write ordering).
Does a Firefox user care that they might lose a site in their history if their machine crashes a few seconds after loading the page? No. What they care about is if their entire history disappears (or gets corrupted due to lack of ordering and has the same effect).
You say "the same thread to draw the UI and do IO. That's plain stupid". Consider the alternative! Mutt would have to become a multithreaded application ONLY so that it could wait on the disk in the background and still be responsive. As you well know, threaded programming is hard. Most people get it wrong.
All OSes already have an asynchronous write-back queue (dirty pages and all these write-back timers and VM heuristics). These exist because UNIX is not DOS. Blocking on the creation of data is just not feasible for performance. The distinction is between writing streams and writing records. It sure as hell doesn't make sense to spawn a thread every time one wants to write anything to disk. So where do we draw the line? Bash doesn't know that your writing to a file is uber-important. Do we add --but-please-fsync-it-because-i-like-it to every shell utility? Do we run "sync" between every step? No, because the power supply could explode at any time anyway.
Imagine an SMTP server where it can pile up a bunch of writes to be checkpointed, ask it to happen sometime soon (not in that order), and be notified when it's on disk so that it can write back "OK" to the sender. This would let any mail server be completely single-threaded (or at least have no need for multiple threads except for more worthy, CPU-bound tasks). The same applies to nearly any daemon or application that needs to write to disk and still be responsive. (Reading, eg., Apache, is another problem, hence my surprise that there is no wakeup support for O_NONBLOCK on files, but that's another story.)
Anyway, those are my feelings on the topic. "fsync the data you care about", but also realize how annoying, wasteful, and counter-productive it can be in some cases to do so. Most people don't have data that they care about that much anyway. I just want "cp a b; rm a" to at least leave a or b around. I know POSIX doesn't guarantee it, but don't you agree that "cp a b ; sync ; rm a" seems like overkill?
(No, I don't know how to implement a callback API for shell programs. Hmm, I wonder why developers would like implicit ordering...)
]]>