mutex: force serialization on mutex_exit() to fix races
It is known that mutexes in Linux are not safe when using them to
synchronize the freeing of object in which the mutex is embedded:
http://lwn.net/Articles/575477/
The known places in ZFS which are suspected to suffer from the race
condition are zio->io_lock and dbuf->db_mtx.
* zio uses zio->io_lock and zio->io_cv to synchronize freeing
between zio_wait() and zio_done().
* dbuf uses dbuf->db_mtx to protect reference counting.
This patch fixes this kind of race by forcing serialization on
mutex_exit() with a spin lock, making the mutex safe by sacrificing
a bit of performance and memory overhead.
This issue most commonly manifests itself as a deadlock in the zio
pipeline caused by a process spinning on the damaged mutex. Similar
deadlocks have been reported for the dbuf->db_mtx mutex. And it can
also cause a NULL dereference or bad paging request under the right
circumstances.
This issue any many like it are linked off the zfsonlinux/zfs#2523
issue. Specifically this fix resolves at least the following
outstanding issues:
Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@gentoo.org>
Closes #421