During zfs_rmnode on a xattr dir, if the system crash just after
dmu_free_long_range, we would get empty xattr dir in delete queue. This would
cause blkid=0 be passed into zap_get_leaf_byblk when doing zfs_purgedir during
mount, and would try to do rw_enter on a wrong structure and cause system
lockup.
We fix this by returning ENOENT when blkid is zero in zap_get_leaf_byblk.
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4114
Closes #4052
Closes #4006
Closes #3018
Closes #2861
ASSERT(RW_LOCK_HELD(&zap->zap_rwlock));
+ /*
+ * If system crashed just after dmu_free_long_range in zfs_rmnode, we
+ * would be left with an empty xattr dir in delete queue. blkid=0
+ * would be passed in when doing zfs_purgedir. If that's the case we
+ * should just return immediately. The underlying objects should
+ * already be freed, so this should be perfectly fine.
+ */
+ if (blkid == 0)
+ return (ENOENT);
+
err = dmu_buf_hold(zap->zap_objset, zap->zap_object,
blkid << bs, NULL, &db, DMU_READ_NO_PREFETCH);
if (err)