]> granicus.if.org Git - zfs/commitdiff
Fix lockdep between ds_lock and dd_lock in dsl_dataset_namelen()
authormzhivich <33133421+mzhivich@users.noreply.github.com>
Mon, 11 Mar 2019 16:11:04 +0000 (12:11 -0400)
committerBrian Behlendorf <behlendorf1@llnl.gov>
Mon, 11 Mar 2019 16:11:04 +0000 (09:11 -0700)
Booting debug kernel found an inconsistent lock dependency between
dataset's ds_lock and its directory's dd_lock.

[ 32.215336] ======================================================
[ 32.221859] WARNING: possible circular locking dependency detected
[ 32.221861] 4.14.90+ #8 Tainted: G           O
[ 32.221862] ------------------------------------------------------
[ 32.221863] dynamic_kernel_/4667 is trying to acquire lock:
[ 32.221864]  (&ds->ds_lock){+.+.}, at: [<ffffffffc10a4bde>] dsl_dataset_check_quota+0x9e/0x8a0 [zfs]
[ 32.221941] but task is already holding lock:
[ 32.221941]  (&dd->dd_lock){+.+.}, at: [<ffffffffc10cd8e9>] dsl_dir_tempreserve_space+0x3b9/0x1290 [zfs]
[ 32.221983] which lock already depends on the new lock.
[ 32.221983] the existing dependency chain (in reverse order) is:
[ 32.221984] -> #1 (&dd->dd_lock){+.+.}:
[ 32.221992]  __mutex_lock+0xef/0x14c0
[ 32.222049]  dsl_dir_namelen+0xd4/0x2d0 [zfs]
[ 32.222093]  dsl_dataset_namelen+0x2f1/0x430 [zfs]
[ 32.222142]  verify_dataset_name_len+0xd/0x40 [zfs]
[ 32.222184]  dmu_objset_find_dp_impl+0x5f5/0xef0 [zfs]
[ 32.222226]  dmu_objset_find_dp_cb+0x40/0x60 [zfs]
[ 32.222235]  taskq_thread+0x969/0x1460 [spl]
[ 32.222238]  kthread+0x2fb/0x400
[ 32.222241]  ret_from_fork+0x3a/0x50

[ 32.222241] -> #0 (&ds->ds_lock){+.+.}:
[ 32.222246]  lock_acquire+0x14f/0x390
[ 32.222248]  __mutex_lock+0xef/0x14c0
[ 32.222291]  dsl_dataset_check_quota+0x9e/0x8a0 [zfs]
[ 32.222355]  dsl_dir_tempreserve_space+0x5d2/0x1290 [zfs]
[ 32.222392]  dmu_tx_assign+0xa61/0xdb0 [zfs]
[ 32.222436]  zfs_create+0x4e6/0x11d0 [zfs]
[ 32.222481]  zpl_create+0x194/0x340 [zfs]
[ 32.222484]  lookup_open+0xa86/0x16f0
[ 32.222486]  path_openat+0xe56/0x2490
[ 32.222488]  do_filp_open+0x17f/0x260
[ 32.222490]  do_sys_open+0x195/0x310
[ 32.222491]  SyS_open+0xbf/0xf0
[ 32.222494]  do_syscall_64+0x191/0x4f0
[ 32.222496]  entry_SYSCALL_64_after_hwframe+0x42/0xb7

[ 32.222497] other info that might help us debug this:

[ 32.222497] Possible unsafe locking scenario:
[ 32.222498] CPU0  CPU1
[ 32.222498] ----  ----
[ 32.222499] lock(&dd->dd_lock);
[ 32.222500]  lock(&ds->ds_lock);
[ 32.222502]  lock(&dd->dd_lock);
[ 32.222503] lock(&ds->ds_lock);
[ 32.222504] *** DEADLOCK ***
[ 32.222505] 3 locks held by dynamic_kernel_/4667:
[ 32.222506] #0: (sb_writers#9){.+.+}, at: [<ffffffffaf68933c>] mnt_want_write+0x3c/0xa0
[ 32.222511] #1: (&type->i_mutex_dir_key#8){++++}, at: [<ffffffffaf652cde>] path_openat+0xe2e/0x2490
[ 32.222515] #2: (&dd->dd_lock){+.+.}, at: [<ffffffffc10cd8e9>] dsl_dir_tempreserve_space+0x3b9/0x1290 [zfs]

The issue is caused by dsl_dataset_namelen() holding ds_lock, followed by
acquiring dd_lock on ds->ds_dir in dsl_dir_namelen().

However, ds->ds_dir should not be protected by ds_lock, so releasing it before
call to dsl_dir_namelen() prevents the lockdep issue

Reviewed-by: Alek Pinchuk <apinchuk@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: Michael Zhivich <mzhivich@akamai.com>
Closes #8413

module/zfs/dsl_dataset.c

index 168aea861637d7c72800989385455ac0ccb62bed..ad944e5b8ea21c10093584a1fa32cbc9f7603cd9 100644 (file)
@@ -867,11 +867,11 @@ dsl_dataset_namelen(dsl_dataset_t *ds)
        VERIFY0(dsl_dataset_get_snapname(ds));
        mutex_enter(&ds->ds_lock);
        int len = strlen(ds->ds_snapname);
+       mutex_exit(&ds->ds_lock);
        /* add '@' if ds is a snap */
        if (len > 0)
                len++;
        len += dsl_dir_namelen(ds->ds_dir);
-       mutex_exit(&ds->ds_lock);
        return (len);
 }