]> granicus.if.org Git - zfs/commit
OpenZFS 9425 - channel programs can be interrupted
authorDon Brady <don.brady@delphix.com>
Sat, 22 Jun 2019 23:51:46 +0000 (16:51 -0700)
committerBrian Behlendorf <behlendorf1@llnl.gov>
Sat, 22 Jun 2019 23:51:46 +0000 (16:51 -0700)
commit186898bbb580a830c02d994e961d717f7cf5dcca
tree3af5af5af4d7bed1bafb671c86f3876f01e0dc57
parentcb9e5b7e84654a8c7dba0f9a0d1227f3c8fa1012
OpenZFS 9425 - channel programs can be interrupted

Problem Statement
=================
ZFS Channel program scripts currently require a timeout, so that hung or
long-running scripts return a timeout error instead of causing ZFS to get
wedged. This limit can currently be set up to 100 million Lua instructions.
Even with a limit in place, it would be desirable to have a sys admin
(support engineer) be able to cancel a script that is taking a long time.

Proposed Solution
=================
Make it possible to abort a channel program by sending an interrupt signal.In
the underlying txg_wait_sync function, switch the cv_wait to a cv_wait_sig to
catch the signal. Once a signal is encountered, the dsl_sync_task function can
install a Lua hook that will get called before the Lua interpreter executes a
new line of code. The dsl_sync_task can resume with a standard txg_wait_sync
call and wait for the txg to complete.  Meanwhile, the hook will abort the
script and indicate that the channel program was canceled. The kernel returns
a EINTR to indicate that the channel program run was canceled.

Porting notes: Added missing return value from cv_wait_sig()

Authored by: Don Brady <don.brady@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Sara Hartse <sara.hartse@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Don Brady <don.brady@delphix.com>
OpenZFS-issue: https://www.illumos.org/issues/9425
OpenZFS-commit: https://github.com/illumos/illumos-gate/commit/d0cb1fb926
Closes #8904
13 files changed:
include/spl/sys/condvar.h
include/sys/dsl_synctask.h
include/sys/txg.h
include/sys/zcp.h
include/sys/zfs_context.h
lib/libzpool/kernel.c
module/spl/spl-condvar.c
module/zfs/dsl_synctask.c
module/zfs/txg.c
module/zfs/zcp.c
tests/runfiles/linux.run
tests/zfs-tests/tests/functional/channel_program/synctask_core/Makefile.am
tests/zfs-tests/tests/functional/channel_program/synctask_core/tst.terminate_by_signal.ksh [new file with mode: 0755]