]> granicus.if.org Git - zfs/log
zfs
6 years agoHonor --with-mounthelperdir where applicable
LOLi [Sun, 17 Dec 2017 22:14:07 +0000 (23:14 +0100)]
Honor --with-mounthelperdir where applicable

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6962

6 years agoFix --with-systemd on Debian-based distributions (#6963)
LOLi [Sun, 17 Dec 2017 22:08:48 +0000 (23:08 +0100)]
Fix --with-systemd on Debian-based distributions (#6963)

These changes propagate the "--with-systemd" configure option to the
RPM spec file, allowing Debian-based distributions to package
systemd-related files.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6591
Closes #6963

6 years agoRemove lib/libspl/include/sys/frame.h
Brian Behlendorf [Sun, 17 Dec 2017 22:02:29 +0000 (14:02 -0800)]
Remove lib/libspl/include/sys/frame.h

The functionality provided by this header is not required by any
of the ZFS user space code.  Minimal functionality was provided
in commit c28a677 which added include/sys/frame.h.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6960
Closes #6972

6 years agoEnable QAT support in zfs-dkms RPM
David Qian [Thu, 7 Dec 2017 08:43:13 +0000 (16:43 +0800)]
Enable QAT support in zfs-dkms RPM

Enable QAT accelerated gzip compression in zfs-dkms RPM package when
environment variant ICP_ROOT is set to QAT drive source code folder
and QAT hardware presence.  Otherwise, use default gzip compression.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: David Qian <david.qian@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6932

6 years agoAdd zfs-import.target services in spec file
Lalufu [Thu, 14 Dec 2017 01:09:22 +0000 (02:09 +0100)]
Add zfs-import.target services in spec file

Add missing zfs-import.target to list of systemd services in zfs
RPM spec file.

Reviewed-by: Niklas Wagner <Skaro@Skaronator.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ralf Ertzinger <ralf@skytale.net>
Issue #6953
Closes #6955

6 years agoVarious ZED fixes
LOLi [Sat, 9 Dec 2017 00:58:41 +0000 (01:58 +0100)]
Various ZED fixes

* Teach ZED to handle spares usingi the configured ashift: if the zpool
   'ashift' property is set then ZED should use its value when kicking
   in a hotspare; with this change 512e disks can be used as spares
   for VDEVs that were created with ashift=9, even if ZFS natively
   detects them as 4K block devices.

 * Introduce an additional auto_spare test case which verifies that in
   the face of multiple device failures an appropiate number of spares
   are kicked in.

 * Fix zed_stop() in "libtest.shlib" which did not correctly wait the
   target pid.

 * Fix ZED crashing on startup caused by a race condition in libzfs
   when used in multi-threaded context.

 * Convert ZED over to using the tpool library which is already present
   in the Illumos FMA code.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #2562
Closes #6858

6 years agoDisable vdev_zaps_004_pos
Brian Behlendorf [Fri, 8 Dec 2017 00:43:59 +0000 (16:43 -0800)]
Disable vdev_zaps_004_pos

Occasionally observed failure of vdev_zaps_004_pos due to the test
case not being 100% reliable.  In order to prevent false positives
disable this test case until it can be made reliable.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #6935
Closes #6936

6 years agoSuppress incorrect objtool warnings
Brian Behlendorf [Thu, 7 Dec 2017 18:28:50 +0000 (10:28 -0800)]
Suppress incorrect objtool warnings

Suppress incorrect warnings from versions of objtool which are not
aware of x86 EVEX prefix instructions used for AVX512.

  module/zfs/vdev_raidz_math_avx512bw.o: warning:
  objtool: <func+offset>: can't find jump dest instruction at .text

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6928

6 years agoFix segfault in zpool iostat when adding VDEVs
Tony Hutter [Wed, 6 Dec 2017 19:43:07 +0000 (11:43 -0800)]
Fix segfault in zpool iostat when adding VDEVs

Fix a segfault when running 'zpool iostat -v 1' while adding
a VDEV.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #6748
Closes #6872

6 years agoOpenZFS 8603 - rename zilog's "zl_writer_lock" to "zl_issuer_lock"
Prakash Surya [Fri, 1 Sep 2017 18:04:26 +0000 (11:04 -0700)]
OpenZFS 8603 - rename zilog's "zl_writer_lock" to "zl_issuer_lock"

This is a purely cosmetic change. The zilog's "zl_writer_lock" field is
being renamed to "zl_issuer_lock" to try and make the code easier to
understand; no other changes are made.

Authored by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: C Fraire <cfraire@me.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/8603
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/2daf06546b
Closes #6927

6 years agoDisable create-o_ashift
Brian Behlendorf [Wed, 6 Dec 2017 18:13:54 +0000 (10:13 -0800)]
Disable create-o_ashift

Occasionally observed failure of create-o_ashift due to the test
case not being 100% reliable.  In order to prevent false positives
disable this test case until it can be made reliable.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #6924
Closes #6925

6 years agoOpenZFS 8585 - improve batching done in zil_commit()
Prakash Surya [Tue, 5 Dec 2017 17:39:16 +0000 (09:39 -0800)]
OpenZFS 8585 - improve batching done in zil_commit()

Authored by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Dan McDonald <danmcd@joyent.com>
Ported-by: Prakash Surya <prakash.surya@delphix.com>
Problem
=======

The current implementation of zil_commit() can introduce significant
latency, beyond what is inherent due to the latency of the underlying
storage. The additional latency comes from two main problems:

 1. When there's outstanding ZIL blocks being written (i.e. there's
    already a "writer thread" in progress), then any new calls to
    zil_commit() will block waiting for the currently oustanding ZIL
    blocks to complete. The blocks written for each "writer thread" is
    coined a "batch", and there can only ever be a single "batch" being
    written at a time. When a batch is being written, any new ZIL
    transactions will have to wait for the next batch to be written,
    which won't occur until the current batch finishes.

    As a result, the underlying storage may not be used as efficiently
    as possible. While "new" threads enter zil_commit() and are blocked
    waiting for the next batch, it's possible that the underlying
    storage isn't fully utilized by the current batch of ZIL blocks. In
    that case, it'd be better to allow these new threads to generate
    (and issue) a new ZIL block, such that it could be serviced by the
    underlying storage concurrently with the other ZIL blocks that are
    being serviced.

 2. Any call to zil_commit() must wait for all ZIL blocks in its "batch"
    to complete, prior to zil_commit() returning. The size of any given
    batch is proportional to the number of ZIL transaction in the queue
    at the time that the batch starts processing the queue; which
    doesn't occur until the previous batch completes. Thus, if there's a
    lot of transactions in the queue, the batch could be composed of
    many ZIL blocks, and each call to zil_commit() will have to wait for
    all of these writes to complete (even if the thread calling
    zil_commit() only cared about one of the transactions in the batch).

To further complicate the situation, these two issues result in the
following side effect:

 3. If a given batch takes longer to complete than normal, this results
    in larger batch sizes, which then take longer to complete and
    further drive up the latency of zil_commit(). This can occur for a
    number of reasons, including (but not limited to): transient changes
    in the workload, and storage latency irregularites.

Solution
========

The solution attempted by this change has the following goals:

 1. no on-disk changes; maintain current on-disk format.
 2. modify the "batch size" to be equal to the "ZIL block size".
 3. allow new batches to be generated and issued to disk, while there's
    already batches being serviced by the disk.
 4. allow zil_commit() to wait for as few ZIL blocks as possible.
 5. use as few ZIL blocks as possible, for the same amount of ZIL
    transactions, without introducing significant latency to any
    individual ZIL transaction. i.e. use fewer, but larger, ZIL blocks.

In theory, with these goals met, the new allgorithm will allow the
following improvements:

 1. new ZIL blocks can be generated and issued, while there's already
    oustanding ZIL blocks being serviced by the storage.
 2. the latency of zil_commit() should be proportional to the underlying
    storage latency, rather than the incoming synchronous workload.

Porting Notes
=============

Due to the changes made in commit 119a394ab0, the lifetime of an itx
structure differs than in OpenZFS. Specifically, the itx structure is
kept around until the data associated with the itx is considered to be
safe on disk; this is so that the itx's callback can be called after the
data is committed to stable storage. Since OpenZFS doesn't have this itx
callback mechanism, it's able to destroy the itx structure immediately
after the itx is committed to an lwb (before the lwb is written to
disk).

To support this difference, and to ensure the itx's callbacks can still
be called after the itx's data is on disk, a few changes had to be made:

  * A list of itxs was added to the lwb structure. This list contains
    all of the itxs that have been committed to the lwb, such that the
    callbacks for these itxs can be called from zil_lwb_flush_vdevs_done(),
    after the data for the itxs is committed to disk.

  * A list of itxs was added on the stack of the zil_process_commit_list()
    function; the "nolwb_itxs" list. In some circumstances, an itx may
    not be committed to an lwb (e.g. if allocating the "next" ZIL block
    on disk fails), so this list is used to keep track of which itxs
    fall into this state, such that their callbacks can be called after
    the ZIL's writer pipeline is "stalled".

  * The logic to actually call the itx's callback was moved into the
    zil_itx_destroy() function. Since all consumers of zil_itx_destroy()
    were effectively performing the same logic (i.e. if callback is
    non-null, call the callback), it seemed like useful code cleanup to
    consolidate this logic into a single function.

Additionally, the existing Linux tracepoint infrastructure dealing with
the ZIL's probes and structures had to be updated to reflect these code
changes. Specifically:

  * The "zil__cw1" and "zil__cw2" probes were removed, so they had to be
    removed from "trace_zil.h" as well.

  * Some of the zilog structure's fields were removed, which affected
    the tracepoint definitions of the structure.

  * New tracepoints had to be added for the following 3 new probes:
      * zil__process__commit__itx
      * zil__process__normal__itx
      * zil__commit__io__error

OpenZFS-issue: https://www.illumos.org/issues/8585
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/5d95a3a
Closes #6566

6 years agoFix NFS sticky bit permission denied error
Brian Behlendorf [Mon, 4 Dec 2017 19:55:57 +0000 (11:55 -0800)]
Fix NFS sticky bit permission denied error

When zfs_sticky_remove_access() was originally adapted for Linux
a typo was made which altered the intended behavior.  As described
in the block comment, the intended behavior is that permission
should be granted when the entry is a regular file and you have
write access.  That is, S_ISREG should have been used instead of
S_ISDIR.

Restricting permission to regular files made good sense for older
systems where setting the bit on executable files would instruct
the system to save the program's text segment on the swap device.

On modern systems this behavior has been replaced by the sticky
bit acting as a restricted deletion flag and the plain file
restriction has been relaxed.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6889
Closes #6910

6 years agoAdd /usr/bin/env to COPY_EXEC_LIST initramfs hook
JKDingwall [Mon, 4 Dec 2017 19:53:57 +0000 (19:53 +0000)]
Add /usr/bin/env to COPY_EXEC_LIST initramfs hook

5dc1ff29 changed the user space program to mount a zfs snapshot
from /bin/sh to /usr/bin/env.  If the executable is not present
in the initramfs then snapshots cannot be automounted.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: James Dingwall <james.dingwall@zynstra.com>
Closes #5360
Closes #6913

6 years agoFix 'zpool create|add' replication level check
Brian Behlendorf [Mon, 4 Dec 2017 19:50:35 +0000 (11:50 -0800)]
Fix 'zpool create|add' replication level check

When the pool configuration contains a hole due to a previous device
removal ignore this top level vdev.  Failure to do so will result in
the current configuration being assessed to have a non-uniform
replication level and the expected warning will be disabled.

The zpool_add_010_pos test case was extended to cover this scenario.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6907
Closes #6911

6 years agoPreserve itx alloc size for zio_data_buf_free()
Brian Behlendorf [Mon, 4 Dec 2017 19:44:39 +0000 (11:44 -0800)]
Preserve itx alloc size for zio_data_buf_free()

Using zio_data_buf_alloc() to allocate the itx's may be unsafe
because the itx->itx_lr.lrc_reclen field is not constant from
allocation to free.  Using a different itx->itx_lr.lrc_reclen
size in zio_data_buf_free() can result in the allocation being
returned to the wrong kmem cache.

This issue can be avoided entirely by storing the allocation size
in itx->itx_size and using that for zio_data_buf_free().

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6912

6 years agoUnbreak the scan status ABI
Tom Caputi [Thu, 30 Nov 2017 17:40:13 +0000 (12:40 -0500)]
Unbreak the scan status ABI

When d4a72f23 was merged, pss_pass_issued was incorrectly
added to the middle of the pool_scan_stat_t structure
instead of the end. This patch simply corrects this issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #6909

6 years agoFix 'zfs get {user|group}objused@' functionality
LOLi [Wed, 29 Nov 2017 19:59:22 +0000 (20:59 +0100)]
Fix 'zfs get {user|group}objused@' functionality

Fix a regression accidentally introduced in 1b81ab4 that prevents
'zfs get {user|group}objused@' from correctly reporting the requested
value.

Update "userspace_003_pos.ksh" and "groupspace_003_pos.ksh" to verify
this functionality.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6908

6 years agoLinux 4.14 compat: CONFIG_GCC_PLUGIN_RANDSTRUCT
Mark Wright [Tue, 28 Nov 2017 23:33:48 +0000 (10:33 +1100)]
Linux 4.14 compat: CONFIG_GCC_PLUGIN_RANDSTRUCT

Fix build errors with gcc 7.2.0 on Gentoo with kernel 4.14
built with CONFIG_GCC_PLUGIN_RANDSTRUCT=y such as:

module/nvpair/nvpair.c:2810:2:error:
positional initialization of field in ?struct? declared with
'designated_init' attribute [-Werror=designated-init]
  nvs_native_nvlist,
  ^~~~~~~~~~~~~~~~~

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mark Wright <gienah@gentoo.org>
Closes #5390
Closes #6903

6 years agoinitramfs: Honor canmount=off
Richard Laager [Fri, 24 Nov 2017 05:08:02 +0000 (23:08 -0600)]
initramfs: Honor canmount=off

The initramfs script was not honoring canmount=off.  With this change,
it does.  If the administrator has asked that a filesystem not be
mounted, that should be honored.

As an exception, the initramfs script ignores canmount=off on the
rootfs.  The rootfs should not have canmount=off set either.  However,
mounting it anyway seems harmless because it is being asked for
explicitly.  The point of this exception is to avoid the risk of
breaking existing systems, just in case someone has canmount=off set on
their rootfs.

The initramfs still mounts filesystems with canmount=noauto.  This is
necessary because it is typical to set that on the rootfs so that it can
be cloned.  Without canmount=noauto, the clones' duplicate mountpoints
would conflict.

This is the remainder of the fix for:
https://github.com/zfsonlinux/pkg-zfs/issues/221

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #6897

6 years agoinitramfs: Honor mountpoint=none/legacy
Richard Laager [Fri, 24 Nov 2017 03:54:48 +0000 (21:54 -0600)]
initramfs: Honor mountpoint=none/legacy

For filesystems that are children of the rootfs, when mountpoint=none or
mountpoint=legacy, the initrafms script would assume a mountpoint based
on the dataset path.  Given that the rootfs should have mountpoint=/ and
mountpoint inheritance is is the default behavior of ZFS, this behavior
seems unnecessary.  In any event, it turns mountpoint=none into a no-op.
That removes this option from the administrator, and if someone uses it,
it does not work as expected.  Worse yet, if the mountpoint directory
does not exist (which is the typical case for mountpoint=none), the
mounting and thus the boot process will fail.  For the case of
mountpoint=legacy, the assumed mountpoint may not be the correct value
set in /etc/fstab.

This change makes the initramfs script not mount the filesystem in
either case.  For mountpoint=none, this means we are correctly honoring
the setting.  For mountpoint=legacy, there are two scenarios:  If
canmount=on, the filesystem will be mounted by the normal mechanisms
later in the boot process.  If canmount=noauto, the filesystem will not
be mounted at all, unless the administrator has done something special.
If they're not doing something special and they want it mounted by the
initramfs, they can simply not set mountpoint=legacy.

This is part of the fix for:
https://github.com/zfsonlinux/pkg-zfs/issues/221

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #6897

6 years agozpool(8): Fix "zpool import -t"
DeHackEd [Tue, 28 Nov 2017 17:10:52 +0000 (12:10 -0500)]
zpool(8): Fix "zpool import -t"

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: DHE <git@dehacked.net>
Closes #6894

6 years agoUpdate for cppcheck v1.80
Brian Behlendorf [Sat, 18 Nov 2017 22:08:00 +0000 (14:08 -0800)]
Update for cppcheck v1.80

Resolve new warnings and errors from cppcheck v1.80.

* [lib/libshare/libshare.c:543]: (warning)
  Possible null pointer dereference: protocol
* [lib/libzfs/libzfs_dataset.c:2323]: (warning)
  Possible null pointer dereference: srctype
* [lib/libzfs/libzfs_import.c:318]: (error)
  Uninitialized variable: link
* [module/zfs/abd.c:353]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:353]: (error) Uninitialized variable: i
* [module/zfs/abd.c:385]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:385]: (error) Uninitialized variable: i
* [module/zfs/abd.c:553]: (error) Uninitialized variable: i
* [module/zfs/abd.c:553]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:763]: (error) Uninitialized variable: i
* [module/zfs/abd.c:763]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:305]: (error) Uninitialized variable: tmp_page
* [module/zfs/zpl_xattr.c:342]: (warning)
   Possible null pointer dereference: value
* [module/zfs/zvol.c:208]: (error) Uninitialized variable: p

Convert the following suppression to inline.

* [module/zfs/zfs_vnops.c:840]: (error)
  Possible null pointer dereference: aiov

Exclude HAVE_UIO_ZEROCOPY and HAVE_DNLC from analysis since
these macro's will never be defined until this functionality
is implemented.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6879

6 years agoFix data on evict_skips in arc_summary.py
Scot W. Stevenson [Sat, 18 Nov 2017 22:07:04 +0000 (23:07 +0100)]
Fix data on evict_skips in arc_summary.py

Display correct data from kstat arcstats for evict_skips,
which is currently repeating the data from mutex_misses.
Fixes #6882

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6882
Closes #6883

6 years agoFix ARC pointer overrun
DeHackEd [Fri, 17 Nov 2017 23:11:39 +0000 (18:11 -0500)]
Fix ARC pointer overrun

Only access the `b_crypt_hdr` field of an ARC header if the content
is encrypted.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #6877

6 years agoSequential scrub and resilvers
Tom Caputi [Thu, 16 Nov 2017 01:27:01 +0000 (20:27 -0500)]
Sequential scrub and resilvers

Currently, scrubs and resilvers can take an extremely
long time to complete. This is largely due to the fact
that zfs scans process pools in logical order, as
determined by each block's bookmark. This makes sense
from a simplicity perspective, but blocks in zfs are
often scattered randomly across disks, particularly
due to zfs's copy-on-write mechanisms.

This patch improves performance by splitting scrubs
and resilvers into a metadata scanning phase and an IO
issuing phase. The metadata scan reads through the
structure of the pool and gathers an in-memory queue
of I/Os, sorted by size and offset on disk. The issuing
phase will then issue the scrub I/Os as sequentially as
possible, greatly improving performance.

This patch also updates and cleans up some of the scan
code which has not been updated in several years.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Authored-by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Authored-by: Alek Pinchuk <apinchuk@datto.com>
Authored-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #3625
Closes #6256

6 years agoMinor code cleanups in arc_python.py
Scot W. Stevenson [Wed, 15 Nov 2017 18:28:11 +0000 (19:28 +0100)]
Minor code cleanups in arc_python.py

Remove unused library re and associated variable kstat_pobj. Add note
to documentation at start of program about required support for old
versions of Python. Change variable "format" (which is a built-in
function) to "fmt".

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6869

6 years agoFix dirty check in dmu_offset_next()
Brian Behlendorf [Wed, 15 Nov 2017 18:19:32 +0000 (10:19 -0800)]
Fix dirty check in dmu_offset_next()

The correct way to determine if a dnode is dirty is to check
if any of the dn->dn_dirty_link's are active.  Relying solely
on the dn->dn_dirtyctx can result in the dnode being mistakenly
reported as clean.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3125
Closes #6867

6 years agoDisable automatic dependencies in zfs-test package
Brian Behlendorf [Wed, 15 Nov 2017 17:12:52 +0000 (09:12 -0800)]
Disable automatic dependencies in zfs-test package

All of the ZTS test scripts specify /bin/ksh as the interpreter.
Unfortunately, as of Fedora 27 only /usr/bin/ksh is provided by
the package manager.  Rather than change all the scripts to
accommodate the latest Fedora disable automatic dependencies
for the zfs-test package.  Functionally this will not cause
any problems since /bin is a symlink to /usr/bin.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6868

6 years agoDisable zvol_ENOSPC_001_pos on 32-bit systems
Brian Behlendorf [Tue, 14 Nov 2017 00:26:15 +0000 (16:26 -0800)]
Disable zvol_ENOSPC_001_pos on 32-bit systems

Occasionally observed failure of zvol_ENOSPC_001_pos due to the
test case taking too long to complete.  Disable the test case until
it can be improved.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #5848
Closes #6862

6 years agoFix truncate(2) mtime and ctime handling
LOLi [Mon, 13 Nov 2017 17:24:26 +0000 (18:24 +0100)]
Fix truncate(2) mtime and ctime handling

On Linux, ftruncate(2) always changes the file timestamps, even if the
file size is not changed. However, in case of a successfull
truncate(2), the timestamps are updated only if the file size changes.
This translates to the VFS calling the ZFS Posix Layer "setattr"
function (zpl_setattr) with ATTR_MTIME and ATTR_CTIME unconditionally
set on the iattr mask only when doing a ftruncate(2), while the
truncate(2) is left to the filesystem implementation to be dealt with.

This behaviour is consistent with POSIX:2004/SUSv3 specifications
where there's no explicit requirement for file size changes to update
the timestamps only for ftruncate(2):

http://pubs.opengroup.org/onlinepubs/009695399/functions/truncate.html
http://pubs.opengroup.org/onlinepubs/009695399/functions/ftruncate.html

This has been later updated in POSIX:2008/SUSv4 where, for both
truncate(2)/ftruncate(2), there's no mention of this size change
requirement:

http://austingroupbugs.net/view.php?id=489
http://pubs.opengroup.org/onlinepubs/9699919799/functions/truncate.html
http://pubs.opengroup.org/onlinepubs/9699919799/functions/ftruncate.html

Unfortunately the Linux VFS is still calling into the ZPL without
ATTR_MTIME/ATTR_CTIME set in the truncate(2) case: we fix this by
explicitly updating the timestamps when detecting the ATTR_SIZE bit,
which is always set in do_truncate(), on the iattr mask.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6811
Closes #6819

6 years agoAdd .travis.yml
George Melikov [Mon, 13 Nov 2017 17:18:18 +0000 (20:18 +0300)]
Add .travis.yml

Travis builders have maximum work time ~49 minutes,
so we have to use 5 builders and spread the ZTS over
them using test group tags.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #6829

6 years agoFix arc_summary.py -d crash with Python3
Scot W. Stevenson [Sun, 12 Nov 2017 04:27:43 +0000 (05:27 +0100)]
Fix arc_summary.py -d crash with Python3

Prevents arc_summary.py crashing when called with parameter -d or
long form --description with Python3.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6849
Closes #6850

6 years agoOpenZFS 7531 - Assign correct flags to prefetched buffers
benrubson [Thu, 2 Nov 2017 15:01:56 +0000 (16:01 +0100)]
OpenZFS 7531 - Assign correct flags to prefetched buffers

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Authored by: abraunegg <alex.braunegg@gmail.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/7531
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/468008cb

6 years agoLong hold the dataset during upgrade
Arkadiusz Bubała [Fri, 10 Nov 2017 21:37:10 +0000 (22:37 +0100)]
Long hold the dataset during upgrade

If the receive or rollback is performed while filesystem is upgrading
the objset may be evicted in `dsl_dataset_clone_swap_sync_impl`. This
will lead to NULL pointer dereference when upgrade tries to access
evicted objset.

This commit adds long hold of dataset during whole upgrade process.
The receive and rollback will return an EBUSY error until the
upgrade is not finished.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arkadiusz Bubała <arkadiusz.bubala@open-e.com>
Closes #5295
Closes #6837

6 years agoFix encryption root hierarchy issue
Tom Caputi [Wed, 8 Nov 2017 23:25:30 +0000 (18:25 -0500)]
Fix encryption root hierarchy issue

After doing a recursive raw receive, zfs userspace performs
a final pass to adjust the encryption root hierarchy as
needed. Unfortunately, the FORCE_INHERIT ioctl had a bug
which caused the encryption root to always be assigned to
the direct parent instead of the inheriting parent. This
patch simply fixes this issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #6847
Closes #6848

6 years agoHandle compressed buffers in __dbuf_hold_impl()
Tim Chase [Wed, 8 Nov 2017 21:32:15 +0000 (15:32 -0600)]
Handle compressed buffers in __dbuf_hold_impl()

In __dbuf_hold_impl(), if a buffer is currently syncing and is still
referenced from db_data, a copy is made in case it is dirtied again in
the txg.  Previously, the buffer for the copy was simply allocated with
arc_alloc_buf() which doesn't handle compressed or encrypted buffers
(which are a special case of a compressed buffer).  The result was
typically an invalid memory access because the newly-allocated buffer
was of the uncompressed size.

This commit fixes the problem by handling the 2 compressed cases,
encrypted and unencrypted, respectively, with arc_alloc_raw_buf() and
arc_alloc_compressed_buf().

Although using the proper allocation functions fixes the invalid memory
access by allocating a buffer of the compressed size, another unrelated
issue made it impossible to properly detect compressed buffers in the
first place.  The header's compression flag was set to ZIO_COMPRESS_OFF
in arc_write() when it was possible that an attached buffer was actually
compressed.  This commit adds logic to only set ZIO_COMPRESS_OFF in
the non-ZIO_RAW case which wil handle both cases of compressed buffers
(encrypted or unencrypted).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes #5742
Closes #6797

6 years agoRefresh TEST file to include new variables
Giuseppe Di Natale [Wed, 8 Nov 2017 19:09:30 +0000 (11:09 -0800)]
Refresh TEST file to include new variables

Add any missing variables for CI control to TEST.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6846

6 years agoCleanup systemd dependencies
Antonio Russo [Wed, 8 Nov 2017 17:39:15 +0000 (12:39 -0500)]
Cleanup systemd dependencies

Some redundancy is present in the systemd dependencies, as
noticed in PR#6764. Existing setups might rely on these quirks,
so these cleanups have been moved to the development branch.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
Closes #6822

6 years agoFix undefined %{systemd_svcs} in RPM scriptlets
LOLi [Wed, 8 Nov 2017 17:16:37 +0000 (18:16 +0100)]
Fix undefined %{systemd_svcs} in RPM scriptlets

This allows RPM-based systems to properly control package installation
and removal when using systemd.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6838
Closes #6841

6 years agoOpenZFS 8607 - variable set but not used
Brian Behlendorf [Wed, 8 Nov 2017 17:09:45 +0000 (09:09 -0800)]
OpenZFS 8607 - variable set but not used

Reviewed by: Yuri Pankov <yuripv@gmx.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Authored by: Toomas Soome <tsoome@me.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/8607
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/b852c2f5
Closes #6842

6 years agoProvide tags in perf-regression.run
Giuseppe Di Natale [Tue, 7 Nov 2017 23:06:27 +0000 (15:06 -0800)]
Provide tags in perf-regression.run

A prior commit changed test-runner to enable tagging
of testgroups within a test suite runfile. They must
be specified in each runfile. Update the runfile for
performance regressions so it is properly tagged.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6830

6 years agoFix zfs-tests.sh single test functionality
LOLi [Tue, 7 Nov 2017 22:55:31 +0000 (23:55 +0100)]
Fix zfs-tests.sh single test functionality

Without any tag specified into the runtime-generated runfile the
test-runner will not execute the test provided from the command line:
fix this by adding tag information to the custom runfile.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6826

6 years agocontrib/initramfs: switch to automake
LOLi [Tue, 7 Nov 2017 22:53:57 +0000 (23:53 +0100)]
contrib/initramfs: switch to automake

Use automake to build initramfs scripts and hooks.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6761

6 years agoBug fix in qat_compress.c when compressed size is < 4KB
wli5 [Tue, 7 Nov 2017 22:51:30 +0000 (06:51 +0800)]
Bug fix in qat_compress.c when compressed size is < 4KB

When the 128KB block is compressed to less than 4KB, the pointer
to the Footer is not in the end of the compressed buffer, that's
because the Header offset was added twice for this case. So there
is a gap between the Footer and the compressed buffer.
1. Always compute the Footer pointer address from the start of the
last page.
2. Remove the un-used workaroud code which has been verified fixed
with the latest driver and this fix.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Weigang Li <weigang.li@intel.com>
Closes #6827

6 years agoSort output of tunables in arc_summary.py
Scot W. Stevenson [Tue, 7 Nov 2017 22:50:15 +0000 (23:50 +0100)]
Sort output of tunables in arc_summary.py

Sort list of tunables printed by _tunable_summary()
alphabetically

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6828

6 years agoDisable automatic dependencies in DKMS package
Brian Behlendorf [Tue, 7 Nov 2017 18:59:27 +0000 (10:59 -0800)]
Disable automatic dependencies in DKMS package

By default additional dependencies are generated automatically for
packages.  This is normally a good thing because it helps ensure
things just work.  It doesn't make sense for the DKMS package which
requires minimal dependencies that can be easily listed.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6467
Closes #6835

6 years agoBuild regression in c89 cleanups
Don Brady [Tue, 7 Nov 2017 18:42:15 +0000 (11:42 -0700)]
Build regression in c89 cleanups

Fixed build regression in non-debug builds from recent cleanups of
c89 workarounds.

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #6832

6 years agoDisable zpool_import_missing_003_pos
George Melikov [Tue, 7 Nov 2017 18:32:04 +0000 (21:32 +0300)]
Disable zpool_import_missing_003_pos

Rarely observed failure of zpool_import_missing_003_pos during
automated testing due to timeout.  Disable the test case until
it can be improved.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Issue #6839
Closes #6840

6 years agoAdd documentation strings to arc_summary.py
Scot W. Stevenson [Sun, 5 Nov 2017 21:11:37 +0000 (22:11 +0100)]
Add documentation strings to arc_summary.py

Include docstrings (PEP8, PEP257) for module and all functions.
Separately, remove outdated section in comment at start of
module. Separately, remove unused global constant "usetunable".

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6818

6 years agoFix column alignment with long zpool names
George G [Sun, 5 Nov 2017 21:09:56 +0000 (21:09 +0000)]
Fix column alignment with long zpool names

`zpool status` normally aligns NAME/STATE/etc columns:

    NAME                       STATE     READ WRITE CKSUM
    dummy                      ONLINE       0     0     0
      mirror-0                 ONLINE       0     0     0
        /tmp/dummy-long-1.bin  ONLINE       0     0     0
        /tmp/dummy-long-2.bin  ONLINE       0     0     0
      mirror-1                 ONLINE       0     0     0
        /tmp/dummy-long-3.bin  ONLINE       0     0     0
        /tmp/dummy-long-4.bin  ONLINE       0     0     0

However, if the zpool name is longer than the zvol names, alignment
issues arise:

    NAME                  STATE     READ WRITE CKSUM
    dummy-very-very-long-zpool-name  ONLINE       0     0     0
      mirror-0            ONLINE       0     0     0
        /tmp/dummy-1.bin  ONLINE       0     0     0
        /tmp/dummy-2.bin  ONLINE       0     0     0
      mirror-1            ONLINE       0     0     0
        /tmp/dummy-3.bin  ONLINE       0     0     0
        /tmp/dummy-4.bin  ONLINE       0     0     0

`zpool iostat` and `zpool import` are also affected:

                  capacity     operations     bandwidth
    pool        alloc   free   read  write   read  write
    ----------  -----  -----  -----  -----  -----  -----
    dummy        104K  1.97G      0      0    152  9.84K
    dummy-very-very-long-zpool-name   152K  1.97G      0      1    144  13.1K
    ----------  -----  -----  -----  -----  -----  -----

    dummy-very-very-long-zpool-name  ONLINE
      mirror-0            ONLINE
        /tmp/dummy-1.bin  ONLINE
        /tmp/dummy-2.bin  ONLINE
      mirror-1            ONLINE
        /tmp/dummy-3.bin  ONLINE
        /tmp/dummy-4.bin  ONLINE

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Gaydarov <git@gg7.io>
Closes #6786

7 years agoRewrite fHits() in arc_summary.py with SI units
Scot W. Stevenson [Sat, 4 Nov 2017 20:33:28 +0000 (21:33 +0100)]
Rewrite fHits() in arc_summary.py with SI units

Complete rewrite of fHits(). Move units from non-standard English
abbreviations to SI units, thereby avoiding confusion because of
"long scale" and "short scale" numbers. Remove unused parameter
"Decimal". Add function string. Aim to confirm to PEP8.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6815

7 years agoUndo c89 workarounds to match with upstream
Don Brady [Sat, 4 Nov 2017 20:25:13 +0000 (14:25 -0600)]
Undo c89 workarounds to match with upstream

With PR 5756 the zfs module now supports c99 and the
remaining past c89 workarounds can be undone.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #6816

7 years agoMinor code cleanup in arc_summary.py
Scot W. Stevenson [Fri, 3 Nov 2017 22:43:53 +0000 (23:43 +0100)]
Minor code cleanup in arc_summary.py

Simplify and inline single-use function div1(); inline twice-used
function div2(); add function comment to zfs_header(); replace
variable "unused" in get_Kstat() with "_" following convention.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6802

7 years agoInitramfs fixes
Brian Behlendorf [Fri, 3 Nov 2017 22:38:52 +0000 (15:38 -0700)]
Initramfs fixes

* initramfs: Fix inconsistent whitespace
* initramfs: Fix a spelling error
* initramfs: Set elevator=noop on the rpool's disks

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #6807

7 years agoAllow test-runner to filter test groups by tag
Giuseppe Di Natale [Fri, 3 Nov 2017 16:53:32 +0000 (09:53 -0700)]
Allow test-runner to filter test groups by tag

Enable test-runner to accept a list of tags to identify
which test groups the user wishes to run.

Also allow test-runner to perform multiple iterations
of a test run.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Wren Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6788

7 years agoinitramfs: Set elevator=noop on the rpool's disks
Richard Laager [Thu, 2 Nov 2017 02:54:56 +0000 (21:54 -0500)]
initramfs: Set elevator=noop on the rpool's disks

ZFS already sets elevator=noop for wholedisk vdevs (for all pools), but
typical root-on-ZFS installations use partitions.  This sets
elevator=noop on the disks in the root pool.

Ubuntu 16.04 and 16.10 had this.  It was lost in 17.04 due to Debian
switching to this upstream initramfs script.

Signed-off-by: Richard Laager <rlaager@wiktel.com>
7 years agoinitramfs: Fix a spelling error
Richard Laager [Thu, 2 Nov 2017 02:54:28 +0000 (21:54 -0500)]
initramfs: Fix a spelling error

This fixes a typo in a comment.

Signed-off-by: Richard Laager <rlaager@wiktel.com>
7 years agoinitramfs: Fix inconsistent whitespace
Richard Laager [Thu, 2 Nov 2017 02:53:22 +0000 (21:53 -0500)]
initramfs: Fix inconsistent whitespace

This fixes one instance of inconsistent whitespace.

Signed-off-by: Richard Laager <rlaager@wiktel.com>
7 years agoAdd scan.coverity.com badge to README
Brian Behlendorf [Mon, 30 Oct 2017 23:21:24 +0000 (16:21 -0700)]
Add scan.coverity.com badge to README

Include the scan.coverity.com status badge in the top level README.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6801

7 years agoOpenZFS 640 - number_to_scaled_string is duplicated in several commands
Jason King [Tue, 13 Jun 2017 09:16:45 +0000 (04:16 -0500)]
OpenZFS 640 - number_to_scaled_string is duplicated in several commands

Porting Notes:
- The OpenZFS patch added nicenum_scale() and nicenum() to a
  library not used by ZFS.  Rather than pull in a new dependency
  the version of nicenum in lib/libzpool/util.c was simply
  replaced with the new one.

Reviewed by: Sebastian Wiedenroth <wiedi@frubar.net>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Yuri Pankov <yuripv@gmx.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Authored by: Jason King <jason.brian.king@gmail.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/640
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/0a055120
Closes #6796

7 years agoRewrite of function fBytes() in arc_summary.py
Scot W. Stevenson [Wed, 25 Oct 2017 06:29:02 +0000 (08:29 +0200)]
Rewrite of function fBytes() in arc_summary.py

Replace if-elif-else construction with shorter loop;
remove unused parameter "Decimal"; centralize format
string; add function documentation string; conform to
PEP8.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6784

7 years agosystemd zfs-import.target and documentation
Antonio Russo [Mon, 30 Oct 2017 20:18:26 +0000 (16:18 -0400)]
systemd zfs-import.target and documentation

zfs-import-{cache,scan}.service must complete before any mounting of
filesystems can occur. To simplify this dependency, create a target
that is reached After (in the systemd sense) the pool is imported.

Additionally, recommend that legacy zfs mounts use the option

x-systemd.requires=zfs-import.target

to codify this requirement.

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
Closes #6764

7 years agoUpdate zfs module parameters man5
abraunegg [Mon, 30 Oct 2017 20:15:10 +0000 (07:15 +1100)]
Update zfs module parameters man5

Update zfs module parameters man5 with missing parameter details
for multiple tunings.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Alex Braunegg <alex.braunegg@gmail.com>
Closes #6785

7 years agoFix status command options in zpool(8)
Brian Behlendorf [Fri, 27 Oct 2017 22:52:03 +0000 (15:52 -0700)]
Fix status command options in zpool(8)

The 'zpool status' command supports the -P option for printing full
path names.  It does not support the -p parsable option for printing
exact values.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6792
Closes #6794

7 years agoOpenZFS 8081 - Compiler warnings in zdb
Brian Behlendorf [Fri, 27 Oct 2017 19:46:35 +0000 (12:46 -0700)]
OpenZFS 8081 - Compiler warnings in zdb

Fix compiler warnings in zdb.  With these changes, FreeBSD can compile
zdb with all compiler warnings enabled save -Wunused-parameter.

usr/src/cmd/zdb/zdb.c
usr/src/cmd/zdb/zdb_il.c
usr/src/uts/common/fs/zfs/sys/sa.h
usr/src/uts/common/fs/zfs/sys/spa.h
Fix numerous warnings, including:
* const-correctness
* shadowing global definitions
* signed vs unsigned comparisons
* missing prototypes, or missing static declarations
* unused variables and functions
* Unreadable array initializations
* Missing struct initializers

usr/src/cmd/zdb/zdb.h
Add a header file to declare common symbols

usr/src/lib/libzpool/common/sys/zfs_context.h
usr/src/uts/common/fs/zfs/arc.c
usr/src/uts/common/fs/zfs/dbuf.c
usr/src/uts/common/fs/zfs/spa.c
usr/src/uts/common/fs/zfs/txg.c
Add a function prototype for zk_thread_create, and ensure that every
callback supplied to this function actually matches the prototype.

usr/src/cmd/ztest/ztest.c
usr/src/uts/common/fs/zfs/sys/zil.h
usr/src/uts/common/fs/zfs/zfs_replay.c
usr/src/uts/common/fs/zfs/zvol.c
Add a function prototype for zil_replay_func_t, and ensure that
every function of this type actually matches the prototype.

usr/src/uts/common/fs/zfs/sys/refcount.h
Change FTAG so it discards any constness of __func__, necessary
since existing APIs expect it passed as void *.

Porting Notes:
- Many of these fixes have already been applied to Linux.  For
  consistency the OpenZFS version of a change was applied if the
  warning was addressed in an equivalent but different fashion.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Authored by: Alan Somers <asomers@gmail.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/8081
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/843abe1b8a
Closes #6787

7 years agoCorrect make mancheck recipe
Giuseppe Di Natale [Fri, 27 Oct 2017 16:52:18 +0000 (09:52 -0700)]
Correct make mancheck recipe

The current make recipe for mancheck silently ignores errors. Correct
the recipe so errors cause the mancheck recipe fail.

The zpool reopen command in the zpool.8 manpage had a bullet list
without an .El.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6790

7 years agoZFS send fails to dump objects larger than 128PiB
LOLi [Thu, 26 Oct 2017 23:58:38 +0000 (01:58 +0200)]
ZFS send fails to dump objects larger than 128PiB

When dumping objects larger than 128PiB it's possible for do_dump() to
miscalculate the FREE_RECORD offset due to an integer overflow
condition: this prevents the receiving end from correctly restoring
the dumped object.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6760

7 years agoAllow 'zpool events' filtering by pool name
LOLi [Thu, 26 Oct 2017 23:49:33 +0000 (01:49 +0200)]
Allow 'zpool events' filtering by pool name

Additionally add four new tests:

 * zpool_events_clear: verify 'zpool events -c' functionality
 * zpool_events_cliargs: verify command line options and arguments
 * zpool_events_follow: verify 'zpool events -f'
 * zpool_events_poolname: verify events filtering by pool name

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #3285
Closes #6762

7 years agoOpenZFS 8558, 8602 - lwp_create() returns EAGAIN
Brian Behlendorf [Thu, 26 Oct 2017 19:57:53 +0000 (12:57 -0700)]
OpenZFS 8558, 8602 - lwp_create() returns EAGAIN

8558 lwp_create() returns EAGAIN on system with more than 80K ZFS filesystems

On a system with more than 80K ZFS filesystems, we've seen cases
where lwp_create() will start to fail by returning EAGAIN. The
problem being, for each of those 80K ZFS filesystems, a taskq will
be created for each dataset as part of the ZIL for each dataset.

Porting Notes:
- The new nomem taskq kstat was dropped.
- Added module options and documentation for new tunings
  zfs_zil_clean_taskq_nthr_pct, zfs_zil_clean_taskq_minalloc,
  zfs_zil_clean_taskq_maxalloc, and zfs_sync_taskq_batch_pct.

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Authored by: Prakash Surya <prakash.surya@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/8558
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/216d772

8602 remove unused "dp_early_sync_tasks" field from "dsl_pool" structure

Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Authored by: Prakash Surya <prakash.surya@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/8602
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/2bcb545
Closes #6779

7 years agoAdded no_scrub_restart flag to zpool reopen
Arkadiusz Bubała [Thu, 26 Oct 2017 19:26:09 +0000 (21:26 +0200)]
Added no_scrub_restart flag to zpool reopen

Added -n flag to zpool reopen that allows a running scrub
operation to continue if there is a device with Dirty Time Log.

By default if a component device has a DTL and zpool reopen
is executed all running scan operations will be restarted.

Added functional tests for `zpool reopen`

Tests covers following scenarios:
* `zpool reopen` without arguments,
* `zpool reopen` with pool name as argument,
* `zpool reopen` while scrubbing,
* `zpool reopen -n` while scrubbing,
* `zpool reopen -n` while resilvering,
* `zpool reopen` with bad arguments.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Arkadiusz Bubała <arkadiusz.bubala@open-e.com>
Closes #6076
Closes #6746

7 years agoarcstat: flush stdout / outfile after each line
Fabian-Gruenbichler [Thu, 26 Oct 2017 19:18:49 +0000 (21:18 +0200)]
arcstat: flush stdout / outfile after each line

Otherwise, if arcstat gets interrupted before the desired number of
iterations is reached, the output file will be empty (both if set via
'-o' or via shell redirection).

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes #6775

7 years agocommitcheck: Multiple OpenZFS ports in commit
Giuseppe Di Natale [Thu, 26 Oct 2017 17:23:58 +0000 (10:23 -0700)]
commitcheck: Multiple OpenZFS ports in commit

Allow commitcheck.sh to handle multiple OpenZFS ports in
a single commit. This is useful in the cases when a change
upstream has bug fixes and it makes sense to port them with
the original patch.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6780

7 years agoAdd Coverity defect fix commit checker support
Giuseppe Di Natale [Thu, 26 Oct 2017 17:17:00 +0000 (10:17 -0700)]
Add Coverity defect fix commit checker support

Enable commitcheck.sh to test if a commit message is
in the expected format for a coverity defect fix.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6777

7 years agoEnsure arc_size_break is filled in arc_summary.py
Giuseppe Di Natale [Mon, 23 Oct 2017 21:18:12 +0000 (14:18 -0700)]
Ensure arc_size_break is filled in arc_summary.py

Use mfu_size and mru_size pulled from the arcstats
kstat file to calculate the mfu and mru percentages
for arc size breakdown.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: AndCycle <andcycle@andcycle.idv.tw>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #5526
Closes #6770

7 years agoCorrect flake8 errors after STYLE builder update
Giuseppe Di Natale [Mon, 23 Oct 2017 21:01:43 +0000 (14:01 -0700)]
Correct flake8 errors after STYLE builder update

Fix new flake8 errors related to bare excepts and ambiguous
variable names due to a STYLE builder update.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6776

7 years agoZTS: Add auto-spare tests
David Quigley [Mon, 23 Oct 2017 18:42:37 +0000 (12:42 -0600)]
ZTS: Add auto-spare tests

The ZED is expected to automatically kick in a hot spare device
when there's one available in the pool and a sufficient number of
read errors have been encountered.  Use zinject to simulate the
failure condition and verify the hot spare is used.

auto_spare_001_pos.ksh: read IO errors, the vdev is FAULTED
auto_spare_002_pos.ksh: read CHECKSUM errors, the vdev is DEGRADE

Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: David Quigley <david.quigley@intel.com>
Closes #6280

7 years agoUse ashift=12 by default on SSDSC2BW48 disks
adisbladis [Mon, 23 Oct 2017 18:00:45 +0000 (02:00 +0800)]
Use ashift=12 by default on SSDSC2BW48 disks

Currently the 480GB models of this disk do not use ashift=12 by
default.  SSDSC2BW48 is also optimized for 4k blocks.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: adisbladis <adis@blad.is>
Closes #6774

7 years agoProvide commit message format for Coverity defects
Giuseppe Di Natale [Mon, 23 Oct 2017 16:47:16 +0000 (09:47 -0700)]
Provide commit message format for Coverity defects

Provide details about the commit message format for Coverity defect
fixes submitted.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6771

7 years agoEmit history events for 'zpool create'
Brian Behlendorf [Mon, 23 Oct 2017 16:45:59 +0000 (09:45 -0700)]
Emit history events for 'zpool create'

History commands and events were being suppressed for the
'zpool create' command since the history object did not
yet exist.  Create the object earlier so this history
doesn't get lost.

Split the pool_destroy event in to pool_destroy and
pool_export so they may be distinguished.

Updated events_001_pos and events_002_pos test cases.  They
now check for the expected history events and were reworked
to be more reliable.

Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6712
Closes #6486

7 years agoSupport integration with new QAT products
wli5 [Fri, 20 Oct 2017 18:11:25 +0000 (02:11 +0800)]
Support integration with new QAT products

Support integration with new QAT products: Intel(R) C62x Chipset,
or Atom(R) C3000 Processor Product Family SoC:
1. Detect new file name in auto-conf.
2. Change MAX_INSTANCES to 48.
3. Change "num_inst" to U16 to clean a build warning.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Weigang Li <weigang.li@intel.com>
Closes #6767

7 years agoAdd convenience 'zfs_get' functions
John [Thu, 19 Oct 2017 18:18:42 +0000 (11:18 -0700)]
Add convenience 'zfs_get' functions

Add get functions to match existing ones.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: John Ramsden <johnramsden@riseup.net>
Closes #6308

7 years agoRemove vn_rename and vn_remove dependency
Brian Behlendorf [Thu, 19 Oct 2017 17:06:55 +0000 (10:06 -0700)]
Remove vn_rename and vn_remove dependency

The only place vn_rename and vn_remove are used is when writing
out an updated pool configuration file.  By truncating the file
instead of renaming and removing it we can avoid having to implement
these interfaces entirely.  Functionally an empty cache file is
treated the same as a missing cache file.  This is particularly
advantageous because the Linux kernel has never provided a way
to reliably implement vn_rename and vn_remove.

The cachefile_004_pos.ksh test case was updated to understand
that an empty cache file is the same as a missing one.

The zfs-import-* systemd service files were not updated to use
ConditionFileNotEmpty in place of ConditionPathExists.  This
means that after exporting all pools and rebooting new pools
will not the scanned for on the next boot.  This small change
should not impact normal usage since pools are not exported
as part of a normal shutdown.

Documentation was updated accordingly.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Arkadiusz Bubała <arkadiusz.bubala@open-e.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes zfsonlinux/spl#648
Closes #6753

7 years agoFix ASSERT in dmu_free_long_object_raw()
Tom Caputi [Wed, 18 Oct 2017 17:08:36 +0000 (13:08 -0400)]
Fix ASSERT in dmu_free_long_object_raw()

This small patch fixes an issue where dmu_free_long_object_raw()
calls dnode_hold() after freeing the dnode a line above.

Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #6766

7 years agoUpdate codecov.io behavior
Brian Behlendorf [Wed, 18 Oct 2017 17:07:02 +0000 (10:07 -0700)]
Update codecov.io behavior

Update the codecov.yml included in the repository to behave as
originally intended.  This can be refined as needed.

* Always post coverage results to the GitHub PR after two builds
  have been uploaded.  This is the normal case since there will
  be a build uploaded for both kernel and user coverage results.

* Adjust red -> yellow -> green coloring in the web interface.
  Due to the number of unlikely error conditions which are hard
  to force consider 90% coverage an excellent level of coverage.

* Allow a 1% variance in coverage between test runs.  This is
  approximately 10x larger than the typical variance observed
  which leaves us a reasonable margin to prevent false positives.

* Always post a new smaller comment to PRs which does not include
  a file list.  Old coverage reports are removed.

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6765

7 years agoFix coverity defects: CID 161388
Tobin Harding [Tue, 17 Oct 2017 16:37:50 +0000 (03:37 +1100)]
Fix coverity defects: CID 161388

CID 161388: Resource Leak (REASOURCE_LEAK)

Jump to errout so that file descriptor gets closed before returning
from function.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Closes #6755

7 years agoFix coverity defects: 147480, 147584
Tobin Harding [Mon, 16 Oct 2017 22:32:48 +0000 (09:32 +1100)]
Fix coverity defects: 147480, 147584

CID 147480: Logically dead code (DEADCODE)

Remove non-null check and subsequent function call. Add ASSERT to future
proof the code.

usage label is only jumped to before `zhp` is initialized.

CID 147584: Out-of-bounds access (OVERRUN)

Subtract length of current string from buffer length for `size` argument
to `snprintf`.

Starting address for the write is the start of the buffer + the current
string length. We need to subtract this string length else risk a buffer
overflow.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Closes #6745

7 years agoAdd DKMS package on Debian-based distributions
Neal Gompa (ニール・ゴンパ) [Sun, 15 Oct 2017 20:00:44 +0000 (16:00 -0400)]
Add DKMS package on Debian-based distributions

* config/deb.am: Enable building DKMS packages for Debian
* rpm/generic/zfs-dkms.spec.in: Adjust spec to be Debian-compatible
  * Condition kernel-devel Req to RPM distros
  * Adjust the DKMS Req to have a minimum of a version only
  * Ensure that --rpm_safe_upgrade isn't used on non-RPM distros
* config/deb.am: Drop CONFIG_KERNEL and CONFIG_USER guards
* Makefile.am: Add pkg-dkms target

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Neal Gompa <ngompa@datto.com>
Closes #6044
Closes #6731

7 years agoFix function documentation to correctly mirror code
Tobin Harding [Fri, 13 Oct 2017 19:42:04 +0000 (06:42 +1100)]
Fix function documentation to correctly mirror code

Currently the function documentation states that two strings are
allocated, this is outdated. Only one char ** parameter is passed
into the function now, clearly only a pointer to a single string
is returned and needs to be free'd.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Closes #6754

7 years agoIncrease default zloop.sh vdev size
Brian Behlendorf [Fri, 13 Oct 2017 19:39:39 +0000 (12:39 -0700)]
Increase default zloop.sh vdev size

The default 128M vdev size used by zloop.sh isn't always large
enough and can result in ENOSPC failures which suspend the pool.
Increase the default size to 512M and provide a -s option which
can be used to specify an alternate size.

This does increase the free space requirements to run zloop.sh.
However, since the vdevs are sparse 4x the space is not required.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6758

7 years agoPost-Encryption Followup
Brian Behlendorf [Fri, 13 Oct 2017 17:02:39 +0000 (10:02 -0700)]
Post-Encryption Followup

This PR includes fixes for bugs and documentation issues found
after the encryption patch was merged and general code improvements
for long-term maintainability.

Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Issue #6526
Closes #6639
Closes #6703
Cloese #6706
Closes #6714
Closes #6595

7 years agoTypo in dsl_dataset.h
Damian Wojsław [Fri, 13 Oct 2017 00:10:38 +0000 (02:10 +0200)]
Typo in dsl_dataset.h

The parameters dsl_dataset_t *os in function prototype should be
renamed to dsl_dataset_t *ds.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Damian Wojsław <damian@wojslaw.pl>
Closes #6756
Closes #6273

7 years agoFixes for SPARC support
Brian Behlendorf [Thu, 12 Oct 2017 16:51:56 +0000 (09:51 -0700)]
Fixes for SPARC support

The current code base almost compiles on SPARC, but a few fixes are
required for the code to compile (and work efficiently). Code in this
PR comes from OpenZFS project which was initially dropped when porting
the crypto framework.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pengcheng Xu <i@jsteward.moe>
Closes #6733
Closes #6738
Closes #6750

7 years agoExplicitly depend on icp module in initramfs hook
Antonio Russo [Thu, 12 Oct 2017 16:39:45 +0000 (12:39 -0400)]
Explicitly depend on icp module in initramfs hook

Automatic dependency resolution is unreliable on many systems.
Follow suit with existing code, and explicitly include icp
in module dependencies.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
Closes #6751

7 years agoFix for #6714
Tom Caputi [Thu, 5 Oct 2017 17:43:34 +0000 (13:43 -0400)]
Fix for #6714

This 2 line patch fixes a possible integer overflow reported by grsec.

Signed-off-by: Tom Caputi <tcaputi@datto.com>
7 years agoFix for #6706
Tom Caputi [Tue, 3 Oct 2017 17:18:45 +0000 (13:18 -0400)]
Fix for #6706

This patch resolves an issue where raw sends would fail to send
encryption parameters if the wrapping key was unloaded and reloaded
before the data was sent and the dataset wass not an encryption root.
The code attempted to lookup the values from the wrapping key which
was not being initialized upon reload. This change forces the code to
lookup the correct value from the encryption root's DSL Crypto Key.
Unfortunately, this issue led to the on-disk DSL Crypto Key for some
non-encryption root datasets being left with zeroed out encryption
parameters. However, this should not present a problem since these
values are never looked at and are overrwritten upon changing keys.

This patch also fixes an issue where raw, resumable sends were not
being cleaned up appropriately if an invalid DSL Crypto Key was
received.

Signed-off-by: Tom Caputi <tcaputi@datto.com>
7 years agoFix for #6703
Tom Caputi [Tue, 3 Oct 2017 01:55:39 +0000 (21:55 -0400)]
Fix for #6703

This patch resolves an issue where spa_keystore_change_key_sync_impl()
incorrectly recursed into clone DSL Directories while recursively
rewrapping encryption keys. Clones share keys with their origins, so
this logic was incorrect.

Signed-off-by: Tom Caputi <tcaputi@datto.com>
7 years agoFixes for #6639
Tom Caputi [Thu, 28 Sep 2017 15:49:13 +0000 (11:49 -0400)]
Fixes for #6639

Several issues were uncovered by running stress tests with zfs
encryption and raw sends in particular. The issues and their
associated fixes are as follows:

* arc_read_done() has the ability to chain several requests for
  the same block of data via the arc_callback_t struct. In these
  cases, the ARC would only use the first request's dsobj from
  the bookmark to decrypt the data. This is problematic because
  the first request might be a prefetch zio which is able to
  handle the key not being loaded, while the second might use a
  different key that it is sure will work. The fix here is to
  pass the dsobj with each individual arc_callback_t so that each
  request can attempt to decrypt the data separately.

* DRR_FREE and DRR_FREEOBJECT records in a send file were not
  having their transactions properly tagged as raw during raw
  sends, which caused a panic when the dbuf code attempted to
  decrypt these blocks.

* traverse_prefetch_metadata() did not properly set
  ZIO_FLAG_SPECULATIVE when issuing prefetch IOs.

* Added a few asserts and code cleanups to ensure these issues
  are more detectable in the future.

Signed-off-by: Tom Caputi <tcaputi@datto.com>
7 years agoEncryption patch follow-up
Tom Caputi [Tue, 12 Sep 2017 20:15:11 +0000 (16:15 -0400)]
Encryption patch follow-up

* PBKDF2 implementation changed to OpenSSL implementation.

* HKDF implementation moved to its own file and tests
  added to ensure correctness.

* Removed libzfs's now unnecessary dependency on libzpool
  and libicp.

* Ztest can now create and test encrypted datasets. This is
  currently disabled until issue #6526 is resolved, but
  otherwise functions as advertised.

* Several small bug fixes discovered after enabling ztest
  to run on encrypted datasets.

* Fixed coverity defects added by the encryption patch.

* Updated man pages for encrypted send / receive behavior.

* Fixed a bug where encrypted datasets could receive
  DRR_WRITE_EMBEDDED records.

* Minor code cleanups / consolidation.

Signed-off-by: Tom Caputi <tcaputi@datto.com>
7 years agoRelax ASSERT for #6526
Tom Caputi [Wed, 11 Oct 2017 16:12:48 +0000 (12:12 -0400)]
Relax ASSERT for #6526

This patch resolves a minor issue where an ASSERT in
metaslab_passivate() that only applies to non weight-based
metaslabs was erroneously applied to all metaslabs.

Signed-off-by: Tom Caputi <tcaputi@datto.com>