]> granicus.if.org Git - zfs/log
zfs
8 years agoSuspend/resume zvol for recv and rollback
Chunwei Chen [Thu, 19 Jan 2017 21:56:36 +0000 (13:56 -0800)]
Suspend/resume zvol for recv and rollback

When doing recv and rollback, dsl_dataset_clone_swap_sync_impl will be
called to swap out the ds_objset and do dmu_objset_evict on the old one.
However, currently zv->zv_objset will not be swapped out accordingly, so
if anyone currently holds a fd on the zvol, we risk hitting a use-after-free.

We fix this by introducing the suspend and resume mechanism of zsb to
zv.  Before recv or rollback, we use zvol_suspend to block all access to
zv_objset and shut it down. After the recv or rollback, we use zvol_resume
to swap in zv_objset with the new ds_objset and unblock the access.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #4866
Closes #5609

8 years agoOpenZFS 6529 - Properly handle updates of variably-sized SA entries
George Melikov [Thu, 19 Jan 2017 21:50:22 +0000 (00:50 +0300)]
OpenZFS 6529 - Properly handle updates of variably-sized SA entries

Porting notes:
- This issue was first fixed in ZoL by commit d862cb0d.  That fix was
then modified and an equivalent version of the patch landed in the
upstream code base.  For additional details see the discussion in
https://github.com/openzfs/openzfs/pull/24 .

This commit aligns ZoL with OpenZFS codebase.

Authored by: Andriy Gapon <avg@icyb.net.ua>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Ned Bass <bass6@llnl.gov>
Reviewed by: Tim Chase <tim@chase2k.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Ported-by: George Melikov mail@gmelikov.ru
OpenZFS-issue: https://www.illumos.org/issues/6529
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/e7e978b
Closes #5606

8 years agoDisable racy test cases
Brian Behlendorf [Thu, 19 Jan 2017 18:24:27 +0000 (10:24 -0800)]
Disable racy test cases

The following test cases may currently fail for benign reasons.
Disable them until they can be updated to run reliably.

- ro_props_001_pos - only recently enabled in ce43e88
- nopwrite_volume

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5614

8 years agoOpenZFS 7659 - Missing thread_exit() in dmu_send.c
George Melikov [Wed, 18 Jan 2017 23:10:35 +0000 (02:10 +0300)]
OpenZFS 7659 - Missing thread_exit() in dmu_send.c

Two threads send_traverse_thread() and receive_writer_thread() should
end with thread_exit();

Mostly a cosmetic issue under IllumOS.

Authored by: Jorgen Lundman <lundman@lundman.net>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7659
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/a569268
Closes #5603

8 years agoOpenZFS 7257 - zfs manpage user property length needs to be updated
George Melikov [Tue, 17 Jan 2017 23:30:01 +0000 (02:30 +0300)]
OpenZFS 7257 - zfs manpage user property length needs to be updated

Since zpool version 16, this limit is actually 8192 characters.
Additionally, this limit is actually 8192 bytes, as it supports UTF-8.

Authored by: Eli Rosenthal <eli.rosenthal@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7257
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/3bc7169
Closes #5608

8 years agoOpenZFS 7235 - remove unused func dsl_dataset_set_blkptr
George Melikov [Tue, 17 Jan 2017 23:22:56 +0000 (02:22 +0300)]
OpenZFS 7235 - remove unused func dsl_dataset_set_blkptr

Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7235
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/bd56f80
Closes #5604

8 years agoOpenZFS 7256 - low probability race in zfs_get_data
George Melikov [Tue, 17 Jan 2017 23:18:59 +0000 (02:18 +0300)]
OpenZFS 7256 - low probability race in zfs_get_data

Authored by: Andriy Gapon <andriy.gapon@clusterhq.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7256
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/6ed18a8
Closes #5601

8 years agoOpenZFS 7071 - lzc_snapshot does not fill in errlist on ENOENT
George Melikov [Tue, 17 Jan 2017 22:52:17 +0000 (01:52 +0300)]
OpenZFS 7071 - lzc_snapshot does not fill in errlist on ENOENT

Authored by: Igor Kozhukhov ikozhukhov@gmail.com
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7071
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/25f7d99
Closes #5597

8 years agoOpenZFS 7082 - bptree_iterate() passes wrong args to zfs_dbgmsg()
George Melikov [Tue, 17 Jan 2017 22:49:24 +0000 (01:49 +0300)]
OpenZFS 7082 - bptree_iterate() passes wrong args to zfs_dbgmsg()

Authored by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7082
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/10e67aa
Closes #5596

8 years agoOpenZFS 6586 - Whitespace inconsistencies in the spa feature dependency arrays in...
Brian Behlendorf [Tue, 17 Jan 2017 22:46:28 +0000 (14:46 -0800)]
OpenZFS 6586 - Whitespace inconsistencies in the spa feature dependency arrays in zfeature_common.c

Porting Notes:
- Preserved 'static const spa_feature_t hole_birth_deps[]'.

Authored by: ilovezfs <ilovezfs@icloud.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Richard Laager <rlaager@wiktel.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/6586
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/22b6687
Closes #5592

8 years agoOpenZFS 6550 - cmd/zfs: cleanup gcc warnings
Brian Behlendorf [Tue, 17 Jan 2017 22:45:02 +0000 (14:45 -0800)]
OpenZFS 6550 - cmd/zfs: cleanup gcc warnings

Porting Notes:
- Many of the fixes proposed by this patch were already applied.
In the cases where a different but equivalent fix was made the
code was updated with the OpenZFS version to minimize differences.

Authored by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andy Stormont <astormont@racktopsystems.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/6550
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/c16bcc4
Closes #5591

8 years agoOpenZFS 6551 - cmd/zpool: cleanup gcc warnings
Brian Behlendorf [Tue, 17 Jan 2017 22:42:56 +0000 (14:42 -0800)]
OpenZFS 6551 - cmd/zpool: cleanup gcc warnings

Porting Notes:
- Many of the fixes proposed by this patch were already applied.
In the cases where a different but equivalent fix was made the
code was updated with the OpenZFS version to minimize differences.
- The zpool_get_vdev_by_name() function was previously removed
by commit  235db0a.

Authored by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andy Stormont <astormont@racktopsystems.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Haakan T Johansson <f96hajo@chalmers.se>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/6551
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/b327cd3
Closes #5590

8 years agoDon't hardcode perl path but use env instead
clefru [Fri, 13 Jan 2017 23:57:34 +0000 (00:57 +0100)]
Don't hardcode perl path but use env instead

Also replace the deprecated "-w" argument with "use warnings;", as
otherwise env would invoke a command called "perl -w".

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Clemens Fruhwirth <clemens@endorphin.org>
Closes #5552

8 years agoFix unallocated object detection for large_dnode datasets
LOLi [Fri, 13 Jan 2017 23:47:34 +0000 (00:47 +0100)]
Fix unallocated object detection for large_dnode datasets

Fix dmu_object_next() to correctly handle unallocated objects on
large_dnode datasets.

We implement this by scanning the dnode block until we find the correct
offset to be used in dnode_next_offset(). This is necessary because we
can't assume *objectp is a hole even if dmu_object_info() returns
ENOENT.

This fixes a couple of issues with zfs receive on large_dnode datasets.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #5027
Closes #5532

8 years agoOpenZFS 7603 - xuio_stat_wbuf_* should be declared (void)
Brian Behlendorf [Fri, 13 Jan 2017 23:33:14 +0000 (15:33 -0800)]
OpenZFS 7603 - xuio_stat_wbuf_* should be declared (void)

Porting Notes:
- include/sys/dmu.h prototypes were already updated in 0bc8fd7

Authored by: Prashanth Sreenivasa <pks@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Approved by: Richard Lowe <richlowe@richlowe.net>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/7603
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/99aa8b5
Closes #5586

8 years agoOpenZFS 7181 - race between zfs_mount and zfs_ioc_rollback
Brian Behlendorf [Fri, 13 Jan 2017 23:29:32 +0000 (15:29 -0800)]
OpenZFS 7181 - race between zfs_mount and zfs_ioc_rollback

Authored by: Andriy Gapon <andriy.gapon@clusterhq.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/7181
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/90f2c09
Closes #5585

8 years agomodule/Makefile.in: use relative cp
Jörg Thalheim [Fri, 13 Jan 2017 23:18:34 +0000 (00:18 +0100)]
module/Makefile.in: use relative cp

Assuming /bin/cp causes problems on systems where cp is
not in /bin such as NixOS.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Joerg Thalheim <joerg@higgsboson.tk>
Closes #5548

8 years agoAdd *_by-dnode routines
bzzz77 [Fri, 13 Jan 2017 22:58:41 +0000 (01:58 +0300)]
Add *_by-dnode routines

Add *_by_dnode() routines for accessing objects given their
dnode_t *, this is more efficient than accessing the object by
(objset_t *, uint64_t object).  This change converts some but
not all of the existing consumers.  As performance-sensitive
code paths are discovered they should be converted to use
these routines.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Closes #5534
Issue #4802

8 years agoOpenZFS 7743 - per-vdev-zaps init path for upgrade
Don Brady [Fri, 13 Jan 2017 21:50:22 +0000 (14:50 -0700)]
OpenZFS 7743 - per-vdev-zaps init path for upgrade

Authored by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Joe Stein <jas14@cs.brown.edu>
Ported-by: Don Brady <don.brady@intel.com>
When loading a pool that had been created before the existance of
per-vdev zaps, on a system that knows about per-vdev zaps, the
per-vdev zaps will not be allocated and initialized.

This appears to be because the logic that would have done so, in
spa_sync_config_object(), is not reached under normal operation. It is
only reached if spa_config_dirty_list is non-empty.

The fix is to add another `AVZ_ACTION_` enum that will allow this code
to be reached when we detect that we're loading an old pool, even when
there are no dirty configs.

OpenZFS-issue: https://www.illumos.org/issues/7743
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/e2d29d0
Closes #5582

8 years agoOpenZFS 7276 - zfs(1m) manpage could better describe space properties
George Melikov [Fri, 13 Jan 2017 21:31:29 +0000 (00:31 +0300)]
OpenZFS 7276 -  zfs(1m) manpage could better describe space properties

Authored by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7276
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/d750135
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/29c6739
Closes #5549

8 years agoFix zfs-share systemd unit file
LOLi [Fri, 13 Jan 2017 21:24:17 +0000 (22:24 +0100)]
Fix zfs-share systemd unit file

Use the system /bin directory rather than the package install
@bindir@.  This allows --prefix=/usr/local to work as intended.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #5559

8 years agoOpenZFS 6603 - zfeature_register() should verify ZFEATURE_FLAG_PER_DATASET implies...
George Melikov [Thu, 12 Jan 2017 19:58:04 +0000 (22:58 +0300)]
OpenZFS 6603 - zfeature_register() should verify ZFEATURE_FLAG_PER_DATASET implies SPA_FEATURE_EXTENSIBLE_DATASET

Authored by: ilovezfs <ilovezfs@icloud.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Richard Laager <rlaager@wiktel.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/6603
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/0803e91
Closes #5573

8 years agoOpenZFS 7303 - dynamic metaslab selection
Don Brady [Thu, 12 Jan 2017 19:52:56 +0000 (12:52 -0700)]
OpenZFS 7303 - dynamic metaslab selection

This change introduces a new weighting algorithm to improve
metaslab selection. The new weighting algorithm relies on the
SPACEMAP_HISTOGRAM feature. As a result, the metaslab weight
now encodes the type of weighting algorithm used (size-based
vs segment-based).

Porting Notes: The metaslab allocation tracing code is conditionally
removed on linux (dependent on mdb debugger).

Authored by: George Wilson <george.wilson@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Chris Siden <christopher.siden@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Pavel Zakharov pavel.zakharov@delphix.com
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: Don Brady <don.brady@intel.com>
OpenZFS-issue: https://www.illumos.org/issues/7303
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/d5190931bd
Closes #5404

8 years agoOpenZFS 6637 - replacing "dontclose" with "should_close"
George Melikov [Thu, 12 Jan 2017 19:25:27 +0000 (22:25 +0300)]
OpenZFS 6637 - replacing "dontclose" with "should_close"

Authored by: David Schwartz <dschwartz783@gmail.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
I find that this is a lot easier to read. "not don't close" is somewhat tough on the eyes.

OpenZFS-issue: https://www.illumos.org/issues/6637
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/d189620
Closes #5572

8 years agoOpenZFS 6328 - Fix cstyle errors in zfs codebase
George Melikov [Thu, 12 Jan 2017 17:42:11 +0000 (20:42 +0300)]
OpenZFS 6328 - Fix cstyle errors in zfs codebase

Authored by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed by: Jorgen Lundman <lundman@lundman.net>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/6328
OpenZFS-commit: https://github.com/illumos/illumos-gate/commit/9a686fb
Closes #5579

8 years agoFurther work on Github usability (issue templates)
George Melikov [Tue, 3 Jan 2017 21:01:48 +0000 (00:01 +0300)]
Further work on Github usability (issue templates)

Make issue template more obvious about importance of
searching the issue tracker first, and wrap logs appropriately.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #5542

8 years agoFix TypeError: unorderable types: str() > int() in arc_summary.py
Johnny Stenback [Tue, 3 Jan 2017 18:29:23 +0000 (10:29 -0800)]
Fix TypeError: unorderable types: str() > int() in arc_summary.py

Running arc_summary.py with a l2arc cache device around produces
the following error:

  Traceback (most recent call last):
    File "/usr/bin/arc_summary.py", line 1148, in <module>
      main()
    File "/usr/bin/arc_summary.py", line 1144, in main
      page(Kstat)
    File "/usr/bin/arc_summary.py", line 724, in _l2arc_summary
      arc["l2_arc_evicts"]["reading"] > 0:
  TypeError: unorderable types: str() > int()

This is due to arc["l2_arc_evicts"]['lock_retries'] and
arc["l2_arc_evicts"]["reading"] both being strings, returned
from fHits() earlier. Rather than adding them up and checking
if the result is > 0, this checks if either string is != '0'.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #5538

8 years agoOpenZFS 7259 - DS_FIELD_LARGE_BLOCKS is unused
George Melikov [Tue, 3 Jan 2017 18:03:05 +0000 (21:03 +0300)]
OpenZFS 7259 - DS_FIELD_LARGE_BLOCKS is unused

Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
The DS_FIELD_LARGE_BLOCKS macro has been unused since the integration of
this patch: 241b541 Illumos 5959 - clean up per-dataset feature count code.

This patch simply removes this macro from dsl_dataset.h.

OpenZFS-issue: https://www.illumos.org/issues/7259
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/faa8036
Closes #5544

8 years agoFix spelling
ka7 [Tue, 3 Jan 2017 17:31:18 +0000 (18:31 +0100)]
Fix spelling

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Haakan T Johansson <f96hajo@chalmers.se>
Closes #5547
Closes #5543

8 years ago4.10 compat - BIO flag changes and others
Tim Chase [Fri, 30 Dec 2016 22:03:59 +0000 (16:03 -0600)]
4.10 compat - BIO flag changes and others

[bio] The req_op enum was changed to req_opf.  Update the "Linux 4.8 API"
autotools checks to use an int to determine whether the various REQ_OP
values are defined.  This should work properly on kernels >= 4.8.

[bio] bio_set_op_attrs() is now an inline function and can't be detected
with #ifdef.  Add a configure check to determine whether bio_set_op_attrs()
is defined.  Move the local definition of it from vdev_disk.c to
blkdev_compat.h for consistency with other related compability shims.

[bio] The read/write flags and their modifiers, including WRITE_FLUSH,
WRITE_FUA and WRITE_FLUSH_FUA have been removed from fs.h.  Add the new
bio_set_flush() compatibility wrapper to replace VDEV_WRITE_FLUSH_FUA
and set the flags appropriately for each supported kernel version.

[vfs] The generic_readlink() function has been made static.  If .readlink
in inode_operations is NULL, generic_readlink() is used.

[zol typo] Completely unrelated to 4.10 compat, fix a typo in the check
for REQ_OP_SECURE_ERASE so that the proper macro is defined:

    s/HAVE_REQ_OP_SECURE_DISCARD/HAVE_REQ_OP_SECURE_ERASE/

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes #5499

8 years agoDon't persist temporary pool name on devices
LOLi [Thu, 22 Dec 2016 18:39:00 +0000 (19:39 +0100)]
Don't persist temporary pool name on devices

Fix a regression accidentally introduced by e0ab3ab.

Additionally, add a new script zpool_import_014_pos.ksh to
the ZFS test suite to exercise 'zpool import -t' functionality.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #5466
Closes #5515

8 years agoFix coverity defects: CID 147587
GeLiXin [Wed, 21 Dec 2016 19:27:24 +0000 (03:27 +0800)]
Fix coverity defects: CID 147587

CID 147587: Out-of-bounds read

Future changes may cause an array overrun of 4096 bytes at byte
offset 4096 by dereferencing pointer dstp.  Adding this additional
check ensures correctness.

Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: GeLiXin <ge.lixin@zte.com.cn>
Closes #5297

8 years agoRemove extra + from zfs man page
bunder2015 [Wed, 21 Dec 2016 19:06:02 +0000 (14:06 -0500)]
Remove extra + from zfs man page

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: bunder2015 <omfgbunder@gmail.com>
Closes #5508

8 years agoUse a dedicated taskq for vdev_file
Chunwei Chen [Wed, 21 Dec 2016 18:47:15 +0000 (10:47 -0800)]
Use a dedicated taskq for vdev_file

The introduction of parallel zvol prefetch causes deadlock when using
vdev_file.

spa_async->(spa_namespace_lock)->txg_wait_synced->(wait for txg_sync)
txg_sync->zio_wait->(wait for vdev_file_io_fsync on system_taskq)
zvol_prefetch_minors_impl (on system_taskq)->spa_open_common->(wait for spa_namespace_lock)

We fix this by using dedicated taskq for vdev_file.  This same change
was originally made in commit bc25c93 but reverted in commit aa9af22
when dynamic taskqs were added.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Closes #5506
Closes #5495

8 years agoFix dsl_props_set_sync_impl to work with nested nvlist
LOLi [Wed, 21 Dec 2016 02:46:59 +0000 (03:46 +0100)]
Fix dsl_props_set_sync_impl to work with nested nvlist

When iterating over the input nvlist in dsl_props_set_sync_impl() when we don't
preserve the nvpair name before looking up ZPROP_VALUE, so when we later go to
process it nvpair_name() is always "value" and not the actual property name.

This fixes a couple of bugs in zfs_ioc_recv():
* Received properties were not restored correctly when failing to receive an
incremental send stream
* Received properties were not completely replaced by the new ones when
successfully receiving an incremental send stream

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #5497

8 years agoFix file attributes
Brian Behlendorf [Mon, 19 Dec 2016 21:01:10 +0000 (13:01 -0800)]
Fix file attributes

This branch contains the following fixes/improvements.

* Fix setting i_flags
* Fix wrong operator in xvattr.h
* Fix fchange macro in zpl_ioctl_setflags()
* Added configure check to use inode_set_flags()
* Added a test case for chattr for better test coverage

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #5486
Closes #5470
Closes #5469

8 years agoFix coverity defects: CID 155008
cao [Mon, 19 Dec 2016 18:26:15 +0000 (02:26 +0800)]
Fix coverity defects: CID 155008

CID 155008:  Resource leaks  (RESOURCE_LEAK)

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5500

8 years agoFix zmo leak when zfs_sb_create fails
Chunwei Chen [Mon, 19 Dec 2016 17:46:29 +0000 (09:46 -0800)]
Fix zmo leak when zfs_sb_create fails

zfs_sb_create would normally takes ownership of zmo, and it will be freed in
zfs_sb_free. However, when zfs_sb_create fails we need to explicit free it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #5490
Closes #5496

8 years agoDon't run 'zpool iostat -c CMD' command on all vdevs, if vdevs specified
Tony Hutter [Sat, 17 Dec 2016 00:10:45 +0000 (16:10 -0800)]
Don't run 'zpool iostat -c CMD' command on all vdevs, if vdevs specified

zpool iostat allows you to specify only certain vdevs to display.
Currently, if you run 'zpool iostat -c CMD vdev1 vdev2 ...'
on specific vdevs, it will actually run the command on *all* vdevs,
and just display the results for the vdevs you specify.  This patch
corrects the behavior to only run the command on the specified vdevs,
and also enables the zpool_iostat_005_pos.ksh tests.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #5443

8 years agoAdd test for chattr
Chunwei Chen [Fri, 16 Dec 2016 23:15:48 +0000 (15:15 -0800)]
Add test for chattr

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
8 years agoUse inode_set_flags when available
Chunwei Chen [Fri, 16 Dec 2016 21:54:51 +0000 (13:54 -0800)]
Use inode_set_flags when available

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
8 years agoFix fchange in zpl_ioctl_setflags
Chunwei Chen [Fri, 16 Dec 2016 20:41:56 +0000 (12:41 -0800)]
Fix fchange in zpl_ioctl_setflags

The fchange in zpl_ioctl_setflags was for detecting flag change. However it
was incorrect and would always fail to detect a flag change from set to unset,
causing users without CAP_LINUX_IMMUTABLE to be able to unset flags.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
8 years agoFix coverity defects: CID 147534
cao [Fri, 16 Dec 2016 17:11:17 +0000 (01:11 +0800)]
Fix coverity defects: CID 147534

CID 147534: Negative array index read

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5467

8 years agoABD: Adapt avx512bw raidz assembly
Gvozden Neskovic [Fri, 16 Dec 2016 01:31:33 +0000 (02:31 +0100)]
ABD: Adapt avx512bw raidz assembly

Adapt avx512bw implementation for use with abd buffers. Mul2 implementation
is rewritten to take advantage of the BW instruction set.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Romain Dolbeau <romain.dolbeau@atos.net>
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Closes #5477

8 years agoFix wrong operator in xvattr.h
Chunwei Chen [Wed, 14 Dec 2016 22:53:56 +0000 (14:53 -0800)]
Fix wrong operator in xvattr.h

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
8 years agoFix i_flags issue caused by 64c688d
Chunwei Chen [Wed, 14 Dec 2016 22:18:53 +0000 (14:18 -0800)]
Fix i_flags issue caused by 64c688d

Fix zfs_xvattr_set to set S_IMMUTABLE and S_APPEND flags correctly.

Reinstate zfs_set_inode_flags and use it when zfs_xvatter_set and also when
setting up inode in zfs_znode_alloc and zfs_rezget.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
8 years agoAdd ida_destroy in zvol_fini to fix memleak
Chunwei Chen [Wed, 14 Dec 2016 17:41:39 +0000 (09:41 -0800)]
Add ida_destroy in zvol_fini to fix memleak

User of ida needs to call ida_destroy after using it. Otherwise
ida->free_bitmap and/or other stuff may leak.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #5484

8 years agoSkip xfstests on Ubuntu 16.04 and CentOS 7
Brian Behlendorf [Wed, 14 Dec 2016 17:36:14 +0000 (09:36 -0800)]
Skip xfstests on Ubuntu 16.04 and CentOS 7

The ZFS enabled versions of xfstests fails to build cleanly on
Ubuntu 16.04 and CentOS 7.  This issue should be resolved by
rebasing the ZFS patches against the latest xfstests and pushing
those patches upstream.  This would allow us to use an unmodified
xfstests.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #5481
Closes #5482

8 years agoSkip slow tests when kmemleak is enabled
Brian Behlendorf [Wed, 14 Dec 2016 17:33:07 +0000 (09:33 -0800)]
Skip slow tests when kmemleak is enabled

When running the ZFS Test Suite with a kmemleak enabled kernel
the following test cases run far slower than usual and may hit
their timeout threshold.  Skip the following test cases.

Test: cli_root/zfs_get/zfs_get_009_pos (run as root) [55:43]
Test: cli_root/zpool_clear/zpool_clear_001_pos (run as root) [11:32]
Test: cli_root/zpool_create/zpool_create_024_pos (run as root) [11:01]
Test: features/async_destroy/async_destroy_001_pos (run as root) [41:15]
Test: inheritance/inherit_001_pos (run as root) [09:08]

Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #5479
Closes #5480

8 years agoFix typos in dbuf.c
bunder2015 [Tue, 13 Dec 2016 22:21:02 +0000 (17:21 -0500)]
Fix typos in dbuf.c

This removes two large whitespaces in "modinfo zfs" as well as correcting
a couple typos.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: bunder2015 <omfgbunder@gmail.com>
Closes #5475

8 years agoUse cstyle -cpP in `make cstyle` check
Brian Behlendorf [Mon, 12 Dec 2016 18:46:26 +0000 (10:46 -0800)]
Use cstyle -cpP in `make cstyle` check

Enable picky cstyle checks and resolve the new warnings.  The vast
majority of the changes needed were to handle minor issues with
whitespace formatting.  This patch contains no functional changes.

Non-whitespace changes are as follows:

* 8 times ; to { } in for/while loop
* fix missing ; in cmd/zed/agents/zfs_diagnosis.c
* comment (confim -> confirm)
* change endline , to ; in cmd/zpool/zpool_main.c
* a number of /* BEGIN CSTYLED */ /* END CSTYLED */ blocks
* /* CSTYLED */ markers
* change == 0 to !
* ulong to unsigned long in module/zfs/dsl_scan.c
* rearrangement of module_param lines in module/zfs/metaslab.c
* add { } block around statement after for_each_online_node

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Håkan Johansson <f96hajo@chalmers.se>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5465

8 years agoAdd CONTRIBUTING information and templates
George Melikov [Fri, 9 Dec 2016 19:48:12 +0000 (20:48 +0100)]
Add CONTRIBUTING information and templates

Guidelines for developers and users describing how they can
participle in the project.

Reviewed-by: Manuel Mendez <mmendez534@gmail.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #672
Closes #4776
Closes #5361

8 years agoFix coverity defects: CID 147475
liaoyuxiangqin [Fri, 9 Dec 2016 18:59:36 +0000 (02:59 +0800)]
Fix coverity defects: CID 147475

CID 147475: Logically dead code (DEADCODE)

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: yuxiang <guo.yong33@zte.com.cn>
Closes #5421

8 years agoDon't count '@' for dataset namelen if not a snapshot
Chunwei Chen [Fri, 9 Dec 2016 18:52:08 +0000 (10:52 -0800)]
Don't count '@' for dataset namelen if not a snapshot

Don't count '@' for dataset namelen if not a snapshot.  This
fixes making a pool unimportable when the  dataset namelen
is 255.

Add test file for zfs create name length 255.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #5432
Closes #5456

8 years agoFix coverity defects: CID 154617
luozhengzheng [Thu, 8 Dec 2016 21:48:09 +0000 (05:48 +0800)]
Fix coverity defects: CID 154617

CID 154617: Memory - illegal accesses (UNINIT)

The value here just needs to be initialized to make Coverity happy.
When dsize == 0, then value of daiter.iter_mapaddr is irrelevant. That
address won't be accessed, it's only used for some arithmetic. dsize
can be zero either if dabd is null, or if code column is longer than the
current data column.

Reviewed-by: Gvozden Neskovic <neskovic@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Closes #5437

8 years agoSpeed up zvol import and export speed
Brian Behlendorf [Thu, 8 Dec 2016 21:05:02 +0000 (14:05 -0700)]
Speed up zvol import and export speed

Speed up import and export speed by:

* Add system delay taskq
* Parallel prefetch zvol dnodes during zvol_create_minors
* Parallel zvol_free during zvol_remove_minors
* Reduce list linear search using ida and hash

Reviewed-by: Boris Protopopov <boris.protopopov@actifio.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #5433

8 years agoRevert "Disable zio_dva_throttle_enabled by default"
Brian Behlendorf [Thu, 8 Dec 2016 20:57:42 +0000 (13:57 -0700)]
Revert "Disable zio_dva_throttle_enabled by default"

Enable zio_dva_throttle_enabled=1 by default. Subsequent
testing has been unable to reproduce the suspected regression.

Tested-by: kernelOfTruth kerneloftruth@gmail.com
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: Brian Behlendorf behlendorf1@llnl.gov
Reverts #5335
Closes #5289
Closes #5457

8 years agoCache ddt_get_dedup_dspace() value if there was no ddt changes
Gvozden Neskovic [Fri, 2 Dec 2016 23:59:35 +0000 (00:59 +0100)]
Cache ddt_get_dedup_dspace() value if there was no ddt changes

Save and reuse ddt dspace calculation when there have been no ddt changes.
This avoids unnecessary traversal of 168KiB of ddt histograms.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Closes #5425

8 years agoRefactor txg history kstat
Brian Behlendorf [Fri, 2 Dec 2016 23:57:49 +0000 (16:57 -0700)]
Refactor txg history kstat

It was observed that even when the txg history is disabled by
setting `zfs_txg_history=0` the txg_sync thread still fetches
the vdev stats unnecessarily.

This patch refactors the code such that vdev_get_stats() is no
longer called when `zfs_txg_history=0`.  And it further reduces
the  differences between upstream and the ZoL txg_sync_thread()
function.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5412

8 years agozvol_remove_minors do parallel zvol_free
Chunwei Chen [Wed, 30 Nov 2016 21:56:50 +0000 (13:56 -0800)]
zvol_remove_minors do parallel zvol_free

On some kernel version, blk_cleanup_queue and put_disk will wait for more then
10ms. So a pool with a lot of zvols will easily wait for more then 1 min if we
do zvol_free sequentially.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Requires-spl: refs/pull/588/head

8 years agozpool_create_minors parallel prefetch
Chunwei Chen [Wed, 30 Nov 2016 21:56:50 +0000 (13:56 -0800)]
zpool_create_minors parallel prefetch

Do parallel prefetch all zvol dnodes before actually creating each individual.
This will greatly reduce the import time when having a lot of zvols and disk
is slow.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
8 years agoEnable mountpoint_003_pos
ChaoyuZhang [Fri, 2 Dec 2016 18:20:57 +0000 (02:20 +0800)]
Enable mountpoint_003_pos

Update the test case to correctly interpret how Linux reports
the mount options.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: ChaoyuZhang <zhang.chaoyu@zte.com.cn>
Closes #5410

8 years agoSkip zpool_scrub_004_pos on 32-bit systems
Brian Behlendorf [Fri, 2 Dec 2016 17:10:23 +0000 (10:10 -0700)]
Skip zpool_scrub_004_pos on 32-bit systems

The zpool_scrub_004_pos test case currently fails when testing on
a 32-bit system.  Conditionally skip this test case on 32-bit
systems until the root cause is identified and resolved.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #5444
Closes #5445

8 years agoOpenZFS 7143 - dbuf_read() creates unnecessary zio_root() for bonus buf
Brian Behlendorf [Thu, 1 Dec 2016 23:50:11 +0000 (16:50 -0700)]
OpenZFS 7143 - dbuf_read() creates unnecessary zio_root() for bonus buf

dbuf_read() creates a zio_root() to track and wait for all the
zio's that may happen as part of this call. However, if the blkptr_t
for this buffer is NULL or a hole, we will not create any more zio's,
so this zio_root() is unnecessary. This is always the case when calling
dbuf_read() on a bonus buffer, because it has no blkptr (it's part of
the containing dnode). For workloads that read a lot of bonus buffers
(e.g. file creation and removal), creating and destroying these
unnecessary zio's can decrease performance by around 3%.

The fix is to only create/destroy the zio_root() in dbuf_read() if
the blkptr is not NULL and not a hole.

Changes sponsored by Intel Corp.

Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs/openzfs#137
Closes #4803
Closes #5382

8 years agoFix incorrect operator in abd_alloc_sametype()
luozhengzheng [Thu, 1 Dec 2016 23:45:16 +0000 (07:45 +0800)]
Fix incorrect operator in abd_alloc_sametype()

This should be & and not | so is_metadata is set correctly.

Reviewed-by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Closes #5438

8 years agoRemove unused sa_update_from_cb()
cao [Thu, 1 Dec 2016 23:39:06 +0000 (07:39 +0800)]
Remove unused sa_update_from_cb()

It looks like this was functionality which was added in the
original SA implementation and then never needed.  It can
be safely removed now and easily added back if we find a
use for it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5440

8 years agoCompile zio.h and zio_impl.h mutual include
cao [Thu, 1 Dec 2016 23:36:25 +0000 (07:36 +0800)]
Compile zio.h and zio_impl.h mutual include

zio.h includes zio_impl.h but zio_impl.h also includes zio.h, so the
header files to contain each other.  Get rid of the zio_impl.h include
in zio.h and update zio_inject.c to include zio.h instead of zio_impl.h.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5439

8 years agozvol: reduce linear list search
Chunwei Chen [Wed, 30 Nov 2016 21:56:50 +0000 (13:56 -0800)]
zvol: reduce linear list search

Use kernel ida to generate minor number, and use hash table to find zvol with
name.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
8 years agoUse system_delay_taskq for long delay tasks
Chunwei Chen [Wed, 30 Nov 2016 21:56:50 +0000 (13:56 -0800)]
Use system_delay_taskq for long delay tasks

Use it for spa_deadman, zpl_posix_acl_free, snapentry_expire.
This free system_taskq from the above long delay tasks, and allow us to do
taskq_wait_outstanding on system_taskq without being blocked forever, making
system_taskq more generic and useful.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
8 years agoDo not force VDEV_NAME_TYPE_ID in max_width()
Håkan Johansson [Thu, 1 Dec 2016 00:46:16 +0000 (01:46 +0100)]
Do not force VDEV_NAME_TYPE_ID in max_width()

Do not force VDEV_NAME_TYPE_ID in max_width(), instead add it
in the relevant calls to max_width().

The first location of max_width() where VDEV_NAME_TYPE_ID is
now added in show_import() is followed by print_import_config() and
print_logs().  Both these print children vdev names that have been
retrieved using an explicit VDEV_NAME_TYPE_ID added.

The second location is in status_callback().  This is followed by
print_status_config(), print_logs(), print_l2cache(), and
print_spares(). For l2cache and spares it should not matter as there
are no mirror-X or raidz-X involved.  print_status_config() as above
retrieves the name using explicit VDEV_NAME_TYPE_ID before
calling itself to print children.

The call of max_width() in get_namewidth() is not changed, as this is
used by zpool_do_iostat(), followed by print_iostat(), which does not
add VDEV_NAME_TYPE_ID.

Overall, we should consider adding VDEV_NAME_TYPE_ID to the
relevant name_flags / cb_name_flags fields, and remove the explicit
adding in called routines.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Haakan T Johansson <f96hajo@chalmers.se>
Closes #5401

8 years agoConvert zio_buf_alloc() consumers
Brian Behlendorf [Wed, 30 Nov 2016 23:18:20 +0000 (16:18 -0700)]
Convert zio_buf_alloc() consumers

In multiple cases zio_buf_alloc() was used instead of kmem_alloc()
or vmem_alloc().  This was often done because the allocations
could be large and it was easy to use zfs_buf_alloc() for them.

But this isn't ideal for allocations which are small or short
lived.  In these cases it is better to use kmem_alloc() or
vmem_alloc().  If possible we want to avoid the case where
we have slabs allocated for kmem caches which are rarely used.

Note for small allocations vmem_alloc() will be internally
converted to kmem_alloc().  Therefore as long as large
allocations are infrequent and short lived the penalty for
using vmem_alloc() is small.

Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5409

8 years agoIntroduce ARC Buffer Data (ABD)
Brian Behlendorf [Wed, 30 Nov 2016 21:48:16 +0000 (14:48 -0700)]
Introduce ARC Buffer Data (ABD)

ZFS currently uses ARC buffers which are backed by virtual memory.
While functional, there are some major problems with this approach
which can be observed on all OpenZFS platforms.  ABD was designed
to address these issues and includes contributions from OpenZFS
developers from multiple platforms.

While all OpenZFS platforms will benefit from ABD this functionality
is critical for Linux.  Unlike the other OpenZFS platforms the Linux
kernel discourages extensive use of virtual memory.  The provided
interfaces are not optimized for frequent allocations from the virtual
address space.  To maintain good performance a kmem cache is
used which contains relatively long lived slabs backed by virtual
memory.  The downside to the approach is that those slabs can
become highly fragmented resulting in an inefficient use of memory.

Another issue is that on 32-bit systems the available virtual
address space in the kernel is only a small fraction of total
system memory.  This means the ARC size is highly constrained
which hurts performance and make allocating memory difficult
and OOMs more likely.

ABD is designed to address these issues by using scatter lists
of pages for data buffers.  This removes the need for slabs
which resolves the fragmentation issue.  It also allows high
memory pages to be allocated which alleviates the virtual
address space pressure on 32-bit systems.

For metadata buffers, which are small, linear ABDs are allocated
from the slab.  This is preferable because there are many places
in the code which expect to be able to read from a given offset
in the buffer.  Using linear ABDs means none of that code needs
to be modified.  The majority of these buffers are allocated with
kmalloc so there's minimal impact of the virtual address space.

Tested-by: Kash Pande <kash@tripleback.net>
Tested-by: kernelOfTruth <kerneloftruth@gmail.com>
Tested-by: RageLtMan <rageltman@sempervictus>
Tested-by: DHE <git@dehacked.net>
Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed-by: David Quigley <david.quigley@intel.com>
Reviewed-by: Gvozden Neskovic <neskovic@gmail.com>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3441
Closes #5135

8 years agoEnable ro_props_001_pos
ChaoyuZhang [Wed, 30 Nov 2016 18:27:04 +0000 (02:27 +0800)]
Enable ro_props_001_pos

This script was disabled as the avail/used space changed slightly.
Add sync_pool() and a short delay after snapshots are created to
ensure everything in flight has been written.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: ChaoyuZhang <zhang.chaoyu@zte.com.cn>
Closes #5201
Closes #5419

8 years agoFix coverity defects: CID 154591
luozhengzheng [Wed, 30 Nov 2016 17:48:01 +0000 (01:48 +0800)]
Fix coverity defects: CID 154591

CID 154591: Incorrect expression (SIZEOF_MISMATCH)

Reviewed-by: Gvozden Neskovic <neskovic@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Closes #5435

8 years agoABD optimized page allocation code
Chunwei Chen [Wed, 26 Oct 2016 04:32:23 +0000 (00:32 -0400)]
ABD optimized page allocation code

* Convert ABD to use the Linux Kernel scatterlist implementation
  instead of the hand rolled one from illumos.

* Scatter ABDs are preferentially populated with higher order
  compound pages from a single zone.  Allocation size is
  progressively decreased until it can be satisfied without
  performing reclaim or compaction.

* An alternate page allocator is provided for kernels older
  than 3.6 and for CONFIG_HIGHMEM systems.  This allocator
  is designed as a fallback for maximum compatibility.

* Extended abdstats to provide visibility in the the allocator.

* Add cached value for PAGESIZE in userspace.

Contributions-by:
Chunwei Chen <david.chen@osnexus.com>
Gvozden Neskovic <neskovic@gmail.com>
Jinshan Xiong <jinshan.xiong@intel.com>
Isaac Huang <he.huang@intel.com>
David Quigley <david.quigley@intel.com>
Brian Behlendorf <behlendorf1@llnl.gov>

8 years agoABD kmap to kmap_atomic
Chunwei Chen [Tue, 27 Sep 2016 21:30:02 +0000 (17:30 -0400)]
ABD kmap to kmap_atomic

Convert usage of kmap to kmap_atomic while correctly saving off
irq state.

8 years agoABD raidz NEON support
Romain Dolbeau [Tue, 22 Nov 2016 07:38:34 +0000 (08:38 +0100)]
ABD raidz NEON support

Port NEON implementation of RAID-Z functions to ABD.

Signed-off-by: Roomain Dolbeau <romain.dolbeau@atos.net>
8 years agoABD raidz avx512f support
Gvozden Neskovic [Sun, 20 Nov 2016 05:01:31 +0000 (06:01 +0100)]
ABD raidz avx512f support

Implement shift based multiplication for 512f. Higher IPC over lookup based
methods yields up to 40% better performance on the current hardware.

Results on Xeon Phi(TM) CPU 7210:
implementation   gen_p           gen_pq          gen_pqr         rec_p           rec_q           rec_r           rec_pq          rec_pr          rec_qr          rec_pqr
original         142232671       24411492        12948205        283053705       22348167        4215911         9171609         2265548         2378370         1648495
scalar           295711162       49851491        33253815        293198109       88179448        61866752        27941684        25764416        17384442        12138153
sse2             410055998       199642658       117973654       406240463       152688682       121092250       84968180        79291076        47473657        20779719
ssse3            411641595       199669571       117937647       406211024       137638508       117050346       81263322        76120405        46281559        32696722
avx2             616485806       311515332       188595628       605455115       260602390       230554476       148198817       138800254       92273356        62937819
avx512f          832191523       408509425       253599522       810094481       404325734       317590971       218235687       197204920       133101937       94001219
fastest          avx512f         avx512f         avx512f         avx512f         avx512f         avx512f         avx512f         avx512f         avx512f         avx512f

Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
8 years agoABD Vectorized raidz
Gvozden Neskovic [Wed, 24 Aug 2016 13:51:33 +0000 (15:51 +0200)]
ABD Vectorized raidz

Enable vectorized raidz code on ABD buffers.  The avx512f,
avx512bw, neon and aarch64_neonx2 are disabled in this commit.
With the exception of avx512bw these implementations are
updated for ABD in the subsequent commits.

Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
8 years agoABD changes for vectorized RAIDZ
Gvozden Neskovic [Wed, 24 Aug 2016 13:42:51 +0000 (15:42 +0200)]
ABD changes for vectorized RAIDZ

* userspace: aligned buffers. Minimum of 32B alignment is
  needed for AVX2. Kernel buffers are aligned 512B or more.
* add abd_get_offset_size() interface
* abd_iter_map(): fix calculation of iter_mapsize
* add abd_raidz_gen_iterate() and abd_raidz_rec_iterate()

Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
8 years agoABD page support to vdev_disk.c
Isaac Huang [Wed, 31 Aug 2016 06:26:43 +0000 (00:26 -0600)]
ABD page support to vdev_disk.c

Signed-off-by: Isaac Huang <he.huang@intel.com>
8 years agoDLPX-44812 integrate EP-220 large memory scalability
David Quigley [Fri, 22 Jul 2016 15:52:49 +0000 (11:52 -0400)]
DLPX-44812 integrate EP-220 large memory scalability

8 years agozstreamdump needs to initialize fletcher 4 support
Tim Chase [Tue, 29 Nov 2016 21:47:05 +0000 (15:47 -0600)]
zstreamdump needs to initialize fletcher 4 support

Otherwise, the checksum function pointer isn't initialized.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes #5411

8 years agoAdd -c to zpool iostat & status to run command
Tony Hutter [Tue, 29 Nov 2016 21:45:38 +0000 (13:45 -0800)]
Add -c to zpool iostat & status to run command

This patch adds a command (-c) option to zpool status and zpool iostat.  The
-c option allows you to run an arbitrary command on each vdev and display
the first line of output in zpool status/iostat.  The environment vars
VDEV_PATH and VDEV_UPATH are set to the vdev's path and "underlying path"
before running the command.  For device mapper, multipath, or partitioned
vdevs, VDEV_UPATH is the actual underlying /dev/sd* disk.  This can be useful
if the command you're running requires a /dev/sd* device.

The patch also uses /sys/block/<dev>/slaves/ to lookup the underlying device
instead of using libdevmapper.  This not only removes the libdevmapper
requirement at build time, but also allows you to resolve device mapper
devices without being root.  This means that UDEV_UPATH get set correctly
when running zpool status/iostat as an unprivileged user.

Example:

$ zpool status -c 'echo I am $VDEV_PATH, $VDEV_UPATH'

NAME        STATE     READ WRITE CKSUM
mypool      ONLINE       0     0     0
  mirror-0  ONLINE       0     0     0
    mpatha  ONLINE       0     0     0  I am /dev/mapper/mpatha, /dev/sdc
    sdb     ONLINE       0     0     0  I am /dev/sdb1, /dev/sdb

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #5368

8 years agoAllow zfs unshare <protocol> -a
LOLi [Tue, 29 Nov 2016 19:22:38 +0000 (20:22 +0100)]
Allow zfs unshare <protocol> -a

Allow `zfs unshare <protocol> -a` command to share or unshare all datasets
of a given protocol, nfs or smb.

Additionally, enable most of ZFS Test Suite zfs_share/zfs_unshare test cases.
To work around some Illumos-specific functionalities ($SHARE/$UNSHARE) some
function wrappers were added around them.

Finally, fix and issue in smb_is_share_active() that would leave SMB shares
exported when invoking 'zfs unshare -a'

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #3238
Closes #5367

8 years agoEnsure that perf regression tests cleanup properly
Giuseppe Di Natale [Tue, 29 Nov 2016 00:24:47 +0000 (16:24 -0800)]
Ensure that perf regression tests cleanup properly

Each test in the performance regression test suite
creates a pool and a dataset for use. Unfortunately,
these tests do not cleanup the pool and dataset
correctly once they complete. Each test now kills
fio and iostat, destroys the dataset, and finally
destroys the pool. Each test also now traps the
SIGTERM signal to handle cases where test-runner
kills a test.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Requires-builders: all
Closes #5407

8 years agoEnable user_property_002_pos
ChaoyuZhang [Sat, 19 Nov 2016 00:25:06 +0000 (08:25 +0800)]
Enable user_property_002_pos

The user_property_002_pos passes as expected.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: ChaoyuZhang <zhang.chaoyu@zte.com.cn>
Closes #5406

8 years agoKernel 4.9 compat: file_operations->aio_fsync removal
DeHackEd [Tue, 15 Nov 2016 17:20:46 +0000 (12:20 -0500)]
Kernel 4.9 compat: file_operations->aio_fsync removal

Linux kernel commit 723c038475b78 removed this field.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #5393

8 years agoFix man page formatting in zfs-module-parameters
DeHackEd [Tue, 15 Nov 2016 01:03:57 +0000 (20:03 -0500)]
Fix man page formatting in zfs-module-parameters

Bold and Normal codes were mixed up in a few places resulting in
bad highlighting.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #5397

8 years agoRepair indent of zpool.8 man page
Håkan Johansson [Mon, 14 Nov 2016 17:47:49 +0000 (18:47 +0100)]
Repair indent of zpool.8 man page

Repair indent of zpool.8 man page, just before zpool labelclear
details.  Accidentally introduced by 193a37cb2 (git bisect).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Haakan T Johansson <f96hajo@chalmers.se>
Closes #5394

8 years agoFix 'zpool import' detection issue
Brian Behlendorf [Mon, 14 Nov 2016 17:40:18 +0000 (09:40 -0800)]
Fix 'zpool import' detection issue

Before adding the entry to the configuration verify that the
device can be opened exclusively.  This ensures that as long
as multipathd is running the underlying multipath devices, which
otherwise appear identical to their /dev/mapper counterpart,
are pruned from the configuration.

Failure to do so can result in a result in the vdev appearing
as UNAVAIL when the vdev path provided to the kernel can't be
opened exclusively.

This check would normally be performed in zpool_open_func()
but placing it there would result in false positives because
it is called concurrently for many devices.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5387

8 years agoAdd a statechange notify zedlet
Don Brady [Thu, 10 Nov 2016 21:52:59 +0000 (14:52 -0700)]
Add a statechange notify zedlet

Now that ZED has internal fault diagnosis and the statechange event
is generated for faulted states, we can replace the io-notify and
checksum-notify zedlets with one based on statechange.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Don Brady <don.brady@intel.com>
Closes #5383

8 years agoFix coverity defects: CID 147503
luozhengzheng [Thu, 10 Nov 2016 16:50:32 +0000 (00:50 +0800)]
Fix coverity defects: CID 147503

CID 147503: Dereference after null check (FORWARD_NULL)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Closes #5326

8 years agoFix coverity defects: CID 147540, 147542
cao [Thu, 10 Nov 2016 01:35:26 +0000 (09:35 +0800)]
Fix coverity defects: CID 147540, 147542

CID 147540: unsigned_compare
- Cast nsec to a int32_t to properly detect the expected overflow.
CID 147542: unsigned_compare
- intval can never be less than ZIO_FAILURE_MODE_WAIT which is
  defined to be zero.  Remove this useless check.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5379

8 years agoFix ZFS_AC_KERNEL_SET_CACHED_ACL_USABLE check
Gvozden Neskovic [Wed, 9 Nov 2016 21:53:13 +0000 (22:53 +0100)]
Fix ZFS_AC_KERNEL_SET_CACHED_ACL_USABLE check

Pass `ACL_TYPE_ACCESS` for type parameter of `set_cached_acl()` and
`forget_cached_acl()` to avoid removal of dead code after BUG() in
compile time. Tested on 3.2.0 kernel.

Introduced in 3779913

Reviewed-by: Massimo Maggi <me@massimo-maggi.eu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Closes #5378

8 years agoExport symbol dmu_objset_userobjspace_upgradable
jxiong [Wed, 9 Nov 2016 21:51:12 +0000 (13:51 -0800)]
Export symbol dmu_objset_userobjspace_upgradable

It's used by Lustre to determine if the objset can be upgraded.
The inline version doesn't work because dmu_objset_is_snapshot()
is not exported.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Closes #5385

8 years agoLinux 3.14 compat: assign inode->set_acl
tuxoko [Wed, 9 Nov 2016 18:37:17 +0000 (10:37 -0800)]
Linux 3.14 compat: assign inode->set_acl

Linux 3.14 introduces inode->set_acl(). Normally, acl modification will come
from setxattr, which will handle by the acl xattr_handler, and we already
handles that well. However, nfsd will directly calls inode->set_acl or
return error if it doesn't exists.

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed-by: Massimo Maggi <me@massimo-maggi.eu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #5371
Closes #5375

8 years agoFix symlinks for {vdev_clear,statechange}-led.sh
Olaf Faaland [Wed, 9 Nov 2016 18:19:43 +0000 (10:19 -0800)]
Fix symlinks for {vdev_clear,statechange}-led.sh

These were named in the zed/Makefile.am as vdev_clear-blinkled.sh
and statechange-blinkled.sh causing bad symlinks to be created.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #5384

8 years agoFix coverity defects: CID 147586
cao [Wed, 9 Nov 2016 01:33:23 +0000 (09:33 +0800)]
Fix coverity defects: CID 147586

CID 147586: function:allow_usage Type:out-of-bounds read

Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5364

8 years agoFix coverity defects: CID 147629
cao [Wed, 9 Nov 2016 00:41:31 +0000 (08:41 +0800)]
Fix coverity defects: CID 147629

CID 147629: Type:Dereference before null check

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov
Signed-off-by: <cao.xuewen cao.xuewen@zte.com.cn>
Closes #5376