]> granicus.if.org Git - zfs/log
zfs
8 years agoIllumos 1644 add ZFS "clones" property
Matthew Ahrens [Wed, 11 May 2016 18:46:14 +0000 (13:46 -0500)]
Illumos 1644 add ZFS "clones" property

Reviewed by: Richard Lowe <richlowe@richlowe.net>
Reviewed by: George Wilson <gwilson@zfsmail.com>
Approved by: Gordon Ross <gwr@nexenta.com>

References:
 https://www.illumos.org/issues/1644

Ported-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
8 years agoIllumos 1502 Remove conversion cruft from manpages
Richard Laager [Wed, 11 May 2016 19:16:21 +0000 (14:16 -0500)]
Illumos 1502 Remove conversion cruft from manpages

Reviewed by: Alexander Eremin <alexander.eremin@nexenta.com>
Reviewed by: Gordon Ross <gordon.w.ross@gmail.com>
Reviewed by: Garrett D'Amore <garrett.damore@gmail.com>

References:
 https://www.illumos.org/issues/1502

Ported-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Conflicts:
man/man8/zpool.8

8 years agozfs.8 & mount.zfs.8: fix a few typos
Ruben Kerkhof [Mon, 16 May 2016 12:14:32 +0000 (14:14 +0200)]
zfs.8 & mount.zfs.8: fix a few typos

filesytem -> filesystem
defntext -> defcontext

Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com>
8 years agozfs.8 & zpool.8: Standardize property value order
Richard Laager [Wed, 11 May 2016 18:21:06 +0000 (13:21 -0500)]
zfs.8 & zpool.8: Standardize property value order

The default value is now always listed first.

Signed-off-by: Richard Laager <rlaager@wiktel.com>
8 years agozfs.8 & zpool.8: Various documentation edits
Richard Laager [Wed, 11 May 2016 18:19:31 +0000 (13:19 -0500)]
zfs.8 & zpool.8: Various documentation edits

Signed-off-by: Richard Laager <rlaager@wiktel.com>
8 years agozfs.8: Improve zfs upgrade documentation
Richard Laager [Wed, 11 May 2016 18:05:33 +0000 (13:05 -0500)]
zfs.8: Improve zfs upgrade documentation

Signed-off-by: Richard Laager <rlaager@wiktel.com>
8 years agozfs.8: Cleanup stray code
Richard Laager [Wed, 11 May 2016 18:04:02 +0000 (13:04 -0500)]
zfs.8: Cleanup stray code

Bad copy-and-paste?

Signed-off-by: Richard Laager <rlaager@wiktel.com>
8 years agozfs.8 & zpool.8: Drop legal/illegal
Richard Laager [Wed, 11 May 2016 16:38:51 +0000 (11:38 -0500)]
zfs.8 & zpool.8: Drop legal/illegal

There's a convention in documentation that these words not be used to
mean "invalid".

Signed-off-by: Richard Laager <rlaager@wiktel.com>
8 years agozfs.8: Fix minor typos and the like
Richard Laager [Wed, 11 May 2016 16:27:00 +0000 (11:27 -0500)]
zfs.8: Fix minor typos and the like

This commit only contains the most trivial of changes.

Signed-off-by: Richard Laager <rlaager@wiktel.com>
8 years agozfs.8: Rework native vs user properties
Richard Laager [Wed, 11 May 2016 16:20:14 +0000 (11:20 -0500)]
zfs.8: Rework native vs user properties

Signed-off-by: Richard Laager <rlaager@wiktel.com>
8 years agozfs.8 & zpool.8: Linux/Solaris differences
Richard Laager [Wed, 11 May 2016 16:11:02 +0000 (11:11 -0500)]
zfs.8 & zpool.8: Linux/Solaris differences

Signed-off-by: Richard Laager <rlaager@wiktel.com>
8 years agozfs.8: Improve mount option documentation
Richard Laager [Wed, 11 May 2016 16:02:17 +0000 (11:02 -0500)]
zfs.8: Improve mount option documentation

This change is primarily about adding inline references in the
properties section to the traditional mount option names.

There are some other editorial changes too.

Signed-off-by: Richard Laager <rlaager@wiktel.com>
8 years agozfs.8: Improve consistency in size documentation
Richard Laager [Wed, 11 May 2016 15:54:27 +0000 (10:54 -0500)]
zfs.8: Improve consistency in size documentation

Signed-off-by: Richard Laager <rlaager@wiktel.com>
8 years agozfs.8: Drop references to Oracle documentation
Richard Laager [Wed, 11 May 2016 15:40:42 +0000 (10:40 -0500)]
zfs.8: Drop references to Oracle documentation

Signed-off-by: Richard Laager <rlaager@wiktel.com>
8 years agozfs.8: zfs get and zfs list accept mountpoints
Richard Laager [Wed, 11 May 2016 13:15:20 +0000 (08:15 -0500)]
zfs.8: zfs get and zfs list accept mountpoints

Signed-off-by: Richard Laager <rlaager@wiktel.com>
8 years agoOpenZFS 6739 - assumption in cv_timedwait_hires
Denys Rtveliashvili [Sun, 15 May 2016 22:18:25 +0000 (22:18 +0000)]
OpenZFS 6739 - assumption in cv_timedwait_hires

Userland version of cv_timedwait_hires() always assumes absolute time.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported by: Denys Rtveliashvili <denys@rtveliashvili.name>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/6739
OpenZFS-commit: https://github.com/illumos/illumos-gate/commit/41c6413

Porting Notes:
The ported change has revealed a number of problems in the Linux-specific code,
as it was expecting incorrect return codes from pthread_* functions.
Reviewed and improved the usage of pthread_* function in lib/libzpool/kernel.c.

8 years agoFix the test to use the variable
jyxent [Sat, 14 May 2016 03:44:03 +0000 (21:44 -0600)]
Fix the test to use the variable

Signed-off-by: Manuel Amador (Rudd-O) <rudd-o@rudd-o.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4645

8 years agoUse cv_timedwait_sig_hires in arc_reclaim_thread
Chunwei Chen [Wed, 11 May 2016 23:55:48 +0000 (16:55 -0700)]
Use cv_timedwait_sig_hires in arc_reclaim_thread

The was originally using interruptible cv_timedwait_sig, but was changed
to uninterruptible cv_timedwait_hires in ae6d0c6. Use _sig_hires instead
to allow interruptible sleep.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4633
Closes #4634

8 years agoA collection of dracut fixes
Manuel Amador (Rudd-O) [Sun, 24 Apr 2016 11:35:44 +0000 (11:35 +0000)]
A collection of dracut fixes

- In older systems without sysroot.mount, import before dracut-mount,
  and re-enable old dracut mount hook
- rootflags MUST be present even if the administrator neglected to
  specify it explicitly
- Check that mount.zfs exists in sbindir
- Remove awk and head as (now unused) requirements, add grep, and
  install the right mount.zfs
- Eliminate one use of grep in Dracut
- Use a more accurate grepping statement to identify zfsutil in rootflags
- Ensure that pooldev is nonempty
- Properly handle /dev/sd* devices and more
- Use new -P to get list of zpool devices
- Bail out of the generator when zfs:AUTO is on the root command line
- Ignore errors from systemctl trying to load sysroot.mount, we only
  care about the output
- Determine which one is the correct initqueuedir at run time.
- Add a compatibility getargbool for our detection / setup script.
- Update dracut .gitignore files

Signed-off-by: <Matthew Thode mthode@mthode.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4558
Closes #4562

8 years agoOpenZFS 6093 - zfsctl_shares_lookup
Dan McDonald [Wed, 11 May 2016 19:03:51 +0000 (12:03 -0700)]
OpenZFS 6093 - zfsctl_shares_lookup

6093 zfsctl_shares_lookup should only VN_RELE() on zfs_zget() success

Reviewed by: Gordon Ross <gwr@nexenta.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/6093
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/0f92170
Closes #4630

This function was always implemented slightly differently under Linux
and therefore never suffered from this issue.  The patch has been
updated and applied as cleanup in order to minimize differences with
the upstream OpenZFS code.

8 years agoRevert "Kill znode->z_gen field"
Brian Behlendorf [Thu, 12 May 2016 20:31:55 +0000 (13:31 -0700)]
Revert "Kill znode->z_gen field"

This reverts commit 4cd77889b684fd0dd1a0a995b692dda3db76a9ac.  The
i_generation field in the inode is 32-bit and the SA code expects
64-bit fixed values.  Revert this optimization for now until
this is cleanly addressed.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #4538

8 years agoAdd -lhHpw options to "zpool iostat" for avg latency, histograms, & queues
Tony Hutter [Mon, 29 Feb 2016 18:05:23 +0000 (10:05 -0800)]
Add -lhHpw options to "zpool iostat" for avg latency, histograms, & queues

Update the zfs module to collect statistics on average latencies, queue sizes,
and keep an internal histogram of all IO latencies.  Along with this, update
"zpool iostat" with some new options to print out the stats:

-l: Include average IO latencies stats:

 total_wait     disk_wait    syncq_wait    asyncq_wait  scrub
 read  write   read  write   read  write   read  write   wait
-----  -----  -----  -----  -----  -----  -----  -----  -----
    -   41ms      -    2ms      -   46ms      -    4ms      -
    -    5ms      -    1ms      -    1us      -    4ms      -
    -    5ms      -    1ms      -    1us      -    4ms      -
    -      -      -      -      -      -      -      -      -
    -   49ms      -    2ms      -   47ms      -      -      -
    -      -      -      -      -      -      -      -      -
    -    2ms      -    1ms      -      -      -    1ms      -
-----  -----  -----  -----  -----  -----  -----  -----  -----
  1ms    1ms    1ms  413us   16us   25us      -    5ms      -
  1ms    1ms    1ms  413us   16us   25us      -    5ms      -
  2ms    1ms    2ms  412us   26us   25us      -    5ms      -
    -    1ms      -  413us      -   25us      -    5ms      -
    -    1ms      -  460us      -   29us      -    5ms      -
196us    1ms  196us  370us    7us   23us      -    5ms      -
-----  -----  -----  -----  -----  -----  -----  -----  -----

-w: Print out latency histograms:

sdb           total           disk         sync_queue      async_queue
latency    read   write    read   write    read   write    read   write   scrub
-------  ------  ------  ------  ------  ------  ------  ------  ------  ------
1ns           0       0       0       0       0       0       0       0       0
...
33us          0       0       0       0       0       0       0       0       0
66us          0       0     107    2486       2     788      12      12       0
131us         2     797     359    4499      10     558     184     184       6
262us        22     801     264    1563      10     286     287     287      24
524us        87     575      71   52086      15    1063     136     136      92
1ms         152    1190       5   41292       4    1693     252     252     141
2ms         245    2018       0   50007       0    2322     371     371     220
4ms         189    7455      22  162957       0    3912    6726    6726     199
8ms         108    9461       0  102320       0    5775    2526    2526      86
17ms         23   11287       0   37142       0    8043    1813    1813      19
34ms          0   14725       0   24015       0   11732    3071    3071       0
67ms          0   23597       0    7914       0   18113    5025    5025       0
134ms         0   33798       0     254       0   25755    7326    7326       0
268ms         0   51780       0      12       0   41593   10002   10002       0
537ms         0   77808       0       0       0   64255   13120   13120       0
1s            0  105281       0       0       0   83805   20841   20841       0
2s            0   88248       0       0       0   73772   14006   14006       0
4s            0   47266       0       0       0   29783   17176   17176       0
9s            0   10460       0       0       0    4130    6295    6295       0
17s           0       0       0       0       0       0       0       0       0
34s           0       0       0       0       0       0       0       0       0
69s           0       0       0       0       0       0       0       0       0
137s          0       0       0       0       0       0       0       0       0
-------------------------------------------------------------------------------

-h: Help

-H: Scripted mode. Do not display headers, and separate fields by a single
    tab instead of arbitrary space.

-q: Include current number of entries in sync & async read/write queues,
    and scrub queue:

 syncq_read    syncq_write   asyncq_read  asyncq_write   scrubq_read
 pend  activ   pend  activ   pend  activ   pend  activ   pend  activ
-----  -----  -----  -----  -----  -----  -----  -----  -----  -----
    0      0      0      0     78     29      0      0      0      0
    0      0      0      0     78     29      0      0      0      0
    0      0      0      0      0      0      0      0      0      0
    -      -      -      -      -      -      -      -      -      -
    0      0      0      0      0      0      0      0      0      0
    -      -      -      -      -      -      -      -      -      -
    0      0      0      0      0      0      0      0      0      0
-----  -----  -----  -----  -----  -----  -----  -----  -----  -----
    0      0    227    394      0     19      0      0      0      0
    0      0    227    394      0     19      0      0      0      0
    0      0    108     98      0     19      0      0      0      0
    0      0     19     98      0      0      0      0      0      0
    0      0     78     98      0      0      0      0      0      0
    0      0     19     88      0      0      0      0      0      0
-----  -----  -----  -----  -----  -----  -----  -----  -----  -----

-p: Display numbers in parseable (exact) values.

Also, update iostat syntax to allow the user to specify specific vdevs
to show statistics for.  The three options for choosing pools/vdevs are:

Display a list of pools:
    zpool iostat ... [pool ...]

Display a list of vdevs from a specific pool:
    zpool iostat ... [pool vdev ...]

Display a list of vdevs from any pools:
    zpool iostat ... [vdev ...]

Lastly, allow zpool command "interval" value to be floating point:
    zpool iostat -v 0.5

Signed-off-by: Tony Hutter <hutter2@llnl.gov
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4433

8 years agoFixes bug in fix_paths()
Marcel Huber [Wed, 11 May 2016 19:28:33 +0000 (21:28 +0200)]
Fixes bug in fix_paths()

Fixes bug introduced in commit 7d90f569a.  Hinted by gcc:

libzfs_import.c: In function ‘fix_paths’:
libzfs_import.c:602:28: warning: self-comparison always evaluates to true [-Wtautological-compare]
    if (best->ne_num_labels == best->ne_num_labels &&

Signed-off-by: Marcel Huber <marcelhuberfoo@gmail.com>
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #4632

8 years agoReduce stack usage of dmu_recv_stream function
Nikolay Borisov [Mon, 9 May 2016 19:15:30 +0000 (22:15 +0300)]
Reduce stack usage of dmu_recv_stream function

The receive_writer_arg and receive_arg structures become large
when ZFS is compiled with debugging enabled. This results in
gcc throwing an error about excessive stack usage:

  module/zfs/dmu_send.c: In function ‘dmu_recv_stream’:
  module/zfs/dmu_send.c:2502:1: error: the frame size of 1256 bytes is
  larger than 1024 bytes [-Werror=frame-larger-than=]

Fix this by allocating those functions on the heap, rather than
on the stack.

With patch:    dmu_send.c:2350:1:dmu_recv_stream 240 static
Without patch: dmu_send.c:2350:1:dmu_recv_stream 1336 static

Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4620

8 years agoOpenZFS 3993, 4700
Adam Stevko [Mon, 9 May 2016 21:03:18 +0000 (14:03 -0700)]
OpenZFS 3993, 4700

3993 zpool(1M) and zfs(1M) should support -p for "list" and "get"
4700 "zpool get" doesn't support -H or -o options

Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/3993
OpenZFS-issue: https://www.illumos.org/issues/4700
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/c58b352

Porting notes:
I removed ZoL's zpool_get_prop_literal() in favor of
zpool_get_prop(..., boolean_t literal) since that's what OpenZFS
uses.  The functionality is the same.

8 years agoAdd zfs-helpers.sh script
Brian Behlendorf [Fri, 6 May 2016 17:24:06 +0000 (10:24 -0700)]
Add zfs-helpers.sh script

Add a script designed to facilitate in-tree development and testing
by installing symlinks on your system which refer to in-tree helper
utilities.  These helper utilities must be installed to in order to
exercise all ZFS functionality.  By using symbolic links and keeping
the scripts in-tree during development they can be easily modified
and those changes tracked.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #4607

8 years agoOpenZFS 6842 - Fix empty xattr dir causing lockup
Chunwei Chen [Fri, 25 Mar 2016 19:21:56 +0000 (15:21 -0400)]
OpenZFS 6842 - Fix empty xattr dir causing lockup

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported-by: Denys Rtveliashvili <denys@rtveliashvili.name>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
An initial version of this patch was applied in commit 29572cc and
subsequently refined upstream.  Since the implementations do not
conflict with each other both are left applied for now.

OpenZFS-issue: https://www.illumos.org/issues/6842
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/02525cd
Closes #4615

8 years agoOpenZFS 6873 - zfs_destroy_snaps_nvl leaks errlist
Chris Williamson [Wed, 20 Apr 2016 03:45:04 +0000 (20:45 -0700)]
OpenZFS 6873 - zfs_destroy_snaps_nvl leaks errlist

Authored by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Ported-by: Denys Rtveliashvili <denys@rtveliashvili.name>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
lzc_destroy_snaps() returns an nvlist in errlist.
zfs_destroy_snaps_nvl() should nvlist_free() it before returning.

OpenZFS-issue: https://www.illumos.org/issues/6873
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/ee06391
Closes #4614

8 years agoOpenZFS 6879 - Incorrect endianness swap
Denys Rtveliashvili [Mon, 9 May 2016 18:22:00 +0000 (19:22 +0100)]
OpenZFS 6879 - Incorrect endianness swap

Authored by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Ported-by: Denys Rtveliashvili <denys@rtveliashvili.name>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Incorrect endianness swap for drr_spill.drr_length in libzfs_sendrecv.c
Instead of drr_write.drr_length, we should be assigning the result of
the byteswap to drr_spill.drr_length.

OpenZFS-issue: https ://www.illumos.org/issues/6879
OpenZFS-commit: https ://github.com/openzfs/openzfs/commit/74c8720
Closes #4613

8 years agoWrap vdev_count_verify_zaps() with ZFS_DEBUG
Brian Behlendorf [Sat, 7 May 2016 01:14:03 +0000 (18:14 -0700)]
Wrap vdev_count_verify_zaps() with ZFS_DEBUG

Commit e0ab3ab introduced two blocks of code which are only needed
when debugging is enabled.  These blocks should be wrapped with
ZFS_DEBUG for clarity and to prevent unused variable warnings in
a production build.

Signed-off-by: Don Brady <don.brady@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4515

8 years agoPer-vdev ZAP tests must use $ZPOOL and $ZDB
Brian Behlendorf [Sat, 7 May 2016 01:13:17 +0000 (18:13 -0700)]
Per-vdev ZAP tests must use $ZPOOL and $ZDB

Commit e0ab3ab introduced new per-vdev ZAP tests which should have
used the $ZPOOL and $ZDB variabled.  The tests passed the automated
testing since both utilities but when running in-tree all of the new
tests fail.

Signed-off-by: Don Brady <don.brady@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4515

8 years agoOpenZFS 6672 - arc_reclaim_thread() should use gethrtime()
David Quigley [Fri, 6 May 2016 16:35:52 +0000 (12:35 -0400)]
OpenZFS 6672 - arc_reclaim_thread() should use gethrtime()

6672 arc_reclaim_thread() should use gethrtime() instead of ddi_get_lbolt()

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Ported-by: David Quigley <dpquigl@davequigley.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/6672
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/571be5c
Closes #4600

8 years agoOpenZFS 6286 - ZFS internal error when set large block on bootfs
Brian Behlendorf [Thu, 5 May 2016 23:19:12 +0000 (16:19 -0700)]
OpenZFS 6286 - ZFS internal error when set large block on bootfs

6286 ZFS internal error when set large block on bootfs
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/6286
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/6de9bb5
Closes #4585

8 years agoOpenZFS 6544 - incorrect comment in libzfs.h about offline status
Tony Hutter [Thu, 5 May 2016 16:30:05 +0000 (09:30 -0700)]
OpenZFS 6544 - incorrect comment in libzfs.h about offline status

6544 incorrect comment in libzfs.h about offline status
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Ported-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/6544
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/cb605c4
Closes #4595

8 years agoOpenZFS 5669 - altroot not set in zpool create
Brian Behlendorf [Thu, 5 May 2016 16:27:55 +0000 (09:27 -0700)]
OpenZFS 5669 - altroot not set in zpool create

5669 altroot not set in zpool create when specified with -o
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/5669
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/c423721
Closes #4594

8 years agotaskq_create() calls thread_create() with wrong arguments
Denys Rtveliashvili [Thu, 5 May 2016 16:24:12 +0000 (17:24 +0100)]
taskq_create() calls thread_create() with wrong arguments

Correct the arguments passed to `thread_create()`.

Signed-off-by: Isaac Huang <he.huang@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4593

8 years agoOpenZFS 6736 - ZFS per-vdev ZAPs
Joe Stein [Mon, 11 Apr 2016 20:16:57 +0000 (16:16 -0400)]
OpenZFS 6736 - ZFS per-vdev ZAPs

6736 ZFS per-vdev ZAPs
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>

References:
  https://www.illumos.org/issues/6736
  https://github.com/openzfs/openzfs/commit/215198a

Ported-by: Don Brady <don.brady@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4515

8 years agoKill znode->z_gen field
Nikolay Borisov [Mon, 18 Apr 2016 19:08:53 +0000 (22:08 +0300)]
Kill znode->z_gen field

This field is a duplicate of the inode->i_generation, so just kill it

Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4538

8 years agoEnable PF_FSTRANS for ioctl secpolicy callbacks (#4571)
Tim Chase [Mon, 2 May 2016 17:00:50 +0000 (12:00 -0500)]
Enable PF_FSTRANS for ioctl secpolicy callbacks (#4571)

At the very least, the zfs_secpolicy_write_perms ioctl security policy
callback, which calls dsl_dataset_hold(), can require freeing memory and,
therefore, re-enter ZFS.  This patch enables PF_FSTRANS for all of the
security policy callbacks similarly to the manner in which it's enabled
for the actual ioctl callback.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4554

8 years agomodule/.gitignore: Add *.dwo (#4580)
Vitaut Bajaryn [Mon, 2 May 2016 16:07:04 +0000 (18:07 +0200)]
module/.gitignore: Add *.dwo (#4580)

These files get generated when CONFIG_DEBUG_INFO_DWARF4 is enabled in
Linux .config.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4580

8 years agoFix user namespaces uid/gid mapping
Brian Behlendorf [Sat, 30 Apr 2016 19:21:51 +0000 (12:21 -0700)]
Fix user namespaces uid/gid mapping

As described in torvalds/linux@5f3a4a2 the &init_user_ns, and
not the current user_ns, should be passed to posix_acl_from_xattr()
and posix_acl_to_xattr().  Conveniently the init_user_ns is
available through the init credential (kcred).

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Massimo Maggi <me@massimo-maggi.eu>
Closes #4177

8 years agoAdd support for libtirpc
Brian Behlendorf [Wed, 27 Apr 2016 00:24:41 +0000 (17:24 -0700)]
Add support for libtirpc

While OpenSolaris libc and glibc both include XDR support, the musl libc
does not in favor of depending on the BSD-licensed libtirpc library.

Adding support is a simple matter of detecting the library, including
the headers and linking against it.  By default libtirpc will be checked
for and if available used.  Otherwise, configure will fall back to using
the xdr implementation provided by libc if available.  The options
--with-tirpc/--without-tirpc can be used to disable this checking.

In addition, the xdr_control() function has been simplied to only
handle ZFSs specific use case.

Original-patch-by: stf <s@ctrlc.hu>
Original-patch-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Signed-off-by: Carlo Landmeter <clandmeter@gmail.com>
Closes #2254
Closes #4559

8 years agoIllumos 6844 - dnode_next_offset can detect fictional holes
Alex Reece [Thu, 21 Apr 2016 18:23:37 +0000 (11:23 -0700)]
Illumos 6844 - dnode_next_offset can detect fictional holes

6844 dnode_next_offset can detect fictional holes
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>

dnode_next_offset is used in a variety of places to iterate over the
holes or allocated blocks in a dnode. It operates under the premise that
it can iterate over the blockpointers of a dnode in open context while
holding only the dn_struct_rwlock as reader. Unfortunately, this premise
does not hold.

When we create the zio for a dbuf, we pass in the actual block pointer
in the indirect block above that dbuf. When we later zero the bp in
zio_write_compress, we are directly modifying the bp. The state of the
bp is now inconsistent from the perspective of dnode_next_offset: the bp
will appear to be a hole until zio_dva_allocate finally finishes filling
it in. In the meantime, dnode_next_offset can detect a hole in the dnode
when none exists.

I was able to experimentally demonstrate this behavior with the
following setup:
1. Create a file with 1 million dbufs.
2. Create a thread that randomly dirties L2 blocks by writing to the
first L0 block under them.
3. Observe dnode_next_offset, waiting for it to skip over a hole in the
middle of a file.
4. Do dnode_next_offset in a loop until we skip over such a non-existent
hole.

The fix is to ensure that it is valid to iterate over the indirect
blocks in a dnode while holding the dn_struct_rwlock by passing the zio
a copy of the BP and updating the actual BP in dbuf_write_ready while
holding the lock.

References:
  https://www.illumos.org/issues/6844
  https://github.com/openzfs/openzfs/pull/82
  DLPX-35372

Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4548

8 years agoIllumos 6659 - nvlist_free(NULL) is a no-op
Josef 'Jeff' Sipek [Fri, 1 Apr 2016 03:54:07 +0000 (23:54 -0400)]
Illumos 6659 - nvlist_free(NULL) is a no-op

6659 nvlist_free(NULL) is a no-op
Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: Marcel Telka <marcel@telka.sk>
Approved by: Robert Mustacchi <rm@joyent.com>

References:
  https://www.illumos.org/issues/6659
  https://github.com/illumos/illumos-gate/commit/aab83bb

Ported-by: David Quigley <dpquigl@davequigley.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4566

8 years agoFix zfs_copies_001_pos/zfs_copies_004_neg
Brian Behlendorf [Mon, 25 Apr 2016 18:50:39 +0000 (11:50 -0700)]
Fix zfs_copies_001_pos/zfs_copies_004_neg

Call block_device_wait when creating/destroying volumes in order
to make the operations synchronous as expected by the test cases.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4560

8 years agoFix 'zpool import' blkid device names
Brian Behlendorf [Wed, 20 Apr 2016 17:17:01 +0000 (10:17 -0700)]
Fix 'zpool import' blkid device names

When importing a pool using the blkid cache only the device
node path was added to the list of known paths for a device.
This results in 'zpool import' always using the sdX names
in preference to the 'path' name stored in the label.

To fix the issue the blkid import path has been updated to
add both the 'path', 'devid', and 'devname' names from the
label to the known paths.  A sanity check is done to ensure
these paths do refer to the same device identified by blkid.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #4523
Closes #3043

8 years agoDisable efi_debug in --enable-debug builds
Brian Behlendorf [Wed, 20 Apr 2016 18:39:15 +0000 (11:39 -0700)]
Disable efi_debug in --enable-debug builds

Disable the additional EFI debugging in all builds.  Some users
run debug builds in production and the extra log messages can
cause confusion.  Beyond that the log messages are rarely useful.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #4523

8 years agoUse udev for partition detection
Brian Behlendorf [Tue, 19 Apr 2016 18:19:12 +0000 (11:19 -0700)]
Use udev for partition detection

When ZFS partitions a block device it must wait for udev to create
both a device node and all the device symlinks.  This process takes
a variable length of time and depends on factors such how many links
must be created, the complexity of the rules, etc.  Complicating
the situation further it is not uncommon for udev to create and
then remove a link multiple times while processing the udev rules.

Given the above, the existing scheme of waiting for an expected
partition to appear by name isn't 100% reliable.  At this point
udev may still remove and recreate think link resulting in the
kernel modules being unable to open the device.

In order to address this the zpool_label_disk_wait() function
has been updated to use libudev.  Until the registered system
device acknowledges that it in fully initialized the function
will wait.  Once fully initialized all device links are checked
and allowed to settle for 50ms.  This makes it far more likely
that all the device nodes will exist when the kernel modules
need to open them.

For systems without libudev an alternate zpool_label_disk_wait()
was updated to include a settle time.  In addition, the kernel
modules were updated to include retry logic for this ENOENT case.
Due to the improved checks in the utilities it is unlikely this
logic will be invoked.  However, if the rare event it is needed
it will prevent a failure.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #4523
Closes #3708
Closes #4077
Closes #4144
Closes #4214
Closes #4517

8 years agoCreate unique partition labels
Brian Behlendorf [Wed, 13 Apr 2016 21:50:16 +0000 (14:50 -0700)]
Create unique partition labels

When partitioning a device a name may be specified for each partition.
Internally zfs doesn't use this partition name for anything so it
has always just been set to "zfs".

However this isn't optimal because udev will create symlinks using
this name in /dev/disk/by-partlabel/.  If the name isn't unique
then all the links cannot be created.

Therefore a random 64-bit value has been added to the partition
label, i.e "zfs-1234567890abcdef".  Additional information could
be encoded here but since partitions may be reused that might
result in confusion and it was decided against.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #4517

8 years agofix booting via dracut generated initramfs
Matthew Thode [Wed, 30 Mar 2016 23:59:15 +0000 (18:59 -0500)]
fix booting via dracut generated initramfs

Dracut and Systemd updated how they integrate with each other, because
of this our current integrations stopped working (around the time
4.1.13 came out).  This patch addresses that issue and gets us booting
again.

Thanks to @Rudd-O for doing the work to get dracut working again and
letting me submit this on his behalf.

Signed-off-by: Manuel Amador (Rudd-O) <rudd-o@rudd-o.com>
Signed-off-by: Matthew Thode <mthode@mthode.org>
Closes #3605
Closes #4478

8 years agoLinux 4.5 compat: Use xattr_handler->name for acl
Chunwei Chen [Fri, 22 Apr 2016 00:19:07 +0000 (17:19 -0700)]
Linux 4.5 compat: Use xattr_handler->name for acl

Linux 4.5 added member "name" to xattr_handler. xattr_handler which matches to
whole name rather than prefix should use "name" instead of "prefix".
Otherwise, kernel will return with EINVAL when it tries to resolve handlers.

Also, we remove the strcmp checks when xattr_handler has name, because
xattr_resolve_name will do the check for us.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4549
Closes #4537

8 years agoAdd pn_alloc()/pn_free() functions
Brian Behlendorf [Wed, 13 Apr 2016 15:55:35 +0000 (08:55 -0700)]
Add pn_alloc()/pn_free() functions

In order to remove the HAVE_PN_UTILS wrappers the pn_alloc() and
pn_free() functions must be implemented.  The existing illumos
implementation were used for this purpose.

The `flags` argument which was used in places wrapped by the
HAVE_PN_UTILS condition has beed added back to zfs_remove() and
zfs_link() functions.  This removes a small point of divergence
between the ZoL code and upstream.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4522

8 years agoRework zpool import excluded devices check
Nikolay Borisov [Wed, 13 Apr 2016 06:40:42 +0000 (09:40 +0300)]
Rework zpool import excluded devices check

Current zpool import code skips directory entries which have prefixes
similar to some system files on linux such as "fd", "core" etc. However,
this means one cannot have one's zpools hosted inside files which are named
e.g. core-1 or lp. Furthermore, apart from the string checks there is already
which makes the zpool_open_func work only with regular files and block devices.

To fix this problem remove most of the checks since they are redundant but
leave the checks for the 'hpet' and 'watchdog' names. Furthermore, change
the checks to strcmp which albeit less safe than strncmp allows to have
devices whose names are prefixed by 'hpet' or 'watchdog'.

Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4438

8 years agoFix ZPL miswrite of default POSIX ACL
Ned Bass [Fri, 15 Apr 2016 18:55:03 +0000 (18:55 +0000)]
Fix ZPL miswrite of default POSIX ACL

Commit 4967a3e introduced a typo that caused the ZPL to store the
intended default ACL as an access ACL. Due to caching this problem
may not become visible until the filesystem is remounted or the inode
is evicted from the cache. Fix the typo and add a regression test.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Closes #4520

8 years agoFix inverted logic on none elevator comparison
Colin Ian King [Thu, 14 Apr 2016 07:58:09 +0000 (08:58 +0100)]
Fix inverted logic on none elevator comparison

Commit d1d7e2689db9e03f1 ("cstyle: Resolve C style issues") inverted
the logic on the none elevator comparison.  Fix this and make it
cstyle warning clean.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4507

8 years agoremove sanity check in replacement test
Jinshan Xiong [Tue, 12 Apr 2016 22:27:47 +0000 (15:27 -0700)]
remove sanity check in replacement test

In replacement test, it spawns a process to truncate a file background
and make sure that the process exists 1 second later. However, the
process may have finished its work and exited therefore it has the
chance to report a false alarm.

This patch just removed those sanity check.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4516

8 years agoMake zfs test easier to run in local install
Jinshan Xiong [Wed, 6 Apr 2016 16:48:10 +0000 (09:48 -0700)]
Make zfs test easier to run in local install

When ZFS is installed by 'make install', programs will be installed
into '/usr/local'. ZFS test scripts can't locate programs 'zpool'
that caused tests failure.

Fix typo in help message.

Add sanity check to for ksh and generate a useful error message.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4495

8 years agoAdd zfs-tests for relatime
Chunwei Chen [Wed, 6 Apr 2016 00:32:23 +0000 (17:32 -0700)]
Add zfs-tests for relatime

Add atime_003_pos to test relatime=on, we do check_atime_updated twice, the
first time should success and the second time should fail. We also modify
atime_001_pos to do check_atime_updated twice and both times should succeed.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4482

8 years agoMake zfs mount according to relatime config in dataset
Chunwei Chen [Fri, 1 Apr 2016 20:12:06 +0000 (13:12 -0700)]
Make zfs mount according to relatime config in dataset

Also enable lazytime in mount.zfs

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #4482

8 years agoEnable lazytime semantic for atime
Chunwei Chen [Thu, 31 Mar 2016 23:52:03 +0000 (16:52 -0700)]
Enable lazytime semantic for atime

Linux 4.0 introduces lazytime. The idea is that when we update the atime, we
delay writing it to disk for as long as it is reasonably possible.

When lazytime is enabled, dirty_inode will be called with only I_DIRTY_TIME
flag whenever i_atime is updated. So under such condition, we will set
z_atime_dirty. We will only write it to disk if file is closed, inode is
evicted or setattr is called. Ideally, we should also write it whenever SA
is going to be updated, but it is left for future improvement.

There's one thing that we should take care of now that we allow i_atime to be
dirty. In original implementation, whenever SA is modified, zfs_inode_update
will be called to overwrite every thing in inode. This will cause dirty
i_atime to be discarded. We fix this by don't overwrite i_atime in
zfs_inode_update. We only overwrite i_atime when allocating new inode or doing
zfs_rezget with zfs_inode_update_new.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #4482

8 years agoFix atime handling and relatime
Chunwei Chen [Wed, 30 Mar 2016 00:53:34 +0000 (17:53 -0700)]
Fix atime handling and relatime

The problem for atime:

We have 3 places for atime: inode->i_atime, znode->z_atime and SA. And its
handling is a mess. A huge part of mess regarding atime comes from
zfs_tstamp_update_setup, zfs_inode_update, and zfs_getattr, which behave
inconsistently with those three values.

zfs_tstamp_update_setup clears z_atime_dirty unconditionally as long as you
don't pass ATTR_ATIME. Which means every write(2) operation which only updates
ctime and mtime will cause atime changes to not be written to disk.

Also zfs_inode_update from write(2) will replace inode->i_atime with what's
inside SA(stale). But doesn't touch z_atime. So after read(2) and write(2).
You'll have i_atime(stale), z_atime(new), SA(stale) and z_atime_dirty=0.

Now, if you do stat(2), zfs_getattr will actually replace i_atime with what's
inside, z_atime. So you will have now you'll have i_atime(new), z_atime(new),
SA(stale) and z_atime_dirty=0. These will all gone after umount. And you'll
leave with a stale atime.

The problem for relatime:

We do have a relatime config inside ZFS dataset, but how it should interact
with the mount flag MS_RELATIME is not well defined. It seems it wanted
relatime mount option to override the dataset config by showing it as
temporary in `zfs get`. But at the same time, `zfs set relatime=on|off` would
also seems to want to override the mount option. Not to mention that
MS_RELATIME flag is actually never passed into ZFS, so it never really worked.

How Linux handles atime:

The Linux kernel actually handles atime completely in VFS, except for writing
it to disk. So if we remove the atime handling in ZFS, things would just work,
no matter it's strictatime, relatime, noatime, or even O_NOATIME. And whenever
VFS updates the i_atime, it will notify the underlying filesystem via
sb->dirty_inode().

And also there's one thing to note about atime flags like MS_RELATIME and
other flags like MS_NODEV, etc. They are mount point flags rather than
filesystem(sb) flags. Since native linux filesystem can be mounted at multiple
places at the same time, they can all have different atime settings. So these
flags are never passed down to filesystem drivers.

What this patch tries to do:

We remove znode->z_atime, since we won't gain anything from it. We remove most
of the atime handling and leave it to VFS. The only thing we do with atime is
to write it when dirty_inode() or setattr() is called. We also add
file_accessed() in zpl_read() since it's not provided in vfs_read().

After this patch, only the MS_RELATIME flag will have effect. The setting in
dataset won't do anything. We will make zfstuil to mount ZFS with MS_RELATIME
set according to the setting in dataset in future patch.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #4482

8 years agoLinux 4.6 compat: PAGE_CACHE_SIZE removal
Brian Behlendorf [Tue, 5 Apr 2016 19:39:37 +0000 (12:39 -0700)]
Linux 4.6 compat: PAGE_CACHE_SIZE removal

As described in torvalds/linux@4a2d057e the macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were originally introduced
to make it possible to add bigger chunks to the page cache.  This
never panned out and it has therefore been removed from the kernel.

ZFS has been updated to use the PAGE_{SIZE,SHIFT,MASK,ALIGN} macros
and calls to page_cache_release() have been replaced with put_page().

There was no need to introduce a configure check for this because
these interfaces have existed for a very long time.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Closes #4489

8 years agoFix WANT_DEVNAME2DEVID configure error
Brian Behlendorf [Fri, 1 Apr 2016 15:49:19 +0000 (08:49 -0700)]
Fix WANT_DEVNAME2DEVID configure error

Accidentally introduced by commit e4023e4.  The AM_CONDITIONAL
cannot be located where it can be invoked conditionally, as in
the `--with-config=user` case.  Relocate it to the top level
ZFS_AC_CONFIG macro along with the other AM_CONDITIONALs.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #4416

8 years agoAdd support 32 bit FS_IOC32_{GET|SET}FLAGS compat ioctls
Colin Ian King [Wed, 30 Mar 2016 22:00:23 +0000 (23:00 +0100)]
Add support 32 bit FS_IOC32_{GET|SET}FLAGS compat ioctls

We need 32 bit userspace FS_IOC32_GETFLAGS and FS_IOC32_SETFLAGS
compat ioctls for systems such as powerpc64.  We use the normal
compat ioctl idiom as used by a variety of file systems to provide
this support.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4477

8 years agoOnly build devname2devid when libudev headers are available
Brian Behlendorf [Thu, 31 Mar 2016 21:50:16 +0000 (14:50 -0700)]
Only build devname2devid when libudev headers are available

Accidentally introduced by commit 39fc0cb.  The devname2devid utility
which depends on libudev must only be built when libudev headers are
available.  This is accomplished through an AM_CONDITIONAL.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #4416

8 years agoAdd support for devid and phys_path keys in vdev disk labels
Don Brady [Mon, 14 Mar 2016 16:04:21 +0000 (10:04 -0600)]
Add support for devid and phys_path keys in vdev disk labels

This is foundational work for ZED.

Updates a leaf vdev's persistent device strings on Linux platform

* only applies for a dedicated leaf vdev (aka whole disk)
* updated during pool create|add|attach|import
* used for matching device matching during auto-{online,expand,replace}
* stored in a leaf disk config label (i.e. alongside 'path' NVP)
* can opt-out using env var ZFS_VDEV_DEVID_OPT_OUT=YES

Some examples:

    path: '/dev/sdb1'
    devid: 'scsi-350000394a8ca4fbc-part1'
    phys_path: 'pci-0000:04:00.0-sas-0x50000394a8ca4fbf-lun-0'

    path: '/dev/mapper/mpatha'
    devid: 'dm-uuid-mpath-35000c5006304de3f'

Signed-off-by: Don Brady <don.brady@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2856
Closes #3978
Closes #4416

8 years agoExpand EDQUOT variable
Andriy Gapon [Fri, 25 Mar 2016 14:29:35 +0000 (16:29 +0200)]
Expand EDQUOT variable

Results in failures with ksh version 93v- 2014-06-25.  This appears
to not be an issue with ksh version 93u+ 2012-08-01.  The expanded
versions works correctly for both.

Signed-off-by: Andriy Gapon <andriy.gapon@clusterhq.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4452

8 years agozfs_main: fix `zfs userspace` squashing unresolved entries
Pavel Boldin [Sun, 27 Mar 2016 22:28:32 +0000 (01:28 +0300)]
zfs_main: fix `zfs userspace` squashing unresolved entries

The `zfs userspace` squashes all entries with unresolved numeric
values into a single output entry due to the comparsion always
made by the string name which is empty in case of unresolved IDs.

Fix this by falling to a numerical comparison when either one
of string values is not found. This then compares any numerical
values after all with a name resolved.

Signed-off-by: Pavel Boldin <boldin.pavel@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4440

8 years agoRemove complicated libspl assert wrappers
Brian Behlendorf [Tue, 1 Mar 2016 14:45:43 +0000 (15:45 +0100)]
Remove complicated libspl assert wrappers

Effectively provide our own version of assert()/verify() for use
in user space.  This minimizes our dependencies and aligns the
user space assertion handling with what's used in the kernel.

Signed-off-by: Carlo Landmeter <clandmeter@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4449

8 years agogcc build error: -Wbool-compare in metaslab.c
DHE [Sun, 27 Mar 2016 19:58:27 +0000 (15:58 -0400)]
gcc build error: -Wbool-compare in metaslab.c

When debugging is enabled on a very recent version of gcc
(tested with 5.3.0), DVA_SET_GANG(dva, !!(flags)) fails
because an assertion causes a comparison between what is
technically a boolean and an integer.

Signed-off-by: DHE <git@dehacked.net>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4465

8 years agoFix zpool_scrub_* test cases
Brian Behlendorf [Sat, 26 Mar 2016 21:17:26 +0000 (14:17 -0700)]
Fix zpool_scrub_* test cases

The zpool_scrub_002, zpool_scrub_003, zpool_scrub_004 test cases fail
reliably when running against small pools or fast storage.  This
occurs because the scrub/resilver operation completes before subsequent
commands can be run.

A one second delay has been added to 10% of zio's in order to ensure
the scrub/resilver operation will run for at least several seconds.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4450

8 years agoUse the correct macro to include backtrace
Carlo Landmeter [Tue, 1 Mar 2016 14:23:09 +0000 (15:23 +0100)]
Use the correct macro to include backtrace

execinfo.h and backtrace() are GNU extensions provided by glibc
and not by gcc, see:

http://www.gnu.org/software/libc/manual/html_mono/libc.html#Backtraces

Signed-off-by: Carlo Landmeter <clandmeter@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4453

8 years agoMove hrtime_t timestruc_t and timespec_t
Carlo Landmeter [Fri, 25 Mar 2016 12:21:53 +0000 (13:21 +0100)]
Move hrtime_t timestruc_t and timespec_t

hrtime_t timestruc_t and timespec_t should have originally been
included in sys/time.h so lets move them.

longlong_t is not defined by any standard so change it to long long

Signed-off-by: Carlo Landmeter <clandmeter@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4459

8 years agoSet _DATE_FMT to '%+' if not defined in libspl/timestamp.c
Carlo Landmeter [Tue, 1 Mar 2016 15:23:12 +0000 (16:23 +0100)]
Set _DATE_FMT to '%+' if not defined in libspl/timestamp.c

Signed-off-by: Carlo Landmeter <clandmeter@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4458

8 years agoEnsure correct return value type
Carlo Landmeter [Tue, 1 Mar 2016 14:32:52 +0000 (15:32 +0100)]
Ensure correct return value type

When compiling with musl libc the return type will be incorrect.

Signed-off-by: Carlo Landmeter <clandmeter@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4454

8 years agoAdd missing fcntl.h to includes in mount_zfs.c
Carlo Landmeter [Fri, 25 Mar 2016 19:47:03 +0000 (20:47 +0100)]
Add missing fcntl.h to includes in mount_zfs.c

This is needed for musl libc

Signed-off-by: Carlo Landmeter <clandmeter@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4456

8 years agoInclude sys/types.h in devid.h
Carlo Landmeter [Tue, 1 Mar 2016 14:56:26 +0000 (15:56 +0100)]
Include sys/types.h in devid.h

This is needed for musl libc

Signed-off-by: Carlo Landmeter <clandmeter@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4454

8 years agoCorrect typo in spa_load_verify_metadata docs
Richard Laager [Mon, 28 Mar 2016 22:13:42 +0000 (17:13 -0500)]
Correct typo in spa_load_verify_metadata docs

Signed-off-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4471

8 years agozloop.sh requires bash
Brian Behlendorf [Fri, 25 Mar 2016 19:40:58 +0000 (12:40 -0700)]
zloop.sh requires bash

The zloop.sh script requires bash.  It will require further improvements
to be compatible with the alternatives such as dash.  This resolves the
ztest failures observed under Ubuntu in the automated tested.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4441

8 years agowrite_dirs: set_partition expects zero-based partition indeces
Andriy Gapon [Fri, 25 Mar 2016 14:32:11 +0000 (16:32 +0200)]
write_dirs: set_partition expects zero-based partition indeces

... despite partition names based 1-based.

Signed-off-by: Andriy Gapon <andriy.gapon@clusterhq.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4446

8 years agozfs_copies: do_vol_test must wait for device
Brian Behlendorf [Fri, 25 Mar 2016 18:51:01 +0000 (11:51 -0700)]
zfs_copies: do_vol_test must wait for device

Occasionally zfs_copies_* tests which rely on do_vol_test() will fail
because udev hasn't yet created the minor device.  Wait for it.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
8 years agoAdd zloop.sh test script
Brian Behlendorf [Wed, 23 Mar 2016 01:08:59 +0000 (18:08 -0700)]
Add zloop.sh test script

Add Chris Williamson's "new" zloop script so that it may be
intergated with ZoLs automated testing.  The original script may
be found in the openzfs-build repository on Github.

Minor modifications were made to the script so it can be run
directly from the ZoL source tree or from installed packages.

Additionally it was updated to use gdb instead of mdb to
extact debugging information from a core dump.

References:
  https://github.com/openzfs/openzfs-build/commit/7fb5d8b
  https://github.com/openzfs/openzfs-build/blob/master/ansible/roles/openzfs-jenkins-slave/files/usr/local/zloop.sh

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4441

8 years agoFix zdb -e and zhack thread_init()
Brian Behlendorf [Thu, 17 Mar 2016 22:32:33 +0000 (15:32 -0700)]
Fix zdb -e and zhack thread_init()

This issue was caused by calling `thread_init()` and `thread_fini()`
multiple times resulting in `kthread_key` being invalid.  To resolve
the issue the explicit calls to `thread_init()` and `thread_fini()`
required by the `zpool` command have been moved in to the command.
Consumers such as `zdb` and `zhack` perform the same initialized
through `kernel_init()` and `kernel_fini()`.

Resolving this issue allows multiple additional test cases to
be enabled.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes #4331

8 years agoSupport for vectorized algorithms on x86
Gvozden Neskovic [Mon, 29 Feb 2016 18:42:27 +0000 (19:42 +0100)]
Support for vectorized algorithms on x86

This is initial support for x86 vectorized implementations of ZFS parity
and checksum algorithms.

For the compilation phase, configure step checks if toolchain supports relevant
instruction sets. Each implementation must ensure that the code is not passed
to compiler if relevant instruction set is not supported. For this purpose,
following new defines are provided if instruction set is supported:
- HAVE_SSE,
- HAVE_SSE2,
- HAVE_SSE3,
- HAVE_SSSE3,
- HAVE_SSE4_1,
- HAVE_SSE4_2,
- HAVE_AVX,
- HAVE_AVX2.

For detecting if an instruction set can be used in runtime, following functions
are provided in (include/linux/simd_x86.h):
- zfs_sse_available()
- zfs_sse2_available()
- zfs_sse3_available()
- zfs_ssse3_available()
- zfs_sse4_1_available()
- zfs_sse4_2_available()
- zfs_avx_available()
- zfs_avx2_available()
- zfs_bmi1_available()
- zfs_bmi2_available()

These function should be called once, on module load, or initialization.
They are safe to use from user and kernel space.
If an implementation is using more than single instruction set, both compiler
and runtime support for all relevant instruction sets should be checked.

Kernel fpu methods:
- kfpu_begin()
- kfpu_end()

Use __get_cpuid_max and __cpuid_count from <cpuid.h>
Both gcc and clang have support for these. They also handle ebx register
in case it is used for PIC code.

Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Closes #4381

8 years agoCleanup linking
Richard Yao [Tue, 15 Mar 2016 17:28:07 +0000 (13:28 -0400)]
Cleanup linking

I noticed during code review of zfsonlinux/zfs#4385 that the author of a
commit had peppered the various Makefile.am files with `$(TIRPC_LIBS)`
when putting it into `lib/libspl/Makefile.am` should have sufficed. Upon
further examination, it seems that he had copied what we do with
`$(ZLIB)`. We also have a bit of that with `-ldl` too.  Unfortunately,
what we do is wrong, so lets fix it to set a good example for future
contributors.

In addition, we have multiple `-lz` and `-luuid` passed to the compiler
because each `AC_CHECK_LIB` adds it to `$LIBS`. That is somewhat
annoying to see, so we switch to `AC_SEARCH_LIBS` to avoid it.  This is
consistent with the recommendation to use `AC_SEARCH_LIBS` over
`AC_CHECK_LIB` by autotools upstream:

https://www.gnu.org/software/autoconf/manual/autoconf-2.66/html_node/Libraries.html

In an ideal world, this would translate into improvements in ELF's
`DT_NEEDED` entries, but that is not the case because of a couple of
bugs in libtool.

The first bug causes libtool to overlink by using static link
dependencies for dynamic linking:

https://wiki.mageia.org/en/Overlinking_issues_in_packaging#libtool_issues

The workaround for this should be to pass `-Wl,--as-needed` in
`LDFLAGS`. That leads us to the second bug, where libtool passes
`LDFLAGS` after the libraries are specified and `ld` will only honor
`--as-needed` on libraries specified before it:

https://sigquit.wordpress.com/2011/02/16/why-asneeded-doesnt-work-as-expected-for-your-libraries-on-your-autotools-project/

There are a few possible workarounds for the second bug. One is to
either patch the compiler spec file to specify `-Wl,--as-needed` or pass
`-Wl,--as-needed` via `CC` like `CC='gcc -Wl,--as-needed'` so that it is
specified early. Another is to patch ltmain.sh like Gentoo does:

https://gitweb.gentoo.org/repo/gentoo.git/tree/eclass/ELT-patches/as-needed

Without one of those workarounds, this cleanup provides no benefit in
terms of `DT_NEEDED` entry generation. It should still be an improvement
because it nicely simplifies the code while encouraging good habits when
patching autotools scripts.

Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4426

8 years agoAdd support for s390[x].
Dimitri John Ledkov [Wed, 16 Mar 2016 21:53:20 +0000 (21:53 +0000)]
Add support for s390[x].

Signed-off-by: Dimitri John Ledkov <xnox@ubuntu.com>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4425

8 years agoDisable zpool_add_004_pos test case
Brian Behlendorf [Thu, 17 Mar 2016 16:47:54 +0000 (09:47 -0700)]
Disable zpool_add_004_pos test case

This test case add a zvol to as a vdev to an existing pool.  This
use case is currently known to be racy.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
8 years agoAdd the ZFS Test Suite
Brian Behlendorf [Wed, 1 Jul 2015 22:23:09 +0000 (15:23 -0700)]
Add the ZFS Test Suite

Add the ZFS Test Suite and test-runner framework from illumos.
This is a continuation of the work done by Turbo Fredriksson to
port the ZFS Test Suite to Linux.  While this work was originally
conceived as a stand alone project integrating it directly with
the ZoL source tree has several advantages:

  * Allows the ZFS Test Suite to be packaged in zfs-test package.
    * Facilitates easy integration with the CI testing.
    * Users can locally run the ZFS Test Suite to validate ZFS.
      This testing should ONLY be done on a dedicated test system
      because the ZFS Test Suite in its current form is destructive.
  * Allows the ZFS Test Suite to be run directly in the ZoL source
    tree enabled developers to iterate quickly during development.
  * Developers can easily add/modify tests in the framework as
    features are added or functionality is changed.  The tests
    will then always be in sync with the implementation.

Full documentation for how to run the ZFS Test Suite is available
in the tests/README.md file.

Warning: This test suite is designed to be run on a dedicated test
system.  It will make modifications to the system including, but
not limited to, the following.

  * Adding new users
  * Adding new groups
  * Modifying the following /proc files:
    * /proc/sys/kernel/core_pattern
    * /proc/sys/kernel/core_uses_pid
  * Creating directories under /

Notes:
  * Not all of the test cases are expected to pass and by default
    these test cases are disabled.  The failures are primarily due
    to assumption made for illumos which are invalid under Linux.
  * When updating these test cases it should be done in as generic
    a way as possible so the patch can be submitted back upstream.
    Most existing library functions have been updated to be Linux
    aware, and the following functions and variables have been added.
    * Functions:
      * is_linux          - Used to wrap a Linux specific section.
      * block_device_wait - Waits for block devices to be added to /dev/.
    * Variables:            Linux          Illumos
      * ZVOL_DEVDIR         "/dev/zvol"    "/dev/zvol/dsk"
      * ZVOL_RDEVDIR        "/dev/zvol"    "/dev/zvol/rdsk"
      * DEV_DSKDIR          "/dev"         "/dev/dsk"
      * DEV_RDSKDIR         "/dev"         "/dev/rdsk"
      * NEWFS_DEFAULT_FS    "ext2"         "ufs"
  * Many of the disabled test cases fail because 'zfs/zpool destroy'
    returns EBUSY.  This is largely causes by the asynchronous nature
    of device handling on Linux and is expected, the impacted test
    cases will need to be updated to handle this.
  * There are several test cases which have been disabled because
    they can trigger a deadlock.  A primary example of this is to
    recursively create zpools within zpools.  These tests have been
    disabled until the root issue can be addressed.
  * Illumos specific utilities such as (mkfile) should be added to
    the tests/zfs-tests/cmd/ directory.  Custom programs required by
    the test scripts can also be added here.
  * SELinux should be either is permissive mode or disabled when
    running the tests.  The test cases should be updated to conform
    to a standard policy.
  * Redundant test functionality has been removed (zfault.sh).
  * Existing test scripts (zconfig.sh) should be migrated to use
    the framework for consistency and ease of testing.
  * The DISKS environment variable currently only supports loopback
    devices because of how the ZFS Test Suite expects partitions to
    be named (p1, p2, etc).  Support must be added to generate the
    correct partition name based on the device location and name.
  * The ZFS Test Suite is part of the illumos code base at:
    https://github.com/illumos/illumos-gate/tree/master/usr/src/test

Original-patch-by: Turbo Fredriksson <turbo@bayour.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #6
Closes #1534

8 years agoIllumos 6681 - zfs list burning lots of time in dodefault() via dsl_prop_*
Alex Wilson [Fri, 11 Mar 2016 23:25:32 +0000 (00:25 +0100)]
Illumos 6681 - zfs list burning lots of time in dodefault() via dsl_prop_*

6681 zfs list burning lots of time in dodefault() via dsl_prop_*
Reviewed by: Patrick Mooney <patrick.mooney@joyent.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>

References:
  https://www.illumos.org/issues/6681
  https://github.com/illumos/illumos-gate/commit/d09e447

Ported-by: kernelOfTruth kerneloftruth@gmail.com
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4406

8 years agoFix aarch64 compilation
Gordan Bobic [Tue, 15 Mar 2016 17:17:51 +0000 (17:17 +0000)]
Fix aarch64 compilation

sys/param.h depends on types defined in sys/types.h
(hrtime_t & timestruc_t).

Signed-off-by: Gordan Bobic <gordan@redsleeve.org>
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4420

8 years agoIllumos 6370 - ZFS send fails to transmit some holes
Paul Dagnelie [Fri, 26 Feb 2016 01:45:19 +0000 (20:45 -0500)]
Illumos 6370 - ZFS send fails to transmit some holes

6370 ZFS send fails to transmit some holes
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: Stefan Ring <stefanrin@gmail.com>
Reviewed by: Steven Burgess <sburgess@datto.com>
Reviewed by: Arne Jansen <sensille@gmx.net>
Approved by: Robert Mustacchi <rm@joyent.com>

References:
  https://www.illumos.org/issues/6370
  https://github.com/illumos/illumos-gate/commit/286ef71

In certain circumstances, "zfs send -i" (incremental send) can produce
a stream which will result in incorrect sparse file contents on the
target.

The problem manifests as regions of the received file that should be
sparse (and read a zero-filled) actually contain data from a file that
was deleted (and which happened to share this file's object ID).

Note: this can happen only with filesystems (not zvols, because they do
not free (and thus can not reuse) object IDs).

Note: This can happen only if, since the incremental source (FromSnap),
a file was deleted and then another file was created, and the new file
is sparse (i.e. has areas that were never written to and should be
implicitly zero-filled).

We suspect that this was introduced by 4370 (applies only if hole_birth
feature is enabled), and made worse by 5243 (applies if hole_birth
feature is disabled, and we never send any holes).

The bug is caused by the hole birth feature. When an object is deleted
and replaced, all the holes in the object have birth time zero. However,
zfs send cannot tell that the holes are new since the file was replaced,
so it doesn't send them in an incremental. As a result, you can end up
with invalid data when you receive incremental send streams. As a
short-term fix, we can always send holes with birth time 0 (unless it's
a zvol or a dataset where we can guarantee that no objects have been
reused).

Ported-by: Steven Burgess <sburgess@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4369
Closes #4050

8 years agoRelax MBR partition scanning requirement
Brian Behlendorf [Thu, 17 Jul 2014 17:46:46 +0000 (10:46 -0700)]
Relax MBR partition scanning requirement

When checking a whole disk to see if it can be safely added to
the pool a variety of checks are done.  One of those checks is
to attempt to determine the partition information and scan all
the partitions for existing filesystems.

Since ZoL contains a EFI library this partition scanning is
easy to do for GPT partitioned disks.  However, for non-GPT
partitioned disks (MBR/EBR) things are a bit harder.  The lack of
a convenient library means non-GPT partitioned disks will not
have all their partitions checked.  For this reason, the default
behavior was to require the force option.  For example:

invalid vdev specification
use '-f' to override the following errors:
/dev/vdb does not contain an GPT label but it may contain partition
information in the MBR.

However in practice requiring the force option for this case is
counter-intuitively less safe.  The reason is because only the first
error is returned.  By passing the force option it will suppress
this first warning and potentially others you were not aware of.

Therefore this patch inverts the default behavior for non-GPT
formated disks (unformatted, MBR/EBR, etc).  If no GPT table is
detected and there is no file system detected on the provided
block device.  Then it will be assumed that block device is safe
to use.

Longer term it would be nice to see MBR/EBR scanning added to
the utilities.  This should be fairly straight forward to do.
However these days it's somewhat less critical because Linux
defaults to GPT partition tables for devices 2TB or larger.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2660
Closes #2274

8 years agoFix lock order inversion with zvol_open()
Boris Protopopov [Wed, 23 Sep 2015 16:34:51 +0000 (12:34 -0400)]
Fix lock order inversion with zvol_open()

zfsonlinux issue #3681 - lock order inversion between zvol_open() and
dsl_pool_sync()...zvol_rename_minors()

Remove trylock of spa_namespace_lock as it is no longer needed when
zvol minor operations are performed in a separate context with no
prior locking state; the spa_namespace_lock is no longer held
when bdev->bd_mutex or zfs_state_lock might be taken in the code
paths originating from the zvol minor operation callbacks.

Signed-off-by: Boris Protopopov <boris.protopopov@actifio.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3681

8 years agoAdd support for asynchronous zvol minor operations
Boris Protopopov [Sat, 22 Mar 2014 09:07:14 +0000 (05:07 -0400)]
Add support for asynchronous zvol minor operations

zfsonlinux issue #2217 - zvol minor operations: check snapdev
property before traversing snapshots of a dataset

zfsonlinux issue #3681 - lock order inversion between zvol_open()
and dsl_pool_sync()...zvol_rename_minors()

Create a per-pool zvol taskq for asynchronous zvol tasks.
There are a few key design decisions to be aware of.

* Each taskq must be single threaded to ensure tasks are always
  processed in the order in which they were dispatched.

* There is a taskq per-pool in order to keep the pools independent.
  This way if one pool is suspended it will not impact another.

* The preferred location to dispatch a zvol minor task is a sync
  task.  In this context there is easy access to the spa_t and
  minimal error handling is required because the sync task must
  succeed.

Support for asynchronous zvol minor operations address issue #3681.

Signed-off-by: Boris Protopopov <boris.protopopov@actifio.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2217
Closes #3678
Closes #3681

8 years agoRemove RPM package restriction
Brian Behlendorf [Thu, 10 Mar 2016 17:14:27 +0000 (09:14 -0800)]
Remove RPM package restriction

ZFS on Linux is regularly tested on arm, ppc, ppc64, i686 and x86_64
architectures.  Given this the artificial architecture restriction in
the packaging has been removed.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
8 years agoChange KM_SLEEP to TQ_SLEEP in spa_deadman()
Tim Chase [Mon, 7 Mar 2016 13:35:29 +0000 (07:35 -0600)]
Change KM_SLEEP to TQ_SLEEP in spa_deadman()

Since they both evaluate to zero, this is a semi-cosmetic change
but the latter is the proper value to use as an argument to
taskq_dispatch_delay().

Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4393

8 years agoUpdated paths to scan when importing zpool(s)
Thijs Cramer [Thu, 4 Feb 2016 21:34:49 +0000 (22:34 +0100)]
Updated paths to scan when importing zpool(s)

Added by-partlabel and by-partuuid to the default device search
path.  Made made device names in by-label more preferable.

Signed-off-by: Thijs Cramer <thijs.cramer@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3892

8 years agoRequire libblkid
Brian Behlendorf [Fri, 19 Feb 2016 23:43:43 +0000 (15:43 -0800)]
Require libblkid

Historically libblkid support was detected as part of configure
and optionally enabled.  This was done because at the time support
for detecting ZFS pool vdevs had just be added to libblkid and
those updated packages were not yet part of many distributions.
This is no longer the case and any reasonably current distribution
will ship a version of libblkid which can detect ZFS pool vdevs.

This patch makes libblkid mandatory at build time and libblkid
the preferred method of scanning for ZFS pools.  For distributions
which include a modern version of libblkid there is no change in
behavior.  Explicitly scanning the default search paths is still
supported and can be enabled with the '-s' command line option.

Additionally making libblkid mandatory means that the 'zpool create'
command can reliably detect if a specified device has an existing
non-ZFS filesystem (ext4, xfs) and print a warning.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2448

8 years agoEnsure zed _finish_daemonize() leaves fds 0-2 open
Chris Dunlap [Tue, 1 Mar 2016 20:23:55 +0000 (12:23 -0800)]
Ensure zed _finish_daemonize() leaves fds 0-2 open

In zed's _finish_daemonize(), /dev/null is open()d onto a temporary
file descriptor which is then dup()d onto stdin, stdout, and stderr.
But if file descriptors 0, 1, or 2 are not already open at the start
of this function, then the temporary file descriptor will fall within
this range and be inadvertently closed when the function cleans up.

This commit adds a check to prevent inadvertently closing this
(presumably temporary) file descriptor when it shouldn't.

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4384

8 years agoFix zpool iostat bandwidth/ops calculation
Tony Hutter [Wed, 2 Mar 2016 20:57:06 +0000 (12:57 -0800)]
Fix zpool iostat bandwidth/ops calculation

print_vdev_stats() subtracts the old bandwidth/ops stats from the new stats
to calculate the bandwidth/ops numbers in "zpool iostat".  However when the
TXG numbers change between stats, zpool_refresh_stats() will incorrectly assign
a NULL to the old stats. This causes print_vdev_stats() to use zeroes for
the old bandwidth/ops numbers, resulting in an inaccurate calculation.

This fix allows the calculation to happen even when TXGs change.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4387