]> granicus.if.org Git - zfs/log
zfs
6 years agoFix some typos
John Eismeier [Wed, 28 Feb 2018 16:57:10 +0000 (11:57 -0500)]
Fix some typos

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: George Melikov <mail@gmelikov.ru>
Signed-off-by: John Eismeier <john.eismeier@gmail.com>
Closes #7237

6 years agoFix zpool(8) list example to match actual format
Tomohiro Kusumi [Wed, 28 Feb 2018 16:54:53 +0000 (01:54 +0900)]
Fix zpool(8) list example to match actual format

a05dfd00 (Illumos 5147) has swapped FRAG and EXPANDSZ,
so it's natural to modify these examples.

 # zpool list | head -1
 NAME     SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
                              ^^^^^^^^^^^^^^^

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #7244

6 years agoAdd SMART self-test results to zpool status -c
Tony Hutter [Tue, 27 Feb 2018 17:31:27 +0000 (09:31 -0800)]
Add SMART self-test results to zpool status -c

Add in SMART self-test results to zpool status|iostat -c.  This
works for both SAS and SATA drives.

Also, add plumbing to allow the 'smart' script to take smartctl
output from a directory of output text files instead of running
it against the vdevs.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7178

6 years agoAdd scrub after resilver zed script
Tony Hutter [Fri, 23 Feb 2018 19:38:05 +0000 (11:38 -0800)]
Add scrub after resilver zed script

* Add a zed script to kick off a scrub after a resilver.  The script is
disabled by default.

* Add a optional $PATH (-P) option to zed to allow it to use a custom
$PATH for its zedlets.  This is needed when you're running zed under
the ZTS in a local workspace.

* Update test scripts to not copy in all-debug.sh and all-syslog.sh by
default.  They can be optionally copied in as part of zed_setup().
These scripts slow down zed considerably under heavy events loads and
can cause events to be dropped or their delivery delayed. This was
causing some sporadic failures in the 'fault' tests.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #4662
Closes #7086

6 years agoFix free memory calculation on v3.14+
chrisrd [Fri, 23 Feb 2018 16:50:06 +0000 (03:50 +1100)]
Fix free memory calculation on v3.14+

Provide infrastructure to auto-configure to enum and API changes in the
global page stats used for our free memory calculations.

arc_free_memory has been broken since an API change in Linux v3.14:

2016-07-28 v4.8 599d0c95 mm, vmscan: move LRU lists to node
2016-07-28 v4.8 75ef7184 mm, vmstat: add infrastructure for per-node
  vmstats

These commits moved some of global_page_state() into
global_node_page_state(). The API change was particularly egregious as,
instead of breaking the old code, it silently did the wrong thing and we
continued using global_page_state() where we should have been using
global_node_page_state(), thus indexing into the wrong array via
NR_SLAB_RECLAIMABLE et al.

There have been further API changes along the way:

2017-07-06 v4.13 385386cf mm: vmstat: move slab statistics from zone to
  node counters
2017-09-06 v4.14 c41f012a mm: rename global_page_state to
  global_zone_page_state

...and various (incomplete, as it turns out) attempts to accomodate
these changes in ZoL:

2017-08-24 2209e409 Linux 4.8+ compatibility fix for vm stats
2017-09-16 787acae0 Linux 3.14 compat: IO acct, global_page_state, etc
2017-09-19 661907e6 Linux 4.14 compat: IO acct, global_page_state, etc

The config infrastructure provided here resolves these issues going back
to the original API change in v3.14 and is robust against further Linux
changes in this area.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Closes #7170

6 years agoReport duration and error in mmp_history entries
Olaf Faaland [Thu, 22 Feb 2018 23:34:34 +0000 (15:34 -0800)]
Report duration and error in mmp_history entries

After an MMP write completes, update the relevant mmp_history entry
with the time between submission and completion, and the error
status of the write.

[faaland1@toss3a zfs]$ cat /proc/spl/kstat/zfs/pool/multihost
39 0 0x01 100 8800 69147946270893 72723903122926
id       txg     timestamp  error  duration   mmp_delay    vdev_guid
10607    1166    1518985089 0      138301     637785455    4882...
10608    1166    1518985089 0      136154     635407747    1151...
10609    1166    1518985089 0      803618560  633048078    9740...
10610    1166    1518985090 0      144826     633048078    4882...
10611    1166    1518985090 0      164527     666187671    1151...

Where duration = gethrtime_in_done_fn - gethrtime_at_submission, and
error = zio->io_error.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7190

6 years agoDo not initiate MMP writes while pool is suspended
Olaf Faaland [Thu, 22 Feb 2018 17:14:46 +0000 (09:14 -0800)]
Do not initiate MMP writes while pool is suspended

While the pool is suspended on host A, it may be imported on host B.
If host A continued to write MMP blocks, it would be blindly
overwriting MMP blocks written by host B, and the blocks written by
host A would have outdated txg information.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7182

6 years agoLinux 4.16 compat: use correct *_dec_and_test()
Tony Hutter [Thu, 22 Feb 2018 17:02:06 +0000 (09:02 -0800)]
Linux 4.16 compat: use correct *_dec_and_test()

Use refcount_dec_and_test() on 4.16+ kernels, atomic_dec_and_test()
on older kernels.  https://lwn.net/Articles/714974/

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes: #7179
Closes: #7211
6 years agoAllow modprobe to fail when called within systemd
Matthew Thode [Wed, 21 Feb 2018 22:45:35 +0000 (22:45 +0000)]
Allow modprobe to fail when called within systemd

This allows for systems with zfs built into the kernel manually to run
these services.  Otherwise the service will fail to start.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Kash Pande <kash@tripleback.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Thode <mthode@mthode.org>
Closes #7174

6 years agoAdd SMART attributes for SSD and NVMe
bunder2015 [Wed, 21 Feb 2018 21:52:47 +0000 (16:52 -0500)]
Add SMART attributes for SSD and NVMe

This adds the SMART attributes required to probe Samsung SSD and NVMe
(and possibly others) disks when using the "zpool status -c" command.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: bunder2015 <omfgbunder@gmail.com>
Closes #7183
Closes #7193

6 years agoCorrect count_uberblocks in mmp.kshlib
Giuseppe Di Natale [Wed, 21 Feb 2018 00:28:52 +0000 (16:28 -0800)]
Correct count_uberblocks in mmp.kshlib

A log_must call was causing count_uberblocks to return more
than just the uberblock count. Remove the log_must since it
was only logging a sleep.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #7191

6 years agoFix config issues: frame size and headers
chrisrd [Thu, 15 Feb 2018 20:58:23 +0000 (07:58 +1100)]
Fix config issues: frame size and headers

1. With various (debug and/or tracing?) kernel options enabled it's
possible for 'struct inode' and 'struct super_block' to exceed the
default frame size, leaving errors like this in config.log:

build/conftest.c:116:1: error: the frame size of 1048 bytes is larger
than 1024 bytes [-Werror=frame-larger-than=]

Fix this by removing the frame size warning for config checks

2. Without the correct headers included, it's possible for declarations
to be missed, leaving errors like this in the config.log:

build/conftest.c:131:14: error: ‘struct nameidata’ declared inside
parameter list [-Werror]

Fix this by adding appropriate headers.

Note: Both these issues can result in silent config failures because
the compile failure is taken to mean "this option is not supported by
this kernel" rather than "there's something wrong with the config
test". This can lead to something merely annoying (compile failures) to
something potentially serious (miscompiled or misused kernel primitives
or functions). E.g. the fixes included here resulted in these
additional defines in zfs_config.h with linux v4.14.19:

Also, drive-by whitespace fixes in config/* files which don't mention
"GNU" (those ones look to be imported from elsewhere so leave them
alone).

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Closes #7169

6 years agoClarify zinject(8) explanation of -e
Olaf Faaland [Thu, 15 Feb 2018 17:50:06 +0000 (09:50 -0800)]
Clarify zinject(8) explanation of -e

Error injection of EIO or ENXIO simply sets the zio's io_error value,
rather than preventing the read or write from occurring.  This is
important information as it affects how the probes must be used.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7172

6 years agoOpenZFS 8857 - zio_remove_child() panic due to already destroyed parent zio
George Wilson [Thu, 8 Feb 2018 20:04:14 +0000 (15:04 -0500)]
OpenZFS 8857 - zio_remove_child() panic due to already destroyed parent zio

PROBLEM
=======
It's possible for a parent zio to complete even though it has children
which have not completed. This can result in the following panic:
    > $C
    ffffff01809128c0 vpanic()
    ffffff01809128e0 mutex_panic+0x58(fffffffffb94c904ffffff597dde7f80)
    ffffff0180912950 mutex_vector_enter+0x347(ffffff597dde7f80)
    ffffff01809129b0 zio_remove_child+0x50(ffffff597dde7c58ffffff32bd901ac0,
    ffffff3373370908)
    ffffff0180912a40 zio_done+0x390(ffffff32bd901ac0)
    ffffff0180912a70 zio_execute+0x78(ffffff32bd901ac0)
    ffffff0180912b30 taskq_thread+0x2d0(ffffff33bae44140)
    ffffff0180912b40 thread_start+8()
    > ::status
    debugging crash dump vmcore.2 (64-bit) from batfs0390
    operating system: 5.11 joyent_20170911T171900Z (i86pc)
    image uuid: (not set)
    panic message: mutex_enter: bad mutex, lp=ffffff597dde7f80
    owner=ffffff3c59b39480 thread=ffffff0180912c40
    dump content: kernel pages only
The problem is that dbuf_prefetch along with l2arc can create a zio tree
which confuses the parent zio and allows it to complete with while children
still exist. Here's the scenario:
    zio tree:
        pio
         |--- lio
The parent zio, pio, has entered the zio_done stage and begins to check its
children to see there are still some that have not completed. In zio_done(),
the children are checked in the following order:
    zio_wait_for_children(zio, ZIO_CHILD_VDEV, ZIO_WAIT_DONE)
    zio_wait_for_children(zio, ZIO_CHILD_GANG, ZIO_WAIT_DONE)
    zio_wait_for_children(zio, ZIO_CHILD_DDT, ZIO_WAIT_DONE)
    zio_wait_for_children(zio, ZIO_CHILD_LOGICAL, ZIO_WAIT_DONE)
If pio, finds any child which has not completed then it stops executing and
goes to sleep. Each call to zio_wait_for_children() will grab the io_lock
while checking the particular child.
In this scenario, the pio has completed the first call to
zio_wait_for_children() to check for any ZIO_CHILD_VDEV children. Since
the only zio in the zio tree right now is the logical zio, lio, then it
completes that call and prepares to check the next child type.
In the meantime, the lio completes and in its callback creates a child vdev
zio, cio. The zio tree looks like this:
    zio tree:
        pio
         |--- lio
         |--- cio
The lio then grabs the parent's io_lock and removes itself.
    zio tree:
        pio
         |--- cio
The pio continues to run but has already completed its check for ZIO_CHILD_VDEV
and will erroneously complete. When the child zio, cio, completes it will panic
the system trying to reference the parent zio which has been destroyed.
SOLUTION
========
The fix is to rework the zio_wait_for_children() logic to accept a bitfield
for all the children types that it's interested in checking. The
io_lock will is held the entire time we check all the children types. Since
the function now accepts a bitfield, a simple ZIO_CHILD_BIT() macro is provided
to allow for the conversion between a ZIO_CHILD type and the bitfield used by
the zio_wiat_for_children logic.

Authored by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: Youzhong Yang <youzhong@gmail.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Dan McDonald <danmcd@omniti.com>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/8857
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/862ff6d99c
Issue #5918
Closes #7168

6 years ago'zfs receive' fails with "dataset is busy"
LOLi [Mon, 12 Feb 2018 20:28:59 +0000 (21:28 +0100)]
'zfs receive' fails with "dataset is busy"

Receiving an incremental stream after an interrupted "zfs receive -s"
fails with the message "dataset is busy": this is because we still have
the hidden clone ../%recv from the resumable receive.

Improve the error message suggesting the existence of a partially
complete resumable stream from "zfs receive -s" which can be either
aborted ("zfs receive -A") or resumed ("zfs send -t").

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7129
Closes #7154

6 years agocontrib/initramfs: add missing conf.d/zfs
LOLi [Mon, 12 Feb 2018 19:40:00 +0000 (20:40 +0100)]
contrib/initramfs: add missing conf.d/zfs

When upgrading from the distribution-provided zfs-initramfs package on
root-on-zfs Ubuntu and Debian the system may fail to boot: this change
adds the missing initramfs configuration file.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7158

6 years agommp should use a fixed tag for spa_config locks
sanjeevbagewadi [Mon, 12 Feb 2018 19:30:38 +0000 (01:00 +0530)]
mmp should use a fixed tag for spa_config locks

mmp_write_uberblock() and mmp_write_done() should the same tag
for spa_config_locks.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sanjeev Bagewadi <sanjeev.bagewadi@gmail.com>
Closes #6530
Closes #7155

6 years agoHandle zap_add() failures in mixed case mode
sanjeevbagewadi [Fri, 9 Feb 2018 18:15:53 +0000 (23:45 +0530)]
Handle zap_add() failures in mixed case mode

With "casesensitivity=mixed", zap_add() could fail when the number of
files/directories with the same name (varying in case) exceed the
capacity of the leaf node of a Fatzap. This results in a ASSERT()
failure as zfs_link_create() does not expect zap_add() to fail. The fix
is to handle these failures and rollback the transactions.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sanjeev Bagewadi <sanjeev.bagewadi@gmail.com>
Closes #7011
Closes #7054

6 years agoFix zdb -ed on objset for exported pool
Chunwei Chen [Fri, 2 Feb 2018 00:36:40 +0000 (16:36 -0800)]
Fix zdb -ed on objset for exported pool

zdb -ed on objset for exported pool would failed with:
  failed to own dataset 'qq/fs0': No such file or directory

The reason is that zdb pass objset name to spa_import, it uses that
name to create a spa. Later, when dmu_objset_own tries to lookup the spa
using real pool name, it can't find one.

We fix this by make sure we pass pool name rather than objset name to
spa_import.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099
Closes #6464

6 years agoFix zdb -E segfault
Chunwei Chen [Fri, 2 Feb 2018 00:28:11 +0000 (16:28 -0800)]
Fix zdb -E segfault

SPA_MAXBLOCKSIZE is too large for stack.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099

6 years agoFix zdb -R decompression
Chunwei Chen [Fri, 2 Feb 2018 00:19:36 +0000 (16:19 -0800)]
Fix zdb -R decompression

There are some issues in the zdb -R decompression implementation.

The first is that ZLE can easily decompress non-ZLE streams. So we add
ZDB_NO_ZLE env to make zdb skip ZLE.

The second is the random bytes appended to pabd, pbuf2 stuff. This serve
no purpose at all, those bytes shouldn't be read during decompression
anyway. Instead, we randomize lbuf2, so that we can make sure
decompression fill exactly to lsize by bcmp lbuf and lbuf2.

The last one is the condition to detect fail is wrong.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099
Closes #4984

6 years agoFix racy assignment of zcb.zcb_haderrors
Chunwei Chen [Thu, 1 Feb 2018 23:42:41 +0000 (15:42 -0800)]
Fix racy assignment of zcb.zcb_haderrors

zcb_haderrors will be modified in zdb_blkptr_done, which is
asynchronous. So we must move this assignment after zio_wait.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099

6 years agoFix zle_decompress out of bound access
Chunwei Chen [Thu, 1 Feb 2018 23:41:05 +0000 (15:41 -0800)]
Fix zle_decompress out of bound access

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099

6 years agoFix zdb -c traverse stop on damaged objset root
Chunwei Chen [Tue, 30 Jan 2018 21:39:11 +0000 (13:39 -0800)]
Fix zdb -c traverse stop on damaged objset root

If a corruption happens to be on a root block of an objset, zdb -c will
not correctly report the error, and it will not traverse the datasets
that come after. This is because traverse_visitbp, which does the
callback and reset error for TRAVERSE_HARD, is skipped when traversing
zil is failed in traverse_impl.

Here's example of what 'zdb -eLcc' command looks like on a pool with
damaged objset root:

== before patch:

Traversing all blocks to verify checksums ...

Error counts:

errno  count
block traversal size 379392 != alloc 33987072 (unreachable 33607680)

bp count:             172
ganged count:           0
bp logical:       1678336      avg:   9757
bp physical:       130560      avg:    759     compression:  12.85
bp allocated:      379392      avg:   2205     compression:   4.42
bp deduped:             0    ref>1:      0   deduplication:   1.00
SPA allocated:   33987072     used:  0.80%

additional, non-pointer bps of type 0:         71
Dittoed blocks on same vdev: 101

== after patch:

Traversing all blocks to verify checksums ...

zdb_blkptr_cb: Got error 52 reading <54, 0, -1, 0>  -- skipping

Error counts:

errno  count
   52  1
block traversal size 33963520 != alloc 33987072 (unreachable 23552)

bp count:             447
ganged count:           0
bp logical:      36093440      avg:  80745
bp physical:     33699840      avg:  75391     compression:   1.07
bp allocated:    33963520      avg:  75981     compression:   1.06
bp deduped:             0    ref>1:      0   deduplication:   1.00
SPA allocated:   33987072     used:  0.80%

additional, non-pointer bps of type 0:         76
Dittoed blocks on same vdev: 115

==

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099

6 years agoLinux 4.11 compat: avoid refcount_t name conflict
Brian Behlendorf [Thu, 8 Feb 2018 22:31:08 +0000 (14:31 -0800)]
Linux 4.11 compat: avoid refcount_t name conflict

Related to commit 4859fe796, when directly using the kernel's
refcount functions in kernel compatibility code do not map
refcount_t to zfs_refcount_t.  This leads to a type mismatch.

Longer term we should consider renaming refcount_t to
zfs_refcount_t in the zfs code base.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7148

6 years agoLinux 4.16 compat: inode_set_iversion()
Brian Behlendorf [Thu, 8 Feb 2018 22:27:59 +0000 (14:27 -0800)]
Linux 4.16 compat: inode_set_iversion()

A new interface was added to manipulate the version field of an
inode.  Add a inode_set_iversion() wrapper for older kernels and
use the new interface when available.

The i_version field was dropped from the trace point due to the
switch to an atomic64_t i_version type.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7148

6 years agoOpenZFS 8966 - Source file zfs_acl.c, function zfs_aclset_common contains a use after...
WHR [Sun, 14 Jan 2018 20:57:54 +0000 (23:57 +0300)]
OpenZFS 8966 - Source file zfs_acl.c, function zfs_aclset_common contains a use after end of the lifetime of a local variable

Authored by: WHR <msl0000023508@gmail.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: George Melikov <mail@gmelikov.ru>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Richard Lowe <richlowe@richlowe.net>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/8966
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/c95549fcdc
Closes #7141

6 years agoRemove deprecated zfs_arc_p_aggressive_disable
Richard Elling [Wed, 7 Feb 2018 19:54:20 +0000 (11:54 -0800)]
Remove deprecated zfs_arc_p_aggressive_disable

zfs_arc_p_aggressive_disable is no more. This PR removes docs
and module parameters for zfs_arc_p_aggressive_disable.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Richard Elling <Richard.Elling@RichardElling.com>
Closes #7135

6 years agoFix default libdir for Debian/Ubuntu
Brian Behlendorf [Tue, 6 Feb 2018 04:42:52 +0000 (20:42 -0800)]
Fix default libdir for Debian/Ubuntu

The distribution provided architecture specific RPM macro files
for x86_64 and other architectures on Debian/Ubuntu specify the
wrong default libdir install location.  When building deb packages
override _lib with the correct location.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7083
Closes #7101

6 years agoBug fix in qat_compress.c for vmalloc addr check
wli5 [Mon, 5 Feb 2018 18:26:27 +0000 (02:26 +0800)]
Bug fix in qat_compress.c for vmalloc addr check

Remove the unused vmalloc address check, and function mem_to_page
will handle the non-vmalloc address when map it to a physical
address.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Weigang Li <weigang.li@intel.com>
Closes #7125

6 years agoFix systemd_ RPM macros usage on Debian-based distributions
LOLi [Fri, 2 Feb 2018 21:50:42 +0000 (22:50 +0100)]
Fix systemd_ RPM macros usage on Debian-based distributions

Debian-based distributions do not seem to provide RPM macros for
dealing with systemd pre- and post- (un)install actions: this results
in errors when installing or upgrading .deb packages because the
resulting control scripts contain the following unresolved macros:

 * %systemd_post
 * %systemd_preun
 * %systemd_postun

Fix this by providing default values for postinstall, preuninstall and
postuninstall scripts when these macros are not defined.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7074
Closes #7100

6 years agoEmit an error message before MMP suspends pool
John L. Hammond [Wed, 17 Jan 2018 20:24:42 +0000 (14:24 -0600)]
Emit an error message before MMP suspends pool

In mmp_thread(), emit an MMP specific error message before calling
zio_suspend() so that the administrator will understand why the pool
is being suspended.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Closes #7048

6 years agoZTS: Fix create-o_ashift test case
LOLi [Tue, 19 Dec 2017 18:49:33 +0000 (19:49 +0100)]
ZTS: Fix create-o_ashift test case

The function that fills the uberblock ring buffer on every device label
has been reworked to avoid occasional failures caused by a race
condition that prevents 'zpool sync' from writing some uberblock
sequentially: this happens when the pool sync ioctl dispatch code calls
txg_wait_synced() while we're already waiting for a TXG to sync.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6924
Closes #6977

6 years agoFix --with-systemd on Debian-based distributions (#6963)
LOLi [Sun, 17 Dec 2017 22:08:48 +0000 (23:08 +0100)]
Fix --with-systemd on Debian-based distributions (#6963)

These changes propagate the "--with-systemd" configure option to the
RPM spec file, allowing Debian-based distributions to package
systemd-related files.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6591
Closes #6963

6 years agoRemove vn_rename and vn_remove dependency
Brian Behlendorf [Thu, 19 Oct 2017 17:06:55 +0000 (10:06 -0700)]
Remove vn_rename and vn_remove dependency

The only place vn_rename and vn_remove are used is when writing
out an updated pool configuration file.  By truncating the file
instead of renaming and removing it we can avoid having to implement
these interfaces entirely.  Functionally an empty cache file is
treated the same as a missing cache file.  This is particularly
advantageous because the Linux kernel has never provided a way
to reliably implement vn_rename and vn_remove.

The cachefile_004_pos.ksh test case was updated to understand
that an empty cache file is the same as a missing one.

The zfs-import-* systemd service files were not updated to use
ConditionFileNotEmpty in place of ConditionPathExists.  This
means that after exporting all pools and rebooting new pools
will not the scanned for on the next boot.  This small change
should not impact normal usage since pools are not exported
as part of a normal shutdown.

Documentation was updated accordingly.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Arkadiusz Bubała <arkadiusz.bubala@open-e.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes zfsonlinux/spl#648
Closes #6753

6 years agoFix "--enable-code-coverage" debug build
Brian Behlendorf [Sat, 23 Sep 2017 05:16:18 +0000 (22:16 -0700)]
Fix "--enable-code-coverage" debug build

When --enable-code-coverage is provided it should not result
in NDEBUG being defined.  This is controlled by --enable-debug.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6674

6 years agoUpdate codecov.yml
Brian Behlendorf [Sat, 23 Sep 2017 01:54:34 +0000 (18:54 -0700)]
Update codecov.yml

Update the codecov.yml to make the following functional changes.

* Do not require the CI testing to pass before posting results.
* Set red-yellow-green coverage percent from 50%-100%
* Allow a 1% drop in coverage to still be considered a pass.
* Reduce the size of the comment posted to the issue.

Additionally, the top level README.markdown has been updated
to include the codecov.io badge and the project summary reworded.

Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6669

6 years agoAdd support for "--enable-code-coverage" option
Prakash Surya [Sat, 23 Sep 2017 01:49:57 +0000 (18:49 -0700)]
Add support for "--enable-code-coverage" option

This change adds support for a new option that can be passed to the
configure script: "--enable-code-coverage". Further, the "--enable-gcov"
option has been removed, as this new option provides the same
functionality (plus more).

When using this new option the following make targets are available:

 * check-code-coverage
 * code-coverage-capture
 * code-coverage-clean

Note: these make targets can only be run from the root of the project.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Prakash Surya <prakash.surya@delphix.com>
Closes #6670

6 years agoMake "-fno-inline" compile option more accessible
Prakash Surya [Fri, 15 Sep 2017 18:47:11 +0000 (11:47 -0700)]
Make "-fno-inline" compile option more accessible

When functions are inlined, it can make the system much more difficult
to instrument using tools such as ftrace, BPF, crash, etc. Thus, to aid
development and increase the system's observability, when the
"--enable-debuginfo" flag is specified, the "-fno-inline" compilation
option will be used for both userspace and kernel modules.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Prakash Surya <prakash.surya@delphix.com>
Closes #6605

6 years agoAdd configure option to enable gcov analysis
Brian Behlendorf [Fri, 15 Sep 2017 17:24:13 +0000 (10:24 -0700)]
Add configure option to enable gcov analysis

* Add configure option to enable gcov analysis.
* Includes a few minor ctime fixes.
* Add codecov.yml configuration.

Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6642

6 years agoImplement --enable-debuginfo to force debuginfo
Richard Yao [Tue, 23 Sep 2014 18:29:30 +0000 (14:29 -0400)]
Implement --enable-debuginfo to force debuginfo

Inspection of a Ubuntu 14.04 x64 system revealed that the config file
used to build the kernel image differs from the config file used to
build kernel modules by the presence of CONFIG_DEBUG_INFO=y:

This in itself is insufficient to show that the kernel is built with
debuginfo, but a cursory analysis of the debuginfo provided and the
size of the kernel strongly suggests that it was built with
CONFIG_DEBUG_INFO=y while the modules were not. Installing
linux-image-$(uname -r)-dbgsym had no obvious effect on the debuginfo
provided by either the modules or the kernel.

The consequence is that issue reports from distributions such as Ubuntu
and its derivatives build kernel modules without debuginfo contain
nonsensical backtraces. It is therefore desireable to force generation
of debuginfo, so we implement --enable-debuginfo. Since the build system
can build both userspace components and kernel modules, the generic
--enable-debuginfo option will force debuginfo for both. However, it
also supports --enable-debuginfo=kernel and --enable-debuginfo=user for
finer grained control.

Enabling debuginfo for the kernel modules works by injecting
CONFIG_DEBUG_INFO=y into the make environment. This is enables
generation of debuginfo by the kernel build systems on all Linux
kernels, but the build environment is slightly different int hat
CONFIG_DEBUG_INFO has not been in the CPP. Adding -DCONFIG_DEBUG_INFO
would fix that, but it would also cause build failures on kernels where
CONFIG_DEBUG_INFO=y is already set. That would complicate its use in
DKMS environments that support a range of kernels and is therefore
undesireable. We could write a compatibility shim to enable
CONFIG_DEBUG_INFO only when it is explicitly disabled, but we forgo
doing that because it is unnecessary. Nothing in ZoL or the kernel uses
CONFIG_DEBUG_INFO in the CPP at this time and that is unlikely to
change.

Enabling debuginfo for the userspace components is done by injecting -g
into CPPFLAGS. This is not necessary because the build system honors the
environment's CPPFLAGS by appending them to the actual CPPFLAGS used,
but it is supported for consistency.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@clusterhq.com>
Closes #2734

6 years agoMake --enable-debug fail when given bogus args
Richard Yao [Tue, 23 Sep 2014 17:31:33 +0000 (13:31 -0400)]
Make --enable-debug fail when given bogus args

Currently, bogus options to --enable-debug become --disable-debug. That
means that passing --enable-debug=true is analogous to --disable-debug,
but the result is counterintuitive. We switch to AS_CASE to allow us to
fail when given a bogus option.

Also, we modify the text printed to clarify that --enable-debug enables
assertions.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@clusterhq.com>
Closes #2734

6 years agoTag zfs-0.7.6 zfs-0.7.6
Tony Hutter [Thu, 1 Feb 2018 18:02:58 +0000 (10:02 -0800)]
Tag zfs-0.7.6

META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
6 years agoFix 'zfs receive -o' when used with '-e|-d'
LOLi [Tue, 30 Jan 2018 23:54:33 +0000 (00:54 +0100)]
Fix 'zfs receive -o' when used with '-e|-d'

When used in conjunction with one of '-e' or '-d' zfs receive options
none of the properties requested to be set (-o) are actually applied:
this is caused by a wrong assumption made about the toplevel dataset
in zfs_receive_one().

Fix this by correctly detecting the toplevel dataset.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7088

Requires-spl: refs/pull/679/head

6 years agoExtend zloop.sh for automated testing
Brian Behlendorf [Mon, 22 Jan 2018 20:48:39 +0000 (12:48 -0800)]
Extend zloop.sh for automated testing

In order to debug issues encountered by ztest during automated
testing it's important that as much debugging information as
possible by dumped at the time of the failure.  The following
changes extend the zloop.sh script in order to make it easier
to integrate with buildbot.

* Add the `-m <maximum cores>` option to zloop.sh to place a
  limit of the number of core dumps generated.  By default, the
  existing behavior is maintained and no limit is set.

* Add the `-l` option to create a 'ztest.core.N' symlink in the
  current directory to the core directory. This functionality
  is provided primarily for buildbot which expects log files to
  have well known names.

* Rename 'ztest.ddt' to 'ztest.zdb' and extend it to dump
  additional basic information on failure for latter analysis.

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed by: Thomas Caputi <tcaputi@datto.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6999

Conflicts:
scripts/zloop.sh

6 years agoCleanup zloop working directory after each pass
Don Brady [Thu, 21 Sep 2017 17:17:56 +0000 (11:17 -0600)]
Cleanup zloop working directory after each pass

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed by: John Kennedy <jwk404@gmail.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: Don Brady <don.brady@delphix.com>
Issue #6595
Closes #6663

6 years agoOpenZFS 8835 - Speculative prefetch in ZFS not working for misaligned reads
Alexander Motin [Mon, 20 Nov 2017 17:56:01 +0000 (19:56 +0200)]
OpenZFS 8835 - Speculative prefetch in ZFS not working for misaligned reads

In case of misaligned I/O sequential requests are not detected as such
due to overlaps in logical block sequence:

    dmu_zfetch(fffff80198dd0ae0, 27347, 9, 1)
    dmu_zfetch(fffff80198dd0ae0, 27355, 9, 1)
    dmu_zfetch(fffff80198dd0ae0, 27363, 9, 1)
    dmu_zfetch(fffff80198dd0ae0, 27371, 9, 1)
    dmu_zfetch(fffff80198dd0ae0, 27379, 9, 1)
    dmu_zfetch(fffff80198dd0ae0, 27387, 9, 1)

This patch makes single block overlap to be counted as a stream hit,
improving performance up to several times.

Authored by: Alexander Motin <mav@FreeBSD.org>
Approved by: Gordon Ross <gwr@nexenta.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Allan Jude <allanjude@freebsd.org>
Reviewed by: Gvozden Neskovic <neskovic@gmail.com>
Reviewed by: George Melikov <mail@gmelikov.ru>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/8835
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/aab6dd482a
Closes #7062

6 years agoFix Debian packaging on ARMv7/ARM64
LOLi [Thu, 18 Jan 2018 18:15:41 +0000 (19:15 +0100)]
Fix Debian packaging on ARMv7/ARM64

When building packages on Debian-based systems specify the target
architecture used by 'alien' to convert .rpm packages into .deb: this
avoids detecting an incorrect value which results in the following
errors:

<package>.aarch64.rpm is for architecture aarch64 ; the package cannot be built on this system
<package>.armv7l.rpm is for architecture armel ; the package cannot be built on this system

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7046
Closes #7058

6 years agoFix shellcheck v0.4.6 warnings
Brian Behlendorf [Wed, 17 Jan 2018 18:17:16 +0000 (10:17 -0800)]
Fix shellcheck v0.4.6 warnings

Resolve new warnings reported after upgrading to shellcheck
version 0.4.6.  This patch contains no functional changes.

* egrep is non-standard and deprecated. Use grep -E instead. [SC2196]
* Check exit code directly with e.g. 'if mycmd;', not indirectly
  with $?.  [SC2181]  Suppressed.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7040

6 years agoRemove l2arc_nocompress from zfs-module-parameters(5)
DeHackEd [Tue, 16 Jan 2018 18:18:08 +0000 (13:18 -0500)]
Remove l2arc_nocompress from zfs-module-parameters(5)

Parameter was removed in d3c2ae1c0806
(OpenZFS 6950 - ARC should cache compressed data)

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #7043

6 years agoFix incompatibility with Reiser4 patched kernels
Richard Yao [Wed, 10 Jan 2018 00:18:19 +0000 (19:18 -0500)]
Fix incompatibility with Reiser4 patched kernels

In ZFSOnLinux, our sources and build system are self contained such that
we do not need to make changes to the Linux kernel sources. Reiser4 on
the other hand exists solely as a kernel tree patch and opts to make
changes to the kernel rather than adapt to it. After Linux 4.1 made a
VFS change that replaced new_sync_read with do_sync_read, Reiser4's
maintainer decided to modify the kernel VFS to export the old function.
This caused our autotools check to misidentify the kernel API as
predating Linux 4.1 on kernels that have been patched with Reiser4
support, which breaks our build.

Reiser4 really should be patched to stop doing this, but lets modify our
check to be more strict to help the affected users of both filesystems.

Also, we were not checking the types of arguments and return value of
new_sync_read() and new_sync_write() . Lets fix that too.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Closes #6241
Closes #7021

6 years agoUse zap_count instead of cached z_size for unlink
Alex Zhuravlev [Mon, 8 Jan 2018 18:57:47 +0000 (10:57 -0800)]
Use zap_count instead of cached z_size for unlink

As a performance optimization Lustre does not strictly update
the SA_ZPL_SIZE when adding/removing from non-directory entries.
This results in entries which cannot be removed through the ZPL
layer even though the ZAP is empty and safe to remove.

Resolve this issue by checking the zap_count() directly instead
on relying on the cached SA_ZPL_SIZE.  Micro-benchmarks show no
significant performance impact due to the additional overhead
of using zap_count().

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7019

6 years agoRevert raidz_map and _col structure types
Nathaniel Wesley Filardo [Tue, 9 Jan 2018 22:46:52 +0000 (17:46 -0500)]
Revert raidz_map and _col structure types

As part of the refactoring of ab9f4b0b824ab4cc64a4fa382c037f4154de12d6,
several uint64_t-s and uint8_t-s were changed to other types.  This
caused ZoL github issue #6981, an overflow of a size_t on a 32-bit ARM
machine.  In absense of any strong motivation for the type changes, this
simply puts them back, modulo the changes accumulated for ABD.

Compile-tested on amd64 and run-tested on armhf.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Nathaniel Wesley Filardo <nwf@cs.jhu.edu>
Closes #6981
Closes #7023

6 years agozhack: fix getopt return type
Nathaniel Wesley Filardo [Tue, 9 Jan 2018 19:14:45 +0000 (14:14 -0500)]
zhack: fix getopt return type

This fixes zhack's command processing on ARM.  On ARM char
is unsigned, and so, in promotion to an int, it will never
compare equal to -1.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Nathaniel Wesley Filardo <nwf@cs.jhu.edu>
Closes #7016

6 years agoFix ARC hit rate
Brian Behlendorf [Mon, 8 Jan 2018 17:52:36 +0000 (09:52 -0800)]
Fix ARC hit rate

When the compressed ARC feature was added in commit d3c2ae1
the method of reference counting in the ARC was modified.  As
part of this accounting change the arc_buf_add_ref() function
was removed entirely.

This would have be fine but the arc_buf_add_ref() function
served a second undocumented purpose of updating the ARC access
information when taking a hold on a dbuf.  Without this logic
in place a cached dbuf would not migrate its associated
arc_buf_hdr_t to the MFU list.  This would negatively impact
the ARC hit rate, particularly on systems with a small ARC.

This change reinstates the missing call to arc_access() from
dbuf_hold() by implementing a new arc_buf_access() function.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6171
Closes #6852
Closes #6989

6 years agoFix 'zpool add' handling of nested interior VDEVs
LOLi [Thu, 28 Dec 2017 18:15:32 +0000 (19:15 +0100)]
Fix 'zpool add' handling of nested interior VDEVs

When replacing a faulted device which was previously handled by a spare
multiple levels of nested interior VDEVs will be present in the pool
configuration; the following example illustrates one of the possible
situations:

   NAME                          STATE     READ WRITE CKSUM
   testpool                      DEGRADED     0     0     0
     raidz1-0                    DEGRADED     0     0     0
       spare-0                   DEGRADED     0     0     0
         replacing-0             DEGRADED     0     0     0
           /var/tmp/fault-dev    UNAVAIL      0     0     0  cannot open
           /var/tmp/replace-dev  ONLINE       0     0     0
         /var/tmp/spare-dev1     ONLINE       0     0     0
       /var/tmp/safe-dev         ONLINE       0     0     0
   spares
     /var/tmp/spare-dev1         INUSE     currently in use

This is safe and allowed, but get_replication() needs to handle this
situation gracefully to let zpool add new devices to the pool.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6678
Closes #6996

6 years agoCall commit callbacks from the tail of the list
lidongyang [Fri, 22 Dec 2017 18:19:51 +0000 (05:19 +1100)]
Call commit callbacks from the tail of the list

Our zfs backed Lustre MDT had soft lockups while under heavy metadata
workloads while handling transaction callbacks from osd_zfs.

The problem is zfs is not taking advantage of the fast path in
Lustre's trans callback handling, where Lustre will skip the calls
to ptlrpc_commit_replies() when it already saw a higher transaction
number.

This patch corrects this, it also has a positive impact on metadata
performance on Lustre with osd_zfs, plus some cleanup in the headers.

A similar issue for ext4/ldiskfs is described on:
https://jira.hpdd.intel.com/browse/LU-6527

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Li Dongyang <dongyang.li@anu.edu.au>
Closes #6986

6 years agoHandle broken pipes in arc_summary
Giuseppe Di Natale [Tue, 19 Dec 2017 21:19:24 +0000 (13:19 -0800)]
Handle broken pipes in arc_summary

Using a command similar to 'arc_summary.py | head' causes
a broken pipe exception. Gracefully exit in the case of a
broken pipe in arc_summary.py.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6965
Closes #6969

6 years agoHandle invalid options in arc_summary
LOLi [Tue, 19 Dec 2017 21:02:40 +0000 (22:02 +0100)]
Handle invalid options in arc_summary

If an invalid option is provided to arc_summary.py we handle any error
thrown from the getopt Python module and print the usage help message.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6983

6 years agoOpenZFS 8794 - cstyle generates warnings with recent perl
Dominik Hassler [Thu, 9 Nov 2017 14:22:07 +0000 (15:22 +0100)]
OpenZFS 8794 - cstyle generates warnings with recent perl

Authored by: Dominik Hassler <hadfl@omniosce.org>
Reviewed by: Andy Fiddaman <andy@omniosce.org>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Dan McDonald <danmcd@joyent.com>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/8794
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/578f67364c
Closes #6973

6 years agoUpdate for cppcheck v1.80
Brian Behlendorf [Sat, 18 Nov 2017 22:08:00 +0000 (14:08 -0800)]
Update for cppcheck v1.80

Resolve new warnings and errors from cppcheck v1.80.

* [lib/libshare/libshare.c:543]: (warning)
  Possible null pointer dereference: protocol
* [lib/libzfs/libzfs_dataset.c:2323]: (warning)
  Possible null pointer dereference: srctype
* [lib/libzfs/libzfs_import.c:318]: (error)
  Uninitialized variable: link
* [module/zfs/abd.c:353]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:353]: (error) Uninitialized variable: i
* [module/zfs/abd.c:385]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:385]: (error) Uninitialized variable: i
* [module/zfs/abd.c:553]: (error) Uninitialized variable: i
* [module/zfs/abd.c:553]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:763]: (error) Uninitialized variable: i
* [module/zfs/abd.c:763]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:305]: (error) Uninitialized variable: tmp_page
* [module/zfs/zpl_xattr.c:342]: (warning)
   Possible null pointer dereference: value
* [module/zfs/zvol.c:208]: (error) Uninitialized variable: p

Convert the following suppression to inline.

* [module/zfs/zfs_vnops.c:840]: (error)
  Possible null pointer dereference: aiov

Exclude HAVE_UIO_ZEROCOPY and HAVE_DNLC from analysis since
these macro's will never be defined until this functionality
is implemented.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6879

6 years agoFix data on evict_skips in arc_summary.py
Scot W. Stevenson [Sat, 18 Nov 2017 22:07:04 +0000 (23:07 +0100)]
Fix data on evict_skips in arc_summary.py

Display correct data from kstat arcstats for evict_skips,
which is currently repeating the data from mutex_misses.
Fixes #6882

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6882
Closes #6883

6 years agoMinor code cleanups in arc_python.py
Scot W. Stevenson [Wed, 15 Nov 2017 18:28:11 +0000 (19:28 +0100)]
Minor code cleanups in arc_python.py

Remove unused library re and associated variable kstat_pobj. Add note
to documentation at start of program about required support for old
versions of Python. Change variable "format" (which is a built-in
function) to "fmt".

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6869

6 years agoFix arc_summary.py -d crash with Python3
Scot W. Stevenson [Sun, 12 Nov 2017 04:27:43 +0000 (05:27 +0100)]
Fix arc_summary.py -d crash with Python3

Prevents arc_summary.py crashing when called with parameter -d or
long form --description with Python3.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6849
Closes #6850

6 years agoSort output of tunables in arc_summary.py
Scot W. Stevenson [Tue, 7 Nov 2017 22:50:15 +0000 (23:50 +0100)]
Sort output of tunables in arc_summary.py

Sort list of tunables printed by _tunable_summary()
alphabetically

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6828

6 years agoAdd documentation strings to arc_summary.py
Scot W. Stevenson [Sun, 5 Nov 2017 21:11:37 +0000 (22:11 +0100)]
Add documentation strings to arc_summary.py

Include docstrings (PEP8, PEP257) for module and all functions.
Separately, remove outdated section in comment at start of
module. Separately, remove unused global constant "usetunable".

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6818

6 years agoRewrite fHits() in arc_summary.py with SI units
Scot W. Stevenson [Sat, 4 Nov 2017 20:33:28 +0000 (21:33 +0100)]
Rewrite fHits() in arc_summary.py with SI units

Complete rewrite of fHits(). Move units from non-standard English
abbreviations to SI units, thereby avoiding confusion because of
"long scale" and "short scale" numbers. Remove unused parameter
"Decimal". Add function string. Aim to confirm to PEP8.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6815

6 years agoMinor code cleanup in arc_summary.py
Scot W. Stevenson [Fri, 3 Nov 2017 22:43:53 +0000 (23:43 +0100)]
Minor code cleanup in arc_summary.py

Simplify and inline single-use function div1(); inline twice-used
function div2(); add function comment to zfs_header(); replace
variable "unused" in get_Kstat() with "_" following convention.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6802

6 years agoRewrite of function fBytes() in arc_summary.py
Scot W. Stevenson [Wed, 25 Oct 2017 06:29:02 +0000 (08:29 +0200)]
Rewrite of function fBytes() in arc_summary.py

Replace if-elif-else construction with shorter loop;
remove unused parameter "Decimal"; centralize format
string; add function documentation string; conform to
PEP8.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6784

6 years agoFix bug in distclean which removes needed files
David Quigley [Wed, 13 Sep 2017 18:45:04 +0000 (12:45 -0600)]
Fix bug in distclean which removes needed files

Running distclean removes the following files because of an error
in Makefile.am

deleted:    tests/zfs-tests/include/commands.cfg
deleted:    tests/zfs-tests/include/libtest.shlib
deleted:    tests/zfs-tests/include/math.shlib
deleted:    tests/zfs-tests/include/properties.shlib
deleted:    tests/zfs-tests/include/zpool_script.shlib

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: David Quigley <david.quigley@intel.com>
Closes #6636

6 years agodmu_objset: release bonus buffer in failure path
Gvozden Neskovic [Wed, 30 Aug 2017 19:09:18 +0000 (21:09 +0200)]
dmu_objset: release bonus buffer in failure path

Reported by kmemleak during testing of a new patch:

```
unreferenced object 0xffff9f1c12e38800 (size 1024):
  comm "z_upgrade", pid 17842, jiffies 4296870904 (age 8746.268s)
  backtrace:
    kmemleak_alloc+0x7a/0x100
    __kmalloc_node+0x26c/0x510
    range_tree_create+0x39/0xa0 [zfs]
    dmu_zfetch_init+0x73/0xe0 [zfs]
    dnode_create+0x12c/0x3b0 [zfs]
    dnode_hold_impl+0x1096/0x1130 [zfs]
    dnode_hold+0x23/0x30 [zfs]
    dmu_bonus_hold_impl+0x6b/0x370 [zfs]
    dmu_bonus_hold+0x1e/0x30 [zfs]
    dmu_objset_space_upgrade+0x114/0x310 [zfs]
    dmu_objset_userobjspace_upgrade_cb+0xd8/0x150 [zfs]
    dmu_objset_upgrade_task_cb+0x136/0x1e0 [zfs]
    kthread+0x119/0x150
```

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Closes #6575

6 years agoFix zfs_ioc_pool_sync should not use fnvlist
Chunwei Chen [Mon, 21 Aug 2017 20:11:11 +0000 (13:11 -0700)]
Fix zfs_ioc_pool_sync should not use fnvlist

Use fnvlist on user input would allow user to easily panic zfs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #6529

6 years agovdev_mirror: load balancing fixes
Gvozden Neskovic [Fri, 4 Aug 2017 09:29:56 +0000 (11:29 +0200)]
vdev_mirror: load balancing fixes

vdev_queue:
- Track the last position of each vdev, including the io size,
  in order to detect linear access of the following zio.
- Remove duplicate `vq_lastoffset`

vdev_mirror:
- Correctly calculate the zio offset (signedness issue)
- Deprecate `vdev_queue_register_lastoffset()`
- Add `VDEV_LABEL_START_SIZE` to zio offset of leaf vdevs

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Closes #6461

6 years agoUse /sbin/openrc-run for openrc init scripts
BtbN [Wed, 16 Aug 2017 22:51:51 +0000 (00:51 +0200)]
Use /sbin/openrc-run for openrc init scripts

Using /sbin/runscript is deprecated and throws a QA warning
when still used in init scripts.

Reviewed-by: bunder2015 <omfgbunder@gmail.com>
Signed-off-by: BtbN <btbn@btbn.de>
Closes #6519

7 years agoTag zfs-0.7.5 zfs-0.7.5
Tony Hutter [Mon, 18 Dec 2017 18:57:47 +0000 (10:57 -0800)]
Tag zfs-0.7.5

META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
7 years agoFix multihost stale cache file import
Brian Behlendorf [Mon, 18 Dec 2017 18:28:27 +0000 (10:28 -0800)]
Fix multihost stale cache file import

When the multihost property is enabled it should be impossible to
import an active pool even using the force (-f) option.  This patch
prevents a forced import from succeeding when importing with a
stale cache file.

The root cause of the problem is that the kernel modules trusted
the hostid provided in configuration.  This is always correct when
the configuration is generated by scanning for the pool.  However,
when using an existing cache file the hostid could be stale which
would result in the activity check being skipped.

Resolve the issue by always using the hostid read from the label
configuration where the best uberblock was found.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6933
Closes #6971

7 years agoFix ZTS MMP tests and ztest -M behavior
Olaf Faaland [Sat, 23 Sep 2017 16:28:18 +0000 (09:28 -0700)]
Fix ZTS MMP tests and ztest -M behavior

Quote "$MMP_IMPORT_MSG" when it is passed as an argument, as it is a
multi-word string.  Some tests were passing when they should not have,
because the grep was only testing for the first word.

Correct the message expected when no hostid is set and the test attempts
to enable multihost.  It did not match the actual output in that
situation.

Disable ztest_reguid() when ztest is invoked with the -M option.  If
ztest performs a reguid, a concurrent import attempt may fail with the
error "one or more devices is currently unavailable" if the guid sum is
calculated on the original device guids but compared against the guid
sum ztest wrote based on the new device guids.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #6666

7 years agoEnable QAT support in zfs-dkms RPM
David Qian [Thu, 7 Dec 2017 08:43:13 +0000 (16:43 +0800)]
Enable QAT support in zfs-dkms RPM

Enable QAT accelerated gzip compression in zfs-dkms RPM package when
environment variant ICP_ROOT is set to QAT drive source code folder
and QAT hardware presence.  Otherwise, use default gzip compression.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: David Qian <david.qian@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6932

7 years agoAdd zfs-import.target services in spec file
Lalufu [Thu, 14 Dec 2017 01:09:22 +0000 (02:09 +0100)]
Add zfs-import.target services in spec file

Add missing zfs-import.target to list of systemd services in zfs
RPM spec file.

Reviewed-by: Niklas Wagner <Skaro@Skaronator.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ralf Ertzinger <ralf@skytale.net>
Issue #6953
Closes #6955

7 years agoEnable zfs-import.target in systemd preset (#6968)
Antonio Russo [Mon, 18 Dec 2017 17:43:55 +0000 (12:43 -0500)]
Enable zfs-import.target in systemd preset (#6968)

Cherry picked line from PR #6822, this enables the new
target introduced in PR #6764.

Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
7 years agoTag zfs-0.7.4 zfs-0.7.4
Tony Hutter [Thu, 7 Dec 2017 18:25:36 +0000 (10:25 -0800)]
Tag zfs-0.7.4

META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
7 years agoRevert "Long hold the dataset during upgrade"
Tony Hutter [Thu, 7 Dec 2017 00:04:57 +0000 (16:04 -0800)]
Revert "Long hold the dataset during upgrade"

This reverts commit a5c8119eba0a3b8a2a780780099eb82e250ba7f9.

The commit (which was modified to remove encryption) was hitting
ASSERT(dsl_pool_config_held(dmu_objset_pool(os))) in
dmu_objset_upgrade() during automated testing.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
7 years agoAllow test-runner to filter test groups by tag
Giuseppe Di Natale [Fri, 3 Nov 2017 16:53:32 +0000 (09:53 -0700)]
Allow test-runner to filter test groups by tag

Enable test-runner to accept a list of tags to identify
which test groups the user wishes to run.

Also allow test-runner to perform multiple iterations
of a test run.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Wren Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6788

7 years agoFix NFS sticky bit permission denied error
Brian Behlendorf [Mon, 4 Dec 2017 19:55:57 +0000 (11:55 -0800)]
Fix NFS sticky bit permission denied error

When zfs_sticky_remove_access() was originally adapted for Linux
a typo was made which altered the intended behavior.  As described
in the block comment, the intended behavior is that permission
should be granted when the entry is a regular file and you have
write access.  That is, S_ISREG should have been used instead of
S_ISDIR.

Restricting permission to regular files made good sense for older
systems where setting the bit on executable files would instruct
the system to save the program's text segment on the swap device.

On modern systems this behavior has been replaced by the sticky
bit acting as a restricted deletion flag and the plain file
restriction has been relaxed.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6889
Closes #6910

7 years agoAdd /usr/bin/env to COPY_EXEC_LIST initramfs hook
JKDingwall [Mon, 4 Dec 2017 19:53:57 +0000 (19:53 +0000)]
Add /usr/bin/env to COPY_EXEC_LIST initramfs hook

5dc1ff29 changed the user space program to mount a zfs snapshot
from /bin/sh to /usr/bin/env.  If the executable is not present
in the initramfs then snapshots cannot be automounted.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: James Dingwall <james.dingwall@zynstra.com>
Closes #5360
Closes #6913
Conflicts:
contrib/initramfs/hooks/zfs

7 years agoFix 'zpool create|add' replication level check
Brian Behlendorf [Mon, 4 Dec 2017 19:50:35 +0000 (11:50 -0800)]
Fix 'zpool create|add' replication level check

When the pool configuration contains a hole due to a previous device
removal ignore this top level vdev.  Failure to do so will result in
the current configuration being assessed to have a non-uniform
replication level and the expected warning will be disabled.

The zpool_add_010_pos test case was extended to cover this scenario.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6907
Closes #6911

7 years agoPreserve itx alloc size for zio_data_buf_free()
Brian Behlendorf [Mon, 4 Dec 2017 19:44:39 +0000 (11:44 -0800)]
Preserve itx alloc size for zio_data_buf_free()

Using zio_data_buf_alloc() to allocate the itx's may be unsafe
because the itx->itx_lr.lrc_reclen field is not constant from
allocation to free.  Using a different itx->itx_lr.lrc_reclen
size in zio_data_buf_free() can result in the allocation being
returned to the wrong kmem cache.

This issue can be avoided entirely by storing the allocation size
in itx->itx_size and using that for zio_data_buf_free().

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6912

7 years agoFix 'zfs get {user|group}objused@' functionality
LOLi [Wed, 29 Nov 2017 19:59:22 +0000 (20:59 +0100)]
Fix 'zfs get {user|group}objused@' functionality

Fix a regression accidentally introduced in 1b81ab4 that prevents
'zfs get {user|group}objused@' from correctly reporting the requested
value.

Update "userspace_003_pos.ksh" and "groupspace_003_pos.ksh" to verify
this functionality.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6908

7 years agoLinux 4.14 compat: CONFIG_GCC_PLUGIN_RANDSTRUCT
Mark Wright [Tue, 28 Nov 2017 23:33:48 +0000 (10:33 +1100)]
Linux 4.14 compat: CONFIG_GCC_PLUGIN_RANDSTRUCT

Fix build errors with gcc 7.2.0 on Gentoo with kernel 4.14
built with CONFIG_GCC_PLUGIN_RANDSTRUCT=y such as:

module/nvpair/nvpair.c:2810:2:error:
positional initialization of field in ?struct? declared with
'designated_init' attribute [-Werror=designated-init]
  nvs_native_nvlist,
  ^~~~~~~~~~~~~~~~~

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mark Wright <gienah@gentoo.org>
Closes #5390
Closes #6903

7 years agoinitramfs: Honor canmount=off
Richard Laager [Fri, 24 Nov 2017 05:08:02 +0000 (23:08 -0600)]
initramfs: Honor canmount=off

The initramfs script was not honoring canmount=off.  With this change,
it does.  If the administrator has asked that a filesystem not be
mounted, that should be honored.

As an exception, the initramfs script ignores canmount=off on the
rootfs.  The rootfs should not have canmount=off set either.  However,
mounting it anyway seems harmless because it is being asked for
explicitly.  The point of this exception is to avoid the risk of
breaking existing systems, just in case someone has canmount=off set on
their rootfs.

The initramfs still mounts filesystems with canmount=noauto.  This is
necessary because it is typical to set that on the rootfs so that it can
be cloned.  Without canmount=noauto, the clones' duplicate mountpoints
would conflict.

This is the remainder of the fix for:
https://github.com/zfsonlinux/pkg-zfs/issues/221

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #6897

7 years agoinitramfs: Honor mountpoint=none/legacy
Richard Laager [Fri, 24 Nov 2017 03:54:48 +0000 (21:54 -0600)]
initramfs: Honor mountpoint=none/legacy

For filesystems that are children of the rootfs, when mountpoint=none or
mountpoint=legacy, the initrafms script would assume a mountpoint based
on the dataset path.  Given that the rootfs should have mountpoint=/ and
mountpoint inheritance is is the default behavior of ZFS, this behavior
seems unnecessary.  In any event, it turns mountpoint=none into a no-op.
That removes this option from the administrator, and if someone uses it,
it does not work as expected.  Worse yet, if the mountpoint directory
does not exist (which is the typical case for mountpoint=none), the
mounting and thus the boot process will fail.  For the case of
mountpoint=legacy, the assumed mountpoint may not be the correct value
set in /etc/fstab.

This change makes the initramfs script not mount the filesystem in
either case.  For mountpoint=none, this means we are correctly honoring
the setting.  For mountpoint=legacy, there are two scenarios:  If
canmount=on, the filesystem will be mounted by the normal mechanisms
later in the boot process.  If canmount=noauto, the filesystem will not
be mounted at all, unless the administrator has done something special.
If they're not doing something special and they want it mounted by the
initramfs, they can simply not set mountpoint=legacy.

This is part of the fix for:
https://github.com/zfsonlinux/pkg-zfs/issues/221

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #6897

7 years agozpool(8): Fix "zpool import -t"
DeHackEd [Tue, 28 Nov 2017 17:10:52 +0000 (12:10 -0500)]
zpool(8): Fix "zpool import -t"

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: DHE <git@dehacked.net>
Closes #6894

7 years agoFix column alignment with long zpool names
George G [Sun, 5 Nov 2017 21:09:56 +0000 (21:09 +0000)]
Fix column alignment with long zpool names

`zpool status` normally aligns NAME/STATE/etc columns:

    NAME                       STATE     READ WRITE CKSUM
    dummy                      ONLINE       0     0     0
      mirror-0                 ONLINE       0     0     0
        /tmp/dummy-long-1.bin  ONLINE       0     0     0
        /tmp/dummy-long-2.bin  ONLINE       0     0     0
      mirror-1                 ONLINE       0     0     0
        /tmp/dummy-long-3.bin  ONLINE       0     0     0
        /tmp/dummy-long-4.bin  ONLINE       0     0     0

However, if the zpool name is longer than the zvol names, alignment
issues arise:

    NAME                  STATE     READ WRITE CKSUM
    dummy-very-very-long-zpool-name  ONLINE       0     0     0
      mirror-0            ONLINE       0     0     0
        /tmp/dummy-1.bin  ONLINE       0     0     0
        /tmp/dummy-2.bin  ONLINE       0     0     0
      mirror-1            ONLINE       0     0     0
        /tmp/dummy-3.bin  ONLINE       0     0     0
        /tmp/dummy-4.bin  ONLINE       0     0     0

`zpool iostat` and `zpool import` are also affected:

                  capacity     operations     bandwidth
    pool        alloc   free   read  write   read  write
    ----------  -----  -----  -----  -----  -----  -----
    dummy        104K  1.97G      0      0    152  9.84K
    dummy-very-very-long-zpool-name   152K  1.97G      0      1    144  13.1K
    ----------  -----  -----  -----  -----  -----  -----

    dummy-very-very-long-zpool-name  ONLINE
      mirror-0            ONLINE
        /tmp/dummy-1.bin  ONLINE
        /tmp/dummy-2.bin  ONLINE
      mirror-1            ONLINE
        /tmp/dummy-3.bin  ONLINE
        /tmp/dummy-4.bin  ONLINE

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Gaydarov <git@gg7.io>
Closes #6786

7 years agoEmit history events for 'zpool create'
Brian Behlendorf [Mon, 23 Oct 2017 16:45:59 +0000 (09:45 -0700)]
Emit history events for 'zpool create'

History commands and events were being suppressed for the
'zpool create' command since the history object did not
yet exist.  Create the object earlier so this history
doesn't get lost.

Split the pool_destroy event in to pool_destroy and
pool_export so they may be distinguished.

Updated events_001_pos and events_002_pos test cases.  They
now check for the expected history events and were reworked
to be more reliable.

Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6712
Closes #6486
Conflicts:
tests/zfs-tests/tests/functional/events/events_002_pos.ksh

7 years agoFix dirty check in dmu_offset_next()
Brian Behlendorf [Wed, 15 Nov 2017 18:19:32 +0000 (10:19 -0800)]
Fix dirty check in dmu_offset_next()

The correct way to determine if a dnode is dirty is to check
if any of the dn->dn_dirty_link's are active.  Relying solely
on the dn->dn_dirtyctx can result in the dnode being mistakenly
reported as clean.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3125
Closes #6867

7 years agoDisable automatic dependencies in zfs-test package
Brian Behlendorf [Wed, 15 Nov 2017 17:12:52 +0000 (09:12 -0800)]
Disable automatic dependencies in zfs-test package

All of the ZTS test scripts specify /bin/ksh as the interpreter.
Unfortunately, as of Fedora 27 only /usr/bin/ksh is provided by
the package manager.  Rather than change all the scripts to
accommodate the latest Fedora disable automatic dependencies
for the zfs-test package.  Functionally this will not cause
any problems since /bin is a symlink to /usr/bin.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6868

7 years agoFix truncate(2) mtime and ctime handling
LOLi [Mon, 13 Nov 2017 17:24:26 +0000 (18:24 +0100)]
Fix truncate(2) mtime and ctime handling

On Linux, ftruncate(2) always changes the file timestamps, even if the
file size is not changed. However, in case of a successfull
truncate(2), the timestamps are updated only if the file size changes.
This translates to the VFS calling the ZFS Posix Layer "setattr"
function (zpl_setattr) with ATTR_MTIME and ATTR_CTIME unconditionally
set on the iattr mask only when doing a ftruncate(2), while the
truncate(2) is left to the filesystem implementation to be dealt with.

This behaviour is consistent with POSIX:2004/SUSv3 specifications
where there's no explicit requirement for file size changes to update
the timestamps only for ftruncate(2):

http://pubs.opengroup.org/onlinepubs/009695399/functions/truncate.html
http://pubs.opengroup.org/onlinepubs/009695399/functions/ftruncate.html

This has been later updated in POSIX:2008/SUSv4 where, for both
truncate(2)/ftruncate(2), there's no mention of this size change
requirement:

http://austingroupbugs.net/view.php?id=489
http://pubs.opengroup.org/onlinepubs/9699919799/functions/truncate.html
http://pubs.opengroup.org/onlinepubs/9699919799/functions/ftruncate.html

Unfortunately the Linux VFS is still calling into the ZPL without
ATTR_MTIME/ATTR_CTIME set in the truncate(2) case: we fix this by
explicitly updating the timestamps when detecting the ATTR_SIZE bit,
which is always set in do_truncate(), on the iattr mask.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6811
Closes #6819

7 years agoOpenZFS 7531 - Assign correct flags to prefetched buffers
benrubson [Thu, 2 Nov 2017 15:01:56 +0000 (16:01 +0100)]
OpenZFS 7531 - Assign correct flags to prefetched buffers

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Authored by: abraunegg <alex.braunegg@gmail.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/7531
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/468008cb

7 years agoLong hold the dataset during upgrade
Arkadiusz Bubała [Fri, 10 Nov 2017 21:37:10 +0000 (22:37 +0100)]
Long hold the dataset during upgrade

If the receive or rollback is performed while filesystem is upgrading
the objset may be evicted in `dsl_dataset_clone_swap_sync_impl`. This
will lead to NULL pointer dereference when upgrade tries to access
evicted objset.

This commit adds long hold of dataset during whole upgrade process.
The receive and rollback will return an EBUSY error until the
upgrade is not finished.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arkadiusz Bubała <arkadiusz.bubala@open-e.com>
Closes #5295
Closes #6837

7 years agoHandle compressed buffers in __dbuf_hold_impl()
Tim Chase [Wed, 8 Nov 2017 21:32:15 +0000 (15:32 -0600)]
Handle compressed buffers in __dbuf_hold_impl()

In __dbuf_hold_impl(), if a buffer is currently syncing and is still
referenced from db_data, a copy is made in case it is dirtied again in
the txg.  Previously, the buffer for the copy was simply allocated with
arc_alloc_buf() which doesn't handle compressed or encrypted buffers
(which are a special case of a compressed buffer).  The result was
typically an invalid memory access because the newly-allocated buffer
was of the uncompressed size.

This commit fixes the problem by handling the 2 compressed cases,
encrypted and unencrypted, respectively, with arc_alloc_raw_buf() and
arc_alloc_compressed_buf().

Although using the proper allocation functions fixes the invalid memory
access by allocating a buffer of the compressed size, another unrelated
issue made it impossible to properly detect compressed buffers in the
first place.  The header's compression flag was set to ZIO_COMPRESS_OFF
in arc_write() when it was possible that an attached buffer was actually
compressed.  This commit adds logic to only set ZIO_COMPRESS_OFF in
the non-ZIO_RAW case which wil handle both cases of compressed buffers
(encrypted or unencrypted).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes #5742
Closes #6797