Commit graph

22 commits

Author SHA1 Message Date
Pawel Jakub Dawidek
665b912a0a MFC r207920,r207934,r207936,r207937,r207970,r208142,r208147,r208148,r208166,
r208454,r208455,r208458:

r207920:

Back out r205134. It is not stable.

r207934:

Add missing new line characters to the warnings.

r207936:

Eventhough r203504 eliminates taste traffic provoked by vdev_geom.c,
ZFS still like to open all vdevs, close them and open them again,
which in turn provokes taste traffic anyway.

I don't know of any clean way to fix it, so do it the hard way - if we can't
open provider for writing just retry 5 times with 0.5 pauses. This should
elimitate accidental races caused by other classes tasting providers created on
top of our vdevs.

Reported by:	James R. Van Artsdalen <james-freebsd-fs2@jrv.org>
Reported by:	Yuri Pankov <yuri.pankov@gmail.com>

r207937:

I added vfs_lowvnodes event, but it was only used for a short while and now
it is totally unused. Remove it.

r207970:

When there is no memory or KVA, try to help by reclaiming some vnodes.
This helps with 'kmem_map too small' panics.

No objections from:	kib
Tested by:		Alexander V. Ribchansky <shurik@zk.informjust.ua>

r208142:

The whole point of having dedicated worker thread for each leaf VDEV was to
avoid calling zio_interrupt() from geom_up thread context. It turns out that
when provider is forcibly removed from the system and we kill worker thread
there can still be some ZIOs pending. To complete pending ZIOs when there is
no worker thread anymore we still have to call zio_interrupt() from geom_up
context. To avoid this race just remove use of worker threads altogether.
This should be more or less fine, because I also thought that zio_interrupt()
does more work, but it only makes small UMA allocation with M_WAITOK.
It also saves one context switch per I/O request.

PR:		kern/145339
Reported by:	Alex Bakhtin <Alex.Bakhtin@gmail.com>

r208147:

Add task structure to zio and use it instead of allocating one.
This eliminates the only place where we can sleep when calling zio_interrupt().
As a side-effect this can actually improve performance a little as we
allocate one less thing for every I/O.

Prodded by:	kib

r208148:

Allow to configure UMA usage for ZIO data via loader and turn it on by
default for amd64. On i386 I saw performance degradation when UMA was used,
but for amd64 it should help.

r208166:

Fix userland build by making io_task available only for the kernel and by
providing taskq_dispatch_safe() macro.

r208454:

Remove ZIO_USE_UMA from arc.c as well.

r208455:

ZIO_USE_UMA is no longer used.

r208458:

Create UMA zones unconditionally.
2010-05-24 10:09:36 +00:00
Pawel Jakub Dawidek
3a482ccc5e MFC r203504,r204067,r204073,r204101,r204804,r205079,r205080,r205132,r205133,
r205134,r205231,r205253,r205264,r205346,r206051,r206667,r206792,r206793,
    r206794,r206795,r206796,r206797:

r203504:

Open provider for writting when we find the right one. Opening too much
providers for writing provokes huge traffic related to taste events send
by GEOM on close. This can lead to various problems with opening GEOM
providers that are created on top of other GEOM providers.

Reorted by:	Kurt Touet <ktouet@gmail.com>, mr
Tested by:	mr, Baginski Darren <kickbsd@ya.ru>

r204067:

Update comment. We also look for GPT partitions.

r204073:

Add tunable and sysctl to skip hostid check on pool import.

r204101:

Don't set f_bsize to recordsize. It might confuse some software (like squid).

Submitted by:	Alexander Zagrebin <alexz@visp.ru>

r204804:

Remove racy assertion.

Reported by:	Attila Nagy <bra@fsn.hu>
Obtained from:	OpenSolaris, Bug ID 6827260

r205079:

Remove bogus assertion.

Reported by:	Johan Ström <johan@stromnet.se>
Obtained from:	OpenSolaris, Bug ID 6920880

r205080:

Force commit to correct Bug ID:

Obtained from:	OpenSolaris, Bug ID 6920880

r205132:

Don't bottleneck on acquiring the stream locks - this avoids a massive
drop off in throughput with large numbers of simultaneous reads

r205133:

fix compilation under ZIO_USE_UMA

r205134:

make UMA the default allocator for ZFS buffers - this avoids
a great deal of contention in kmem_alloc

r205231:

- reduce contention by breaking up ARC state locks in to 16 for data
  and 16 for metadata
- export L2ARC tunables as sysctls
- add several kstats to track L2ARC state more precisely
- avoid holding a contended lock when atomically incrementing a
  contended counter (no lock protection needed for atomics)

r205253:

use CACHE_LINE_SIZE instead of hardcoding 128 for lock pad

pointed out by Marius Nuennerich and jhb@

r205264:

- cache line align arcs_lock array (h/t Marius Nuennerich)
- fix ARCS_LOCK_PAD to use architecture defined CACHE_LINE_SIZE
- cache line align buf_hash_table ht_locks array

r205346:

The same code is used to import and to create pool.
The order of operations is the following:
1. Try to open vdev by remembered path and guid.
2. If 1 failed, try to find vdev which guid matches and ignore the path.
3. If 2 failed this means either that the vdev we're looking for is gone
   or that pool is being created and vdev doesn't contain proper guid yet.
   To be able to handle pool creation we open vdev by path anyway.

Because of 3 it is possible that we open wrong vdev on import which can lead to
confusions.

The solution for this is to check spa_load_state. On pool creation it will be
equal to SPA_LOAD_NONE and we can open vdev only by path immediately and if it
is not equal to SPA_LOAD_NONE we first open by path+guid and when that fails,
we open by guid. We no longer open wrong vdev on import.

r206051:

IOCPARM_MAX defines maximum size of a structure that can be passed
directly to ioctl(2). Because of how ioctl command is build using _IO*()
macros we have only 13 bits to encode structure size. So the structure
can be up to 8kB-1.

Currently we define IOCPARM_MAX as PAGE_SIZE.

This is IMHO wrong for three main reasons:

1. It is confusing on archs with page size larger than 8kB (not really
   sure if we support such archs (sparc64?)), as even if PAGE_SIZE is
   bigger than 8kB, we won't be able to encode anything larger in ioctl
   command.

2. It is a waste. Why the structure can be only 4kB on most archs if we
   have 13 bits dedicated for that, not 12?

3. It shouldn't depend on architecture and page size. My ioctl command
   can work on one arch, but can't on the other?

Increase IOCPARM_MAX to 8kB and make it independed of PAGE_SIZE and
architecture it is compiled for. This allows to use all the bits on all the
archs for size. Note that this doesn't mean we will copy more on every ioctl(2)
call. No. We still copyin(9)/copyout(9) only exact number of bytes encoded in
ioctl command.

Practical use for this change is ZFS. zfs_cmd_t structure used for ZFS
ioctls is larger than 4kB.

Silence on:	arch@

r206667:

Fix 3-way deadlock that can happen because of ZFS and vnode lock
order reversal.

thread0 (vfs_fhtovp)	thread1 (vop_getattr)	thread2 (zfs_recv)
--------------------	---------------------	------------------
			vn_lock
rrw_enter_read
						rrw_enter_write (hangs)
			rrw_enter_read (hangs)
vn_lock (hangs)

Reported by:	Attila Nagy <bra@fsn.hu>

r206792:

Set ARC_L2_WRITING on L2ARC header creation.

Obtained from:	OpenSolaris

r206793:

Remove racy assertion.

Obtained from:	OpenSolaris

r206794:

Extend locks scope to match OpenSolaris.

r206795:

Add missing list and lock destruction.

r206796:

Style fixes.

r206797:

Restore previous order.
2010-04-18 21:36:34 +00:00
Pawel Jakub Dawidek
e43f173602 MFC r196295:
Remove OpenSolaris taskq port (it performs very poorly in our kernel) and
replace it with wrappers around our taskqueue(9).
To make it possible implement taskqueue_member() function which returns 1
if the given thread was created by the given taskqueue.

Approved by:	re (kib)
2009-08-17 09:03:47 +00:00
Kip Macy
762169b50a fix xdrmem_control to be safe in an if statement
fix zfs to depend on krpc
remove xdr from zfs makefile

Submitted by:	dchagin@freebsd.org
2009-05-30 22:23:58 +00:00
Kip Macy
c334d2d544 MFdevbranch 192944
- add FreeBSD implementation of xdrmem_control needed by zfs
 - have zfs define xdr_ops using FreeBSD's definition
 - remove solaris xdr files from zfs compile
2009-05-28 08:18:12 +00:00
Edward Tomasz Napierala
0970b4bae0 MFp4 changes neccessary for NFSv4 ACLs support in ZFS. This is mostly
about removing a few #ifdefs and providing compatibility wrappers and
VOP implementations to get and set an ACL; ZFS does ACL enforcement all
by itself.

Note that the VOPs are ifdefed out for now, so this change should be
a no-op.

Reviewed by:	pjd
2009-05-26 08:21:59 +00:00
Kip Macy
469ef3e563 rename xdr support files to avoid conflicts when linking in to the kernel 2009-05-11 04:18:58 +00:00
Kip Macy
126b14f7e8 fix atomic.S rename and vimage breakage
The latter was pointed out by Artem Belevich
2009-05-09 05:45:13 +00:00
Kip Macy
8569258bf8 - rename atomic.S and crc32.c to avoid collisions when linking zfs in to the kernel
- update Makefile
- ifdef out acl_{alloc, free}, they aren't used by zfs and conflict with existing in-kernel routines
2009-05-09 01:45:55 +00:00
Pawel Jakub Dawidek
1ba4a712dd Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes.
This bring huge amount of changes, I'll enumerate only user-visible changes:

- Delegated Administration

	Allows regular users to perform ZFS operations, like file system
	creation, snapshot creation, etc.

- L2ARC

	Level 2 cache for ZFS - allows to use additional disks for cache.
	Huge performance improvements mostly for random read of mostly
	static content.

- slog

	Allow to use additional disks for ZFS Intent Log to speed up
	operations like fsync(2).

- vfs.zfs.super_owner

	Allows regular users to perform privileged operations on files stored
	on ZFS file systems owned by him. Very careful with this one.

- chflags(2)

	Not all the flags are supported. This still needs work.

- ZFSBoot

	Support to boot off of ZFS pool. Not finished, AFAIK.

	Submitted by:	dfr

- Snapshot properties

- New failure modes

	Before if write requested failed, system paniced. Now one
	can select from one of three failure modes:
	- panic - panic on write error
	- wait - wait for disk to reappear
	- continue - serve read requests if possible, block write requests

- Refquota, refreservation properties

	Just quota and reservation properties, but don't count space consumed
	by children file systems, clones and snapshots.

- Sparse volumes

	ZVOLs that don't reserve space in the pool.

- External attributes

	Compatible with extattr(2).

- NFSv4-ACLs

	Not sure about the status, might not be complete yet.

	Submitted by:	trasz

- Creation-time properties

- Regression tests for zpool(8) command.

Obtained from:	OpenSolaris
2008-11-17 20:49:29 +00:00
Craig Rodrigues
e506f34b24 Merge latest DTrace changes from Perforce.
Approved by:	jb
2008-11-05 19:40:36 +00:00
Marius Strobl
5b20de10b9 Add atomic operations for ZFS/sparc64.
Approved by:	core, pjd
Obtained from:	OpenSolaris (w/ adaptations)
MFC after:	2 weeks
2008-04-11 22:59:33 +00:00
John Birrell
e327f52446 The sources covered by Sun's CDDL have been repo copied below the
src/cddl and src/sys/cddl directories per the core@ decision following
the license review.

This change modifies the affected Makefiles to reference the sources
in their new location.
2008-03-27 23:21:25 +00:00
David E. O'Brien
0a3374af71 "root" the include path so there is less duplication. 2008-03-08 19:14:43 +00:00
Ruslan Ermilov
741ac35cd4 Remove WARNS from here and compile with default kernel flags.
Switch off those warnings that ZFS sources do not pass.
2008-02-21 11:11:06 +00:00
John Birrell
ee8a5fa77d Remove _SOLARIS_C_SOURCE now that it doesn't do anything in FreeBSD
headers. All OpenSolaris compatibility comes via the set of specific
compatibility headers in src/compat/opensolaris and
src/sys/compat/opensolaris.
2007-11-28 22:58:09 +00:00
Pawel Jakub Dawidek
3b7917d766 - Reduce number of atomic operations needed to be implemented in asm by
implementing some of them using existing ones.
- Allow to compile ZFS on all archs and use atomic operations surrounded
  by global mutex on archs we don't have or can't have all atomic
  operations needed by ZFS.
2007-06-08 12:35:47 +00:00
Pawel Jakub Dawidek
d4c4dfe96f FreeBSD's namecache works quite well with ZFS, so remove DNLC. 2007-05-23 21:33:02 +00:00
Pawel Jakub Dawidek
8b384c52c0 MFp4: Now that ZFS can use FreeBSD's namecache, turn it off by default and
turn off DNLC, but don't remove DNLC yet just in case.
2007-04-24 16:59:20 +00:00
Pawel Jakub Dawidek
ffe54ff0ec MFp4: Synchronize with recent OpenSolaris changes. 2007-04-08 16:29:25 +00:00
Pawel Jakub Dawidek
3dc4488c91 Move atomic.S files to directories that better fit OpenSolaris directory
layout.
2007-04-07 23:54:54 +00:00
Pawel Jakub Dawidek
2109a92fd1 Add Makefile for zfs.ko kernel module. 2007-04-06 01:35:16 +00:00