Commit graph

118955 commits

Author SHA1 Message Date
Alan Somers
913b932900 Remove artificial restriction on lio_listio's operation count
In r322258 I made p1003_1b.aio_listio_max a tunable. However, further
investigation shows that there was never any good reason for that limit to
exist in the first place. It's used in two completely different ways:

* To size a UMA zone, which globally limits the number of concurrent
  aio_suspend calls.

* To artifically limit the number of operations in a single lio_listio call.
  There doesn't seem to be any memory allocation associated with this limit.

This change does two things:

* Properly names aio_suspend's UMA zone, and sizes it based on a new constant.

* Eliminates the artifical restriction on lio_listio. Instead, lio_listio
  calls will now be limited by the more generous max_aio_queue_per_proc. The
  old p1003_1b.aio_listio_max is now an alias for
  vfs.aio.max_aio_queue_per_proc, so sysconf(3) will still work with
  _SC_AIO_LISTIO_MAX.

Reported by:	bde
Reviewed by:	jhb
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D12120
2017-10-23 23:12:01 +00:00
Alan Somers
659058b06f Fix the error message when creating a zpool on a too-small device
Don't check for SPA_MINDEVSIZE in vdev_geom_attach when opening by path.
It's redundant with the check in vdev_open, and failing to attach here
results in the wrong error message being printed.  However, still check for
it in some other situations:

* When opening by guids, so we don't get bogged down reading from slow
  devices like floppy drives.
* In vdev_geom_read_pool_label for the same reason, because we iterate over
  all providers.
* If the caller requests that we verify the guid, because then we'll have to
  read from the device before vdev_open verifies the size.

PR:		222227
Reported by:	Marie Helene Kvello-Aune <marieheleneka@gmail.com>
Reviewed by:	avg, mav
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D12531
2017-10-23 23:05:29 +00:00
Stephen Hurd
3429c02f82 Some cache related optimizations
1. prefetch 128 bytes of mbufs.
2. Re-order filling the pkt_info so cache stalls happen at the end
3. Define empty prefetch2cachelines() macro when the function isn't present.

Provides small performance improvments on some hardware

Reviewed by:	sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12447
2017-10-23 20:50:08 +00:00
Mark Johnston
c9748df8ff Remove resource_set_*() declarations from sys/bus.h.
The corresponding definitions were removed in r78135.

PR:		223189
Submitted by:	marc.priggemeyer@gmail.com
MFC after:	1 week
2017-10-23 16:02:48 +00:00
Matt Joras
ba19246e07 Move clear_unrhdr to tmpfs_free_tmp.
Clearing the unr in tmpfs_unmount is not correct. In the case of
multiple references to the tmpfs mount (e.g. when there are lookup
threads using it) it will not be the one to finish tmpfs_free_tmp. In
those cases tmpfs_free_node_locked will be the final one to execute
tmpfs_free_tmp, and until then the unr must be valid.

Reported by:	pho
Approved/reviewed by:	rstone (mentor)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12749
2017-10-23 15:43:38 +00:00
Mark Johnston
5fca1d90c1 Fix the VM_NRESERVLEVEL == 0 build.
Add VM_NRESERVLEVEL guards in the pmaps that implement transparent
superpage promotion using reservations.

Reviewed by:	alc, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D12764
2017-10-23 15:34:05 +00:00
Mateusz Guzik
5132933a08 Bump WITNESS_PENDLIST to accomodate sleepq chain bump.
Reported by:	ngie
2017-10-23 01:00:35 +00:00
Warner Losh
6c458a6a56 Use preferred defined paths, rather than relative paths in fdt.
Sponsored by: Netflix
2017-10-22 22:52:27 +00:00
Warner Losh
404a0a2f40 Use SYSDIR instead of ${.CURDIR}/../..<etc>/sys.
Sponsored by: Netflix
2017-10-22 22:50:28 +00:00
Warner Losh
b022947e0e Use BOOTSRC here.
sponsored by: Netflix
2017-10-22 22:50:23 +00:00
Warner Losh
45e0c11351 Prefer SRCTOP paths for bits we're grabbing from libc.
Sponsored by: Netflix
2017-10-22 22:50:19 +00:00
Warner Losh
254ba2c985 Make at91 boot loader compile again.
No clue if it actually still works, but it might.
2017-10-22 22:50:15 +00:00
Warner Losh
cb3ee5c694 End source directories with SRC rather than a hodgepodge of names
BOOTDIR->BOOTSRC
FICLDIR->FICLSRC
LDR_MI->LDRSRC

This matches the patterns used in the rest of the system a bit vetter.

Suggested by: rgrimes@
Sponsored by: Netflix
2017-10-22 22:50:08 +00:00
Warner Losh
392c874409 Move fdt and uboot defines into common uboot.mk.
Sponsored by: Netflix
2017-10-22 22:49:51 +00:00
Mateusz Guzik
9e68989764 Make the sleepq chain hash size configurable per-arch and bump on amd64.
While here cache-align chains.

This shortens longest found chain during poudriere -j 80 from 32 to 16.

Pushing this higher up will probably require allocation on boot.
2017-10-22 20:43:50 +00:00
Mateusz Guzik
5a17c5524f sdt: make all sdt probe sites test one variable
This saves on cache misses at the expense of a slight grow of .text.

Note this is a bandaid for lack of hotpatching.

Discussed with:	markj
2017-10-22 20:22:23 +00:00
Mark Johnston
cc1307b67f Delete declarations of struct pfs_bitmap, removed in r143841.
MFC after:	1 week
2017-10-22 20:22:11 +00:00
Mateusz Guzik
4f7663af9f sdt: whack unused SDT_PROBE_ENABLED 2017-10-22 20:14:48 +00:00
Mateusz Guzik
614e1868d6 Change kdb_active type to u_char.
Fixes warnings from gcc and keeps the small size. Perhaps nesting should be moved
to another variablle.

Reported by:	ngie
2017-10-22 13:42:56 +00:00
Enji Cooper
f2374e0cc5 Clean up trailing whitespace in kdb_thr_ctx(..)
MFC after:	1 week
2017-10-22 12:12:52 +00:00
Bruce M Simpson
1d4c696a71 Add Prolific PL27A1 USB 3.0 Host-Host device to udbp(4).
Tested with a Plugable cable in VirtualBox against Linux 4.11.

MFC after:	2 weeks
2017-10-22 11:15:58 +00:00
Edward Tomasz Napierala
be7d4ac586 Add OID for the vm.overcommit sysctl. This makes it possible to remove
one call to sysctl(2) from jemalloc startup code. (That also requires
changes to jemalloc, but I plan to push those to upstream first.)

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D12745
2017-10-22 10:35:29 +00:00
Konstantin Belousov
456a73ef01 Remove the support for mknod(S_IFMT), which created dummy vnodes with
VBAD type.

FFS ffs_write() VOP catches such vnodes and panics, other VOPs do not
check for the type and their behaviour is really undefined.  The
comment claims that this support was done for 'badsect' to flag bad
sectors, we do not have such facility in kernel anyway.

Reported by:	Dmitry Vyukov <dvyukov@google.com>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-10-22 08:11:45 +00:00
Warner Losh
cced8b11d9 Define LIBSA32 to LIBSA on i386 to fix build.
Sponsored by: Netflix
2017-10-22 07:25:28 +00:00
Warner Losh
4b165a5f12 Use BOOTOBJ and BOOTDIR to find geli includes and libraries.
Sponsored by: Netflix
2017-10-22 03:52:22 +00:00
Warner Losh
4b32ccfd81 When building standalone, don't define errno. Let the definition from
stand.h override. This is similar to what we do in the kernel.

Sponsored by: Netflix
2017-10-22 03:52:17 +00:00
Warner Losh
cee8ef91e3 Stopgap fix to the mistmatch between LOADER_GELI_SUPPORT and
LOADER_NO_GELI_SUPPORT. To disable geli support in the loader, define
LOADER_GELI_SUPPORT=no. Proper warnings for for old build options to
follow.

Sponsored by: Netflix
2017-10-22 03:52:12 +00:00
Warner Losh
d1d01d440c Introduce BOOTOBJ: The top level object directory for the boot tree
and use it in preference to spelling out the path.

Sponsored by: Netflix
2017-10-22 03:52:08 +00:00
Warner Losh
913dd6b0fa Use BOOTDIR more consistently in defs.mk rather than repeat sys/boot.
Sponsored By: Netflix
2017-10-22 03:52:03 +00:00
Mateusz Guzik
be49509eea mtx: implement thread lock fastpath
MFC after:	1 week
2017-10-21 22:40:09 +00:00
Konstantin Belousov
422fe502b3 Check that the page which is freed as zeroed, indeed has all-zero content.
This catches some rare mysterious failures at the source.  The check
is only performed on architectures which implement direct map, and
only enabled with option DIAGNOSTIC, similar to other costly
consistency checks.

Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-10-21 17:28:12 +00:00
Emmanuel Vadot
65854c616e dtb/allwinner: Disconnect sinovoip-bpi-m3.dts from the build
No active commiter have this board and we diverged too much from
the upstream DTS.
2017-10-21 16:12:00 +00:00
Michal Meloun
32c48d07c6 Fix spelling.
Reported by:	lidl
MFC after:	1 month
2017-10-21 15:48:16 +00:00
Emmanuel Vadot
749885aa20 dts: Update our device tree sources file fomr Linux 4.13 2017-10-21 15:47:40 +00:00
Michal Meloun
0cbf724ed0 Fullify implementation of AT_HWCAP and AT_HWCAP2 for ARMv6,7.
This makes elf_aux_info(3) useable for ARM ports.

MFC after:	1 month
2017-10-21 12:16:21 +00:00
Michal Meloun
c8759f0996 Add C++ decoration to auxv.v forgotten in r324815.
MFC after:	1 month
2017-10-21 12:15:12 +00:00
Michal Meloun
0b08ae2120 Make elf_aux_info() as public libc function.
- Teach elf aux vector functions about newly added AT_HWCAP and AT_HWCAP2
  vectors.
- Export _elf_aux_info() as new public libc function elf_aux_info(3)

The elf_aux_info(3) should be considered as FreeBSD counterpart of glibc
getauxval() with more robust interface.

Note:
We cannot name this new function as getauxval(), with glibc compatible
interface. Some ports autodetect its existence and then expects that all
Linux specific AT_<*> vectors are defined and implemented.

MFC after:	1 month
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D12743
2017-10-21 12:06:18 +00:00
Michal Meloun
904d8c492f Add AT_HWCAP2 ELF auxiliary vector.
- allocate value for new AT_HWCAP2 auxiliary vector on all platforms.
 - expand 'struct sysentvec' by new 'u_long *sv_hwcap2', in exactly
   same way as for AT_HWCAP.

MFC after:	1 month
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D12699
2017-10-21 12:05:01 +00:00
Ryan Libby
2efbd10ade pms/freebsd: fix compiler warnings
- A number of unused variable warnings,
 - a missing prototype warning (actually a dead function),
 - and a potential use of an uninitialized variable.

Reviewed by:	pfg
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12683
2017-10-21 07:23:45 +00:00
Bjoern A. Zeeb
8e94025b41 With r181803 on 2008-08-17 23:27:27Z the first VIMAGE commit went into
HEAD.  Enable VIMAGE in GENERIC kernels and some others (where GENERIC does
not exist) on HEAD.

Disable building LINT-VIMAGE with VIMAGE being default.

This should give it a lot more exposure in the run-up to 12 to help
us evaluate whether to keep it on by default or not.
We are also hoping to get better performance testing.
The feature can be disabled using nooptions.

Requested by:		many
Reviewed by:		kristof, emaste, hiren
X-MFC after:		never
Relnotes:		yes
Differential Revision:	https://reviews.freebsd.org/D12639
2017-10-20 21:40:59 +00:00
Mark Johnston
eadbeae5e7 Free the right address range if kmem_back() fails in memguard_alloc().
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2017-10-20 21:13:19 +00:00
Mateusz Guzik
e66167764a amd64: plug missed dt_lock in cpu_fork 2017-10-20 18:58:11 +00:00
Konstantin Belousov
b3d4ab6645 Take the vm object lock in read mode in vnode_generic_putpages().
Only upgrade it to write mode if we need to clear dirty bits of the
partially valid page after EOF.

Suggested and reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2017-10-20 18:40:29 +00:00
Mark Johnston
a3e8a25a52 Avoid the nbp lookup in the final loop iteration in flushbuflist().
The end of the loop must re-lookup the next buf since the bufobj lock
is dropped in the loop body. If the lookup fails, the loop is restarted.
This mechanism non-obviously also terminates the loop when the end of
the buf list is reached. Split up the two loops termination cases to
make the code a bit less fragile. No functional change intended.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12730
2017-10-20 14:56:13 +00:00
Mateusz Guzik
62bf13cbf9 mtx: fix up UP build after r324778
Reported by:	Michael Butler
2017-10-20 14:04:01 +00:00
Konstantin Belousov
ac04195ba6 Move swapout code into vm/vm_swapout.c.
There is no NO_SWAPPING #ifdef left in the code.

Requested by:	alc
Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
Differential revision:	https://reviews.freebsd.org/D12663
2017-10-20 09:10:49 +00:00
Konstantin Belousov
05877a8595 Do not overwrite clean blocks on pageout.
If filesystem block size is less than the page size, it is possible
that the page-out run contains partially clean pages.  E.g., the chunk
of the page might be bdwrite()-ed, or some thread performed bwrite()
on a buffer which references a chunk of the paged out page.  As
result, the assertion added in r319975, which checked that all pages
in the run are dirty, does not hold on such filesystems.

One solution is to remove the assert, but it is undesirable, because
we do overwrite the valid on-disk content. I cannot provide a scenario
where such write would corrupt the file data, but I do not like it on
principle.  Another, in my opinion proper, solution is to only write
parts of the pages still marked dirty.  The patch implements this, it
skips clean blocks and only writes the dirty block runs.

Note that due to clustering, write one page might clean other pages in
the run, so the next write range must be calculated only after the
current range is written out.

More, due to a possible invalidation, and the fact that the object
lock is dropped and reacquired before the checks, it is possible that
the whole page-out pages run appears to consist of only clean pages.
For this reason, it is impossible to assert that there is some work
for the pageout method to do (i.e. assert that there is at least one
dirty page in the run).  But such clearing can only occur due to
invalidation, and not due to a parallel write, because we own the
vnode lock exclusive.

Reported by:	fsu
In collaboration with:	pho
Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
Differential revision:	https://reviews.freebsd.org/D12668
2017-10-20 08:32:37 +00:00
Konstantin Belousov
4313989360 In vm_page_free_phys_pglist(), do not take vm_page_queue_free_mtx if
there is nothing to do.

Suggested by:	mjg
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-10-20 08:25:49 +00:00
Hans Petter Selasky
d05554bb99 The remote DMA TCP portspace selector, RDMA_PS_TCP, is used for both
iWarp and RoCE in ibcore. The selection of RDMA_PS_TCP can not be used
to indicate iWarp protocol use. Backport the proper IB device
capabilities from Linux upstream to distinguish between iWarp and
RoCE. Only allocate the additional socket required for iWarp for RDMA
IDs when at least one iWarp device present. This resolves
interopability issues between iWarp and RoCE in ibcore

Reviewed by:		np @
Differential Revision:	https://reviews.freebsd.org/D12563
Sponsored by:		Mellanox Technologies
MFC after:		3 days
2017-10-20 08:20:15 +00:00
Mateusz Guzik
c48a94251d Mark kdb_active as __read_frequently and switch to bool to eat less space. 2017-10-20 04:02:53 +00:00