Commit graph

1815 commits

Author SHA1 Message Date
Jamie Gritton
c542c43ef1 Revert r337922, except for some documention-only bits. This needs to wait
until user is changed to stop using jail(2).

Differential Revision:	D14791
2018-08-16 19:09:43 +00:00
Jamie Gritton
284001a222 Put jail(2) under COMPAT_FREEBSD11. It has been the "old" way of creating
jails since FreeBSD 7.

Along with the system call, put the various security.jail.allow_foo and
security.jail.foo_allowed sysctls partly under COMPAT_FREEBSD11 (or
BURN_BRIDGES).  These sysctls had two disparate uses: on the system side,
they were global permissions for jails created via jail(2) which lacked
fine-grained permission controls; inside a jail, they're read-only
descriptions of what the current jail is allowed to do.  The first use
is obsolete along with jail(2), but keep them for the second-read-only use.

Differential Revision:	D14791
2018-08-16 18:40:16 +00:00
Conrad Meyer
ba9ace7436 settimeofday(2): Remove stale note about timezone
Contrary to the removed comment, the kernel does appear to use the timezone
argument of settimeofday.  The comment dates to the BSD4.4 import; I assume it
is just stale.
2018-08-04 22:08:24 +00:00
Ruslan Bukin
42570cd1d4 MAXLOGNAME changed to 33 in r243023.
Update man pages.

Sponsored by:	DARPA, AFRL
2018-08-03 16:05:03 +00:00
David Bright
95c05062ec Allow a EVFILT_TIMER kevent to be updated.
If a timer is updated (re-added) with a different time period
(specified in the .data field of the kevent), the new time period has
no effect; the timer will not expire until the original time has
elapsed. This violates the documented behavior as the kqueue(2) man
page says (in part) "Re-adding an existing event will modify the
parameters of the original event, and not result in a duplicate
entry."

This modification, adapted from a patch submitted by cem@ to PR214987,
fixes the kqueue system to allow updating a timer entry. The
kevent timer behavior is changed to:

  * When a timer is re-added, update the timer parameters to and
    re-start the timer using the new parameters.
  * Allow updating both active and already expired timers.
  * When the timer has already expired, dequeue any undelivered events
    and clear the count of expirations.

All of these changes address the original PR and also bring the
FreeBSD and macOS kevent timer behaviors into agreement.

A few other changes were made along the way:

  * Update the kqueue(2) man page to reflect the new timer behavior.
  * Fix man page style issues in kqueue(2) diagnosed by igor.
  * Update the timer libkqueue system test to test for the updated
    timer behavior.
  * Fix the (test) libkqueue common.h file so that it includes
    config.h which defines various HAVE_* feature defines, before the
    #if tests for such variables in common.h. This enables the use of
    the actual err(3) family of functions.
  * Fix the usages of the err(3) functions in the tests for incorrect
    type of variables. Those were formerly undiagnosed due to the
    disablement of the err(3) functions (see previous bullet point).

PR:		214987
Reported by:	Brian Wellington <bwelling@xbill.org>
Reviewed by:	kib
MFC after:	1 week
Relnotes:	yes
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D15778
2018-07-27 13:49:17 +00:00
Konstantin Belousov
b3042426d0 Remove bits of the old NUMA.
Remove numactl(1), edit numa(4) to bring it some closer to reality,
provide libc ABI shims for old NUMA syscalls.

Noted and reviewed by:	brooks (previous version)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D16142
2018-07-10 22:00:20 +00:00
Brooks Davis
7cc923f8a8 Get rid of netbsd_lchown and netbsd_msync syscall entries.
No valid FreeBSD binary very called them (they would call lchown and
msync directly) and we haven't supported NetBSD binaries in ages.

This is a respin of r335983 with a workaround for the ancient BFD linker
in the libc stubs.

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16193
2018-07-10 13:32:04 +00:00
Warner Losh
bdea3adca6 Tweak documentation to RB_ constants to reflect current use
RB_ASKNAME is no longer instructions to the boot loader to request a
prompt for which kernel to boot. Instead, it asks for what the root
file system to use. RB_INITNAME is unused, and never has been in
FreeBSD as far as I can tell. Remove it from the documentation and fix
comment. RB_SELFTEST and RB_MINIROOT likewise (though they were
completely undocumented). These last three constants can likely just
be deleted as nothing references them (even to set useless bits).

RB_ASKNAME doesn't actually survive reboot, however, so needs to be
communicated to the bootloader via other means. If the bootloader sets
it, though, it will be honored.
2018-07-10 00:01:14 +00:00
Brooks Davis
714c03c81e Revert r335983.
The bfd linker in tree doesn't support multiple names for the same
symbol (at least with current flags).
2018-07-05 16:03:03 +00:00
Brooks Davis
5b04a71dae Get rid of netbsd_lchown and netbsd_msync syscall entries.
No valid FreeBSD binary ever called them (they would call lchown and
msync directly) and we haven't supported NetBSD binaries in ages.

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15814
2018-07-05 14:12:56 +00:00
Conrad Meyer
e02d32f72e sigaction.2: Minor cleanups
Add vertical space between struct definition and function prototype.

Use "NULL" to describe zero pointers, instead of "zero."

Remove perhaps unclear "can not" and replace.  Tag struct member names used
with appropriate tags.
2018-06-28 18:17:20 +00:00
Ian Lepore
25b10ed4b7 Add some words clarifying that rename(2) does nothing when the 'from' and
'to' args are the same file.  Wording borrowed from POSIX.1-2017, but
the freebsd code to implement this behavior was added in 2002 (r103180).
2018-06-21 15:21:17 +00:00
Sean Bruno
1a43cff92a Load balance sockets with new SO_REUSEPORT_LB option.
This patch adds a new socket option, SO_REUSEPORT_LB, which allow multiple
programs or threads to bind to the same port and incoming connections will be
load balanced using a hash function.

Most of the code was copied from a similar patch for DragonflyBSD.

However, in DragonflyBSD, load balancing is a global on/off setting and can not
be set per socket. This patch allows for simultaneous use of both the current
SO_REUSEPORT and the new SO_REUSEPORT_LB options on the same system.

Required changes to structures:
Globally change so_options from 16 to 32 bit value to allow for more options.
Add hashtable in pcbinfo to hold all SO_REUSEPORT_LB sockets.

Limitations:
As DragonflyBSD, a load balance group is limited to 256 pcbs (256 programs or
threads sharing the same socket).

This is a substantially different contribution as compared to its original
incarnation at svn r332894 and reverted at svn r332967.  Thanks to rwatson@
for the substantive feedback that is included in this commit.

Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Obtained from:	DragonflyBSD
Relnotes:	Yes
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D11003
2018-06-06 15:45:57 +00:00
Mark Johnston
9f9c9b22ec Reimplement brk() and sbrk() to avoid the use of _end.
Previously, libc.so would initialize its notion of the break address
using _end, a special symbol emitted by the static linker following
the bss section.  Compatibility issues between lld and ld.bfd could
cause the wrong definition of _end (libc.so's definition rather than
that of the executable) to be used, breaking the brk()/sbrk()
interface.

Avoid this problem and future interoperability issues by simply not
relying on _end.  Instead, modify the break() system call to return
the kernel's view of the current break address, and have libc
initialize its state using an extra syscall upon the first use of the
interface.  As a side effect, this appears to fix brk()/sbrk() usage
in executables run with rtld direct exec, since the kernel and libc.so
no longer maintain separate views of the process' break address.

PR:		228574
Reviewed by:	kib (previous version)
MFC after:	2 months
Differential Revision:	https://reviews.freebsd.org/D15663
2018-06-04 19:35:15 +00:00
Justin Hibbits
2e65567500 Added ptrace support for reading/writing powerpc VSX registers
Summary:
Added ptrace support for getting/setting the remaining part of the VSX registers
(the part that's not already covered by FPR or VR registers).

This is necessary to add support for VSX registers in debuggers.

Submitted by:	Luis Pires
Differential Revision: https://reviews.freebsd.org/D15458
2018-06-02 19:17:11 +00:00
Mark Johnston
e2c1730299 Remove an inaccuracy from mincore.2.
Super pages are supported on non-x86 architectures, so just remove the
incorrect note.  While here, change terminology to be consistent with
mmap.2.

MFC after:	1 week
2018-06-01 23:40:43 +00:00
Brooks Davis
7351a8bdb5 Make vadvise compat freebsd11.
The vadvise syscall (aka ovadvise) is undocumented and has always been
implmented as returning EINVAL.  Put the syscall under COMPAT11 and
provide a userspace implementation.

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15557
2018-05-25 20:40:23 +00:00
Brooks Davis
2357535254 Indicate the brk/sbrk are deprecated and not portable.
More firmly suggest mmap(2) instead.

Include the history of arm64 and riscv shipping without brk/sbrk.

Mention that sbrk(0) produces unreliable results.

Reviewed by:	emaste, Marcin Cieślak
MFC after:	3 days
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15535
2018-05-24 18:32:54 +00:00
Konstantin Belousov
84ffdd6a81 Note that PT_SETSTEP is auto-cleared.
Wording and reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D15054
2018-05-23 17:55:30 +00:00
Sevan Janiyan
d3fff23be8 Use St macro for specifying C standards.
Reported by:	rgrimes@
2018-05-20 21:56:08 +00:00
Sevan Janiyan
d55b77df03 Fix a typo and remove an unneeded Tn macro as highlighted by mandoc -Tlint.
Submitted by:		Mateusz Piotrowski
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D15204
2018-05-20 20:28:17 +00:00
Lawrence Stewart
9891578a40 Plug a memory leak and potential NULL-pointer dereference introduced in r331214.
Each TCP connection that uses the system default cc_newreno(4) congestion
control algorithm module leaks a "struct newreno" (8 bytes of memory) at
connection initialisation time. The NULL-pointer dereference is only germane
when using the ABE feature, which is disabled by default.

While at it:

- Defer the allocation of memory until it is actually needed given that ABE is
  optional and disabled by default.

- Document the ENOMEM errno in getsockopt(2)/setsockopt(2).

- Document ENOMEM and ENOBUFS in tcp(4) as being synonymous given that they are
  used interchangeably throughout the code.

- Fix a few other nits also accidentally omitted from the original patch.

Reported by:	Harsh Jain on freebsd-net@
Tested by:	tjh@
Differential Revision:	https://reviews.freebsd.org/D15358
2018-05-17 02:46:27 +00:00
Konstantin Belousov
450cd8475a PROC_PDEATHSIG_CTL will appear first in 11.2.
Submitted by:	Thomas Munro
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D15399
2018-05-12 10:11:33 +00:00
Xin LI
b6f7731dba Remove "All rights reserved" from my files.
See r333391 for the rationale.

MFC after:	1 week
2018-05-10 06:41:08 +00:00
Eric van Gyzen
488ab515d6 Remove 'All rights reserved' from my files
See r333391 for the rationale.

Approved by:	emaste (for the Foundation copyright)
Sponsored by:	Dell EMC
2018-05-09 20:12:59 +00:00
Kyle Evans
1921252001 fcntl(2): Vaguely document that ENOTTY is possible, with light examples
Reported by:	vs (2006, FreeBSD 6.1-BETA3)
Reported by:	me (2018, angry debugging session)
MFC after:	1 month
2018-05-03 02:42:13 +00:00
Ed Maste
e2811155f1 Clarify bindat/connectat use with AT_FDCWD
Discovered during investigation into the PR - the description of
AT_FDCWD was somewhat confusing.

PR:		222632
Submitted by:	Jan Kokemüller <jan.kokemueller@gmail.com>
MFC after:	1 week
2018-04-30 17:16:17 +00:00
Konstantin Belousov
1302eea7bb Rename PROC_PDEATHSIG_SET -> PROC_PDEATHSIG_CTL and PROC_PDEATHSIG_GET
-> PROC_PDEATHSIG_STATUS for consistency with other procctl(2)
operations names.

Requested by:	emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	13 days
2018-04-20 15:19:27 +00:00
Konstantin Belousov
b940886338 Add PROC_PDEATHSIG_SET to procctl interface.
Allow processes to request the delivery of a signal upon death of
their parent process.  Supposed consumer of the feature is PostgreSQL.

Submitted by:	Thomas Munro
Reviewed by:	jilles, mjg
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D15106
2018-04-18 21:31:13 +00:00
Edward Tomasz Napierala
604f1c416c Don't put multiple names on a single .Nm line. This fixes apropos(1)
output, from this:

strnlen, strlen, strlen,(3) - find length of string                                                                                                                                                     │·······

... to this:

strlen, strnlen(3) - find length of string

PR:		223525
MFC after:	2 weeks
2018-04-17 09:05:46 +00:00
Jeff Roberson
ac8f2d6e4b Add missing file from 4331508
Document cpuset_{get,set}domain()
2018-03-25 07:42:44 +00:00
Jeff Roberson
93f31533df Document new NUMA related syscalls and utility options.
Sponsored by:	Netflix, Dell/EMC Isilon
2018-03-24 23:58:44 +00:00
Conrad Meyer
08a7e74c7c getentropy(3): Fallback to kern.arandom sysctl on older kernels
On older kernels, when userspace program disables SIGSYS, catch ENOSYS and
emulate getrandom(2) syscall with the kern.arandom sysctl (via existing
arc4_sysctl wrapper).

Special care is taken to faithfully emulate EFAULT on NULL pointers, because
sysctl(3) as used by kern.arandom ignores NULL oldp.  (This was caught by
getentropy(3) ATF tests.)

Reported by:	kib
Reviewed by:	kib
Discussed with:	delphij
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14785
2018-03-21 23:52:37 +00:00
Conrad Meyer
e9ac27430c Implement getrandom(2) and getentropy(3)
The general idea here is to provide userspace programs with well-defined
sources of entropy, in a fashion that doesn't require opening a new file
descriptor (ulimits) or accessing paths (/dev/urandom may be restricted
by chroot or capsicum).

getrandom(2) is the more general API, and comes from the Linux world.
Since our urandom and random devices are identical, the GRND_RANDOM flag
is ignored.

getentropy(3) is added as a compatibility shim for the OpenBSD API.

truss(1) support is included.

Tests for both system calls are provided.  Coverage is believed to be at
least as comprehensive as LTP getrandom(2) test coverage.  Additionally,
instructions for running the LTP tests directly against FreeBSD are provided
in the "Test Plan" section of the Differential revision linked below.  (They
pass, of course.)

PR:		194204
Reported by:	David CARLIER <david.carlier AT hardenedbsd.org>
Discussed with:	cperciva, delphij, jhb, markj
Relnotes:	maybe
Differential Revision:	https://reviews.freebsd.org/D14500
2018-03-21 01:15:45 +00:00
Mark Johnston
f0eaf8ec5e Remove a lingering inaccuracy from mlock.2.
User wirings of the same address range don't stack.

Noted by:	Dan Nelson
MFC after:	3 days
2018-03-20 20:45:47 +00:00
Mark Johnston
d09fcbd30e Add a space between a section number and a following comma.
Fix some nits from igor while here.

MFC after:	3 days
2018-03-15 19:03:54 +00:00
Brooks Davis
b85a98949f Refer to SysV IPC permissions as numeric constants.
POSIX defines no macros for these permissions.

Also remove unneeded headers from synopsis.

PR:		225905
Reviewed by:	wblock
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D14461
2018-03-04 20:06:02 +00:00
Brooks Davis
6d0fe480a8 Don't declare union semun in userspace unless _WANT_SEMUN is defined.
POSIX explicitly states that the application must declare union semun.
This makes no sense, but it is what it is.  This brings us into line
with Linux, MacOS/Darwin, and NetBSD.

In a ports exp-run a moderate number of ports fail due to a lack of
approprate autotools-like discovery mechanisms or local patches.  A
commit to address them will follow shortly.

PR:		224300, 224443 (exp-run)
Reviewed by:	emaste, jhb, kib
Exp-run by:	antoine
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14492
2018-03-02 22:32:53 +00:00
Brooks Davis
93e48a303a Rename kernel-only members of semid_ds and msgid_ds.
This deliberately breaks the API in preperation for future syscall
revisions which will remove these nonstandard members.

In an exp-run a single port (devel/qemu-user-static) was found to
use them which it did becuase it emulates system calls.  This has
been fixed in the ports tree.

PR:		224443 (exp-run)
Reviewed by:	kib, jhb (previous version)
Exp-run by:	antoine
Sponsored by:	DARPA, AFRP
Differential Revision:	https://reviews.freebsd.org/D14490
2018-03-02 22:10:48 +00:00
Conrad Meyer
e9180d6956 socketpair.2: Reference relevant POSIX standards
Sponsored by:	Dell EMC Isilon
2018-02-10 19:41:32 +00:00
Conrad Meyer
6e876d695e fsync.2: Cross-reference fsync(1)
Reported by:	rpokala
Sponsored by:	Dell EMC Isilon
2018-02-06 23:12:47 +00:00
Maxim Konovalov
c042d0ca4a o EMFILE errno documented.
PR:		219209
Submitted by:	yuri (with minor adjustment)
Reviewed by:	brooks
2018-01-26 08:38:26 +00:00
Kirk McKusick
4cfb30ed21 Update .Dd missed in -r328304.
Reported by: Bjoern Zeeb (bz)
MFC with:    328304
2018-01-24 22:36:21 +00:00
Kirk McKusick
8557409f20 In the C library, the setting up of the group array by various
utilities is done by calling gr_addgid() for each group to be
added (usually found by traversing /etc/group) then calling the
setgroups() system call after the group set has been created.
The gr_addgid() function (helpfully?) deduplicates the addition
of group members. So, if you call it to add a group member that
already exists, it is just dropped. Because group[0] is the
effective group-ID and is over-written when a setgid program
is run, The value in group[0] is usually duplicated so that
group value is not lost when a setgid program is run.

Historically this happened because the group value indicated
in the password file also appears in /etc/group (e.g., if you
are group staff in the password file, you will also appear in
the staff line in /etc/group). But, with the addition of the
deduplication, the attempt to add group staff was lost because
it already appeared in group[0]. So, the fix is to deduplicate
starting from group[1] which allows a duplicate of the entry in
group[0], but not in later entries.

There is some confusion about the setgroups system call because in
BSD it has (always) set the entire group including the egid group
(in group[0]). However, in Linux, it skips over group[0] and starts
setting from group[1]. See this comment from linux_setgroups:

      /*
       * cr_groups[0] holds egid. Setting the whole set from
       * the supplied set will cause egid to be changed too.
       * Keep cr_groups[0] unchanged to prevent that.
       */

To make it clear what the BSD setgroups system call does, I
added the following paragraph to the setgroups(2) manual page:

   The first entry of the group array (gidset[0]) is used as the effective
   group-ID for the process.  This entry is over-written when a setgid
   program is run.  To avoid losing access to the privileges of the
   gidset[0] entry, it should be duplicated later in the group array.
   By convention, this happens because the group value indicated in the
   password file also appears in /etc/group.  The group value in the
   password file is placed in gidset[0] and that value then gets added a
   second time when the /etc/group file is scanned to create the group set.

Reported by: Paul McMath  paulm at tetrardus.net
Reviewed by: kib
MFC after:   2 weeks
2018-01-23 22:18:45 +00:00
Alan Somers
76f9d2759b mlock(2): correct documentation for error conditions.
The man page is years out of date regarding errors. Our implementation _does_
allow unaligned addresses, and it _does_not_ check for negative lengths,
because the length is unsigned. It checks for overflow instead.

Update the tests accordingly.

Reviewed by:	bcr
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D13826
2018-01-22 21:45:54 +00:00
Jeff Roberson
3f289c3fcf Implement 'domainset', a cpuset based NUMA policy mechanism. This allows
userspace to control NUMA policy administratively and programmatically.

Implement domainset based iterators in the page layer.

Remove the now legacy numa_* syscalls.

Cleanup some header polution created by having seq.h in proc.h.

Reviewed by:	markj, kib
Discussed with:	alc
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13403
2018-01-12 22:48:23 +00:00
Eitan Adler
837fe32558 Fix a few more speelling errors
Reviewed by:		bjk
Reviewed by:		jilles (incl formal "accept")
Differential Revision:	https://reviews.freebsd.org/D13650
2017-12-28 01:31:28 +00:00
Benjamin Kaduk
9e6e05e43f Note that old sys/event.h required manual sys/types.h inclusion
ed fixed this in r313704 but older versions are still affected.
2017-12-07 01:50:17 +00:00
Ed Maste
19164ee6cd use @@@ instead of @@ in __sym_default
Using
    .symver foo,foo@@VER
causes foo and foo@@VER to be output to the .o file. This requires foo
to be weak since the linker handles foo@@VER as foo.

Using
    .symver foo,foo@@@VER
causes just foo@@ver to be output and avoid the need for making foo
weak. It also reduces the constraint on how exactly a linker has to
handle foo and foo@@VER being present.

Submitted by:	Rafael Espíndola
Reviewed by:	dim, kib
Differential Revision:	https://reviews.freebsd.org/D11653
2017-12-05 20:19:13 +00:00
Warner Losh
94ebc05f37 Fix missing .Dd bump 2017-12-01 22:52:45 +00:00