was issued during VM-initiated i/o (pageout), so that the function
does not try to flush or remove pages or wait for the vm object
paging-in-progress counter.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
X-Differential revision: https://reviews.freebsd.org/D10241
This patch adds a general mechanism for providing encryption keys to the
kernel from the boot loader. This is intended to enable GELI support at
boot time, providing a better mechanism for passing keys to the kernel
than environment variables. It is designed to be extensible to other
applications, and can easily handle multiple encrypted volumes with
different keys.
This mechanism is currently used by the pending GELI EFI work.
Additionally, this mechanism can potentially be used to interface with
GRUB, opening up options for coreboot+GRUB configurations with completely
encrypted disks.
Another benefit over the existing system is that it does not require
re-deriving the user key from the password at each boot stage.
Most of this patch was written by Eric McCorkle. It was extended by
Allan Jude with a number of minor enhancements and extending the keybuf
feature into boot2.
GELI user keys are now derived once, in boot2, then passed to the loader,
which reuses the key, then passes it to the kernel, where the GELI module
destroys the keybuf after decrypting the volumes.
Submitted by: Eric McCorkle <eric@metricspace.net> (Original Version)
Reviewed by: oshogbo (earlier version), cem (earlier version)
MFC after: 3 weeks
Relnotes: yes
Sponsored by: ScaleEngine Inc.
Differential Revision: https://reviews.freebsd.org/D9575
resulting in a process dumping core in the corefile.
Also extend procstat to view select members of 'struct ptrace_lwpinfo'
from the contents of the note.
Sponsored by: Dell EMC Isilon
The prerequisite for '#if __EXT1_VISIBLE' functionality is the
inclusion of sys/cdefs.h. errno.h only auto-includes the header for
non-kernel environment, and EXT1 block only useful for non-kernel as
well.
Reported by: lwhsu
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
9899:2011 Appendix K 3.7.4.1.
Other needed supporting types, defines and constraint_handler
infrastructure is added as specified in the C11 spec.
Submitted by: Tom Rix <trix@juniper.net>
Sponsored by: Juniper Networks
Discussed with: ed
MFC after: 3 weeks
Differential revision: https://reviews.freebsd.org/D9903
Differential revision: https://reviews.freebsd.org/D10161
This is done so that the thread state changes during the switch
are not confused with the thread state changes reported when the thread
spins on a lock.
Here is an example, three consecutive entries for the same thread (from top to
bottom):
KTRGRAPH group:"thread", id:"zio_write_intr_3 tid 100260", state:"sleep", attributes: prio:84, wmesg:"-", lockname:"(null)"
KTRGRAPH group:"thread", id:"zio_write_intr_3 tid 100260", state:"spinning", attributes: lockname:"sched lock 1"
KTRGRAPH group:"thread", id:"zio_write_intr_3 tid 100260", state:"running", attributes: none
The above trace could leave an impression that the final state of
the thread was "running".
After this change the sleep state will be reported after the "spinning"
and "running" states reported for the sched lock.
Reviewed by: jhb, markj
MFC after: 1 week
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D9961
matches static binaries.
Interpretation of the 'static' there is that the binary must not
specify an interpreter. In particular, shared objects are matched by
the brand if BI_CAN_EXEC_DYN is also set.
This improves precision of the brand matching, which should eliminate
surprises due to brand ordering.
Revert r315701.
Discussed with and tested by: ed (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
For three years now CAM does not use SIM lock, but still enforces SIM to
use it. Remove this requirement, allowing SIMs to have any locking they
prefer, if they pass no mutex to cam_sim_alloc().
MFC after: 2 weeks
This is a painful change, but it is needed. On the one hand, we avoid
modifying them, and this slows down some ideas, on the other hand we still
eventually modify them and tools like netstat(1) never work on next version of
FreeBSD. We maintain a ton of spares in them, and we already got some ifdef
hell at the end of tcpcb.
Details:
- Hide struct inpcb, struct tcpcb under _KERNEL || _WANT_FOO.
- Make struct xinpcb, struct xtcpcb pure API structures, not including
kernel structures inpcb and tcpcb inside. Export into these structures
the fields from inpcb and tcpcb that are known to be used, and put there
a ton of spare space.
- Make kernel and userland utilities compilable after these changes.
- Bump __FreeBSD_version.
Reviewed by: rrs, gnn
Differential Revision: D10018
Add a clock_nanosleep() syscall, as specified by POSIX.
Make nanosleep() a wrapper around it.
Attach the clock_nanosleep test from NetBSD. Adjust it for the
FreeBSD behavior of updating rmtp only when interrupted by a signal.
I believe this to be POSIX-compliant, since POSIX mentions the rmtp
parameter only in the paragraph about EINTR. This is also what
Linux does. (NetBSD updates rmtp unconditionally.)
Copy the whole nanosleep.2 man page from NetBSD because it is complete
and closely resembles the POSIX description. Edit, polish, and reword it
a bit, being sure to keep any relevant text from the FreeBSD page.
Reviewed by: kib, ngie, jilles
MFC after: 3 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D10020
the default partition, eMMC v4.41 and later devices can additionally
provide up to:
1 enhanced user data area partition
2 boot partitions
1 RPMB (Replay Protected Memory Block) partition
4 general purpose partitions (optionally with a enhanced or extended
attribute)
Of these "partitions", only the enhanced user data area one actually
slices the user data area partition and, thus, gets handled with the
help of geom_flashmap(4). The other types of partitions have address
space independent from the default partition and need to be switched
to via CMD6 (SWITCH), i. e. constitute a set of additional "disks".
The second kind of these "partitions" doesn't fit that well into the
design of mmc(4) and mmcsd(4). I've decided to let mmcsd(4) hook all
of these "partitions" up as disk(9)'s (except for the RPMB partition
as it didn't seem to make much sense to be able to put a file-system
there and may require authentication; therefore, RPMB partitions are
solely accessible via the newly added IOCTL interface currently; see
also below). This approach for one resulted in cleaner code. Second,
it retains the notion of mmcsd(4) children corresponding to a single
physical device each. With the addition of some layering violations,
it also would have been possible for mmc(4) to add separate mmcsd(4)
instances with one disk each for all of these "partitions", however.
Still, both mmc(4) and mmcsd(4) share some common code now e. g. for
issuing CMD6, which has been factored out into mmc_subr.c.
Besides simply subdividing eMMC devices, some Intel NUCs having UEFI
code in the boot partitions etc., another use case for the partition
support is the activation of pseudo-SLC mode, which manufacturers of
eMMC chips typically associate with the enhanced user data area and/
or the enhanced attribute of general purpose partitions.
CAVEAT EMPTOR: Partitioning eMMC devices is a one-time operation.
- Now that properly issuing CMD6 is crucial (so data isn't written to
the wrong partition for example), make a step into the direction of
correctly handling the timeout for these commands in the MMC layer.
Also, do a SEND_STATUS when CMD6 is invoked with an R1B response as
recommended by relevant specifications. However, quite some work is
left to be done in this regard; all other R1B-type commands done by
the MMC layer also should be followed by a SEND_STATUS (CMD13), the
erase timeout calculations/handling as documented in specifications
are entirely ignored so far, the MMC layer doesn't provide timeouts
applicable up to the bridge drivers and at least sdhci(4) currently
is hardcoding 1 s as timeout for all command types unconditionally.
Let alone already available return codes often not being checked in
the MMC layer ...
- Add an IOCTL interface to mmcsd(4); this is sufficiently compatible
with Linux so that the GNU mmc-utils can be ported to and used with
FreeBSD (note that due to the remaining deficiencies outlined above
SANITIZE operations issued by/with `mmc` currently most likely will
fail). These latter will be added to ports as sysutils/mmc-utils in
a bit. Among others, the `mmc` tool of the GNU mmc-utils allows for
partitioning eMMC devices (tested working).
- For devices following the eMMC specification v4.41 or later, year 0
is 2013 rather than 1997; so correct this for assembling the device
ID string properly.
- Let mmcsd.ko depend on mmc.ko. Additionally, bump MMC_VERSION as at
least for some of the above a matching pair is required.
- In the ACPI front-end of sdhci(4) describe the Intel eMMC and SDXC
controllers as such in order to match the PCI one.
Additionally, in the entry for the 80860F14 SDXC controller remove
the eMMC-only SDHCI_QUIRK_INTEL_POWER_UP_RESET.
OKed by: imp
Submitted by: ian (mmc_switch_status() implementation)
I moved this branch from github to a private server, and pulled from the
wrong one when committing r315280, so I failed to include two recent commits.
Thankfully, they were only cosmetic and were included in the review.
Specifically:
Add documentation, polish comments, and improve style(9).
Tested by: pho (r315280)
MFC after: 2 weeks
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(),
the new value of the clock shall be used to determine the time
of expiration for absolute time services based upon the
CLOCK_REALTIME clock. This applies to the time at which armed
absolute timers expire. If the absolute time requested at the
invocation of such a time service is before the new value of
the clock, the time service shall expire immediately as if the
clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime()
shall have no effect on threads that are blocked waiting for
a relative time service based upon this clock, including the
nanosleep() function; nor on the expiration of relative timers
based upon this clock. Consequently, these time services shall
expire when the requested relative interval elapses, independently
of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3)
pthread_mutex_timedlock(3)
pthread_rwlock_timedrdlock(3)
pthread_rwlock_timedwrlock(3)
sem_timedwait(3)
sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by: jhb, kib
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
INHERIT_ZERO is an OpenBSD feature.
When a page is marked as such, it would be zeroed
upon fork().
This would be used in new arc4random(3) functions.
PR: 182610
Reviewed by: kib (earlier version)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D427
overly large allocation requests.
When ktrace-ing io, sys_kevent() allocates memory to copy the
requested changes and reported events. Allocations are sized by the
incoming syscall lengths arguments, which are user-controlled, and
might cause overflow in calculations or too large allocations.
Since io trace chunks are limited by ktr_geniosize, there is no sense
it even trying to satisfy unbounded allocations. Export ktr_geniosize
and clamp the buffers sizes in advance.
PR: 217435
Reported by: Tim Newsham <tim.newsham@nccgroup.trust>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
all the clocks that they provide.
Each clocks are exported under the node 'clock.<clkname>' and have the following
children nodes :
- frequency
- parent (The selected parent, if any)
- parents (The list of parents, if any)
- childrens (The list of childrens, if any)
- enable_cnt (The enabled counter)
This give us the possibility to examine clocks at runtime and make graph of
the clock flow.
Reviewed by: mmel
MFC after: 2 month
Differential Revision: https://reviews.freebsd.org/D9833
4.0.0 (branches/release_40 296509). The release will follow soon.
Please note that from 3.5.0 onwards, clang, llvm and lldb require C++11
support to build; see UPDATING for more information.
Also note that as of 4.0.0, lld should be able to link the base system
on amd64 and aarch64. See the WITH_LLD_IS_LLD setting in src.conf(5).
Though please be aware that this is work in progress.
Release notes for llvm, clang and lld will be available here:
<http://releases.llvm.org/4.0.0/docs/ReleaseNotes.html>
<http://releases.llvm.org/4.0.0/tools/clang/docs/ReleaseNotes.html>
<http://releases.llvm.org/4.0.0/tools/lld/docs/ReleaseNotes.html>
Thanks to Ed Maste, Jan Beich, Antoine Brodin and Eric Fiselier for
their help.
Relnotes: yes
Exp-run: antoine
PR: 215969, 216008
MFC after: 1 month
Unclear how, but the locking routine for mutexes was using the *release*
barrier instead of acquire. This must have been either a copy-pasto or bad
completion.
Going through other uses of atomics shows no barriers in:
- upgrade routines (addressed in this patch)
- sections protected with turnstile locks - this should be fine as necessary
barriers are in the worst case provided by turnstile unlock
I would like to thank Mark Millard and andreast@ for reporting the problem and
testing previous patches before the issue got identified.
ps.
.-'---`-.
,' `.
| \
| \
\ _ \
,\ _ ,'-,/-)\
( * \ \,' ,' ,'-)
`._,) -',-')
\/ ''/
) / /
/ ,'-'
Hardware provided by: IBM LTC
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.
Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96
Thread might create a condition for delayed SU cleanup, which creates
a reference to the mount point in td_su, but exit without returning
through userret(), e.g. when terminating due to single-threading or
process exit. In this case, td_su reference is not dropped and mount
point cannot be freed.
Handle the situation by clearing td_su also in the thread destructor
and in exit1(). softdep_ast_cleanup() has to receive the thread as
argument, since e.g. thread destructor is executed in different
context.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
On Core2 and older Intel CPUs, where TSC stops in C2, system does not
allow C2 entrance if timecounter hardware is TSC. This is done by
tc_windup() which tests for TC_FLAGS_C2STOP flag of the new
timecounter and increases cpu_disable_c2_sleep if flag is set. Right
now init_TSC_tc() only sets the flag if cpu_deepest_sleep >= 2, but
TSC is initialized too early for this variable to be set by
acpi_cpu.c.
There is no reason to require that ACPI reported C2 and deeper states
to set TC_FLAGS_C2STOP, so remove cpu_deepest_sleep test from
init_TSC_tc() condition. And since this is the only use of the
variable, remove it at all.
Reported and submitted by: Jia-Shiun Li <jiashiun@gmail.com>
Suggested by: jhb
MFC after: 2 weeks
For example, the FreeBSD GCC (4.2.1) has a spotty support for that
feature. If the static keyword is used with an unnamed array parameter
in a function declaration, then the compilation fails with:
error: static or type qualifiers in abstract declarator
The feature does work if the parameter is named.
So, the restriction introduced in this commit can be removed when all
affected function prototypes have the workaround.
MFC after: 1 week
Sponsored by: Panzura
with geom_flashmap(4) and teach it about MMC for slicing enhanced
user data area partitions. The FDT slicer still is the default for
CFI, NAND and SPI flash on FDT-enabled platforms.
- In addition to a device_t, also pass the name of the GEOM provider
in question to the slicers as a single device may provide more than
provider.
- Build a geom_flashmap.ko.
- Use MODULE_VERSION() so other modules can depend on geom_flashmap(4).
- Remove redundant/superfluous GEOM routines that either do nothing
or provide/just call default GEOM (slice) functionality.
- Trim/adjust includes
Submitted by: jhibbits (RouterBoard bits)
Reviewed by: jhibbits
A set of helper functions have been added to manage the life of the
LinuxKPI task struct. When an external system call or task is invoked,
a check is made to create the task struct by demand. A thread
destructor callback is registered to free the task struct when a
thread exits to avoid memory leaks.
This change lays the ground for emulating the Linux kernel more
closely which is a dependency by the code using the LinuxKPI APIs.
Add new dedicated td_lkpi_task field has been added to struct thread
instead of abusing td_retval[1].
Fix some header file inclusions to make LINT kernel build properly
after this change.
Bump the __FreeBSD_version to force a rebuild of all kernel modules.
MFC after: 1 week
Sponsored by: Mellanox Technologies
When a thread is stopped in ptracestop(), the ptrace(2) user may request
a signal be delivered upon resumption of the thread. Heretofore, those signals
were discarded unless ptracestop()'s caller was issignal(). Fix this by
modifying ptracestop() to queue up signals requested by the ptrace user that
will be delivered when possible. Take special care when the signal is SIGKILL
(usually generated from a PT_KILL request); no new stop events should be
triggered after a PT_KILL.
Add a number of tests for the new functionality. Several tests were authored
by jhb.
PR: 212607
Reviewed by: kib
Approved by: kib (mentor)
MFC after: 2 weeks
Sponsored by: Dell EMC
In collaboration with: jhb
Differential Revision: https://reviews.freebsd.org/D9260