- Add macros to allow preinitialization of cap_rights_t.
- Convert most commonly used code paths to use preinitialized cap_rights_t.
A 3.6% speedup in fstat was measured with this change.
Reported by: mjg
Reviewed by: oshogbo
Approved by: sbruno
MFC after: 1 month
- Make sure Giant is locked when calling PCI device methods.
Newbus currently requires this.
- Avoid unlocking Giant right before aquiring the sleepqueue lock.
This can save a task switch.
MFC after: 1 week
Sponsored by: Mellanox Technologies
When disabling the MSIX IRQ vectors for a PCI device through the
LinuxKPI, make sure any old MSIX IRQ numbers are no longer visible to
the linux_pci_find_irq_dev() function else IRQs can be requested from
the wrong PCI device.
MFC after: 1 week
Sponsored by: Mellanox Technologies
instead of pause() in the msleep() function to avoid rounding errors when
converting delay values forth and back. Add a guard for a delay value
of zero milliseconds which is undefined.
MFC after: 1 week
Requested by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
delayed_work in the LinuxKPI. This allows the timer_pending() function
macro to be used with delayed work structures.
No functional nor structural change.
MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
signal in the LinuxKPI.
The read(), write() and mmap() system calls can return either EINTR or
ERESTART upon receiving a signal. Add code to figure out the correct
return value by temporarily storing the return code from the relevant
FreeBSD kernel APIs in the Linux task structure.
MFC after: 3 days
Sponsored by: Mellanox Technologies
Make sure all RCU dereferencing use the READ_ONCE() function macro.
MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
printk_ratelimited() function macro to return a boolean stating if there
was a printout, true, or not, false.
MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
LinuxKPI to be compatible with Linux. No functional change.
MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
handler in the LinuxKPI. This is needed when the interrupt handler is disabled
before freeing the interrupt.
MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
When the owner of the wire reference releases the last reference, it
might be that the page was already attempted to be freed (but free
cannot be performed at that time due to wire). Check that the page
was removed from the object as the indicator of the free attempt and
finish the free operation if so.
Reported and tested by: Slava Shwartsman
Reviewed by: hselasky
Sponsored by: Mellanox Technologies
MFC after: 1 week
actually return the computed result instead of the input value.
This is a regression issue after r289572.
Found by: gcc6
MFC after: 3 days
Sponsored by: Mellanox Technologies
LinuxKPI task struct. Change type of "state" variable from "int" to
"atomic_t" to simplify code and avoid unneccessary casting.
MFC after: 1 week
Sponsored by: Mellanox Technologies
gets drained before invoking the work function. Else the timer
mutex may still be in use which can lead to use-after-free situations,
because the work function might free the work structure before returning.
MFC after: 1 week
Sponsored by: Mellanox Technologies
Bump the FreeBSD version to force recompilation of external
kernel modules due to structure change.
PR: 222504
Submitted by: Greg V <greg@unrelenting.technology>
MFC after: 1 week
Sponsored by: Mellanox Technologies
have a scope ID. Change size of the searched scope ID to the full
16-bits. There can typically be more than 255 interfaces.
Suggested by: ae @
MFC after: 1 week
Sponsored by: Mellanox Technologies
Workaround problem that ifa_ifwithaddr() also matches the scope ID of
the IPv6 address when searching for a maching IPv6 address. For now
simply try all valid scope IDs until a match is found.
MFC after: 1 week
Sponsored by: Mellanox Technologies
use of the linux_poll_wakeup() function from unsafe contexts, which
can lead to use-after-free issues.
Instead of calling linux_poll_wakeup() directly use the wake_up()
family of functions in the LinuxKPI to do this.
Bump the FreeBSD version to force recompilation of external kernel modules.
MFC after: 1 week
Sponsored by: Mellanox Technologies
in the VM area structure in the LinuxKPI when doing mmap() and that
unsupported bits are masked away.
While at it fix some redundant use of parenthesing inside some related
macros.
Found by: KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Such drivers attach to a vgapci bus rather than directly to a pci bus. For
the rest of the LinuxKPI to work correctly in this case, we override the
vgapci bus' ivars with those of the grandparent.
Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11932
after r319757.
1) Correct the return value from __wait_event_common() from 1 to 0 in
case the timeout is specified as MAX_SCHEDULE_TIMEOUT. In the other
case __ret is zero and will be substituted in the last part of the
macro with the appropriate value before return.
2) Make sure the "timeout" argument is casted to "int" before
evaluating negativity. Else the signedness of a "long" might be
checked instead of the signedness of an integer.
3) The wait_event() function should not have a return value.
Found by: KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
While there, switch to FreeBSD internal callout active status.
Reviewed by: markj, hselasky
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D11900
wide enough to hold the full 64-bit dev_t. Instead use the "dev" field in
the "linux_cdev" structure to store and lookup this value.
While at it remove superfluous use of parenthesis inside the
MAJOR(), MINOR() and MKDEV() macros in the LinuxKPI.
MFC after: 1 week
Sponsored by: Mellanox Technologies
It looks like the __acquire and __release macros are for the consumption
of static analysis tools and have no semantic effect. Transform the
definitions from constant expressions to empty statements in order to
avoid -Wunused-value from gcc.
Likewise avoid future warnings for __chk_{user,io}_ptr, but with a cast
to void, because it looks like some linux kernel code may use those in
expression contexts.
Reviewed by: hselasky, markj
Approved by: markj (mentor)
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11695
Also add some checks for overflow to existing functions.
Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11533
Add io_mapping_init_wc() and add a third (unused) parameter to
io_mapping_map_wc().
Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11286
list.h includes a number of FreeBSD headers as a workaround for the
LIST_HEAD name collision. To reduce pollution, avoid including list.h
in commonly used headers when it is not explicitly needed.
Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11249
In particular:
- Don't evaluate event conditions with a sleepqueue lock held, since such
code may attempt to acquire arbitrary locks.
- Fix the return value for wait_event_interruptible() in the case that the
wait is interrupted by a signal.
- Implement wait_on_bit_timeout() and wait_on_atomic_t().
- Implement some functions used to test for pending signals.
- Implement a number of wait_event_*() variants and unify the existing
implementations.
- Unify the mechanism used by wait_event_*() and schedule() to put the
calling thread to sleep.
This is required to support updated DRM drivers. Thanks to hselasky for
finding and fixing a number of bugs in the original revision.
Reviewed by: hselasky
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D10986
ARM and MIPS fail universe builds.
ARM and MIPS are missing the following:
* VM_MEMATTR_WRITE_THROUGH
* VM_MEMATTR_WRITE_COMBINING
Pointy-hat to: jhibbits
arm, mips, and powerpc all implement pmap_mapdev_attr() and pmap_unmapdev(),
so add those archs to the checks. powerpc also includes the atomic_swap_*()
functions, so add that to the supported list as well. Not tested except by
compiling powerpc.
Reviewed by: markj
polling contexts in the LinuxKPI.
After the kqueue() support was added to the LinuxKPI in r319409 the
Linux poll file operation will be used outside the system file polling
callback function, which can cause a NULL-pointer panic inside
selrecord() because curthread->td_sel is set to NULL. This patch moves
the selrecord() call away from poll_wait() and to the system file poll
callback function in the LinuxKPI, which essentially wraps the Linux
one. This is similar to what the cuse(3) module is currently doing.
Refer to sys/fs/cuse/*.[ch] for more details.
MFC after: 1 week
Sponsored by: Mellanox Technologies
devices. The implementation allows read and write filters to be
created and piggybacks on the poll() file operation to determine when
a filter should trigger. The piggyback mechanism is simply to check
for the EWOULDBLOCK or EAGAIN return code from read(), write() or
ioctl() system calls and then update the kqueue() polling state bits.
The implementation is similar to the one found in the cuse(3) module.
Refer to sys/fs/cuse/*.[ch] for more details.
MFC after: 1 week
Sponsored by: Mellanox Technologies
printk_ratelimited() in the LinuxKPI.
While at it fix the inclusion guard of printk.h to be similar to the
rest of the LinuxKPI header files.
MFC after: 1 week
Sponsored by: Mellanox Technologies
task structure to avoid deadlock when tearing down the VM object
during a process exit.
Found by: markj @
MFC after: 1 week
Sponsored by: Mellanox Technologies
- Allow "struct linux_file" to be refcounted when its "_file" member
is NULL by using its "f_count" field. The reference counts are
transferred to the file structure when the file descriptor is
installed.
- Add missing vdrop() calls for error cases during open().
- Set the "_file" member of "struct linux_file" during open. This
allows use of refcounting through get_file() and fput() with LinuxKPI
character devices.
MFC after: 1 week
Sponsored by: Mellanox Technologies
kit, CK, in the LinuxKPI.
When threads are pinned to a CPU core or when there is only one CPU,
it can happen that a higher priority thread can call the CK
synchronize function while a lower priority thread holds the read
lock. Because the CK's synchronize is a simple wait loop this can lead
to a deadlock situation. To solve this problem use the recently
introduced CK's wait callback function.
When detecting a CK blocking condition figure out the lowest priority
among the blockers and update the calling thread's priority and
yield. If another CPU core is holding the read lock, pin the thread to
the blocked CPU core and update the priority. The calling threads
priority and CPU bindings are restored before return.
If a thread holding a CK read lock is detected to be sleeping, pause()
will be used instead of yield().
MFC after: 1 week
Sponsored by: Mellanox Technologies
- Move all bitmap related functions from bitops.h to bitmap.h, similar
to what Linux does.
- Apply some minor code cleanup and simplifications to optimize the
generated code when using static inline functions.
- Implement the following list of bitmap functions which are needed by
drm-next and ibcore:
- bitmap_find_next_zero_area_off()
- bitmap_find_next_zero_area()
- bitmap_or()
- bitmap_and()
- bitmap_xor()
- Add missing include directives to the qlnxe driver
(davidcs@ has been notified)
MFC after: 1 week
Sponsored by: Mellanox Technologies
In FreeBSD thread IDs and procedure IDs have distinct number
spaces. When asking for the group leader task ID in the LinuxKPI,
return the procedure ID and let this resolve to the first task in the
procedure having a valid LinuxKPI task structure pointer.
MFC after: 1 week
Sponsored by: Mellanox Technologies
like open, close and fault using the character device pager.
Some notes about the implementation:
1) Linux drivers set the vm_ops and vm_private_data fields during a
mmap() call to indicate that the driver wants to use the LinuxKPI VM
operations. Else these operations are not used.
2) The vm_private_data pointer is associated with a VM area structure
and inserted into an internal LinuxKPI list. If the vm_private_data
pointer already exists, the existing VM area structure is used instead
of the allocated one which gets freed.
3) The LinuxKPI's vm_private_data pointer is used as the callback
handle for the FreeBSD VM object. The VM subsystem in FreeBSD has a
similar list to identify equal handles and will only call the
character device pager's close function once.
4) All LinuxKPI VM operations are serialized through the mmap_sem
sempaphore, which is per procedure, which prevents simultaneous access
to the shared VM area structure when receiving page faults.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
the LinuxKPI for accessing user-space memory in the kernel.
Add functions to hold and wire physical page(s) based on a given range
of user-space virtual addresses.
Add functions to get and put a reference on, wire, hold, mark
accessed, copy and dirty a physical page.
Add new VM related structures and defines as a preparation step for
advancing the memory map capabilities of the LinuxKPI.
Add function to figure out if a virtual address was allocated using
malloc().
Add function to convert a virtual kernel address into its physical
page pointer.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
functions in the LinuxKPI. Add a usage atomic to the task_struct
structure to facilitate refcounting the task structure when returned
from get_pid_task(). The get_task_struct() and put_task_struct()
function is used to manage atomic refcounting. After this change the
task_struct should only be freed through put_task_struct().
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
some associated helper functions in the LinuxKPI. Let the existing
linux_alloc_current() function allocate and initialize the new
structure and let linux_free_current() drop the refcount on the memory
mapping structure. When the mm_struct's refcount reaches zero, the
structure is freed.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
pairwise to support the FreeBSD way of pushing and popping the page
fault flags. Ensure this by requiring every occurrence of pagefault
disable function call to have a corresponding pagefault enable call.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
Support is implemented by mapping Linux's "struct net" into FreeBSD's
"struct vnet". Currently only vnet0 is supported by ibcore.
MFC after: 1 week
Sponsored by: Mellanox Technologies
When locking a mutex and deadlock is detected the first mutex lock
call that sees the deadlock will return -EDEADLK .
MFC after: 1 week
Sponsored by: Mellanox Technologies
Add support for using mutexes during KDB and shutdown. This is also
required for doing mode-switching during panic for drm-next.
Add new mutex functions mutex_init_witness() and mutex_destroy()
allowing LinuxKPI mutexes to be tracked by witness.
Declare mutex_is_locked() and mutex_is_owned() like inline functions
to get cleaner warnings. These functions are used inside WARN_ON()
statements which might look a bit odd if these functions get fully
expanded.
Give mutexes better debug names through the mutex_name() macro when
WITNESS_ALL is defined. The mutex_name() macro can prefix parts of the
filename and line number before the mutex name.
MFC after: 1 week
Sponsored by: Mellanox Technologies
Put large functions into linux_slab.c instead of declaring them static
inline.
Add support for more memory allocation wrappers like kmalloc_array()
and __vmalloc().
Make sure either the M_WAITOK or the M_NOWAIT flag is set and mask
away unused memory allocation flags before calling FreeBSD's malloc()
routine.
Move kmalloc_node() definition to slab.h where it belongs.
Implement support for the SLAB_DESTROY_BY_RCU feature when creating a
kmem_cache which basically means kmem_cache memory is freed using
call_rcu().
MFC after: 1 week
Sponsored by: Mellanox Technologies
LinuxKPI. When the type of the argument is constant the temporary
variable cannot be assigned after the barrier. Instead assign the
temporary variable by initialization.
MFC after: 1 week
Sponsored by: Mellanox Technologies
This change makes the workqueue implementation behave more like in
Linux, both functionality wise and structure wise.
All workqueue code has been moved to linux_work.c
Add an atomic based statemachine to the work_struct to ensure proper
operation. Prior to this change struct_work was directly mapped to a
FreeBSD task. When a taskqueue has multiple threads the same task may
end up being executed on more than one worker thread simultaneously.
This might cause problems with code coming from Linux, which expects
serial behaviour, similar to Linux tasklets.
Move all global workqueue function names into the linux_xxx domain to
avoid symbol name clashes in the future.
Implement a few more workqueue related functions and macros.
Create two multithreaded taskqueues for the LinuxKPI during module
load, one for time-consuming callbacks and one for non-time consuming
callbacks.
MFC after: 1 week
Sponsored by: Mellanox Technologies
WITNESS_ALL is defined. The lock name is based on the filename and
line number where the initialisation happens.
MFC after: 1 week
Sponsored by: Mellanox Technologies
- Optimise the RCU implementation to not allocate and free
ck_epoch_records during runtime. Instead allocate two sets of
ck_epoch_records per CPU for general purpose use. The first set is
only used for reader locks and the second set is only used for
synchronization and barriers and is protected with a regular mutex to
prevent simultaneous issues.
- Move the task structure away from the rcu_head structure and into
the per-CPU structures. This allows the size of the rcu_head structure
to be reduced down to the size of two pointers.
- Fix a bug where the linux_rcu_barrier() function only waited for one
per-CPU epoch record to be completed instead of all.
- Use a critical section or a mutex to protect ck_epoch_begin() and
ck_epoch_end() depending on RCU or SRCU type. All the ck_epoch_xxx()
functions, except ck_epoch_register(), ck_epoch_unregister() and
ck_epoch_recycle() are not re-entrant and needs a critical section or
a mutex to operate in the LinuxKPI, after inspecting the CK
implementation of the above mentioned functions. The simultaneous
issues arise from per-CPU epoch records being shared between multiple
threads depending on the amount of taskswitching and how many threads
are involved with the RCU and SRCU operations.
- Properly free all epoch records by using safe list traversal at
LinuxKPI module unload. It turns out the ck_epoch_recycle() always
have the records on an internal list and use a flag in the epoch
record to track allocated and free entries. This would lead to use
after free during module unload.
- Remove redundant synchronize_rcu() call from the
linux_compat_uninit() function. Let the linux_rcu_runtime_uninit()
function do the final rcu_barrier() instead.
MFC after: 1 week
Sponsored by: Mellanox Technologies
The clang compiler will optimise these functions down to three AMD64
instructions if the bit argument is a constant during compilation.
MFC after: 1 week
Sponsored by: Mellanox Technologies
When allocating unmapped pages, take advantage of the direct map on
AMD64 to get the virtual address corresponding to a page. Else all
pages allocated must be mapped because sometimes the virtual address
of a page is requested.
Move all page allocation and deallocation code into an own C-file.
Add support for GFP_DMA32, GFP_KERNEL, GFP_ATOMIC and __GFP_ZERO
allocation flags.
Make a clear separation between mapped and unmapped allocations.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
The i915kms driver in Linux 4.9 reimplement parts of the scatter list
functions with regards to performance. In other words there is not so
much room for changing structure layouts and functionality if the
i915kms should be built AS-IS. This patch aligns the scatter list
support to what is expected by the i915kms driver. Remove some
comments not needed while at it.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
1) Add better spinlock debug names when WITNESS_ALL is defined.
2) Make sure that the calling thread gets bound to the current CPU
while a spinlock is locked. Some Linux kernel code depends on that the
CPU ID doesn't change while a spinlock is locked.
3) Add support for using LinuxKPI spinlocks during a panic().
MFC after: 1 week
Sponsored by: Mellanox Technologies
Tasklets are implemented using a taskqueue and a small statemachine on
top. The additional statemachine is required to ensure all LinuxKPI
tasklets get serialized. FreeBSD taskqueues do not guarantee
serialisation of its tasks, except when there is only one worker
thread configured.
MFC after: 1 week
Sponsored by: Mellanox Technologies
A set of helper functions have been added to manage the life of the
LinuxKPI task struct. When an external system call or task is invoked,
a check is made to create the task struct by demand. A thread
destructor callback is registered to free the task struct when a
thread exits to avoid memory leaks.
This change lays the ground for emulating the Linux kernel more
closely which is a dependency by the code using the LinuxKPI APIs.
Add new dedicated td_lkpi_task field has been added to struct thread
instead of abusing td_retval[1].
Fix some header file inclusions to make LINT kernel build properly
after this change.
Bump the __FreeBSD_version to force a rebuild of all kernel modules.
MFC after: 1 week
Sponsored by: Mellanox Technologies
The intended use is to annotate frequently used globals which either rarely
change (and thus can be grouped in the same cacheline) or are an atomic counter
(which means it may benefit from being the only variable in the cacheline).
Linker script support is provided only for amd64. Architectures without it risk
having other variables put in, i.e. as if they were not annotated. This is
harmless from correctness point of view.
Reviewed by: bde (previous version)
MFC after: 1 month
the ones obtained through devclass_get_device(). Some minor code
cleanups while at it.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
Linux list_for_each_entry() does not neccessarily end with the iterator
NULL (it may be an offset from NULL if the list member is not the first
element of the member struct).
Reported by: Coverity
CID: 1366940
Reviewed by: hselasky@
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D8780
My plan is to change this function's prototype at some point in the
future to add a new label argument, which can be used to export all of
sysctl as metrics that can be scraped by Prometheus. Switch over this
caller to use the macro wrapper counterpart.
conflict with the opensolaris kernel module.
This patch solves a problem where the kernel linker will incorrectly
resolve opensolaris kmem_xxx() functions as linuxkpi ones, which leads
to a panic when these functions are used.
Submitted by: gallatin @
Sponsored by: Mellanox Technologies
MFC after: 1 week
FreeBSD supports lazy allocation of PCI BAR, that is, when a device
driver's attach method is invoked, even if the device's PCI BAR
address wasn't initialized, the invocation of bus_alloc_resource_any()
(the call chain: pci_alloc_resource() -> pci_alloc_multi_resource() ->
pci_reserve_map() -> pci_write_bar()) would allocate a proper address
for the PCI BAR and write this 'lazy allocated' address into the PCI
BAR.
This model works fine for native FreeBSD device drivers, but _not_ for
device drivers shared with Linux (e.g. dev/mlx5/mlx5_core/mlx5_main.c
and ofed/drivers/net/mlx4/main.c. Both of them use
pci_request_regions(), which doesn't work properly with the PCI BAR
lazy allocation, because pci_resource_type() -> _pci_get_rle() always
returns NULL, so pci_request_regions() doesn't have the opportunity to
invoke bus_alloc_resource_any(). We now use pci_find_bar() in
pci_resource_type(), which is able to locate all available PCI BARs
even if some of them will be lazy allocated.
Submitted by: Dexuan Cui <decui microsoft com>
Reviewed by: hps
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D8071
Linux module parameters have a permissions value. If any write bits
are set we are allowed to modify the module parameter runtime. Reflect
this when creating the static SYSCTL nodes.
Sponsored by: Mellanox Technologies
MFC after: 1 week
Bool module parameters are no longer supported, because there is no
equivalent in FreeBSD.
There are two macros available which control the behaviour of the
LinuxKPI module parameters:
- LINUXKPI_PARAM_PARENT allows the consumer to set the SYSCTL parent
where the modules parameters will be created.
- LINUXKPI_PARAM_PREFIX defines a parameter name prefix, which is
added to all created module parameters.
Sponsored by: Mellanox Technologies
MFC after: 1 week
run after a panic(). This for example allows a LinuxKPI based graphics
stack to receive prints during a panic.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
at it use NULL for some pointer checks.
Bump the FreeBSD version to force recompilation of all kernel modules
due to a structure size change.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
streamline the rest of the xxx_to_jiffies() functions to have a
constant 64-bit argument and use identical range checks for the
result.
Specifically preserve msecs_to_jiffies(0) returning 0. See r282743 for
further details.
MFC after: 1 week
Sponsored by: Mellanox Technologies
error code checks might fail. ERESTART is in the BSD world defined as
-1. While at it add more Linux error codes.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
Linux requires that all IOCTL data resides in userspace. FreeBSD
always moves the main IOCTL structure into a kernel buffer before
invoking the IOCTL handler and then copies it back into userspace,
before returning. Hide this difference in the "linux_copyin()" and
"linux_copyout()" functions by remapping userspace addresses in the
range from 0x10000 to 0x20000, to the kernel IOCTL data buffer.
It is assumed that the userspace code, data and stack segments starts
no lower than memory address 0x400000, which is also stated by "man 1
ld", which means any valid userspace pointer can be passed to regular
LinuxKPI handled IOCTLs.
Bump the FreeBSD version to force recompilation of all kernel modules.
Discussed with: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
"current" inside all LinuxKPI file operation callbacks. The "current"
is frequently used for various debug prints, printing the thread name
and thread ID for example.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
This is kinda critical to the performance when the CPU is slow and
network bandwidth is high, e.g. in the hypervisor.
Reviewed by: rrs, gallatin, Dexuan Cui <decui microsoft com>
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5765