vm_phys_seg_paddr_to_vm_page() expects a PA that's in bounds, but
vm_phys_find_range() purposefully returns a pointer to the end of the
last page in a segment.
Fixes: 69cbb18746 ("vm_phys: Add a vm_phys_seg_paddr_to_vm_page() helper")
Most of vmm.h is machine-independent. Simplify merging amd64 and arm64
vmm code by removing this machine-dependent routine from arm64's vmm.h.
No functional change intended.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D45557
FreeBSD's boot times have decreased to the point where vm_page array
initialization represents a significant fraction of the total boot time.
For example, when booting FreeBSD in Firecracker (a VMM designed to
support lightweight VMs) with 128MB and 1GB of RAM, vm_page
initialization consumes 9% (3ms) and 37% (21.5ms) of the kernel boot
time, respectively. This is generally relevant in cloud environments,
where one wants to be able to spin up VMs as quickly as possible.
This patch implements lazy initialization of (most) page structures,
following a suggestion from cperciva@. The idea is to introduce a new
free pool, VM_FREEPOOL_LAZYINIT, into which all vm_page structures are
initially placed. For this to work, we need only initialize the first
free page of each chunk placed into the buddy allocator. Then, early
page allocations draw from the lazy init pool and initialize vm_page
chunks (up to 16MB, 4096 pages) on demand. Once APs are started, an
idle-priority thread drains the lazy init pool in the background to
avoid introducing extra latency in the allocator. With this scheme,
almost all of the initialization work is moved out of the critical path.
A couple of vm_phys operations require the pool to be drained before
they can run: vm_phys_find_range() and vm_phys_unfree_page(). However,
these are rare operations. I believe that
vm_phys_find_freelist_contig() does not require any special treatment,
as it only ever accesses the first page in a power-of-2-sized free page
chunk, which is always initialized.
For now the new pool is only used on amd64 and arm64, since that's where
I can easily test and those platforms would get the most benefit.
Reviewed by: alc, kib
Differential Revision: https://reviews.freebsd.org/D40403
A subsequent patch will make this factoring more worthwhile.
No functional change intended.
Reviewed by: dougm, alc, kib, emaste
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D40400
This is useful for a subsequent patch which implements lazy
initialization of vm_page structures using a dedicate vm_phys free page
pool.
No functional change intended.
Reviewed by: alc, kib, emaste
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D40399
jemalloc performs two types of virtual memory allocations: (1) large
chunks of virtual memory, where the chunk size is a multiple of a
superpage and explicitly aligned, and (2) small allocations, mostly
128KB, where no alignment is requested. Typically, it starts with a
small allocation, and over time it makes both types of allocation.
With anon_loc being updated on every allocation, we wind up with a
repeating pattern of a small allocation, a large gap, and a large,
aligned allocation. (As an aside, we wind up allocating a reservation
for these small allocations, but it will never fill because the next
large, aligned allocation updates anon_loc, leaving a gap that will
never be filled with other small allocations.)
With this change, anon_loc isn't updated on every allocation. So, the
small allocations will be clustered together, the large allocations will
be clustered together, and there will be fewer gaps between the
anonymous memory allocations. In addition, I see a small reduction in
reservations allocated (e.g., 1.6% during buildworld), fewer partially
populated reservations, and a small increase in 64KB page promotions on
arm64.
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D39845
Replace the lookup-remove loop in rangeet_remove_all with a call
to SWAP_PCTRIE_RECLAIM_CALLBACK, to eliminate repeated trie searches.
Reviewed by: rlibby
Differential Revision: https://reviews.freebsd.org/D45584
Replace the lookup-remove loop in swp_pager_meta_free_all with a call
to SWAP_PCTRIE_RECLAIM_CALLBACK, to eliminate repeated trie searches.
Reviewed by: rlibby
Differential Revision: https://reviews.freebsd.org/D45583
PCTRIE_RECLAIM frees all the interior nodes in a pctrie, but is little
used because most trie-destroyers want to free leaves of the tree
too. Add PCTRIE_RECLAIM_CALLBACK, with two extra arguments, a callback
function and an auxiliary argument, that is invoked on every non-NULL
leaf in the tree as the tree is destroyed.
Reviewed by: rlibby, kib (previous version)
Differential Revision: https://reviews.freebsd.org/D45565
For NFSv4.1/4.2, an atomic upgrade of a delegation from a
read delegation to a write delegation is allowed and can
result in significantly improved performance.
This patch adds this upgrade to the NFSv4.1/4.2 client and
enables use of read delegations.
For a test case of building a FreeBSD kernel (sources and
output objects) over a NFSv4.2 mount, these changes reduced
the elapsed time by 30% and included a reduction of 80% for
RPC counts when delegations were enabled. As such, with this
patch there are at least certain cases where enabling
delegations seems to be worth the increased complexity they
bring.
This patch should only affect the NFSv4.1/4.2 behaviour
when delegations are enabled, which is not the default.
MFC after: 1 month
Since delegations are only issued for regular files, check
v_type to see if the query is for a regular file. This is
a simple optimization for the non-VREG case.
While here, fix a couple of global variable declarations.
This patch should only affect the NFSv4.1/4.2 behaviour
when delegations are enabled, which is not the default.
MFC after: 1 month
NFSv4.1/4.2 defined new OPEN_WANT_xxx flags that a client
can use to hint to the server that delegations are or are
not wanted. This patch adds use of those delegations to
the client.
This patch should only affect the NFSv4.1/4.2 behaviour
when delegations are enabled, which is not the default.
MFC after: 1 month
The data of a TCP packet must fit into the announced window, but this is not
required for the sequence number of the FIN. A packet with the FIN bit set and
containing data that fits exactly into the announced window was blocked. Our
stack generates such packets when the receive buffer size is set to 1024. Now
pf uses only the data lenght for window comparison.
OK henning@
Obtained From: OpenBSD
Sponsored by: Rubicon Communications, LLC ("Netgate")
pf was setting max_win to 0 and discarded retransmitted SYN-ACK segments without
wscale if the original SYN contained a wscale option. with gerhard@, ok
henning@
Obtained From: OpenBSD
Sponsored by: Rubicon Communications, LLC ("Netgate")
This will be used when we add SVE support to reduce the registers
needed to be saved on context switch.
Reviewed by: imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D43305
When returning from an exception to userspace clear the saved td_frame.
On the next exception this should point to the frame, however this is
not guaranteed.
To ensure the trap frame pointer is either valid or NULL clear it
before returning to userspace in the EL0 synchronous exception handler.
Reviewed by: kib, markj
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D44807
Some LinuxKPI lock macros pass need a flags field passed in. This is
written to but never read from so gcc complains.
Fix this by marking the flags variables as unused to quieten the
compiler.
Reviewed by: brooks (earlier version), kib
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D45303
When a variable in write only and can't be removed, e.g. for API
reasons, it is useful to document this fact similar to __diagused
and __witness_used.
Add __writeonly to tell the compiler and anyone looking at the code
that this variable is expected to only be written to, and to not
raise and error.
Reviewed by: imp, kib
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D45561
In several places, a loop tests for powers of two, or iterates through
powers of two. In those places, replace the loop with an invocation
of fls or ilog2 without changing the meaning of the code.
Reviewed by: alc, markj, kib, np, erj, avg (previous version)
Differential Revision: https://reviews.freebsd.org/D45494
Define a page_range struct to pair up the two values passed to
freerange functions. Have swp_pager_freeswapspace also take a
page_range argument rather than a pair of arguments.
In swp_pager_meta_free_all, drop a needless test and use a new
helper function to do the cleanup for each swap block.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D45562
Drop an unneeded test, a branch and a needless computation to save a
few instructions.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D45558
Add boot parameter hw.vmbus.tlb_hcall for tlb flush hypercall.
By default it is set to 1 to allow hyercall tlb flush. It can be
set to 0 in loader.conf to turn off hypercall and use system
provided tlb flush routine.
The change also changes flag in the per cpu contiguous memory
allocation to no wait to avoid panic happened some cases which there
are no enough contiguous memery available at boot time.
Reported by: gbe
Tested by: whu
MFC after: 1 week
Fixes: 2b887687ed
Sponsored by: Microsoft
When removing a port, the ioctl frontend requires the "-p" argument.
But other frontends, like cfiscsi, do not. So don't require that
argument in the ctladm command. The frontend driver will report an
error if any required argument is missing.
MFC after: 2 weeks
Sponsored by: Axcient
Reviewed by: mav
Pull Request: https://github.com/freebsd/freebsd-src/pull/1279
It isn't used, and only masks/unmasks FIQs on the local CPU so will be
broken on SMP.
Reviewed by: mmel
Differential Revision: https://reviews.freebsd.org/D33804
Adjust the mair_el1 macro indentation to be consistent with the
surrounding macros.
Reviewed by: emaste
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D45524
This code runs at EL2 while the kernel runs at EL1. We build these
files for EL2 through a dependency in vmm_hyp_blob.elf.full so there
is no need to include them in SRCS.
Reviewed by: imp, kib, markj
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D45467
When we update credits there is a potential for a race causing an
overflow of vxcr_next (i.e. incrementing it past vxcr_ndesc). Change the
check to >= rather than == to be more robust against this.
Reviewed by: emaste
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D43712
Extend SNDST_DSPS_PROVIDER_INFO for sound(4) to include information
about each channel in a given device, similar to how cat'ing
/dev/sndstat with hw.snd.verbose=2 works.
While here, document all provider_info fields.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Reviewed by: dev_submerge.ch, markj
Differential Revision: https://reviews.freebsd.org/D45501
On creating the pfsync(4) interface, pfsync_clone_create() does an
unconditional bpfattach(). Use bpf_peers_present() which was introduced
in commit 16d878cc99 [1] to check the presence of bpf peers.
This will save a little CPU cycles and memory usage when the
synchronisation interface is not configured and there is no bpf peers
present. There should be no functional change.
1. 16d878cc99 Fix the following bpf(4) race condition which can result in a panic
Reviewed by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D45533
On creating the pflog(4) interface, pflog_clone_create() does an
unconditional bpfattach(). Use bpf_peers_present() which was introduced
in commit 16d878cc99 [1] to check the presence of bpf peers.
This will save a little CPU cycles when no bpf peers present. There
should be no functional change.
1. 16d878cc99 Fix the following bpf(4) race condition which can result in a panic
Reviewed by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D45532
Defer the bti lookup until after page table page allocation is complete.
We sometimes release the pmap lock and sleep during page table page
allocation. Consequently, the result of a bti lookup from before
page table page allocation could be stale when we finally create the
mapping based on it.
Modify pmap_bti_same() to update the prototype PTE at the same time as
checking the address range. This eliminates the need for calling
pmap_pte_bti() in addition to pmap_bti_same(). pmap_bti_same() was
already doing most of the work of pmap_pte_bti().
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D45502
With multiple flags passed in, e.g., CTLFLAG_RD | CTLFLAG_CAPRD, due to
the precedence rules, this will result in false positive assertion. Fix
that by surrounding the replacement lists with parentheses.
Reviewed by: imp, erj
Fixes: 10a1e981d4 iflib: mark isc_driver_version as constant
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D45531
Rather than using the values and leaving net80211 names in a comment
define the LinuxKPI IEEE80211_HT_CAP_* to the net80211 IEEE80211_HTCAP_*
names. That way errors like the one fixed in 3e0915b7b6 are less
likely to happen.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Add the SET_SYSTEM_SLEEP_PM_OPS() by factoring some other macro code
out in order to set the suspend/resume functions when the struct is
already given. Such is the case in iwlwifi d3.
Also add an initial implementation of device_can_wakeup(). Though
this is likely all we need we have no way of setting the flag for it
yet so leave a pr_debug() and a comment there as well. Until we want
to support WoWLAN this is likely not needed for wireless.
Doing it the proper way checking a bool in 'struct dev_pm_info' would
change 'struct device' and with that 'struct pci_dev' and break the
KBI. In favour of mergeability this version does not implement the
full functionality yet.
Both help to make an updated iwlwifi d3 compile.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D45358
The "Invalid TXQ" error from iwlwifi seems to be triggered by a
frame being sent for a sta which is no longer known to the driver/fw.
While we make sure to trigger the sending of the frame in net80211
early enough (by calling (*iv_newstate)() early on rather than at
the end), TX in LinuxKPI is run in a deferred task. When we drop the
net80211 ic lock again and re-acquire the LHW lock the packet may not
yet have made it to the driver.
Work around this between the (ic and lhw) locks by making sure
(a) no new packets get queued after we return from (*iv_newstate)(),
and (b) the TX task has run or gets cancelled and we manually push
any remaining packets out (or let lsta_free() clean them up).
The disabled packet queuing now also needs to be re-enabled in
scan_to_auth() in case an lsta is staying in service or gets re-used.
Also make sure that any following lkpi_wake_tx_queues() calls no
longer ignore queues which have not seen a prior dequeue.
This former workaround "feature" (ltxq->seen_dequeue) should be
fully garbage collected in a later change on its own.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
PR: 274382
Tested by: emaste, lwhsu, thj, rkoberman at gmail.com
Accepted by: adrian
Differential Revision: https://reviews.freebsd.org/D45508
We have to unlock the net80211 ic lock in order to be able to call
sleepable downcalls to the driver/firmware; a 2nd thread may go through
net80211::join1() and (*iv_update_bss)() after we checked and unlocked.
Re-check status at the end of the function under the ic lock so that we
do not accidentally set lvif_bss_synched to true again despite it no
longer being true.
This should fix a race where we lost the (*iv_update_bss)() state
during startup where one SCAN->AUTH is followed by a (then) AUTH->AUTH
and lkpi_sta_a_to_a() did the wrong thing.
Once we re-consider net80211 state and allowing a second join
on a different node or iv_bss update without previously tearing down
the older node we can likely undo a lot of these extra checks and
workarounds.
Sponsored by: The FreeBSD Foundation (updated version)
Tested by: emaste (on and off)
MFC after: 3 days
Reviewd by: cc
Differential Revision: https://reviews.freebsd.org/D43967
IEEE80211_HT_CAP_RX_STBC was set to 0x100 instead of 0x300.
Correct to get the expected behavior.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Fixes: b0f7376822 LinuxKPI: 802.11 header updates
Reviewed by: adrian
Differential Revision: https://reviews.freebsd.org/D45506
AMPDU_RX was added as a second AMPDU_TX, LDPC_TX and LDPC_RX missing;
correct and add missing.
Makes ddb output (and other debugging) look more correct.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Reviewed by: adrian
Differential Revision: https://reviews.freebsd.org/D45505