Introduce SLEEPQ_DROP sleepq_signal() flag, allowing one to drop the
sleep queue chain lock before returning. Reduced lock scope allows
significantly reduce lock contention inside taskqueue_enqueue() for
ZFS worker threads doing ~350K disk reads/s on 40-thread system.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
The last commit (538ef055b7) accidentally
set the executable bits for this file. This is not intended to be
executed at all. (Would that even work?)
Sponsored by: Intel Corporation
This version is intended to be used with the 0.29.4 version of the
ice(4) driver, which will be be committed afterwards.
Signed-off-by: Eric Joyner <erj@FreeBSD.org>
Reviewed by: stallamr_netapp.com
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D30887
Hide the vDSO defines to the linux32_sysvec as they are not intended to
be used outside of it. Fix LINUX32_PS_STRINGS, use the size of
struct linux32_ps_strings instead of a numeric constant.
MFC after: 2 weeks
Hardware TLS is now supported in some interface cards and it works well. Except that
when we have connections that retransmit a lot we get into trouble with all the retransmits.
This prep step makes way for change that Drew will be making so that we can "kick out" a
session from hardware TLS.
Reviewed by: mtuexen, gallatin
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30895
When NFSv4.1 support was added to the client, the implementation was
still experimental and, as such, the default minor version was set to 0.
Since the NFSv4.1 client implementation is now believed to be solid
and the NFSv4.1/4.2 protocol is significantly better than NFSv4.0,
I beieve that NFSv4.1/4.2 should be used where possible.
This patch changes the default minor version for NFSv4 to be the highest
minor version supported by the NFSv4 server. If a specific minor version
is desired, the "minorversion" mount option can be used to override
this default. This is compatible with the Linux NFSv4 client behaviour.
This was discussed on freebsd-current@ in mid-May 2021 under
the subject "changing the default NFSv4 minor version" and
the consensus seemed to be support for this change.
It also appeared that changing this for FreeBSD 13.1 was
not considered a POLA violation, so long as UPDATING
and RELNOTES entries were made for it.
MFC after: 2 weeks
Remove an #if 0 that results in a compilation error if PV_STATS is
defined. Aside from this #if 0, there is nothing wrong with the
PV_STATS code.
MFC after: 1 week
There were two bugs that prevented V4 sockets from connecting to
a rack server running a V4/V6 socket. As well as a bug that stops the
mapped v4 in V6 address from working.
Reviewed by: mtuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30885
Some of the changes in this release:
* Large LLQ headers,
* Bug/stability fixes,
* Change of the README/Documentation.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
Update the driver in order not to break its compilation
and make use of the new ENA logging system
Migrate platform code to the new logging system provided by ena_com
layer.
Make ENA_INFO the new default log level.
Remove all explicit use of `device_printf`, all new logs requiring one
of the log macros to be used.
IO queue related attributes are registered statically at driver attach
with the rest of the ENA specific sysctl nodes. However, the number of
queues can be changed at runtime via the `ena_sysctl_io_queues_nb`
request, leading to a potential exposure of attributes for non-existing
queues.
Introduce a new `ena_sysctl_update_queue_node_nb` function, which
updates the sysctl nodes after the number of queues is altered.
This happens by either registering or unregistering node specific oids,
based on a delta between the previous and current queue count.
NOTE: All unregistered oids must be registered again before the driver
detach, e.g. by another call to this function.
Submitted by: Artur Rojek <ar@semihalf.com>
Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
Calling free on a NULL pointer is valid, as appropriate check is already
done internally:
/* free(NULL, ...) does nothing */
if (addr == NULL)
return;
Submitted by: Artur Rojek <ar@semihalf.com>
Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
Default LLQ (Low-latency queue) maximum header size is 96 bytes and can
be too small for some types of packets - like IPv6 packets with multiple
extension. This can be fixed, by using large LLQ headers.
If the device supports larger LLQ headers, the user can activate this
feature by setting sysctl tunable 'hw.ena.force_large_llq_header' to '1'
in the /boot/loader.conf file.
In case the device isn't supporting this feature, the default value (96B)
will be used.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
According to man style(9), only C-style comments should be used.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
Implement support for the NXP LS1028A SoC MDIO controller.
It is attached to the internal PCI root complex.
The controller is used to communicate with PHYs of ports connected
to the internal switch.
Submitted by: Lukasz Hajec <lha@semihalf.com>
Reviewed by: manu
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D30731
ENETC it a gigabit Ethernet controller found on the LS1028A board.
It supports basic VLAN offloads - tag extraction, injection and hardware
filtering. Inband MDIO connectivity is used for link status
monitoring through the miibus interface. Fixed-link mode is also
supported, which allows for operation of internal cpu to switch port.
Since no admin interrupts are present in hardware, link status polling
has to be used.
Due to a hardware bug software reset of the NIC results in a external
abort. Because of that most of the hardware initialization is done
during attach. This also means that in the case of an fatal error full
board reset is required.
The enetc_hw.h header was imporoted from Linux. It is dual licensed.
Submitted by: Kornel Duleba <mindal@semihalf.com>
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D30729
Provide common MDIO code for two LS1028 ENETC controllers -
an external one found on the PCI bus and internal one found in ENETC.
Submitted by: Lukasz Hajec <lha@semihalf.com>
Reviewed by: manu
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D30730
ENETC NIC found in LS1028A has a bug where clearing TX pidx/cidx
causes the ring to hang after being re-enabled.
Add a new flag, if set iflib will preserve the indices during restart.
Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: gallatin, erj
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D30728
By definition ofw_bus_get_node() should consistently return -1 when there
is no associated OF node.
MFC after: 4 weeks
Discussed with: nwhitehorn
Analyzed in: https://reviews.freebsd.org/D30761
The "function entry" needs to be recorded with TSRAW() rather than
TSENTER() since curthread is not yet defined. (The same approach
is taken in amd64's hammer_time.)
The warning message "ERROR loading DTB" (for systems without a device
tree blob) is printed extremely early in the boot process -- among
other things, before curthread or other pcpu data has been set up.
Unfortunately, printf is instrumented with TSLOG, which cannot run
quite this early.
Wrap the printf in #ifndef TSLOG; the situations where the printf
will be useful are not ones where TSLOG would be in use.
Revise pmap_remove_l2() to use the constant-time function page_to_pvh()
instead of the linear-time function pa_to_pvh().
Reviewed by: kib, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D30876
This effectively causes syncing of the mount point from softdep_prealloc(),
softdep_prerename(), and softdep_prelink(). Typically it avoids the need
for journal suspension at this point, at all.
Suggested and reviewed by: mckusick
Discussed with: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D30041
We call into softdep_prerename() and softdep_prelink() when there is
low free space in the journal. Functions sync all vnodes participating
in the VOP, in the hope that this would reduce journal utilization. But
if the vnodes are already synced, doing sync would only spend writes,
journal is filled not due to the records from modifications of our
vnodes.
Remember original seqc numbers for vnodes, and only initiate syncs when
seqc changed.
Reviewed by: mckusick
Discussed with: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D30041
so call it only in SU+J case
Reviewed by: mckusick
Discussed with: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D30041
Allocate nameidata on stack and NDPREINIT() it, for compatibility with
assumptions from other filesystems' lookup code.
Reviewed by: mckusick
Discussed with: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D30041
Its intent is to do the initialization of the future part of struct nameidata
which should be used across several namei() and VOPs. Right now it is NOP.
Reviewed by: mckusick
Discussed with: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D30041
The page table entry for a 4KB page mapping must be valid if a PV entry
for the mapping exists, so there is no point in testing each page table
entry's validity when iterating over a PV list.
Reviewed by: kib, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D30875
When allocating the mbuf we used m_get2 which fails
if len is superior to MJUMPAGESIZE, if its the case,
use m_getjcl instead.
Reviewed by: kp@
PR: 205164
Pull Request: https://github.com/freebsd/freebsd-src/pull/131
Add devd event on network iface address add/remove. Can be used to
automate actions on any address change.
Reviewed by: imp@ (and minor style tweaks)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30840
Compiling libalias results in warnings about unused functions.
Those warnings are caused by clang's heuristic to consider an inline
function as in use, iff the declaration is in a *.c file.
Declarations in *.h files do not emit those warnings.
Hence the declarations must be moved to an extra *.h file.
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D30844
ddfc9c4c59 was missing changes to two files to complete the
bus_child_pnpinfo_str->bus_child_pnpinfo. This fixes the broken kernel
builds.
Sponsored by: Netflix
Now that the upper layers all go through a layer to tie into these
information functions that translates an sbuf into char * and len. The
current interface suffers issues of what to do in cases of truncation,
etc. Instead, migrate all these functions to using struct sbuf and these
issues go away. The caller is also in charge of any memory allocation
and/or expansion that's needed during this process.
Create a bus_generic_child_{pnpinfo,location} and make it default. It
just returns success. This is for those busses that have no information
for these items. Migrate the now-empty routines to using this as
appropriate.
Document these new interfaces with man pages, and oversight from before.
Reviewed by: jhb, bcr
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29937
tcp_twcheck now expects a read lock on the inp for the SYN case
instead of a write lock.
Reviewed by: np
Fixes: 1db08fbe3f tcp_input: always request read-locking of PCB for any pure SYN segment.
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D30782
If the connection is in the process of disconnecting, ic_socket can be
NULL. For icl_cxgbei_conn_transfer_setup(), lock the connection and
check ic_socket before using it. For icl_cxgbei_conn_task_setup(),
the caller already holds the connection lock, so assert it and bail
early with ECONNRESET if the connection is disconnecting.
Reported by: Jithesh Arakkan @ Chelsio
Fixes: f949967c8e cxgbei: Fix a race between transfer setup and a peer reset.
are unneccessary and used to be there before TFO as an invariant. With
TFO and after 8d5719aa74 the "so" value is still needed.
Reported & tested by: tuexen
Fixes: 8d5719aa74
When support for a sparse pv_table was added, the implementation of
pa_to_pvh() changed from a simple constant-time calculation to iterating
over the array vm_phys_segs[]. To mitigate this issue, an alternative
function, page_to_pvh(), was introduced that still runs in constant time
but requires the vm_page_t to be known. However, three cases where the
vm_page_t is known were not converted to page_to_pvh(). This change
converts those three cases.
Reviewed by: kib, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D30832
The INQUIRY command may return a CAM_DATA_RUN_ERR code, even when
it succeeds. This happens during driver startup, causing the
current and further inquiries to be aborted, resulting in some
missing information about the controller.
Reviewed by: imp
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D30843
Reserve one page for the DMA subsystem, that may need it when the I/O
buffer is not page aligned.
Without this change, writes with the maximum allowed size failed, if:
- physical memory was fragmented, making it necessary to use one DMA
segment for each page
- the buffer to be written was not page aligned, causing the DMA
subsystem to need one extra segment
In the scenario above, the DMA subsystem would run out of segments,
resulting in a write with no SG segments, that would fail.
Reviewed by: imp
MFC after: 2 weeks
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D30798