Commit graph

43165 commits

Author SHA1 Message Date
Warner Losh
b2fd259edd uart: Add a signal to compute rclk from baudrate
With newer, more diverse hardware designs, the rclk can be
unknown. Currently deployed systems have no standard way to discover the
baud-clock generator frequency. However, sometimes we have a fairly good
idea that the firmware programmed the UART to be the baud rate that it's
telling us it's at. Create a way to instruct the uart class drivers to
compute the baud clock frequency the first time their init routines are
called. Usually the 'divisors' are relatively small, meaning we will
likely have a fairly large error (goes as 1 / (divisor + 1). However,
we also know that the baud-generator clock  needs to be divided down
to the baud-rate +/- about 5% (so while the error could be large for
an arbitrary baud-clock, standard baud rates generally will give
an error of 5% or less).

Often, the console speed and the getty-configured speed are the same, so
this heuristic allows boot messages and login sessions to work.

Sponsored by:		Netflix
Reviewed by:		andrew
Differential Revision:	https://reviews.freebsd.org/D47072
2024-10-14 16:03:58 -06:00
Warner Losh
cc7854e109 uart: export rclk via sysctl
To help debugging, export the rclk a uart is using as
dev.uart.X.rclk. It can be opaque when it is wrong since any error
messages printed to the system console using the wrong rclk aren't
informative.

Sponsored by:		Netflix
Reviewed by:		andrew, markj
Differential Revision:	https://reviews.freebsd.org/D47070
2024-10-14 16:03:58 -06:00
Warner Losh
fa93443af9 uart: Prefer rclk passed in over rclk in the class
If rclk is set in sysdev, then it was set during the boot process and is
intended to override the defaults. By prefering the sysdev one over the
class, xo=XXXX in hw.uart.console can give the user a usable console for
non-traditional UARTs, especially on !x86 platforms. The default rclk
generally only is good for I/O mapped UARTS or PCI ones that we can do a
table lookup on. Other times, it can be hard to know what a good default
is without more information.

Sponsored by:		Netflix
Reviewed by:		andrew
Differential Revision:	https://reviews.freebsd.org/D47069
2024-10-14 16:03:58 -06:00
Bjoern A. Zeeb
e69e172d40 dpaa2: allow tapping of tx packets in dpni
Packet capturing on dpni is only half-working given the BPF_MTAP call
in the TX path is missing. Add it to see packets in both directions.

MFC after:	3 days
Reviewed by:	dsl
Differential Revision: https://reviews.freebsd.org/D47103
2024-10-14 17:41:35 +00:00
Kevin Bowling
7763b194d8 igc: txrx function prototype cleanup
Drop variable names of function prototypes since the file is mixed in
listing them or not and they fall out of sync.

MFC after:	1 week
Sponsored by:	BBOX.io
2024-10-14 09:10:59 -07:00
Kevin Bowling
9dc452b983 e1000: txrx function prototype cleanup
Drop variable names of function prototypes since the file is mixed in
listing them or not and they fall out of sync.

MFC after:	1 week
Sponsored by:	BBOX.io
2024-10-14 09:10:59 -07:00
Kevin Bowling
1b0e41ddff igc: Function prototype cleanup
Drop variable names of function prototypes since the file is mixed in
listing them or not and they fall out of sync.

MFC after:	1 week
Sponsored by:	BBOX.io
2024-10-14 06:52:31 -07:00
Kevin Bowling
542f5d5631 igc: Rename 'struct adapter' to 'struct igc_softc'
Rename the 'struct adapter' to 'struct igc_softc' to avoid type
ambiguity in things like kgdb and make sharing code with e1000 and
ixgbe easier.

MFC after:	1 week
Sponsored by:	BBOX.io
2024-10-14 06:52:31 -07:00
Mark Johnston
1bae9dc584 netmap: Make memory pools NUMA-aware
Each netmap adapter associated with a physical adapter is attached to a
netmap memory pool.  contigmalloc() is used to allocate physically
contiguous memory for the pool, but ideally we would ensure that all
such memory is allocated from the NUMA domain local to the adapter.

Augment netmap's memory pools with a NUMA domain ID, similar to how
IOMMU groups are handled in the Linux port.  That is, when attaching to
a physical adapter, ensure that the associated memory pools are local to
the adapter's associated memory domain, creating new pools as needed.

Some types of ifnets do not have any defined NUMA affinity; in this case
the domain ID in question is the sentinel value -1.

Add a sysctl, dev.netmap.port_numa_affinity, which can be used to enable
the new behaviour.  Keep it disabled by now to avoid surprises in case
netmap applications are relying on zero-copy optimizations to forward
packets between ports belonging to different NUMA domains.

Reviewed by:	vmaffione
MFC after:	2 weeks
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D46666
2024-10-14 13:33:33 +00:00
Kevin Bowling
09526a771a igc: Add sysctls for some missing MAC stats
MFC after:	1 week
Sponsored by:	BBOX.io
2024-10-14 06:07:41 -07:00
Kevin Bowling
68b1f5dc59 igc: Add sysctl for DMA Coalesce
This feature can increase efficiency at the expense of latency

It does not work well with the default interrupt delay, but expose
the otherwise unconnected code in the driver in case people want to
experiment.

See
https://www.intel.com/content/dam/support/us/en/documents/network/adapter/pro100/sb/466827_intel_r__dma_coalescing_white_paper_v003.pdf

MFC after:	1 week
Sponsored by:	BBOX.io
2024-10-14 05:56:39 -07:00
Emmanuel Vadot
c875e976f6 vt_splash: Remove debug print
Sponsored by:	Beckhoff Automation GmbH & Co. KG
2024-10-14 11:01:59 +02:00
Konstantin Belousov
4bf34c597c md(4): always trim the last partial sector
Do it also for the preloaded disk, in addition to the dynamically
configured device.  This is needed to avoid geom checking alignment and
panicing on read of the last sector, e.g. for partition schemes and
label tasting.

PR:	281978
Reported by:	bz
Reviewed by:	bz, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D47102
2024-10-14 11:08:21 +03:00
Peter Eriksson
b339ab1491 ciss: Don't panic on null CR ciss_dequeue_notify
Apparently, sometimes on hot plug/unplug, a null cr comes back from
ciss_dequeue_notify. This is clearly a bug, and by ignoring it we're
papering over that bug. We only ever wake the thread after enqueing a
notification or setting a bit about killing the thread, so once we check
the bit isn't the cause, cr can't be NULL unless something else has
dequeued it.

Ideally, this would be fixed, rather than papered over, but this makes a
very old card somewhat more useable for external enclosures. I suspect
it's a race when we set CISS_THREAD_SHUT and another flag (the latter
w/o ciss_mtx held), but I don't see it and w/o hardware to reproduce
it would be hard to know for sure.

PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
2024-10-13 23:37:46 -06:00
Peter Eriksson
fd95966af5 ciss: hw.ciss.initator_id to set the initiator ID
Add hw.ciss.inititor_id to set the initiator to something other than the
default.

PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
2024-10-13 23:37:46 -06:00
Peter Eriksson
45645518ea ciss: Add max physical target
Add support for tracking the maximum physical target and using that to
override the maximum logical target.

PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
2024-10-13 23:24:15 -06:00
Peter Eriksson
7c74337e2c ciss: Expose tunable hw.ciss.force_interrupt as sysctl
Expose the hw.ciss.force_interrupt tuneable as a sysctl and make it
writeable at runtime.

PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
2024-10-13 23:24:06 -06:00
Peter Eriksson
77af8c6db2 ciss: Expose tunable hw.ciss.force_transport as sysctl
Expose the hw.ciss.force_transport tuneable as a sysctl and make it
writeable at runtime.

PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
2024-10-13 23:23:54 -06:00
Peter Eriksson
cec58bba64 ciss: Expose tunable hw.ciss.nop_message_heartbeat as sysctl
Expose the hw.ciss.nop_message_heartbeat tuneable as a sysctl and make
it writeable at runtime.

PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
2024-10-13 23:23:45 -06:00
Peter Eriksson
a35564358a ciss: Expose tunable hw.ciss.expose_hidden_physical as sysctl
Expose the hw.ciss.expose_hidden_physical tuneable as a sysctl
and make it writeable at runtime.

PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
2024-10-13 23:23:36 -06:00
Peter Eriksson
d8b024673b ciss: Report more errors at higher ciss_verbose levels
Report more information on errors, including the the opcode.

PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
2024-10-13 23:23:25 -06:00
Peter Eriksson
f373e6b866 ciss: Add sysctl/tunable hw.ciss.verbose
Add tuneable to turn on/off verbosity for debugging purposes. This is
approximately the same as bootverbose, but will print even more
information when > 1.

PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
2024-10-13 23:23:17 -06:00
Peter Eriksson
74575d1428 ciss: Add sysctl/tunable hw.ciss and hw.ciss.base_transfer_speed
Add a sysctl/tuneable to report a different base transfer speed than the
default of 132*1024.

PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
2024-10-13 23:23:08 -06:00
Peter Eriksson
cafc839393 ciss: Ignore data over/under run on RECEIVE_DIAGNOSTIC
This appears to be harmless, so ignore data over/under run on
diagnostics.

PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
2024-10-13 23:22:19 -06:00
Peter Eriksson
f03e1a42e9 ciss: Minor formatting nit.
PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
2024-10-13 23:22:01 -06:00
Konstantin Belousov
e9d948cfe0 iommu: move context link and ref count into device-independent parts
This also allows to move some bits of ddb print routines into
iommu_utils.c common for x86 iommu drivers.

Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-10-14 01:30:26 +03:00
Konstantin Belousov
7896b03fff iommu_get_requester(): do not panic if asked about non-pci device
For now, return zero rid and the device itself.  Add a comment noting
that eventually ACPI HID can be used to calculate rid for some more
devices.

Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-10-14 01:30:26 +03:00
Konstantin Belousov
26ff9d2050 iommu.h: improve header self-sufficiency
The header embeds struct task into defined structures.  Also it needs
the PCI_BUSMAX constant.

Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-10-14 01:30:25 +03:00
prateek sethi
c0e0e530ce mps/mpr: Add workaround for firmware not responding to IOC_FACTS or IOC_INIT
Sometimes, especially with older firmware, mps(4) would have trouble
initializing the card in one of these two steps. Add in a retry after a
short delay. Sean Bruno and Stephen McConnell thought this was OK in the
bug discussions, but never committed it.  Steve indicated the delay
might not be necessary, but the OP clearly needed to make it longer to
make things work. I've kept the delay, and added the suggested comment.

Ported the iocfacts part to mpr as well, since we see similar errors
about once every month or two over a few thousand controllers at
work. We've not seen it with IOC_INIT as far back as I can query the
error log database, so I didn't port that forward. We'll see if this
helps, but won't know for sure until next year (so I'm committing it now
since it won't hurt and might help). We usually see this failure in
connection with complicated recovery operations with a drive that's
failing, though, at least in the last year's worth of failures. It's
not clear this is the same as OP or not.

PR: 212841
Sponsored by: Netflix
Co-authored-by: imp
2024-10-13 15:38:01 -06:00
Matthias Lanter
ecbe99e162 amdtemp: add support for AMD Family 19h Models 40h-4Fh
PR:		281962
MFC after:	2 weeks
2024-10-13 13:21:19 +00:00
Matthias Lanter
a76e28d10f amdsmn: add support for AMD Family 19h Models 40h-4Fh
PR:		281962
MFC after:	2 weeks
2024-10-13 13:20:01 +00:00
Kevin Bowling
669d26e576 igc: Want AIM at 2.5G
This should have been commited with bc9402a, need to account for
link_speed of 2500 as well on igc.

MFC after:	6 days
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Sponsored by:	BBOX.io
2024-10-12 23:13:08 -07:00
Bjoern A. Zeeb
fa3dfeff95 dpaa2: fix MRU for dpni (and software vlans along)
0480dccd3f tried to fix the MTU for software VLANs given dpni
announces IFCAP_VLAN_MTU.  Unfortunately the initial MRU during
setup is reduced from the maximum supported by the HW to our
maximum ethernet RX frame length so only after further mtu toggles
the solution there would work.
Set the maximum RX frame size (without CRC) to jumbo length +
vlan encap len by default given we also announce IFCAP_JUMBO_MTU.

While here improve the manual (ioctl) MTU setting by checking if
IFCAP_VLAN_MTU is currently enabled and only then add the extra
bytes.

Fixes:		0480dccd3f
MFC after:	3 days
Reviewed by:	dsl
Differential Revision: https://reviews.freebsd.org/D47066
2024-10-12 22:13:39 +00:00
Arvydas Sidorenko
4584c8f0ef lpt: check readiness with predefined macros
Replace spelled-out bits with pre-defined macros for those same bits.
No functional change.

PR: 170076
Reviewed by: imp
2024-10-12 14:40:24 -06:00
Colin Percival
c808132731 acpi_gpiobus: OR GPIO_PIN_(IN|OUT)PUT into flags
Right now flags is set to 0 before this "=" -> "|=" change, but it will
matter when the NOT_YET section above becomes effective.

MFC after:	2 weeks
Sponsored by:	Amazon
2024-10-12 11:14:25 -07:00
Zhenlei Huang
c7a2636889 axgbe: Fix setting promisc mode
Ethernet drivers should respect IFF_PROMISC rather than IFF_PPROMISC.
The latter is for user-requested promisc mode, it implies the former
but not vice versa. Some in-kernel components such as if_bridge(4) and
bpf(4) will set promisc mode for interfaces on-demand.

While here, update the debugging message to be not confusing.

This was spotted while reviewing markj@ 's work D46524.

Test from Franco shows that the interface seems to be unconditionally
initialized to promisc mode regardless of this fix. That needs further
investigation.

Reviewed by:	markj, Franco Fichtner <franco@opnsense.org>
Tested by:	Franco Fichtner <franco@opnsense.org>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D46794
2024-10-12 21:56:56 +08:00
Kevin Bowling
516d92304a igc: Improve a comment and update copyright dates
MFC after:	1 week
2024-10-12 00:53:57 -07:00
Kevin Bowling
bc9402abdd igc: Add AIM
igc is derived from igb and has never had an AIM implementation. The
same algorithm from e1000 is appropriate here.

Upon more detailed study of the Linux driver which has a newer AIM
implementation, it finally became clear to me this is actually a
holdoff timer and not an interrupt limit as it is conventionally
(statically) programmed and displayed as an interrupt rate. The data
sheets also make this somewhat clear.

Thus, AIM accomplishes two beneficial things for a wide variety of
workloads[1]:

1. At low throughput/packet rates, it will significantly lower latency
(by counter-intuitively "increasing" the interrupt rate.. better
thought of as decreasing the holdoff timer because you will modulate
down before coming anywhere near these interrupt rates).
2. At bulk data rates, it is tuned to achieve a lower interrupt rate
(by increasing the holdoff timer) than the current static 8000/s. This
decreases processing overhead and yields more headroom for other work
such as packet filters or userland.

For a single NIC this might be worth a few sys% on common CPUs, but may
be meaningful when multiplied such as if_lagg, if_bridge and forwarding
setups.

The AIM algorithm was re-introduced from the older igb or out of tree
driver, and then modernized with permission to use Intel code from other
drivers.

[1]: http://iommu.com/datasheets/ethernet/controllers-nics/intel/e1000/gbe-controllers-interrupt-moderation-appl-note.pdf

MFC after:	1 week
Relnotes:	yes
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Sponsored by:	BBOX.io
Differential Revision:	https://reviews.freebsd.org/D47053
2024-10-12 00:36:39 -07:00
Warner Losh
0bd23ca0ee uart: Fix cut-n-paste error in DBG2 code
This code is parsing the DBG2 ACPI table, not the SPCR table, so tweak
the comment.

Sponsored by:		Netflix
2024-10-11 15:27:46 -06:00
Warner Losh
9bb56359e6 uart: Document rw:XXX field of hw.uart.console
Add a one-liner description of rw - Register Width added in eae36de826.

Fixes: eae36de826
Sponsored by:		Netflix
2024-10-11 15:27:46 -06:00
Warner Losh
852233cf76 uart: Small style tweak
Use if (err == 0) rather than if (!err) to follow stlye(9) and also the
rest of the file.

Sponsored by:		Netflix
2024-10-11 15:27:46 -06:00
Kevin Bowling
3e501ef896 e1000: Re-add AIM
We originally left this out because iflib modulates interrupts and
accomplishes some level of batching versus the custom queues in the
older driver. Upon more detailed study of the Linux driver which has a
newer implementation, it finally became clear to me this is actually a
holdoff timer and not an interrupt limit as it is conventionally
(statically) programmed and displayed as an interrupt rate. The data
sheets also make this somewhat clear.

Thus, AIM accomplishes two beneficial things for a wide variety of
workloads[1]:

1. At low throughput/packet rates, it will significantly lower latency
(by counter-intuitively "increasing" the interrupt rate.. better
thought of as decreasing the holdoff timer because you will modulate
down before coming anywhere near these interrupt rates).
2. At bulk data rates, it is tuned to achieve a lower interrupt rate
(by increasing the holdoff timer) than the current static 8000/s. This
decreases processing overhead and yields more headroom for other work
such as packet filters or userland.

For a single NIC this might be worth a few sys% on common CPUs, but may
be meaningful when multiplied such as if_lagg, if_bridge and forwarding
setups.

The AIM algorithm was re-introduced from the older igb or out of tree
driver, and then modernized with permission to use Intel code from other
drivers.

I have retroactively added it to lem(4) and em(4) where the same concept
applies, albeit to a single ITR register.

[1]: http://iommu.com/datasheets/ethernet/controllers-nics/intel/e1000/gbe-controllers-interrupt-moderation-appl-note.pdf

Tested by:	cc (https://wiki.freebsd.org/chengcui/testD46768)
MFC after:	1 week
Relnotes:	yes
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Sponsored by:	BBOX.io
Differential Revision:	https://reviews.freebsd.org/D46768
2024-10-10 22:36:43 -07:00
Konstantin Belousov
8e5b07dd08 mlx5_ipsec: add enough #ifdef IPSEC_OFFLOAD to make LINT_NOIP compilable
Reported by:	kp
Sponsored by:	NVidia networking
Fixes:	2851aafe96
2024-10-10 16:18:11 +03:00
Konstantin Belousov
2851aafe96 mlx5 ipsec_offload: ensure that driver does not dereference dead sahindex
Take the sahtree rlock and check for the DEAD SA state before validating
and filling the SA xfrm attributes.

Sponsored by:	NVidia networking
2024-10-10 12:55:45 +03:00
Roger Pau Monné
e7fe856437 xen/blk{front,back}: fix usage of sector sizes different than 512b
The units of the size reported in the 'sectors' xenbus node is always 512b,
regardless of the value of the 'sector-size' node.  The sector offsets in
the ring requests are also always based on 512b sectors, regardless of the
'sector-size' reported in xenbus.

Fix both blkfront and blkback to assume 512b sectors in the required fields.

The blkif.h public header has been recently updated in upstream Xen repository
to fix the regressions in the specification introduced by later modifications,
and clarify the base units of xenstore and shared ring fields.

PR: 280884
Reported by: Christian Kujau
MFC after: 1 week
Sponsored by: Cloud Software Group
Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D46756
2024-10-08 09:29:13 +02:00
Warner Losh
6c711019f2 nvme: Don't create sysctl for io queues not created
When we can't set the number of I/O queues ont he admin queue, we
continue on. However, we don't create the I/O queue structures, so
having pointers (NULL) into them for sysctls makes no sense and leads to
a panic when accessed. When summing up different stats, also skip the
ioq stats when it's NULL.

Sponsored by:		Netflix
2024-10-07 22:22:40 -06:00
Navdeep Parhar
52e5a66eac cxgbe(4): Use correct synchronization when marking the adapter offline.
adapter->flags are guarded by a synch_op, as noted in the comment in
adapter.h where the flags are defined.

Fixes:	5241b210a4 cxgbe(4): Basic infrastructure for ULDs to participate in adapter reset.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2024-10-07 10:25:53 -07:00
Roger Pau Monné
9a73b5b1e8 xen: remove PV suspend/resume support copyright
Thew code for PV suspend/resume support has long been removed, also remove the
copyright notice associated with it.

There are still two copyright blocks with (to my understanding) slightly
different wordings of the BSD 2 clause license.  I however don't feel like
merging them due to those wording differences.

The removal of the PV suspend/resume code was done in
ed95805e90.

Sponsored by: Cloud Software Group
Reviewed by: imp
Differential revision: https://reviews.freebsd.org/D46860
2024-10-07 18:59:45 +02:00
Roger Pau Monné
9dd5105f22 xen: expose support for poweroff/reboot/suspend on xenbus
Some toolstacks won't attempt the signal power actions on xenbus unless the VM
explicitly exposes support for them.  FreeBSD supports all power actions, hence
signal on xenbus such support by setting the nodes to the value of "1".

Sponsored by: Cloud Software Group
Reviewed by: markj
Differential review: https://reviews.freebsd.org/D46859
2024-10-07 18:59:45 +02:00
Florian Walpen
e0c37c160b snd_hdsp(4): Support AO4S-192 and AI4S-192 extension boards.
Create an additional 4 channel pcm device for RME HDSP 9632 sound cards,
to support the optional AO4S-192 and AI4S-192 extension boards. For
simplicity, the <HDSP 9632 [ext]> pcm device is always present, even if
the extension boards are not installed.

Unfortunately I cannot test this with actual hardware, but I made sure
the additional channels do not affect the functionality of the HDSP 9632
as currently in src.

Reviewed by: christos, br
Differential Revision: https://reviews.freebsd.org/D46837
2024-10-04 19:51:49 +01:00