Change 4787572d05 made if_alloc_domain() never fail, then also do the
wrappers if_alloc(), if_alloc_dev(), and if_gethandle().
No functional change intended.
Reviewed by: kp, imp, glebius, stevek
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D45740
(cherry picked from commit aa3860851b9f6a6002d135b1cac7736e0995eedc)
Some platforms require an adjustment of the ethernet hearders. Rather
than make this be on __NO_STRICT_ALIGNMENT being defined, define
VTNET_ETHER_ALIGN to be either 0 or ETHER_ALIGN (aka 2). Add a test to
the if statements to only do them when != 0. This eliminates the #ifdef
sprinkled in the code, still communicates the intent and gives the same
compiled results.
Sponsored by: Netflix
Reviewed by: bz, bryanv
Differential Revision: https://reviews.freebsd.org/D43654
(cherry picked from commit 0ea4b4084845bfeedc8c692e4d34252023b78cb3)
While we account for the padding in the length of the mbuf we use, we do
not account for it when we 'guess' the size of the mbuf to allocate
based in the MTU of the device. This leads to a situation where we might
fail if the mtu is close to a bucket size (say 2018) such that the added
padding would push us over the edge for a full-sized packet. mtu of 2018
is super rare (2016 and 2020 would both work), but fix it none-the-less.
It's a shame we can't just set VTNET_RX_HEADER_PAD to 2 in this case. The 4
seems hard-coded somewhere I've not found documented (I think it's in the
protocol given the comments about VIRTIO_F_ANY_LAYOUT).
Sponsored by: Netflix
Reviewed by: bz
Differential Revision: https://reviews.freebsd.org/D43656
(cherry picked from commit d9e0e42627613b56abf0f8fa1ad601e5690d775c)
If the header that we add to the packet's size is 0 % 4 and we're
strictly aligning, then we need to adjust where we store the header so
the packet that follows will have it's struct ip header properly
aligned. We do this on allocation (and when we check the length of the
mbufs in the lro_nomrg case). We can't just adjust the clustersz in the
softc, because it's also used to allocate the mbufs and it needs to be
the proper size for that. Since we otherwise use the size of the mbuf
(or sometimes the smaller size of the received packet) to compute how
much we can buffer, this ensures no overflows. The 2 byte adjustment
also does not affect how many packets we can receive in the lro_nomrg
case.
PR: 271288
Sponsored by: Netflix
Reviewed by: bryanv
Differential Revision: https://reviews.freebsd.org/D43224
(cherry picked from commit 3be59adbb5a2ae7600d46432d3bc82286e507e95)
If the host doesn't announce VIRTIO_NET_F_CTRL_RX we cannot disable all
multicast traffic. Previously we'd refuse to set the IFF_ALLMULTI flag,
which is the exact opposite of what is actually happening.
This broke things such as igmpproxy.
See also: https://redmine.pfsense.org/issues/14301
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D41356
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
Expect that drivers call into the network stack with the net epoch
entered. This has already been the fact since early 2020. The net
interrupts, that are marked with INTR_TYPE_NET, were entering epoch
since 511d1afb6b. For the taskqueues there is NET_TASK_INIT() and
all drivers that were known back in 2020 we marked with it in
6c3e93cb5a. However in e87c494015 we took conservative approach
and preferred to opt-in rather than opt-out for the epoch.
This change not only reverts e87c494015 but adds a safety belt to
avoid panicing with INVARIANTS if there is a missed driver. With
INVARIANTS we will run in_epoch() check, print a warning and enter
the net epoch. A driver that prints can be quickly fixed with the
IFF_NEEDSEPOCH flag, but better be augmented to properly enter the
epoch itself.
Note on TCP LRO: it is a backdoor to enter the TCP stack bypassing
some layers of net stack, ignoring either old IFF_KNOWSEPOCH or the
new IFF_NEEDSEPOCH. But the tcp_lro_flush_all() asserts the presence
of network epoch. Indeed, all NIC drivers that support LRO already
provide the epoch, either with help of INTR_TYPE_NET or just running
NET_EPOCH_ENTER() in their code.
Reviewed by: zlei, gallatin, erj
Differential Revision: https://reviews.freebsd.org/D39510
Disable software LRO during kernel dumping, because having it enabled
requires to be in a network epoch, which might or might not be the
case depending on the code path resulting in the panic.
Reviewed by: markj
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D34787
ALTQ only works on network drivers which use if_start (rather than
if_transmit). vtnet uses if_start if built with VTNET_LEGACY_TX. Default
to that the kernel is built with ALTQ enabled, to reduce user surprise.
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
The decision whether a TCP packet is sent over IPv4 or IPv6 was
based on ethertype, which works correctly. In D27926 the criteria
was changed to checking if the CSUM_IP_TSO flag is set in the
csum-flags and then considering it to be TCP/IPv4.
However, the TCP stack sets the flag to CSUM_TSO for IPv4 and IPv6,
where CSUM_TSO is defined as CSUM_IP_TSO|CSUM_IP6_TSO.
Therefore TCP/IPv6 packets gets mis-classified as TCP/IPv4,
which breaks TSO for TCP/IPv6.
This patch bases the check again on the ethertype.
This fix will be MFC instantly as discussed with re(gjb).
MFC after: instantly
PR: 254366
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D29331
Rather than have every device register itself for both virtio_pci and
virtio_mmio, provide a VIRTIO_DRIVER_MODULE wrapper to declare both,
merge VIRTIO_SIMPLE_PNPTABLE with VIRTIO_SIMPLE_PNPINFO and make the
latter register for both buses. This also has the benefit of abstracting
away the available transports and their names.
Reviewed by: bryanv
Differential Revision: https://reviews.freebsd.org/D28073
Try to standardize how drivers negotiate feature and the
function names
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27930
- Add and fix a few error path counters
- Improve sysctl descriptions
- Use flags consistently to determine IPv4 vs IPv6
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27926
Prior to V1, the driver would enable interrupts and then notify the
host that DRIVER_OK. Since for V1, DRIVER_OK needs to be set before
notifying the virtqueues, there may be items in the queues waiting
to be processed by the time interrupts are enabled.
This fixes a bug where the Rx queue would appear stuck, only being
usable after an interface down/up cycle.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27922
For multiqueue, we may use fewer than the provided maximum number of
queues. Try to limit allocations of the unused queues: no interrupts,
no indirect descriptors, and no taskqueues.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27921
Verify the max_virtqueue_pairs is within the range allowed by
the spec.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27920
This useful when running on hosts that support checksum offloading
but not the GUEST_TSO (LRO) feature. Or potentially, some GRO-like
support when doing forwarding.
Only enable SW LRO when the host LRO is not available since both
tends to be harmful, and difficult to enable/disable selectively
with only a single IFCAP_LRO flag.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27919
This allows the Rx checksum and LRO to be modified without a full
reinit of the device.
Remove IFCAP_RXCSUM_IPV6 from the interface capabilities since in
VirtIO Rx checksums are just enabled or disabled for all protocols.
Properly update IFCAP_LRO if LRO is becomes disabled when Rx
checksums are disabled.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27916
In modern VirtIO, the virtqueues cannot be notified before setting
DRIVER_OK status.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27932
Defer the ether_ifattach until the interface capabilities
are configured
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27913
This improves spec compliance because the driver is not suppose
to notify the device prior to setting the DRIVER_OK status, which
could happen with the VIRTIO_NET_F_CTRL_MAC_ADDR.
The VIRTIO_NET_F_MAC feature should always be negotiated so would
be a rare situation.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27910
This may have been required in an early, early, early version of the
specification but I cannot find any reference to it, and a promiscuous
default seems very odd so remove this code.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27909
This features lets the guest driver know the speed and duplex of
the "link". Instead of trying to support many media types based
on the possible/likely speeds/duplexes, only use the speed to
set the interface baudrate.
Cleanup ifmedia code to match other drivers.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27908
This feature lets the guest driver know the maximum MTU size
supported by the host device. If set, use this to limit the
acceptable MTUs, and improve how the receive mbuf cluster size
then is selected.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27907
- Fix the NEEDS_CSUM and DATA_VALID checksum flags. The NEEDS_CSUM
checksum is incomplete (partial) so offer a fallback for the driver
to calculate the checksum. Simplify DATA_VALID because we know
the host has validated the checksum.
- Default 4K mbuf clusters for mergeable buffers. May need to
scale this down to 2K clusters in certain configurations such
many queue pairs, big queues (like 4096 in GCP), and low memory.
- Use the MTU when calculated the receive mbuf cluster size
when not doing TSO/LRO. This will need more adjustment once
the MTU feature is supported.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27906