These variables are tunables, so in principle they never change at runtime.
That would mean they don't need to be tracked per-vnet.
However, they both can be decreased (back to their default values) if the
memory allocations for their respective tables fail, and these allocations are
per-vnet. That is, it's possible for a few vnets to be started and have the
tuned size for the hash and srchash tables only to have later vnets fail the
initial allocation and fall back to smaller allocations. That would confuse
the previously created vnets (because their actual table size and size/mask
variables would no longer match).
Avoid this by turning these into per-vnet variables.
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
The extensions allow to restrict interface where SP or SA are offloaded,
and to receive software and hardware offload counters for given SA.
Sponsored by: NVIDIA networking
Differential revision: https://reviews.freebsd.org/D44316
Similarly, mtu is needed to decide inline IPSEC offloiad for the driver.
Sponsored by: NVIDIA networking
Differential revision: https://reviews.freebsd.org/D44224
The information about the interface is needed to coordinate inline
offloading of IPSEC processing with corresponding driver.
Sponsored by: NVIDIA networking
Differential revision: https://reviews.freebsd.org/D44223
The mbuf flag M_HASFCS was introduced for drivers to indicate the net
stack that packets include FCS (Frame Check Sequence). In principle, to
be efficient, FCS should always be processed by hardware, firmware, or
at last sort the driver. Well, Ethernet specifies that damaged frames
should be discarded, thus only good ones will be passed up to the net
stack, then it makes no senses for the net stack to see FCS just to trim
it.
The last consumer of the flag M_HASFCS has been removed since change [1].
It is time to retire it.
1. 105a4f7b3c ng_atmllc: remove
Reviewed by: kp
MFC after: never
Differential Revision: https://reviews.freebsd.org/D42391
Some drivers, e.g. if_enc(4), only allow one instance to be created, but
the KPI ifc_attach_cloner() treat zero value of maxunit as not limited,
aka IF_MAXUNIT.
Introduce a new flag IFC_F_LIMITUNIT to indicate that the requested
maxunit is limited and should be respected.
Consumers should use the new flag if there is an intended limit.
Reviewed by: glebius
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D45757
On clone creating, either failure from vxlan_set_user_config() or
ifc_copyin() will result in leaking previous allocated counters.
Since counter_u64_alloc(M_WAITOK) never fails, make vxlan_stats_alloc()
void and move the allocation for counters below checking ifd->params to
avoid memory leak.
Reviewed by: kp, glebius
Fixes: b092fd6c97 if_vxlan(4): add support for hardware assisted checksumming, TSO, and RSS
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D45822
The encapsulated (original) frame does not count in FCS as per Section 5
of RFC 7348.
Reviewed by: afedorov, bryanv, #network
Fixes: b7592822d5 Allow set MTU more than 1500 bytes
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D45195
Since change [1], if_bpf will not be detached by the interface departure
eventhandler and will not be NULL. Then the logic to re-attach if_bpf
becomes dead and serves no purpose any more.
This partially reverts commit 05fc416403.
1. 9ce40d321d bpf: Fix incorrect cleanup
Reviewed by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D45599
Change 4787572d05 made if_alloc_domain() never fail, then also do the
wrappers if_alloc(), if_alloc_dev(), and if_gethandle().
No functional change intended.
Reviewed by: kp, imp, glebius, stevek
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D45740
This improves readability a little. As a side effect, a redundant
CURVNET_RESTORE is removed.
No functional change intended.
Reviewed by: glebius
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D45595
While here remove a pointless static local variable lo_cloner.
No functional change intended.
Reviewed by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D45728
When a lagg interface is destroyed, it destroys all of the lagg ports,
which triggers an asynchronous link state change handler. This in turn
may generate a netlink message, a portion of which requires netlink to
invoke the SIOCGIFMEDIA ioctl of the lagg interface, which involves
scanning the list of interface media. This list is not internally
locked, it requires the interface driver to provide some kind of
synchronization.
Shortly after the link state notification has been raised, the lagg
interface detaches itself from the network stack. As a part of this, it
blocks in order to wait for link state handlers to drain, but before
that it destroys the interface media list. Reverse this order of
operations so that the link state change handlers drain first, avoiding
a use-after-free that is very occasionally triggered by lagg stress
tests. This matches other ethernet drivers in the tree.
MFC after: 2 weeks
State keys are trivially const in lookup routines, so annotate them as
such. No functional change intended.
Reviewed by: kp
MFC after: 1 week
Sponsored by: Klara, Inc.
Sponsored by: Modirum
Differential Revision: https://reviews.freebsd.org/D45671
The machinery to support 64-bit counters even on 32-bit kernels had a
bug where it would unitentionally truncate the value back to 32-bits
when transferring to a new counter. This resulted in buggy be behavior
on 64-bit kernels as well.
Sponsored by: Rubicon Communications, LLC ("Netgate")
This function was introduced in commit [1] and is actually used as a
boolean function although it was not defined as so.
No functional change intended.
1. 16d878cc99 Fix the following bpf(4) race condition which can result in a panic
Reviewed by: markj, kp, #network
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D45509
777a4702c changed how we copy out the anchor_call string, and
incorrectly limited it to 8 (4 on 32-bit systems) bytes. Fix that so we
get the full anchor path, rather than just the first few characters.
PR: 279225
Sponsored by: Rubicon Communications, LLC ("Netgate")
If we fail to change the vlan id we have to undo the removal (and vlan id
change) in the error path. Otherwise we'll have removed the vlan object from the
hash table, and have the wrong vlan id as well. Subsequent modification attempts
will then try to remove an entry which doesn't exist, and panic.
Undo the vlan id modification if the insertion in the hash table fails, and
re-insert it under the original vlan id.
PR: 279195
Reviewed by: zlei
MFC atfer: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D45285
User misconfiguration, either tunnel loops, or a large number of
different nested tunnels, can overflow the kernel stack. Prevent that
by using if_tunnel_check_nesting().
PR: 278394
Diagnosed by: markj
Reviewed by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D45197
User misconfiguration may lead to routing loops where we try to send the tunnel
packet into the tunnel. This eventually leads to stack overflows and panics.
Avoid this using if_tunnel_check_nesting(), which will drop the packet if we're
looping or we hit three layers of nested tunnels.
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Rely on LAGG_SLOCK() instead. The use of network epoch(9) here was added
in 6573d7580b (later tidied by 87bf9b9cbe) as a large sweep that
blindly substituted blocking kernel primitives with epoch(9). In these
particular code paths use of epoch(9) is incorrect and doesn't provide any
protection against a stale pointer. Recent fix 48698ead6f, which should
actually have removed the epoch use, created a potential sleeping in epoch
problem.
Based on the old submission from asomers@. With modern state of locking
in lagg(4), the patch got much simplier. Enable the test that was
waiting for this change.
PR: 226144
Reviewed by: asomers
Differential Revision: https://reviews.freebsd.org/D44605
There are situations where an struct ifnet has a NULL if_ioctl pointer.
For example, e6000sw creates such struct ifnets for each of its ports so it can
call into the MII code.
If there is then a link state event this calls do_link_state_change()
-> rtnl_handle_ifevent() -> dump_iface() -> get_operstate() ->
get_operstate_ether(). That wants to know if the link is up or down, so it tries
to ioctl(SIOCGIFMEDIA), which doesn't go well if if_ioctl is NULL.
Guard against this, and return EOPNOTSUPP.
PR: 275920
MFC ater: 3 days
Sponsored by: Rubicon Communications, LLC ("Netgate")
And more comments on the #ifdef INET blocks to improve readability.
While here, revert the order of two prototypes to produce minimal diff
compared to stable branches.
MFC with: 65767e6126
This is used by 802.3 Ethernet. (Also be used by 802.4 Token Bus and
802.5 Token Ring, but we don't support those.)
This was accidentally removed along with FDDI support in commit
0437c8e3b1, presumably because comments implied it was used only by
FDDI or Token Ring.
Fixes: 0437c8e3b1 ("Remove support for FDDI networks.")
Reviewed-by: emaste
Signed-off-by: Denny Page <dennypage@me.com>
Pull-request: https://github.com/freebsd/freebsd-src/pull/1166
- add const to the appropriate places in the libipsec public API and the
relevant internal functions needed to support that.
- replace caddr_t with c_caddr_t in ipsec_dump_policy()
- update the ipsec_dump_policy manpage to use c_caddr_t (this manpage
was already wrong as it had "char *" instead of caddr_t previously).
While here, update pfkeyv2.h to not cast away const in the PFKEY_*()
macros.
This should not cause any ABI changes as the actual types have not
changed.
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/1099
The if_bridge contains several instances of:
if (AF_INET code ...
#ifdef INET6
AF_INET6 code ...
#endif
) {
...
Clean this up by adding a couple of macros at the top of the file that
are conditionally defined based on whether INET and/or INET6 are enabled,
which makes the code more readable and easier to maintain.
No functional change intended.
Reviewed by: zlei, markj
MFC after: 1 week
Pull Request: https://github.com/freebsd/freebsd-src/pull/1191
The pseudo_AF_HDRCMPLT check is already being done in if_loop and
just needed to be ported over to if_ic, if_wg, if_disc, if_gif,
if_gre, if_me, if_tuntap and ng_iface. This is needed in order to
allow these interfaces to work properly with e.g., tcpreplay.
PR: 256587
Reviewed by: markj
MFC after: 2 weeks
Pull Request: https://github.com/freebsd/freebsd-src/pull/876
The ice(4) driver will add the ability to create extra interfaces
that hang off of the base interface; to do that the driver requires
a method for the subinterface to request hardware interrupt resources
from the base interface.
Signed-off-by: Eric Joyner <erj@FreeBSD.org>
MFC after: 3 days
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D39930
Intended to be used with upcoming feature to add sub-interfaces, since
those new interfaces will be dynamically created and will need to have
spare MSI-X interrupts already allocated for them on driver load.
This sysctl is marked as a tunable since it will need to be set before
the driver is loaded since MSI-X interrupt allocation and setup is
done during the attach process.
Signed-off-by: Eric Joyner <erj@FreeBSD.org>
MFC after: 3 days
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D41326
Some of the QUAD sysctls are actually for unsigned quad values.
Switch to using UQUAD instead, as that is meant for unsigned.
Reviewed by: erj, jhb
Obtained from: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D44620
Both the mbuf length and the total packet length are signed.
While here, update a stall comment to reflect the current practice.
Reviewed by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42390
There is no need for a separate structure neither for a linked list.
Provide each VNET with an array of pointers to if_clone that has the same
size as the driver list.
Reviewed by: zlei, kevans, kp
Differential Revision: https://reviews.freebsd.org/D44307