Commit graph

5239 commits

Author SHA1 Message Date
Gleb Smirnoff
0fac350c54 sockets: don't malloc/free sockaddr memory on getpeername/getsockname
Just like it was done for accept(2) in cfb1e92912, use same approach
for two simplier syscalls that return socket addresses.  Although,
these two syscalls aren't performance critical, this change generalizes
some code between 3 syscalls trimming code size.

Following example of accept(2), provide VNET-aware and INVARIANT-checking
wrappers sopeeraddr() and sosockaddr() around protosw methods.

Reviewed by:		tuexen
Differential Revision:	https://reviews.freebsd.org/D42694
2023-11-30 08:31:10 -08:00
Kristof Provost
44f323ecde pf: implement DIOCGETRULES via netlink
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-11-27 21:36:49 +01:00
Warner Losh
fdafd315ad sys: Automated cleanup of cdefs and other formatting
Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by:		Netflix
2023-11-26 22:24:00 -07:00
Warner Losh
29363fb446 sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by:		Netflix
2023-11-26 22:23:30 -07:00
Michael Tuexen
99c79cab42 if_tuntap: add LRO support to tap devices
This allows testing the LRO code with packetdrill in local mode.

Reviewed by:		rscheff
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D42548
2023-11-19 15:57:53 +01:00
Kristof Provost
7093414c63 pf: sctp heartbeats confirm a connection
When we create a new state for multihomed sctp connections (i.e.
based on INIT/INIT_ACK or ASCONF parameters) the new connection will
never see a COOKIE/COOKIE_ACK exchange. We should consider HEARTBEAT_ACK
to be a confirmation that the connection is established.

This ensures that such connections do not time out earlier than
expected.

MFC after:	1 week
Sponsored by:	Orange Business Services
2023-11-17 23:33:44 +01:00
Michael Tuexen
44669b7650 if_tuntap: remove redundant check
eh can't be NULL, so there is no need to check for it.
Reported by:	zlei
MFC after:	1 week
Sponsored by:	Netflix, Inc.
2023-11-09 11:43:54 +01:00
Michael Tuexen
ff69d13a50 if_tuntap: support receive checksum offloading for tap interfaces
When enabled, pretend that the IPv4 and transport layer checksum
is correct for packets injected via the character device.
This is a prerequisite for adding support for LRO, which will
be added next. Then packetdrill can be used to test the LRO
code in local mode.

Reviewed by:		rscheff, zlei
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D42477
2023-11-09 11:37:27 +01:00
Michael Tuexen
35af22ac98 if_tuntap: trigger the bpf hook on transmitting for the tap interface
The tun interface triggers the bpf hook when a packet is transmitted,
the tap interface triggers it when the packet is read from the
character device. This is inconsistent.
So fix the tap device such that it behaves like the tun device.
This is needed for adding support for the tap device to packetdrill.

Reviewed by:		kevans, rscheff
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D42467
2023-11-05 20:32:46 +01:00
Michael Tuexen
4ffe410e40 if_tuntap: improve code consistency
No functional change intended.

Reviewed by:		rscheff
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D42462
2023-11-04 10:22:42 +01:00
Kristof Provost
586fae3cd2 Revert "pf: remove COMPAT_FREEBSD14 #ifdef from pfvar.h"
This reverts commit 9eff639071.

The libpfctl port has been fixed (to avoid using DIOCGETSTATESV2), so we
can now safely revert this.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-10-30 09:17:56 +01:00
Kristof Provost
4f33755051 pf: allow states to be killed by their pre-NAT address
If a connection is NAT-ed we could previously only terminate it by its
ID or the post-NAT IP address. Allow users to specify they want look for
the state by its pre-NAT address. Usage: `pfctl -k nat -k <address>`.

See also:	https://redmine.pfsense.org/issues/11556
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D42312
2023-10-23 16:37:05 +02:00
Kristof Provost
ffbf25951e pf: convert rule addition to netlink
The nvlist-based version will be removed in FreeBSD 16.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D42279
2023-10-23 16:24:51 +02:00
Zhenlei Huang
7a974a6498 bpf: Make dead_bpf_if const
The dead_bpf_if is not subjected to be written. Make it const so that
on destructive writing to it the kernel will panic instead of silent
memory corruption.

No functional change intended.

Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D42189
2023-10-21 12:52:27 +08:00
Kristof Provost
9eff639071 pf: remove COMPAT_FREEBSD14 #ifdef from pfvar.h
When userspace includes pfvar.h it doesn't get the kernel's COMPAT_*
defines, so we end up not having required symbols in userspace. This
caused the libpfctl port to fail to build.

libpfctl will be updated to use the new netlink-based state export code
soon, which will also fix thix build issue.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-10-19 16:19:39 +02:00
Gleb Smirnoff
28f6910714 net/route: properly brace the RT_LOG() macro 2023-10-18 22:21:53 -07:00
R. Christian McDonald
6e281255ea lltable: fix ddb show llentry l3_addr pretty printer
The ddb commands for lltable do not produce useful l3_addr information.

This fixes the llentry pretty printer to correctly display the l3_addr

Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D42253
2023-10-17 19:03:49 +02:00
Justin Hibbits
8f31b879ec bpf: Add IfAPI analogue for bpf_peers_present()
An interface's bpf could feasibly not exist, in which case
bpf_peers_present() would panic from a NULL pointer dereference.  Solve
this by adding a new IfAPI that could deal with a NULL bpf, if such
could occur in the network stack.

Reviewed by:	zlei
Sponsored by:	Juniper Networks, Inc.
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D42082
2023-10-13 14:33:31 -04:00
Justin Hibbits
5e444deec0 Revert "bpf: Add IfAPI analogue for bpf_peers_present()"
This reverts commit c81dd8e5fe.

Commit message needs revised.
2023-10-13 14:33:31 -04:00
Justin Hibbits
c81dd8e5fe bpf: Add IfAPI analogue for bpf_peers_present()
An interface's bpf could feasibly not exist, in which case
bpf_peers_present() would panic from a NULL pointer dereference.  Solve
this by adding a new IfAPI that includes a NULL check.  Since this API
is used in only a handful of locations, it reduces the the NULL check
scope over inserting the check into bpf_peers_present().

Sponsored by:	Juniper Networks, Inc.
MFC after:	1 week
2023-10-13 13:12:44 -04:00
Kristof Provost
81647eb60e pf: implement start/stop calls via netlink
Implement equivalents to DIOCSTART and DIOCSTOP in netlink. Provide a
libpfctl implementation and add a basic test case, mostly to verify that
we still return the same errors as before the conversion

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D42145
2023-10-13 09:53:22 +02:00
Kristof Provost
ab393e9548 netlink: move NETLINK define to opt_global.h
Move the NETLINK define into opt_global.h so we can rely on it being
set correctly, without having to remember to include opt_netlink.h.
This ensures that the NETLINK define is correctly set. If not we
may end up with unloadable modules, due to missing symbols (such as
nlmsg_get_group_writer).

PR:		274306
Reviewed by:	imp, markj
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D42179
2023-10-13 09:23:47 +02:00
Kristof Provost
ebfd3b229a pf: move DIOCGETSTATES(V2) to COMPAT_FREEBSD14
We now have an improved version (via netlink). The old-style ioctl will
be removed in FreeBSD 16.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D42101
2023-10-10 11:48:22 +02:00
Konstantin Belousov
27f1ec0be2 tun/tap: correct ref count on cloned cdevs
Reported and tested by:	eugen
PR:	273418
Discussed with:	jah, kevans
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D42008
2023-10-10 02:36:59 +03:00
Zhenlei Huang
21a722d959 rtsock: Add sysctl flag CTLFLAG_TUN to loader tunable
The sysctl variable `net.route.netisr_maxqlen` is actually a loader
tunable. Add sysctl flag CTLFLAG_TUN to it so that `sysctl -T` will
report it correctly.

No functional change intended.

Reviewed by:	glebius
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D41928
2023-09-25 18:10:46 +08:00
Zhenlei Huang
cf7974fd9e sysctl: Update 'master' copy of vnet SYSCTLs on kernel environment variables change
Complete phase three of 3da1cf1e88.

With commit 110113bc08, vnet sysctl variables can be loader tunable
but the feature is limited. When the kernel modules have been initialized,
any changes (e.g. via kenv) to kernel environment variable will not affect
subsequently created VNETs.

This change relexes the limitation by listening on kernel environment
variable's set / unset events, and then update the 'master' copy of vnet
SYSCTL or restore it to its initial value.

With this change, TUNABLE_XXX_FETCH can be greately eliminated for vnet
loader tunables.

Reviewed by:	glebius
Fixes:	110113bc08 sysctl(9): Enable vnet sysctl variables to be loader tunable
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D41825
2023-09-21 12:11:28 +08:00
Dag-Erling Smørgrav
9a071e4e57 Assert that ifnet_detach_sxlock is held where needed.
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D41770
2023-09-08 10:06:11 +00:00
Kristof Provost
4d3af82f78 pf: mark removed connections within a multihome association as shutting down
Parse IP removal in ASCONF chunks, find the affected state(s) and mark
them as shutting down. This will cause them to time out according to
PFTM_TCP_CLOSING timeouts, rather than waiting for the established
session timeout.

MFC after:	3 weeks
Sponsored by:	Orange Business Services
2023-09-07 19:05:01 +02:00
Kristof Provost
51a78dd276 pf: improve SCTP state validation
Only create new states for INIT chunks, or when we're creating a
secondary state for a multihomed association.

Store and verify verification tag.

MFC after:	3 weeks
Sponsored by:	Orange Business Services
2023-09-07 19:05:01 +02:00
Kristof Provost
10aa9ddb4d pf: support SCTP multihoming
SCTP may announce additional IP addresses it'll use in the INIT/INIT_ACK
chunks, or in ASCONF chunks at any time during the connection. Parse these
parameters, evaluate the ruleset for the new connection and if allowed
create the corresponding states.

MFC after:	3 weeks
Sponsored by:	Orange Business Services
Differential Revision:	https://reviews.freebsd.org/D41637
2023-09-07 19:05:00 +02:00
Zhenlei Huang
49d6743da1 net: Check per-flow priority code point for untagged traffic
Commit 868aabb470 introduced per-flow priority. There's a defect in the
logic for untagged traffic, it does not check M_VLANTAG set in the mbuf
packet header or MTAG_8021Q/MTAG_8021Q_PCP_OUT tag set by firewall, then
can result missing desired priority in the outbound packets.

For mbuf packet with M_VLANTAG in header, some interfaces happen to work
due to bug in the drivers mentioned in D39499. As modern interfaces have
VLAN hardware offloading, the defect is barely noticeable unless the
feature per-flow priority is widely tested.

As a side effect of this defect, the soft padding to work around buggy
bridges is bypassed. That may result in regression if soft padding is
requested.

PR:		273431
Discussed with:	kib
Fixes:	868aabb470 Add IP(V6)_VLAN_PCP to set 802.1 priority per-flow
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D39536
2023-09-06 18:15:14 +08:00
Dag-Erling Smørgrav
b451dcc84f if_vlan: Always default to 802.1q.
There is no reason for this fallback to be conditional on COMPAT_FREEBSD12.

PR:		273539
MFC after:	1 week
Sponsored by:	Klara, Inc.
Sponsored by:	NetApp, Inc.
Reviewed by:	melifaro, allanjude
Differential Revision:	https://reviews.freebsd.org/D41717
2023-09-04 23:26:18 +00:00
Kristof Provost
8d49fd7331 pf: remove DIOCGETRULE and DIOCGETSTATUS
These calls have nvlist variants that completely supersede them.
Remove the old code.

Reviewed by:	mjg
MFC after:	never
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D41651
2023-08-31 10:56:32 +02:00
Zhenlei Huang
b22aae410b net: Remove vlan metadata on pcp / vlan encapsulation
For oubound traffic, the flag M_VLANTAG is set in mbuf packet header to
indicate the underlaying interface do hardware VLAN tag insertion if
capable, otherwise the net stack will do 802.1Q encapsulation instead.

Commit 868aabb470 introduced per-flow priority which set the priority ID
in the mbuf packet header. There's a corner case that when the driver is
disabled to do hardware VLAN tag insertion, and the net stack do 802.1Q
encapsulation, then it will result double tagged packets if the driver do
not check the enabled capability (hardware VLAN tag insertion).

Unfortunately some drivers, currently known cxgbe(4) re(4) ure(4) igc(4)
and vmx(4), have this issue. From a quick review for other interface
drivers I believe a lot more drivers have the same issue. It makes more
sense to fix in net stack than to try to change every single driver.

PR:	270736
Reviewed by:	kp
Fixes:	868aabb470 Add IP(V6)_VLAN_PCP to set 802.1 priority per-flow
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D39499
2023-08-30 17:36:38 +08:00
Kristof Provost
2e8edbc285 pf: Remove DIOCCLRSTATES and DIOCKILLSTATES
These now have nvlist based alternatives, so remove them.

Reviewed by:	mjg, Pau Amma <pauamma@gundo.com> (man page)
MFC after:	never
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30056
2023-08-29 11:01:51 +02:00
Justin Hibbits
2a3716432d IfAPI: Retire if_etherbpfmtap() and if_bpfmtap()
Summary:
These came in the original DrvAPI commits in 2014, and are obsoleted by
bpf_mtap_if() and ether_bpf_mtap_if().  The `_if` suffix, rather than
prefix, conveys that it's operating on the bpf of the interface, instead
than the interface itself.

Reviewed by:	glebius
Sponsored by:	Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D41146
2023-08-25 12:50:14 -04:00
Kevin Bowling
725e4008ef iflib: invert default restart on VLAN changes
In rS360398, a new iflib device method was added to opt out of VLAN
events needing an interface reset.

I am switching the default to not requiring a restart for:
* VLAN events
* unknown events

After fixing various bugs, I do not think this would be a common need
of hardware and it is undesirable from the user's perspective causing
link flaps and much slower VLAN configuration. Currently, there are no
other restart events besides VLAN events, and setting the
ifdi_needs_restart default to false will alleviate the need to churn
every driver if an odd event is added in the future for specific
hardware.

markj points out this could cause churn in the other direction; I will
solve that problem with an event registration system as he mentions in
the review should we need it in the future.

These drivers will opt into restart and need further inspection or work:
* ixv (needs code audit, 61a8231 fixed principal issue; re-init probably
not necessary)
* axgbe (needs code audit; re-init probably not necessary)
* iavf - (needs code audit; interaction with Malicious Driver Detection
mentioned in rS360398)
* mgb - no VLAN functions are currently implemented. Left a comment.

MFC after:	2 weeks
Sponsored by:	BBOX.io
Differential Revision:	https://reviews.freebsd.org/D41558
2023-08-24 13:48:19 -07:00
Kajetan Staszkiewicz
d10de21f2f pf: Access r->rpool.cur->kif under mutex protection
pf_route() sends traffic to a specified next hop over a specific
interface. The next hop is obtained in pf_map_addr() but the interface
is obtained directly via r->rpool.cur->kif` outside of the lock held in
pf_map_addr() in multiple places around pf. The chosen interface is not
stored in source node.

Move the interface selection into pf_map_addr(), have the function
return it together with the chosen IP address and ensure its stored
in struct pf_ksrc_node, store it in the source node and use the stored
value when needed.

Sponsored by:	InnoGames GmbH
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D41570
2023-08-24 13:05:33 +02:00
Zhenlei Huang
838c8c4786 net: Do not overwrite if_vlan's PCP
In commit c7cffd65c5 the function ether_8021q_frame() was slightly
refactored to use pointer of struct ether_8021q_tag as parameter qtag to
include the new option proto.

It is wrong to write to qtag->pcp as it will effectively change the memory
that qtag points to. Unfortunately the transmit routine of if_vlan parses
pointer of the member ifv_qtag of its softc which stores vlan interface's
PCP internally, when transmitting mbufs that contains PCP the vlan
interface's PCP will get overwritten.

Fix by operating on a local copy of qtag->pcp. Also mark 'struct ether_8021q_tag'
as const so that compilers can pick up such kind of bug.

PR:	273304
Reviewed by:	kp
Fixes:	c7cffd65c5 Add support for stacked VLANs (IEEE 802.1ad, AKA Q-in-Q)
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D39505
2023-08-23 17:53:48 +08:00
Kristof Provost
949491f2a6 if_ovpn: clear mbuf flags on rx
When we receive a packet and remove the encapsulating layer we should
also clear out protocol flags and any mbuf tags.

If we do not we risk confusing firewalls filtering the tunneled packet.

See also: 	https://redmine.pfsense.org/issues/14682#change-69073
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-08-22 20:30:11 +02:00
Warner Losh
031beb4e23 sys: Remove $FreeBSD$: one-line sh pattern
Remove /^\s*#[#!]?\s*\$FreeBSD\$.*$\n/
2023-08-16 11:54:58 -06:00
Warner Losh
685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Warner Losh
dfc016587a sys: Remove $FreeBSD$: two-line .c pattern
Remove /^#include\s+<sys/cdefs.h>.*$\n\s+__FBSDID\("\$FreeBSD\$"\);\n/
2023-08-16 11:54:30 -06:00
Warner Losh
71625ec9ad sys: Remove $FreeBSD$: one-line .c comment pattern
Remove /^/[*/]\s*\$FreeBSD\$.*\n/
2023-08-16 11:54:24 -06:00
Warner Losh
2ff63af9b8 sys: Remove $FreeBSD$: one-line .h pattern
Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/
2023-08-16 11:54:18 -06:00
Warner Losh
95ee2897e9 sys: Remove $FreeBSD$: two-line .h pattern
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
2023-08-16 11:54:11 -06:00
Kevin Bowling
b1a39c31a3 vlan: Respect IFCAP_LRO mask
vlan_capabilities(), used by the IFCAP ioctl, was not respecting the
IFCAP_LRO bit if it was masked by the requestor.

This prevented if_bridge(4) from automasking LRO with a message like:
bridge0: can't disable some capabilities on em3.11: 0x400

This also prevented manually disabling LRO from any vlan interface.

PR:		254596
Reported by:	Paul Vixie <paul@redbarn.org>
MFC after:	1 week
2023-08-12 09:39:23 -07:00
Kristof Provost
fb69ed397e Revert "if_vlan: do not enable LRO for bridge interaces"
This reverts commit 5f11a33cee.

As requested by Kevin Bowling. He explains:

> The subtle bug was that vlan_capabilities() in if_vlan was not obeying
> the requested mask from its IFCAP ioctl.
2023-08-12 15:56:21 +02:00
Paul Vixie
5f11a33cee if_vlan: do not enable LRO for bridge interaces
If the parent interface is not a bridge and can do LRO and
checksum offloading on VLANs, then guess it may do LRO on VLANs.
False positive here cost nothing, while false negative may lead
to some confusions. According to Wikipedia:

"LRO should not operate on machines acting as routers, as it breaks
the end-to-end principle and can significantly impact performance."

The same reasoning applies to machines acting as bridges.

PR:		254596
MFC after:	3 weeks
2023-08-12 00:50:37 +02:00
Eric Joyner
d2dd3d5a98
iflib: Remove redundant variable
In iflib_init_locked(), sctx and scctx both point to the same value,
which is the ifc_softc_ctx field in the iflib softc. Remove the
declaration and assignment to sctx since scctx can be used instead, and
the name of scctx follows the naming convention used for local variables
that point to ifc_softc_ctx.

In theory there should be no functional impact with this change.

Signed-off-by: Eric Joyner <erj@FreeBSD.org>

Reviewed by:	kbowling@
MFC after:	3 days
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D41325
2023-08-07 15:46:48 -07:00