Commit graph

4658 commits

Author SHA1 Message Date
Kristof Provost
5a1bc5f902 altq: Fix typo in features sysctl description
Reported by:	Jose Luis Duran

(cherry picked from commit 35dabb7b9c)
2021-07-31 10:12:01 +02:00
Kristof Provost
b0e7f371cd Add FEATURE sysctls for ALTQ disciplines
This will allow userspace to more easily figure out if ALTQ is built
into the kernel and what disciplines are supported.

Reviewed by:		donner@
Differential Revision:	https://reviews.freebsd.org/D28302

(cherry picked from commit e111d79806)
2021-07-31 10:12:01 +02:00
Mark Johnston
b76e41fca9 Add required sysctl name length checks to various handlers
Reported by:	KMSAN
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 0dcef81de9)
2021-07-29 20:32:58 -04:00
Kristof Provost
7d226e964a pf: clean up syncookie callout on vnet shutdown
Ensure that we cancel any outstanding callouts for syncookies when we
terminate the vnet.

MFC after:	1 week
Sponsored by:	Modirum MDPay

(cherry picked from commit 32271c4d38)
2021-07-27 09:45:41 +02:00
Kristof Provost
2987a3643b pf: syncookie ioctl interface
Kernel side implementation to allow switching between on and off modes,
and allow this configuration to be retrieved.

MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D31139

(cherry picked from commit 231e83d342)
2021-07-27 09:42:52 +02:00
Kristof Provost
c3d03672e1 pf: syncookie support
Import OpenBSD's syncookie support for pf. This feature help pf resist
TCP SYN floods by only creating states once the remote host completes
the TCP handshake rather than when the initial SYN packet is received.

This is accomplished by using the initial sequence numbers to encode a
cookie (hence the name) in the SYN+ACK response and verifying this on
receipt of the client ACK.

Reviewed by:	kbowling
Obtained from:	OpenBSD
MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D31138

(cherry picked from commit 8e1864ed07)
2021-07-27 09:42:25 +02:00
Mateusz Guzik
6ee77aab15 pf: embed a pointer to the lock in struct pf_kstate
This shaves calculation which in particular helps on arm.

Note using the & hack instead would still be more work.

Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 907257d696)
2021-07-25 07:00:37 +00:00
Mateusz Guzik
440c90da04 pf: shrink struct pf_kstate
Makes room for a pointer.

Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 9009d36afd)
2021-07-25 07:00:32 +00:00
Mateusz Guzik
3dc78db31e pf: add a comment to pf_kstate concerning compat with pf_state_cmp
Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit f9aa757d8d)
2021-07-25 07:00:27 +00:00
Kristof Provost
e540d07879 pf: add DIOCGETSTATESV2
Add a new version of the DIOCGETSTATES call, which extends the struct to
include the original interface information.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D31097

(cherry picked from commit c6bf20a2a4)
2021-07-16 11:08:27 +02:00
Mateusz Guzik
cfaec275f6 pf: add pf_find_state_all_exists
Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 19d6e29b87)
2021-07-14 14:50:12 +00:00
Mateusz Guzik
8ef908c1d5 pf: padalign global locks found in pf.c
Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit f649cff587)
2021-07-14 14:50:12 +00:00
Mateusz Guzik
a65a227398 pf: allow table stats clearing and reading with ruleset rlock
Instead serialize against these operations with a dedicated lock.

Prior to the change, When pushing 17 mln pps of traffic, calling
DIOCRGETTSTATS in a loop would restrict throughput to about 7 mln.  With
the change there is no slowdown.

Reviewed by:	kp (previous version)
Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit dc1ab04e4c)
2021-07-14 14:50:12 +00:00
Mateusz Guzik
ce02bfa524 pf: depessimize table handling
Creating tables and zeroing their counters induces excessive IPIs (14
per table), which in turns kills single- and multi-threaded performance.

Work around the problem by extending per-CPU counters with a general
counter populated on "zeroing" requests -- it stores the currently found
sum. Then requests to report the current value are the sum of per-CPU
counters subtracted by the saved value.

Sample timings when loading a config with 100k tables on a 104-way box:

stock:

pfctl -f tables100000.conf  0.39s user 69.37s system 99% cpu 1:09.76 total
pfctl -f tables100000.conf  0.40s user 68.14s system 99% cpu 1:08.54 total

patched:

pfctl -f tables100000.conf  0.35s user 6.41s system 99% cpu 6.771 total
pfctl -f tables100000.conf  0.48s user 6.47s system 99% cpu 6.949 total

Reviewed by:	kp (previous version)
Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit f92c21a28c)
2021-07-14 14:50:12 +00:00
Kristof Provost
1fcecff2b3 pf: rename pf_state to pf_kstate
Indicate that this is a kernel-only structure, and make it easier to
distinguish from others used to communicate with userspace.

Reviewed by:	mjg
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D31096

(cherry picked from commit 211cddf9e3)
2021-07-14 07:55:54 +02:00
Mateusz Guzik
eae6de0406 iflib: switch bare zone_mbuf use to m_free_raw
Reviewed by:	kbowling
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30961

(cherry picked from commit bad5f0b6c2)
2021-07-05 12:05:00 +00:00
Mateusz Guzik
aa9d233aad pf: revert: Use counter(9) for pf_state byte/packet tracking
stats are not shared and consequently per-CPU counters only waste
memory.

No slowdown was measured when passing over 20M pps.

Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 55cc305dfc)
2021-07-05 11:32:13 +00:00
Mateusz Guzik
0e69786dae pf: deduplicate V_pf_state_z handling with pfsync
Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 803dfe3da0)
2021-07-05 11:32:12 +00:00
Mateusz Guzik
f75b6fce91 pf: assert that sizeof(struct pf_state) <= 312
To prevent accidentally going over a threshold which makes UMA fit only
12 objects per page instead of 13.

Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit e6dd0e2e8d)
2021-07-05 11:32:11 +00:00
Mateusz Guzik
3ec1b75a0d pf: add pf_release_staten and use it in pf_unlink_state
Saves one atomic op.

Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit d09388d013)
2021-07-05 11:32:10 +00:00
Kristof Provost
c816b23784 pf: store L4 headers in pf_pdesc
Rather than pointers to the headers store full copies. This brings us
slightly closer to what OpenBSD does, and also makes more sense than
storing pointers to stack variable copies of the headers.

Reviewed by:	donner, scottl
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30719

(cherry picked from commit d38630f619)
2021-06-26 10:57:37 +02:00
Kristof Provost
b4c7aa06c5 pf: Mark struct pf_pdesc as kernel only
This structure is only used by the kernel module internally. It's not
shared with user space, so hide it behind #ifdef _KERNEL.

Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 29698ed904)
2021-06-26 10:57:37 +02:00
Randall Stewart
68d6663afb This pulls over all the changes that are in the netflix
tree that fix the ratelimit code. There were several bugs
in tcp_ratelimit itself and we needed further work to support
the multiple tag format coming for the joint TLS and Ratelimit dances.

    Sponsored by: Netflix Inc.
    Differential Revision:  https://reviews.freebsd.org/D28357

(cherry picked from commit 1a714ff204)
2021-06-08 01:18:32 +02:00
Alexander V. Chernikov
d40def01a4 Fix a use after free in update_rtm_from_rc().
update_rtm_from_rc() calls update_rtm_from_info() internally.
The latter one may update provided prtm pointer with a new rtm.
Reassign rtm from prtm afeter calling update_rtm_from_info() to
 avoid touching the freed rtm.

PR:		255871
Submitted by:	lylgood@foxmail.com
2021-05-30 10:30:53 +00:00
Alexander V. Chernikov
f279295521 Fix panic when trying to delete non-existent gateway in multipath route.
IF non-existend gateway was specified, the code responsible for calculating
 an updated nexthop group, returned the same already-used nexthop group.
After the route table update, the operation result contained the same
 old & new nexthop groups. Thus, the code responsible for decomposing
 the notification to the list of simple nexthop-level notifications,
 was not able to find any differences. As a result, it hasn't updated any
  of the "simple" notification fields, resulting in empty rtentry pointer.
This empty pointer was the direct reason of a panic.

Fix the problem by returning ESRCH when the new nexthop group is the same
 as the old one after applying gateway filter.

Reported by:	Michael <michael.adm at gmail.com>
PR:		255665
2021-05-30 10:30:45 +00:00
Kristof Provost
48d771e579 pf: Track the original kif for floating states
Track (and display) the interface that created a state, even if it's a
floating state (and thus uses virtual interface 'all').

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30245

(cherry picked from commit d0fdf2b28f)
2021-05-27 09:06:14 +02:00
Kristof Provost
def59341c9 pf: Add DIOCGETSTATESNV
Add DIOCGETSTATESNV, an nvlist-based alternative to DIOCGETSTATES.

MFC after:      1 week
Sponsored by:   Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30243

(cherry picked from commit 0592a4c83d)
2021-05-27 09:05:50 +02:00
Kristof Provost
70762ee0f2 pf: Add DIOCGETSTATENV
Add DIOCGETSTATENV, an nvlist-based alternative to DIOCGETSTATE.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30242

(cherry picked from commit 1732afaa0d)
2021-05-27 09:04:36 +02:00
Mark Johnston
178633e282 if: Remove unnecessary validation in the SIOCSIFNAME handler
A successful copyinstr() call guarantees that the returned string is
nul-terminated.  Furthermore, the removed check would harmlessly compare
an uninitialized byte with '\0' if the new name is shorter than
IFNAMESIZ - 1.

Reported by:	KMSAN
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit ad22ba2b9f)
2021-05-19 09:32:11 -04:00
Jonah Caplan
61d771b63d bridgestp: validate timer values in config BPDU
IEEE Std 802.1D-2004 Section 17.14 defines permitted ranges for timers.
Incoming BPDU messages should be checked against the permitted ranges.
The rest of 17.14 appears to be enforced already.

PR:		254924
Reviewed by:	kp, donner
Differential Revision:	https://reviews.freebsd.org/D29782

(cherry picked from commit 0e4025bffa)
2021-05-18 12:00:38 +02:00
Kristof Provost
8c610ccac6 pf: Support killing 'matching' states
Optionally also kill states that match (i.e. are the NATed state or
opposite direction state entry for) the state we're killing.

See also https://redmine.pfsense.org/issues/8555

Submitted by:	Steven Brown
Reviewed by:	bcr (man page)
Obtained from:	https://github.com/pfsense/FreeBSD-src/pull/11/
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30092

(cherry picked from commit 93abcf17e6)
2021-05-14 10:42:07 +02:00
Kristof Provost
a9620e7c70 pf: Allow states to by killed per 'gateway'
This allows us to kill states created from a rule with route-to/reply-to
set.  This is particularly useful in multi-wan setups, where one of the
WAN links goes down.

Submitted by:	Steven Brown
Obtained from:	https://github.com/pfsense/FreeBSD-src/pull/11/
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30058

(cherry picked from commit abbcba9cf5)
2021-05-14 10:24:00 +02:00
Kristof Provost
e25df66606 pf: Introduce DIOCKILLSTATESNV
Introduce an nvlist based alternative to DIOCKILLSTATES.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30054

(cherry picked from commit e989530a09)
2021-05-14 10:17:17 +02:00
Kristof Provost
41bb01e095 pf: Introduce DIOCCLRSTATESNV
Introduce an nvlist variant of DIOCCLRSTATES.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30052

(cherry picked from commit 7606a45dcc)
2021-05-14 10:15:27 +02:00
Kristof Provost
898407819d pf: Optionally attempt to preserve rule counter values across ruleset updates
Usually rule counters are reset to zero on every update of the ruleset.
With keepcounters set pf will attempt to find matching rules between old
and new rulesets and preserve the rule counters.

MFC after:	4 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29780

(cherry picked from commit 42ec75f83a)
2021-05-11 17:04:45 +02:00
Kurosawa Takahiro
e49799dcf1 pf: Implement the NAT source port selection of MAP-E Customer Edge
MAP-E (RFC 7597) requires special care for selecting source ports
in NAT operation on the Customer Edge because a part of bits of the port
numbers are used by the Border Relay to distinguish another side of the
IPv4-over-IPv6 tunnel.

PR:		254577
Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D29468

(cherry picked from commit 2aa21096c7)
2021-05-11 17:04:45 +02:00
Kristof Provost
fbbcc07976 pfsync: Expose PFSYNCF_OK flag to userspace
Add 'syncok' field to ifconfig's pfsync interface output. This allows
userspace to figure out when pfsync has completed the initial bulk
import.

Reviewed by:	donner
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29948

(cherry picked from commit 5f5bf88949)
2021-05-10 21:45:57 +02:00
Kristof Provost
c93907df7b pf: Allow multiple labels to be set on a rule
Allow up to 5 labels to be set on each rule.
This offers more flexibility in using labels. For example, it replaces
the customer 'schedule' keyword used by pfSense to terminate states
according to a schedule.

Reviewed by:	glebius
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29936

(cherry picked from commit 6fcc8e042a)
2021-05-10 21:45:57 +02:00
Hans Petter Selasky
b7622437f5 net: Introduce IPV6_DSCP(), IPV6_ECN() and IPV6_TRAFFIC_CLASS() macros
Introduce convenience macros to retrieve the DSCP, ECN or traffic class
bits from an IPv6 header.

Use them where appropriate.

Reviewed by:	ae (previous version), rscheff, tuexen, rgrimes
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29056

(cherry picked from commit bb4a7d94b9)
2021-05-10 16:30:44 +02:00
Jose Luis Duran
e0c2f8156c ifconfig: Minor documentation fix
PR:	255557

(cherry picked from commit 0ea8a7f36d)
2021-05-10 03:48:05 +03:00
Kristof Provost
326f189d5b pf: PFRULE_REFS should not be user-visible
Split the PFRULE_REFS flag from the rule_flag field. PFRULE_REFS is a
kernel-internal flag and should not be exposed to or read from
userspace.

MFC after:	4 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29778

(cherry picked from commit 4f1f67e888)
2021-05-07 10:15:43 +02:00
Kristof Provost
a3e4fd8b33 pf: Implement nvlist variant of DIOCGETRULE
MFC after:	4 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29559

(cherry picked from commit d710367d11)
2021-05-07 10:15:41 +02:00
Kristof Provost
f9b057eaf6 pf: Introduce nvlist variant of DIOCADDRULE
This will make future extensions of the API much easier.
The intent is to remove support for DIOCADDRULE in FreeBSD 14.

Reviewed by:	markj (previous version), glebius (previous version)
MFC after:	4 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29557

(cherry picked from commit 5c62eded5a)
2021-05-07 10:15:41 +02:00
Kristof Provost
95a06e369e pf: Remove unused variable rt_listid from struct pf_krule
Reviewed by:	donner
MFC after:	4 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29639

(cherry picked from commit 4967f672ef)
2021-05-07 10:15:40 +02:00
Alexander V. Chernikov
972fcfb34b [fib algo] Update fib_gen counter under FIB_MOD_LOCK.
MFC after:	3 days

(cherry picked from commit 41ce0e34ea)
2021-05-04 21:31:36 +00:00
Alexander V. Chernikov
939c41f3b8 Fix drace CTF for the rib_head.
33cb3cb2e3 introduced an `rib_head` structure field under the
FIB_ALGO define. This may be problematic for the CTF, as some
 of the files including `route_var.h` do not have `fib_algo`
 defined.

Make dtrace happy by making the field unconditional.

Suggested by:	markj

(cherry picked from commit bc5ef45aec)
2021-05-04 21:31:25 +00:00
Alexander V. Chernikov
d0666c8718 Add rib_walk_from() wrapper for selective rib tree traversal.
Provide wrapper for the rnh_walktree_from() rib callback.
As currently `struct rib_head` is considered internal to the
 routing subsystem, this wrapper is necessary to maintain isolation
 from the external code.

Differential Revision: https://reviews.freebsd.org/D29971
MFC after:	1 week

(cherry picked from commit f9668e42b4)
2021-05-04 21:30:35 +00:00
Alexander V. Chernikov
83add84c00 [fib algo] Delay algo init at fib growth to to allow to reliably use rib KPI.
Currently, most of the rib(9) KPI does not use rnh pointers, using
 fibnum and family parameters to determine the rib pointer instead.
This works well except for the case when we initialize new rib pointers
 during fib growth.
In that case, there is no mapping between fib/family and the new rib,
 as an entirely new rib pointer array is populated.

Address this by delaying fib algo initialization till after switching
 to the new pointer array and updating the number of fibs.
Set datapath pointer to the dummy function, so the potential callers
 won't crash the kernel in the brief moment when the rib exists, but
 no fib algo is attached.

This change allows to avoid creating duplicates of existing rib functions,
 with altered signature.

Differential Revision: https://reviews.freebsd.org/D29969
MFC after:	1 week

(cherry picked from commit 8a0d57baec)
2021-05-04 21:30:35 +00:00
Alexander V. Chernikov
3fd9848f15 [rtsock] Enforce netmask/RTF_HOST consistency.
Traditionally we had 2 sources of information whether the
 added/delete route request targets network or a host route:
netmask (RTA_NETMASK) and RTF_HOST flag.

The former one is tricky: netmask can be empty or can explicitly
 specify the host netmask. Parsing netmask sockaddr requires per-family
 parsing and that's what rtsock code traditionally avoided. As a result,
 consistency was not enforced and it was possible to specify network with
 the RTF_HOST flag and vice versa.

Continue normalization efforts from D29826 and D29826 and ensure that
 RTF_HOST flag always reflects host/network data from netmask field.

Differential Revision: https://reviews.freebsd.org/D29958
MFC after:	2 days

(cherry picked from commit 5d1403a79a)
2021-05-04 21:29:36 +00:00
Tai-hwa Liang
89ed20a9b6 if_firewire: fixing panic upon packet reception for VNET build
netisr_dispatch_src() needs valid VNET pointer or firewire_input() will panic
when receiving a packet.

Reviewed by:	glebius
MFC after:	2 weeks

(cherry picked from commit d9b61e7153)
2021-05-03 07:51:53 +00:00