Commit graph

4636 commits

Author SHA1 Message Date
Randall Stewart
68d6663afb This pulls over all the changes that are in the netflix
tree that fix the ratelimit code. There were several bugs
in tcp_ratelimit itself and we needed further work to support
the multiple tag format coming for the joint TLS and Ratelimit dances.

    Sponsored by: Netflix Inc.
    Differential Revision:  https://reviews.freebsd.org/D28357

(cherry picked from commit 1a714ff204)
2021-06-08 01:18:32 +02:00
Alexander V. Chernikov
d40def01a4 Fix a use after free in update_rtm_from_rc().
update_rtm_from_rc() calls update_rtm_from_info() internally.
The latter one may update provided prtm pointer with a new rtm.
Reassign rtm from prtm afeter calling update_rtm_from_info() to
 avoid touching the freed rtm.

PR:		255871
Submitted by:	lylgood@foxmail.com
2021-05-30 10:30:53 +00:00
Alexander V. Chernikov
f279295521 Fix panic when trying to delete non-existent gateway in multipath route.
IF non-existend gateway was specified, the code responsible for calculating
 an updated nexthop group, returned the same already-used nexthop group.
After the route table update, the operation result contained the same
 old & new nexthop groups. Thus, the code responsible for decomposing
 the notification to the list of simple nexthop-level notifications,
 was not able to find any differences. As a result, it hasn't updated any
  of the "simple" notification fields, resulting in empty rtentry pointer.
This empty pointer was the direct reason of a panic.

Fix the problem by returning ESRCH when the new nexthop group is the same
 as the old one after applying gateway filter.

Reported by:	Michael <michael.adm at gmail.com>
PR:		255665
2021-05-30 10:30:45 +00:00
Kristof Provost
48d771e579 pf: Track the original kif for floating states
Track (and display) the interface that created a state, even if it's a
floating state (and thus uses virtual interface 'all').

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30245

(cherry picked from commit d0fdf2b28f)
2021-05-27 09:06:14 +02:00
Kristof Provost
def59341c9 pf: Add DIOCGETSTATESNV
Add DIOCGETSTATESNV, an nvlist-based alternative to DIOCGETSTATES.

MFC after:      1 week
Sponsored by:   Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30243

(cherry picked from commit 0592a4c83d)
2021-05-27 09:05:50 +02:00
Kristof Provost
70762ee0f2 pf: Add DIOCGETSTATENV
Add DIOCGETSTATENV, an nvlist-based alternative to DIOCGETSTATE.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30242

(cherry picked from commit 1732afaa0d)
2021-05-27 09:04:36 +02:00
Mark Johnston
178633e282 if: Remove unnecessary validation in the SIOCSIFNAME handler
A successful copyinstr() call guarantees that the returned string is
nul-terminated.  Furthermore, the removed check would harmlessly compare
an uninitialized byte with '\0' if the new name is shorter than
IFNAMESIZ - 1.

Reported by:	KMSAN
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit ad22ba2b9f)
2021-05-19 09:32:11 -04:00
Jonah Caplan
61d771b63d bridgestp: validate timer values in config BPDU
IEEE Std 802.1D-2004 Section 17.14 defines permitted ranges for timers.
Incoming BPDU messages should be checked against the permitted ranges.
The rest of 17.14 appears to be enforced already.

PR:		254924
Reviewed by:	kp, donner
Differential Revision:	https://reviews.freebsd.org/D29782

(cherry picked from commit 0e4025bffa)
2021-05-18 12:00:38 +02:00
Kristof Provost
8c610ccac6 pf: Support killing 'matching' states
Optionally also kill states that match (i.e. are the NATed state or
opposite direction state entry for) the state we're killing.

See also https://redmine.pfsense.org/issues/8555

Submitted by:	Steven Brown
Reviewed by:	bcr (man page)
Obtained from:	https://github.com/pfsense/FreeBSD-src/pull/11/
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30092

(cherry picked from commit 93abcf17e6)
2021-05-14 10:42:07 +02:00
Kristof Provost
a9620e7c70 pf: Allow states to by killed per 'gateway'
This allows us to kill states created from a rule with route-to/reply-to
set.  This is particularly useful in multi-wan setups, where one of the
WAN links goes down.

Submitted by:	Steven Brown
Obtained from:	https://github.com/pfsense/FreeBSD-src/pull/11/
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30058

(cherry picked from commit abbcba9cf5)
2021-05-14 10:24:00 +02:00
Kristof Provost
e25df66606 pf: Introduce DIOCKILLSTATESNV
Introduce an nvlist based alternative to DIOCKILLSTATES.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30054

(cherry picked from commit e989530a09)
2021-05-14 10:17:17 +02:00
Kristof Provost
41bb01e095 pf: Introduce DIOCCLRSTATESNV
Introduce an nvlist variant of DIOCCLRSTATES.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30052

(cherry picked from commit 7606a45dcc)
2021-05-14 10:15:27 +02:00
Kristof Provost
898407819d pf: Optionally attempt to preserve rule counter values across ruleset updates
Usually rule counters are reset to zero on every update of the ruleset.
With keepcounters set pf will attempt to find matching rules between old
and new rulesets and preserve the rule counters.

MFC after:	4 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29780

(cherry picked from commit 42ec75f83a)
2021-05-11 17:04:45 +02:00
Kurosawa Takahiro
e49799dcf1 pf: Implement the NAT source port selection of MAP-E Customer Edge
MAP-E (RFC 7597) requires special care for selecting source ports
in NAT operation on the Customer Edge because a part of bits of the port
numbers are used by the Border Relay to distinguish another side of the
IPv4-over-IPv6 tunnel.

PR:		254577
Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D29468

(cherry picked from commit 2aa21096c7)
2021-05-11 17:04:45 +02:00
Kristof Provost
fbbcc07976 pfsync: Expose PFSYNCF_OK flag to userspace
Add 'syncok' field to ifconfig's pfsync interface output. This allows
userspace to figure out when pfsync has completed the initial bulk
import.

Reviewed by:	donner
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29948

(cherry picked from commit 5f5bf88949)
2021-05-10 21:45:57 +02:00
Kristof Provost
c93907df7b pf: Allow multiple labels to be set on a rule
Allow up to 5 labels to be set on each rule.
This offers more flexibility in using labels. For example, it replaces
the customer 'schedule' keyword used by pfSense to terminate states
according to a schedule.

Reviewed by:	glebius
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29936

(cherry picked from commit 6fcc8e042a)
2021-05-10 21:45:57 +02:00
Hans Petter Selasky
b7622437f5 net: Introduce IPV6_DSCP(), IPV6_ECN() and IPV6_TRAFFIC_CLASS() macros
Introduce convenience macros to retrieve the DSCP, ECN or traffic class
bits from an IPv6 header.

Use them where appropriate.

Reviewed by:	ae (previous version), rscheff, tuexen, rgrimes
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29056

(cherry picked from commit bb4a7d94b9)
2021-05-10 16:30:44 +02:00
Jose Luis Duran
e0c2f8156c ifconfig: Minor documentation fix
PR:	255557

(cherry picked from commit 0ea8a7f36d)
2021-05-10 03:48:05 +03:00
Kristof Provost
326f189d5b pf: PFRULE_REFS should not be user-visible
Split the PFRULE_REFS flag from the rule_flag field. PFRULE_REFS is a
kernel-internal flag and should not be exposed to or read from
userspace.

MFC after:	4 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29778

(cherry picked from commit 4f1f67e888)
2021-05-07 10:15:43 +02:00
Kristof Provost
a3e4fd8b33 pf: Implement nvlist variant of DIOCGETRULE
MFC after:	4 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29559

(cherry picked from commit d710367d11)
2021-05-07 10:15:41 +02:00
Kristof Provost
f9b057eaf6 pf: Introduce nvlist variant of DIOCADDRULE
This will make future extensions of the API much easier.
The intent is to remove support for DIOCADDRULE in FreeBSD 14.

Reviewed by:	markj (previous version), glebius (previous version)
MFC after:	4 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29557

(cherry picked from commit 5c62eded5a)
2021-05-07 10:15:41 +02:00
Kristof Provost
95a06e369e pf: Remove unused variable rt_listid from struct pf_krule
Reviewed by:	donner
MFC after:	4 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29639

(cherry picked from commit 4967f672ef)
2021-05-07 10:15:40 +02:00
Alexander V. Chernikov
972fcfb34b [fib algo] Update fib_gen counter under FIB_MOD_LOCK.
MFC after:	3 days

(cherry picked from commit 41ce0e34ea)
2021-05-04 21:31:36 +00:00
Alexander V. Chernikov
939c41f3b8 Fix drace CTF for the rib_head.
33cb3cb2e3 introduced an `rib_head` structure field under the
FIB_ALGO define. This may be problematic for the CTF, as some
 of the files including `route_var.h` do not have `fib_algo`
 defined.

Make dtrace happy by making the field unconditional.

Suggested by:	markj

(cherry picked from commit bc5ef45aec)
2021-05-04 21:31:25 +00:00
Alexander V. Chernikov
d0666c8718 Add rib_walk_from() wrapper for selective rib tree traversal.
Provide wrapper for the rnh_walktree_from() rib callback.
As currently `struct rib_head` is considered internal to the
 routing subsystem, this wrapper is necessary to maintain isolation
 from the external code.

Differential Revision: https://reviews.freebsd.org/D29971
MFC after:	1 week

(cherry picked from commit f9668e42b4)
2021-05-04 21:30:35 +00:00
Alexander V. Chernikov
83add84c00 [fib algo] Delay algo init at fib growth to to allow to reliably use rib KPI.
Currently, most of the rib(9) KPI does not use rnh pointers, using
 fibnum and family parameters to determine the rib pointer instead.
This works well except for the case when we initialize new rib pointers
 during fib growth.
In that case, there is no mapping between fib/family and the new rib,
 as an entirely new rib pointer array is populated.

Address this by delaying fib algo initialization till after switching
 to the new pointer array and updating the number of fibs.
Set datapath pointer to the dummy function, so the potential callers
 won't crash the kernel in the brief moment when the rib exists, but
 no fib algo is attached.

This change allows to avoid creating duplicates of existing rib functions,
 with altered signature.

Differential Revision: https://reviews.freebsd.org/D29969
MFC after:	1 week

(cherry picked from commit 8a0d57baec)
2021-05-04 21:30:35 +00:00
Alexander V. Chernikov
3fd9848f15 [rtsock] Enforce netmask/RTF_HOST consistency.
Traditionally we had 2 sources of information whether the
 added/delete route request targets network or a host route:
netmask (RTA_NETMASK) and RTF_HOST flag.

The former one is tricky: netmask can be empty or can explicitly
 specify the host netmask. Parsing netmask sockaddr requires per-family
 parsing and that's what rtsock code traditionally avoided. As a result,
 consistency was not enforced and it was possible to specify network with
 the RTF_HOST flag and vice versa.

Continue normalization efforts from D29826 and D29826 and ensure that
 RTF_HOST flag always reflects host/network data from netmask field.

Differential Revision: https://reviews.freebsd.org/D29958
MFC after:	2 days

(cherry picked from commit 5d1403a79a)
2021-05-04 21:29:36 +00:00
Tai-hwa Liang
89ed20a9b6 if_firewire: fixing panic upon packet reception for VNET build
netisr_dispatch_src() needs valid VNET pointer or firewire_input() will panic
when receiving a packet.

Reviewed by:	glebius
MFC after:	2 weeks

(cherry picked from commit d9b61e7153)
2021-05-03 07:51:53 +00:00
Alexander V. Chernikov
59b3b210a6 [fib algo] always commit static routes synchronously.
Modular fib lookup framework features logic that allows
 route update batching for the algorithms that cannot easily
 apply the routing change without rebuilding. As a result,
 dataplane lookups may return old data until the the sync
 takes place. With the default sync timeout of 50ms, it is
 possible that new binary like ping(8) executed exactly after
 route(8) will still use the old fib data.

To address some aspects of the problem, framework executes
 all rtable changes without RTF_GATEWAY synchronously.

To fix the aforementioned problem, this diff extends sync
 execution for all RTF_STATIC routes (e.g. ones maintained by
 route(8).
This fixes a bunch of tests in the networking space.

Reported by:	ci, arichardson
MFC after:	2 weeks

(cherry picked from commit 439d087d0b)
2021-04-29 08:47:32 +00:00
Alexander V. Chernikov
2c0d16218e Fix NOINET[6],!VIMAGE builds after FIB_ALGO addition to GENERIC
Reported by:	jbeich
PR:		255390

(cherry picked from commit 7d222ce3c1)
2021-04-29 08:47:32 +00:00
Alexander V. Chernikov
a423e945c7 Fix NOINET[6] build after enabling FIB_ALGO in GENERIC.
Submitted by:	jbeich
PR:		255389

(cherry picked from commit 67372fb3e0)
2021-04-29 08:47:32 +00:00
Alexander V. Chernikov
1899138d5a Fix rib generation count for fib algo.
Currently, PCB caching mechanism relies on the rib generation
 counter (rnh_gen) to invalidate cached nhops/LLE entries.

With certain fib algorithms, it is now possible that the
 datapath lookup state applies RIB changes with some delay.
In that scenario, PCB cache will invalidate on the RIB change,
 but the new lookup may result in the same nexthop being returned.
When fib algo finally gets in sync with the RIB changes, PCB cache
 will not receive any notification and will end up caching the stale data.

To fix this, introduce additional counter, rnh_gen_rib, which is used
 only when FIB_ALGO is enabled.
This counter is incremented by the control plane. Each time when fib algo
 synchronises with the RIB, it updates rnh_gen to the current rnh_gen_rib value.

Differential Revision: https://reviews.freebsd.org/D29812
Reviewed by:	donner
MFC after:	2 weeks

(cherry picked from commit 33cb3cb2e3)
2021-04-29 08:47:32 +00:00
Alexander V. Chernikov
6fa0160556 Add batched update support for the fib algo.
Initial fib algo implementation was build on a very simple set of
 principles w.r.t updates:

1) algorithm is ether able to apply the change synchronously (DIR24-8)
 or requires full rebuild (bsearch, lradix).
2) framework falls back to rebuild on every error (memory allocation,
 nhg limit, other internal algo errors, etc).

This changes brings the new "intermediate" concept - batched updates.
Algotirhm can indicate that the particular update has to be handled in
 batched fashion (FLM_BATCH).
The framework will write this update and other updates to the temporary
 buffer instead of pushing them to the algo callback.
Depending on the update rate, the framework will batch 50..1024 ms of updates
 and submit them to a different algo callback.

This functionality is handy for the slow-to-rebuild algorithms like DXR.

Differential Revision:	https://reviews.freebsd.org/D29588
Reviewed by:	zec
MFC after:	2 weeks

(cherry picked from commit 6b8ef0d428)
2021-04-29 08:47:32 +00:00
Stefan Eßer
553c1513a8 Fix build with gcc
Correctly declare function without arguments as f(void) instead of f().

(cherry picked from commit 6409e59427)
2021-04-29 08:47:31 +00:00
Alexander V. Chernikov
a07e7856e4 fib algo: do not reallocate datapath index for datapath ptr update.
Fib algo uses a per-family array indexed by the fibnum to store
 lookup function pointers and per-fib data.

Each algorithm rebuild currently requires re-allocating this array
 to support atomic change of two pointers.

As in reality most of the changes actually involve changing only
 data pointer, add a shortcut performing in-flight pointer update.

MFC after:	2 weeks

(cherry picked from commit 0abb6ff590)
2021-04-29 08:47:31 +00:00
Alexander V. Chernikov
c971873ace Implement better rebuild-delay fib algo policy.
The intent is to better handle time intervals with large amount of RIB
updates (e.g. BGP peer going up or down), while still keeping low sync
delay for the rest scenarios.

The implementation is the following: updates are bucketed into the
buckets of size 50ms. If the number of updates within a current bucket
 exceeds the threshold of 500 routes/sec (e.g. 10 updates per bucket
interval), the update is delayed for another 50ms. This can be repeated
 until the maximum update delay (1 sec) is reached.

All 3 variables are runtime tunables:

* net.route.algo.fib_max_sync_delay_ms: 1000
* net.route.algo.bucket_change_threshold_rate: 500
* net.route.algo.bucket_time_ms: 50

Differential Review:	https://reviews.freebsd.org/D29588
MFC after:		2 weeks

(cherry picked from commit ee2cf2b360)
2021-04-29 08:47:31 +00:00
Alexander V. Chernikov
b77bee7642 Appease -Wsign-compare in radix.c
Differential Revision:	https://reviews.freebsd.org/D29661
Submitted by:	zec
MFC after	2 weeks

(cherry picked from commit 63dceebe68)
2021-04-29 08:47:31 +00:00
Alexander V. Chernikov
c1ce6196d2 Allow to specify debugnet fib in sysctl/tunable.
Differential Revision:	https://reviews.freebsd.org/D29593
Reviewed by:		donner
MFC after:		2 weeks

(cherry picked from commit caf2f62765)
2021-04-29 08:47:31 +00:00
Alexander V. Chernikov
fbe149eef7 Fix rtsock sockaddr alignment.
b31fbebeb3 introduced alloc_sockaddr_aligned() which, in fact,
 failed to produce aligned addresses.

Reported by:	Oskar Holmlund <oskar.holmlund at yahoo.com>
MFC after:	immediately

(cherry picked from commit 25682e6a49)
2021-04-27 19:59:27 +00:00
Alexander V. Chernikov
d2f68847a3 [fib algo] Do not print algo attach/detach message on boot
MFC after:	1 day

(cherry picked from commit c23385612d)
2021-04-26 08:49:33 +00:00
Alexander V. Chernikov
3173872183 Make gcc happy by initializing error in rib_handle_ifaddr_info().
(cherry picked from commit a81e2e7890)
2021-04-26 08:49:24 +00:00
Alexander V. Chernikov
6f1e5d9169 Relax rtsock message restrictions.
Address multiple issues with strict rtsock message validation.

D28668 "normalisation" approach was based on the assumption that
 we always have at least "standard" sockaddr len.
It turned out to be false - certain older applications like quagga
 or routed abuse sin[6]_len field and set it to the offset to the
 first fully-zero bit in the mask. It is impossible to normalise
 such sockaddrs without reallocation.

With that in mind, change the approach to use a distinct memory
 buffer for the altered sockaddrs. This allows supporting the older
 software while maintaining the guarantee on the "standard" sockaddrs.

PR:	255273,255089
Differential Revision:	https://reviews.freebsd.org/D29826
MFC after:	3 days

(cherry picked from commit b31fbebeb3)
2021-04-26 08:48:47 +00:00
Alexander V. Chernikov
98a3c20696 Improve error reporting in rtsock.c
MFC after:	3 days

(cherry picked from commit 758c9d54d4)
2021-04-26 08:48:41 +00:00
Alexander V. Chernikov
45645b05b8 Fib algo: extend KPI by allowing algo to set datapath pointers.
Some algorithms may require updating datapath and control plane
 algo pointers after the (batched) updates.

Export fib_set_datapath_ptr() to allow setting the new datapath
 function or data pointer from the algo.
Add fib_set_algo_ptr() to allow updating algo control plane
 pointer from the algo.
Add fib_epoch_call() epoch(9) wrapper to simplify freeing old
 datapath state.

Reviewed by:		zec
Differential Revision: https://reviews.freebsd.org/D29799
MFC after:		1 week

(cherry picked from commit e2f79d9e51)
2021-04-26 08:48:14 +00:00
Alexander V. Chernikov
de703e98e6 Fix direct route installation with net/bird.
Slighly relax the gateway validation rules imposed by the
 2fe5a79425, by requiring only first 8 bytes (everyhing
 before sdl_data to be present in the AF_LINK gateway.

Reported by:	olivier
PR:		255089

(cherry picked from commit 7f5f3fcc32)
2021-04-19 18:52:57 +00:00
Alexander V. Chernikov
9abc85d17d Fix vlan creation for the older ifconfig(8) binaries.
Reported by:	allanjude
MFC after:	immediately

(cherry picked from commit afbb64f1d8)
2021-04-12 22:18:33 +00:00
Alexander V. Chernikov
c3254131a4 Fix fib algo rebuild delay calculation.
Submitted by:	Marco Zec <zec at fer.hr>

(cherry picked from commit e4ac3f7463)
2021-04-08 20:15:04 +00:00
Tai-hwa Liang
f87d56d4bf net: fixing a memory leak in if_deregister_com_alloc()
Drain the callbacks upon if_deregister_com_alloc() such that the
if_com_free[type] won't be nullified before if_destroy().

Taking fwip(4) as an example, before this fix, kldunload if_fwip will
go through the following:

  1. fwip_detach()
  2. if_free() -> schedule if_destroy() through NET_EPOCH_CALL
  3. fwip_detach() returns
  4. firewire_modevent(MOD_UNLOAD) -> if_deregister_com_alloc()
  5. kernel complains about:
	Warning: memory type fw_com leaked memory on destroy (1 allocations, 64 bytes leaked).
  6. EPOCH runs if_destroy() -> if_free_internal()i

By this time, if_com_free[if_alloctype] is NULL since it's already
nullified by if_deregister_com_alloc(); hence, firewire_free() won't
have a chance to release the allocated fw_com.

Reviewed by:	hselasky, glebius
MFC after:	2 weeks

(cherry picked from commit 092f3f0812)
2021-04-08 16:21:33 +00:00
Konstantin Belousov
5cbc9d80b8 mbuf: add a way to mark flowid as calculated from the internal headers
(cherry picked from commit e243367b64)
2021-04-07 06:32:39 +03:00
Konstantin Belousov
2f5491784e vxlan: correct interface MTU when using hw offloads
(cherry picked from commit baacf70137)
2021-04-07 06:32:39 +03:00