tree that fix the ratelimit code. There were several bugs
in tcp_ratelimit itself and we needed further work to support
the multiple tag format coming for the joint TLS and Ratelimit dances.
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D28357
(cherry picked from commit 1a714ff204)
update_rtm_from_rc() calls update_rtm_from_info() internally.
The latter one may update provided prtm pointer with a new rtm.
Reassign rtm from prtm afeter calling update_rtm_from_info() to
avoid touching the freed rtm.
PR: 255871
Submitted by: lylgood@foxmail.com
IF non-existend gateway was specified, the code responsible for calculating
an updated nexthop group, returned the same already-used nexthop group.
After the route table update, the operation result contained the same
old & new nexthop groups. Thus, the code responsible for decomposing
the notification to the list of simple nexthop-level notifications,
was not able to find any differences. As a result, it hasn't updated any
of the "simple" notification fields, resulting in empty rtentry pointer.
This empty pointer was the direct reason of a panic.
Fix the problem by returning ESRCH when the new nexthop group is the same
as the old one after applying gateway filter.
Reported by: Michael <michael.adm at gmail.com>
PR: 255665
Track (and display) the interface that created a state, even if it's a
floating state (and thus uses virtual interface 'all').
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30245
(cherry picked from commit d0fdf2b28f)
A successful copyinstr() call guarantees that the returned string is
nul-terminated. Furthermore, the removed check would harmlessly compare
an uninitialized byte with '\0' if the new name is shorter than
IFNAMESIZ - 1.
Reported by: KMSAN
Sponsored by: The FreeBSD Foundation
(cherry picked from commit ad22ba2b9f)
IEEE Std 802.1D-2004 Section 17.14 defines permitted ranges for timers.
Incoming BPDU messages should be checked against the permitted ranges.
The rest of 17.14 appears to be enforced already.
PR: 254924
Reviewed by: kp, donner
Differential Revision: https://reviews.freebsd.org/D29782
(cherry picked from commit 0e4025bffa)
This allows us to kill states created from a rule with route-to/reply-to
set. This is particularly useful in multi-wan setups, where one of the
WAN links goes down.
Submitted by: Steven Brown
Obtained from: https://github.com/pfsense/FreeBSD-src/pull/11/
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30058
(cherry picked from commit abbcba9cf5)
Introduce an nvlist based alternative to DIOCKILLSTATES.
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30054
(cherry picked from commit e989530a09)
Usually rule counters are reset to zero on every update of the ruleset.
With keepcounters set pf will attempt to find matching rules between old
and new rulesets and preserve the rule counters.
MFC after: 4 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29780
(cherry picked from commit 42ec75f83a)
MAP-E (RFC 7597) requires special care for selecting source ports
in NAT operation on the Customer Edge because a part of bits of the port
numbers are used by the Border Relay to distinguish another side of the
IPv4-over-IPv6 tunnel.
PR: 254577
Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D29468
(cherry picked from commit 2aa21096c7)
Add 'syncok' field to ifconfig's pfsync interface output. This allows
userspace to figure out when pfsync has completed the initial bulk
import.
Reviewed by: donner
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29948
(cherry picked from commit 5f5bf88949)
Allow up to 5 labels to be set on each rule.
This offers more flexibility in using labels. For example, it replaces
the customer 'schedule' keyword used by pfSense to terminate states
according to a schedule.
Reviewed by: glebius
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29936
(cherry picked from commit 6fcc8e042a)
Introduce convenience macros to retrieve the DSCP, ECN or traffic class
bits from an IPv6 header.
Use them where appropriate.
Reviewed by: ae (previous version), rscheff, tuexen, rgrimes
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29056
(cherry picked from commit bb4a7d94b9)
Split the PFRULE_REFS flag from the rule_flag field. PFRULE_REFS is a
kernel-internal flag and should not be exposed to or read from
userspace.
MFC after: 4 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29778
(cherry picked from commit 4f1f67e888)
This will make future extensions of the API much easier.
The intent is to remove support for DIOCADDRULE in FreeBSD 14.
Reviewed by: markj (previous version), glebius (previous version)
MFC after: 4 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29557
(cherry picked from commit 5c62eded5a)
33cb3cb2e3 introduced an `rib_head` structure field under the
FIB_ALGO define. This may be problematic for the CTF, as some
of the files including `route_var.h` do not have `fib_algo`
defined.
Make dtrace happy by making the field unconditional.
Suggested by: markj
(cherry picked from commit bc5ef45aec)
Provide wrapper for the rnh_walktree_from() rib callback.
As currently `struct rib_head` is considered internal to the
routing subsystem, this wrapper is necessary to maintain isolation
from the external code.
Differential Revision: https://reviews.freebsd.org/D29971
MFC after: 1 week
(cherry picked from commit f9668e42b4)
Currently, most of the rib(9) KPI does not use rnh pointers, using
fibnum and family parameters to determine the rib pointer instead.
This works well except for the case when we initialize new rib pointers
during fib growth.
In that case, there is no mapping between fib/family and the new rib,
as an entirely new rib pointer array is populated.
Address this by delaying fib algo initialization till after switching
to the new pointer array and updating the number of fibs.
Set datapath pointer to the dummy function, so the potential callers
won't crash the kernel in the brief moment when the rib exists, but
no fib algo is attached.
This change allows to avoid creating duplicates of existing rib functions,
with altered signature.
Differential Revision: https://reviews.freebsd.org/D29969
MFC after: 1 week
(cherry picked from commit 8a0d57baec)
Traditionally we had 2 sources of information whether the
added/delete route request targets network or a host route:
netmask (RTA_NETMASK) and RTF_HOST flag.
The former one is tricky: netmask can be empty or can explicitly
specify the host netmask. Parsing netmask sockaddr requires per-family
parsing and that's what rtsock code traditionally avoided. As a result,
consistency was not enforced and it was possible to specify network with
the RTF_HOST flag and vice versa.
Continue normalization efforts from D29826 and D29826 and ensure that
RTF_HOST flag always reflects host/network data from netmask field.
Differential Revision: https://reviews.freebsd.org/D29958
MFC after: 2 days
(cherry picked from commit 5d1403a79a)
netisr_dispatch_src() needs valid VNET pointer or firewire_input() will panic
when receiving a packet.
Reviewed by: glebius
MFC after: 2 weeks
(cherry picked from commit d9b61e7153)
Modular fib lookup framework features logic that allows
route update batching for the algorithms that cannot easily
apply the routing change without rebuilding. As a result,
dataplane lookups may return old data until the the sync
takes place. With the default sync timeout of 50ms, it is
possible that new binary like ping(8) executed exactly after
route(8) will still use the old fib data.
To address some aspects of the problem, framework executes
all rtable changes without RTF_GATEWAY synchronously.
To fix the aforementioned problem, this diff extends sync
execution for all RTF_STATIC routes (e.g. ones maintained by
route(8).
This fixes a bunch of tests in the networking space.
Reported by: ci, arichardson
MFC after: 2 weeks
(cherry picked from commit 439d087d0b)
Currently, PCB caching mechanism relies on the rib generation
counter (rnh_gen) to invalidate cached nhops/LLE entries.
With certain fib algorithms, it is now possible that the
datapath lookup state applies RIB changes with some delay.
In that scenario, PCB cache will invalidate on the RIB change,
but the new lookup may result in the same nexthop being returned.
When fib algo finally gets in sync with the RIB changes, PCB cache
will not receive any notification and will end up caching the stale data.
To fix this, introduce additional counter, rnh_gen_rib, which is used
only when FIB_ALGO is enabled.
This counter is incremented by the control plane. Each time when fib algo
synchronises with the RIB, it updates rnh_gen to the current rnh_gen_rib value.
Differential Revision: https://reviews.freebsd.org/D29812
Reviewed by: donner
MFC after: 2 weeks
(cherry picked from commit 33cb3cb2e3)
Initial fib algo implementation was build on a very simple set of
principles w.r.t updates:
1) algorithm is ether able to apply the change synchronously (DIR24-8)
or requires full rebuild (bsearch, lradix).
2) framework falls back to rebuild on every error (memory allocation,
nhg limit, other internal algo errors, etc).
This changes brings the new "intermediate" concept - batched updates.
Algotirhm can indicate that the particular update has to be handled in
batched fashion (FLM_BATCH).
The framework will write this update and other updates to the temporary
buffer instead of pushing them to the algo callback.
Depending on the update rate, the framework will batch 50..1024 ms of updates
and submit them to a different algo callback.
This functionality is handy for the slow-to-rebuild algorithms like DXR.
Differential Revision: https://reviews.freebsd.org/D29588
Reviewed by: zec
MFC after: 2 weeks
(cherry picked from commit 6b8ef0d428)
Fib algo uses a per-family array indexed by the fibnum to store
lookup function pointers and per-fib data.
Each algorithm rebuild currently requires re-allocating this array
to support atomic change of two pointers.
As in reality most of the changes actually involve changing only
data pointer, add a shortcut performing in-flight pointer update.
MFC after: 2 weeks
(cherry picked from commit 0abb6ff590)
The intent is to better handle time intervals with large amount of RIB
updates (e.g. BGP peer going up or down), while still keeping low sync
delay for the rest scenarios.
The implementation is the following: updates are bucketed into the
buckets of size 50ms. If the number of updates within a current bucket
exceeds the threshold of 500 routes/sec (e.g. 10 updates per bucket
interval), the update is delayed for another 50ms. This can be repeated
until the maximum update delay (1 sec) is reached.
All 3 variables are runtime tunables:
* net.route.algo.fib_max_sync_delay_ms: 1000
* net.route.algo.bucket_change_threshold_rate: 500
* net.route.algo.bucket_time_ms: 50
Differential Review: https://reviews.freebsd.org/D29588
MFC after: 2 weeks
(cherry picked from commit ee2cf2b360)
b31fbebeb3 introduced alloc_sockaddr_aligned() which, in fact,
failed to produce aligned addresses.
Reported by: Oskar Holmlund <oskar.holmlund at yahoo.com>
MFC after: immediately
(cherry picked from commit 25682e6a49)
Address multiple issues with strict rtsock message validation.
D28668 "normalisation" approach was based on the assumption that
we always have at least "standard" sockaddr len.
It turned out to be false - certain older applications like quagga
or routed abuse sin[6]_len field and set it to the offset to the
first fully-zero bit in the mask. It is impossible to normalise
such sockaddrs without reallocation.
With that in mind, change the approach to use a distinct memory
buffer for the altered sockaddrs. This allows supporting the older
software while maintaining the guarantee on the "standard" sockaddrs.
PR: 255273,255089
Differential Revision: https://reviews.freebsd.org/D29826
MFC after: 3 days
(cherry picked from commit b31fbebeb3)
Some algorithms may require updating datapath and control plane
algo pointers after the (batched) updates.
Export fib_set_datapath_ptr() to allow setting the new datapath
function or data pointer from the algo.
Add fib_set_algo_ptr() to allow updating algo control plane
pointer from the algo.
Add fib_epoch_call() epoch(9) wrapper to simplify freeing old
datapath state.
Reviewed by: zec
Differential Revision: https://reviews.freebsd.org/D29799
MFC after: 1 week
(cherry picked from commit e2f79d9e51)
Slighly relax the gateway validation rules imposed by the
2fe5a79425, by requiring only first 8 bytes (everyhing
before sdl_data to be present in the AF_LINK gateway.
Reported by: olivier
PR: 255089
(cherry picked from commit 7f5f3fcc32)
Drain the callbacks upon if_deregister_com_alloc() such that the
if_com_free[type] won't be nullified before if_destroy().
Taking fwip(4) as an example, before this fix, kldunload if_fwip will
go through the following:
1. fwip_detach()
2. if_free() -> schedule if_destroy() through NET_EPOCH_CALL
3. fwip_detach() returns
4. firewire_modevent(MOD_UNLOAD) -> if_deregister_com_alloc()
5. kernel complains about:
Warning: memory type fw_com leaked memory on destroy (1 allocations, 64 bytes leaked).
6. EPOCH runs if_destroy() -> if_free_internal()i
By this time, if_com_free[if_alloctype] is NULL since it's already
nullified by if_deregister_com_alloc(); hence, firewire_free() won't
have a chance to release the allocated fw_com.
Reviewed by: hselasky, glebius
MFC after: 2 weeks
(cherry picked from commit 092f3f0812)