The benefit is that in the debugger you will see PF_DIVERT_MTAG_DIR_IN
instead of 1 when looking at a structure. And compilation time failure
if anybody sets it to a wrong value. Using "port" instead of "ndir" when
assigning a port improves readability of code.
Suggested by: glebius
MFC after: 3 weeks
X-MFC-With: fabf705f4b
Resolved conflict between ipfw and pf if both are used and pf wants to
do divert(4) by having separate mtags for pf and ipfw.
Also fix the incorrect 'rulenum' check, which caused the reported loop.
While here add a few test cases to ensure that divert-to works as
expected, even if ipfw is loaded.
divert(4)
PR: 272770
MFC after: 3 weeks
Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D42142
Coverity found that one safety check (kassert) was not
functional, as possible incorrect subtractions during
the accounting wouldn't show up as (invalid) negative
values.
Reported by: gallatin
Reviewed By: cc, #transport
Sponsored By: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D42180
Move the NETLINK define into opt_global.h so we can rely on it being
set correctly, without having to remember to include opt_netlink.h.
This ensures that the NETLINK define is correctly set. If not we
may end up with unloadable modules, due to missing symbols (such as
nlmsg_get_group_writer).
PR: 274306
Reviewed by: imp, markj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D42179
When a Retransmission Timeout happens during an on-going SACK loss recovery
episode, the internal SACK accounting was not cleared.
Reported by: pho
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D42162
Add more accounting while processing SACK data, to
keep track of when a packet is deemed lost using
the RFC6675 guidance.
Together with PRR (RFC6972) this allows a sender to
retransmit presumed lost packets faster, and loss
recovery to complete earlier.
Reviewed By: cc, rrs, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D39299
The following sysctl variables are actually loader tunables. Add sysctl
flag CTLFLAG_TUN to them so that `sysctl -T` will report them correctly.
1. net.inet.sctp.tcbhashsize
2. net.inet.sctp.pcbhashsize
3. net.inet.sctp.chunkscale
The loader tunable 'net.inet.sctp.tcbhashsize' and 'net.inet.sctp.chunkscale'
are only used during vnet initializing, thus it make no senses to make them
writable tunable.
Validate the values of loader tunables on vnet initialize, reset them to
theirs defaults if invalid to prevent potential kernel panics.
Reviewed by: tuexen, #transport, #network
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42007
So when we call the fast_rsm retransmit path, we should always move
snd_nxt back up to snd_max. In fact during ack-processing if snd_nxt
falls behind it should be moved up there as well. Otherwise what
can happen is we have an incorrect mark on snd_nxt and incorrectly
calculate the offset when we go through the front path (which is
what skzyall was able to do) then when we go to clean up the
send the offset is all wrong and we crash.
Special thanks to Gleb for pointing out the problem and the email
that had the reproducer so I could find the issue.
Reported-by: syzbot+f5061a372f74f021ec02@syzkaller.appspotmail.com
Sponsored by: Netflix Inc
As implemented, this security policy would only prevent seeing processes
in sub-jails, but would not prevent sending signals to, changing
priority of or debugging processes in these, enabling attacks where
unprivileged users could tamper with random processes in sub-jails in
particular circumstances (conflated UIDs) despite the policy being
enforced.
PR: 272092
Reviewed by: mhorne
MFC after: 2 weeks
Sponsored by: Kumacom SAS
Differential Revision: https://reviews.freebsd.org/D40628
The loader tunable `net.inet.ip.mfchashsize` does not have corresponding
sysctl MIB entry. Just add it.
While here, the sysctl variable `net.inet.pim.squelch_wholepkt` is actually
a loader tunable. Add sysctl flag CTLFLAG_TUN to it so that `sysctl -T`
will report it correctly.
Reviewed by: kp
Fixes: 443fc3176d Introduce a number of changes to the MROUTING code
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D41997
When shutdown(..., SHUT_RD) or shutdown(..., SHUT_RDWR) is called,
really clean up the read queue and issue an ungraceful shutdown if
user messages are affected.
Reported by: syzbot+d4e1d30d578891245f59@syzkaller.appspotmail.com
MFC after: 3 days
With recent change 110113bc08, a vnet tunable can be initialized when
there is a corresponding kernel environment variable unless it is marked
with the flag CTLFLAG_NOFETCH.
The initialization may happen during early boot(linker preload), at that
time vnet0 has not been created. The hander carp_allow_sysctl() for the
tunable net.inet.carp.allow requires vnet, thus invoking it during early
boot will cause kernel panic.
The tunnable is initialized by vnet sysinit routine ipcarp_sysinit() so
let's just mark it with flag CTLFLAG_NOFETCH.
No functional change intended.
Fixes: 110113bc08 sysctl(9): Enable vnet sysctl variables to be loader tunable
MFC after: 2 week
Differential Revision: https://reviews.freebsd.org/D41525
All notifications are now queued via sctp_ulp_notify(). Do
the locking of the inp read lock there and validate this in all
functions being used.
This is one step in avoiding race conditions when closing the
read end of an SCTP socket.
MFC after: 3 days
This makes consistent use of the parameters and ensures that
all SCTP AUTH related notifications are using sctp_ulp_notify().
No functional change intended.
MFC after: 3 days
This vnet loader tunable is defined with SYSCTL_PROC, thus will not be
initialized by kernel on vnet creating and will always have the default
value TCP_FASTOPEN_CCACHE_BUCKET_LIMIT_DEFAULT.
Fix by fetching the value from the corresponding kernel environment during
vnet constructing.
PR: 273509
Reviewed by: #transport, tuexen
Fixes: c560df6f12 This is an implementation of the client side of TCP Fast Open (TFO) [RFC7413]
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D41691
The IGMP code buffers packets in the imf_inm->inm_scq mbufq, but does
not clear this queue when struct in_mfilter is freed by imf_purge().
This can cause memory leaks if IGMPv3 is used.
Purge the mbufq on imf_purge().
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D41629
IGMP requires hosts to use the lowest version they've seen on the
network. When the IGMP timers expire we take the opportunity to upgrade again.
However, we did not take the net.inet.igmp.default_version sysctl
setting into account, so we could end up switching to IGMPv3 even if the
user had requested IGMPv2 or IGMPv1 via the sysctl.
Check V_igmp_default_version before we upgrade the IGMP version.
Reviewed by: adrian
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D41628
In particular, don't use a socket level flag, use the inp level one.
After adding appropriate locking, this will close a race condition.
MFC after: 1 week
If a socket is marked as cannot read anymore, drop chunks which
should be added to a control element in the receive queue.
This is consistent with dropping control elements instead of
adding them in the same situation.
Reported by: syzbot+291f6581cecb77097b16@syzkaller.appspotmail.com
MFC after: 1 week
When handling a SHUTDOWN or SHUTDOWN ACK chunk detect if the peer
is violating the protocol by not having made sure all user messages
are reveived by the peer. If this situation is detected, abort the
association.
MFC after: 1 week
This change adds struct tcp_info fields corresponding to the following
struct tcpcb ones:
- snd_una
- snd_max
- rcv_numsacks
- rcv_adv
- dupacks
Note that while both tcp_fill_info() and fill_tcp_info_from_tcb() are
extended accordingly, no counterpart of rcv_numsacks is available in
the cxgbe(4) TOE PCB, though.
Sponsored by: NetApp, Inc. (originally)
This function actually only ever reads from the TCP PCB. Consequently,
also make the pointer to its TCP PCB parameter const.
Sponsored by: NetApp, Inc. (originally)
Don't handle a graceful shutdown of the peer as an implicit signal
that all partial messages are complete. First, this is not implemented
correctly and second this should not be done by the peer. It is more
appropriate to handle this as a protocol violation.
Remove the incorrect code and leave detecting the protocol violation
and its handling in a followup commit.
MFC after: 1 week