Commit graph

750 commits

Author SHA1 Message Date
Gleb Smirnoff
3789810845 tcp: avoid bcopy() in tcp_mss_update() 2024-11-20 16:37:24 -08:00
Gleb Smirnoff
09000cc133 tcp: mechanically rename hostcache metrics structure fields
Use hc_ prefix instead of rmx_.  The latter stands for "route metrix" and
is an artifact from the 90-ies, when TCP caching was embedded into the
routing table.  The rename should have happened back in 97d8d152c2.

No functional change. Done with sed(1) command:

s/rmx_(mtu|ssthresh|rtt|rttvar|cwnd|sendpipe|recvpipe|granularity|expire|q|hits|updates)/hc_\1/g
2024-11-20 16:29:00 -08:00
Richard Scheffenegger
8f5a2e216f tcp: fix cwnd recalculation during limited transmit
Properly calculate the expected flight size (cwnd) during
limited transmit. Exclude the SACK scoreboard from
consideration when still in limited transmit.

PR: 282605
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D47541
2024-11-14 09:19:49 +01:00
Richard Scheffenegger
dded4e9e52 tcp: change SOCKBUF_* macros to SOCK_[RECV|SEND]BUF_* macros
Change the older LOCK related macros over to the
dedicated send/recv buffer macros in the base tcp stack.

No functional change intended.

Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D47567
2024-11-14 02:08:12 +01:00
Richard Scheffenegger
7dc78150c7 tcp: refactor cwnd during SACK transmissions to allow TSO
Refactoring of cwnd and moving the adjustment for SACKed data into
tcp_output() - cwnd tracking the maximum extent starting at snd_una -
allows both SACK loss recovery as well as SACK transmissions after
RTO during slow start and if allowed, the use of TSO while in loss
recovery.

Reviewed By:		tuexen, cc, #transport
Sponsored by:		NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D43470
2024-10-29 19:04:12 +01:00
Richard Scheffenegger
440f4ba18e tcp: fix duplicate retransmissions when RTO happens during SACK loss recovery
When snd_nxt doesn't track snd_max, partial SACK ACKs may elicit
unexpected duplicate retransmissions. This is usually masked by
LRO not necessarily ACKing every individual segment, and prior
to RFC6675 SACK loss recovery, harder to trigger even when an
RTO happens while SACK loss recovery is ongoing.

Address this by improving the logic when to start a SACK loss recovery
and how to deal with a RTO, as well as improvements to the adjusted
congestion window during transmission selection.

Reviewed By:	tuexen, cc, #transport
Sponsored by:	NetApp, Inc.
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D43355
2024-10-10 13:02:47 +02:00
Michael Tuexen
40299c55a0 tcp: implement challenge ACK throttling for the base stack
Implement ACK throttling of challenge ACKs as described in RFC 5961.

Reviewed by:		Peter Lei, rscheff, cc
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46066
2024-07-25 13:54:52 +02:00
Michael Tuexen
37b3e6a660 tcp: use TCP_MAXWIN instead of 65535
This is suggested by cc@. No functional change.

Sponsored by:	Netflix, Inc.
2024-07-22 08:52:12 +02:00
Michael Tuexen
646c28ea80 tcp: improve SEG.ACK validation
Implement the improved SEG.ACK validation described in RFC 5961.
In addition to that, also detect ghost ACKs, which are ACKs for data
that has never been sent.
The additional checks are enabled by default, but can be disabled
by setting the sysctl-variable net.inet.tcp.insecure_ack to a
non-zero value.

PR:			250357
Reviewed by:		Peter Lei, rscheff (older version)
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D45894
2024-07-21 11:37:35 +02:00
Konstantin Belousov
b6919741b7 ipsec_offload: handle TSO if supported
Allow for TSO to operate if network interface supports ipsec inline
offload and supports TSO over it.

Reviewed by:	tuexen
Sponsored by:	NVIDIA networking
Differential revision:	https://reviews.freebsd.org/D44222
2024-07-12 06:29:32 +03:00
Michael Tuexen
df9de82f54 tcp: fix sending RST after second inp lookup
When we first find an inp, we set also the tp. If then a second
lookup is necessary, the inp is recomputed. If this fails, the
tp is not cleared, which resulted in failing KASSERT.
Therefore, clear the tp when staring the inp lookup procedure.
Reported by:	Jenkins
Fixes:		02d15215ce ("tcp: improve blackhole support")
MFC after:	1 week
Sponsored by:	Netflix, Inc.
2024-05-25 19:58:48 +02:00
Michael Tuexen
02d15215ce tcp: improve blackhole support
There are two improvements to the TCP blackhole support:
(1) If net.inet.tcp.blackhole is set to 2, also sent no RST whenever
    a segment is received on an existing closed socket or if there is
    a port mismatch when using UDP encapsulation.
(2) If net.inet.tcp.blackhole is set to 3, no RST segment is sent in
    response to incoming segments on closed sockets or in response to
    unexpected segments on listening sockets.
Thanks to gallatin@ for suggesting such an improvement.

Reviewed by:		gallatin
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D45304
2024-05-24 06:59:13 +02:00
Randall Stewart
fce03f85c5 TCP can be subject to Sack Attacks lets fix this issue.
There is a type of attack that a TCP peer can launch on a connection. This is for sure in Rack or BBR and probably even the default stack if it uses lists in sack processing. The idea of the attack is that the attacker is driving you to look at 100's of sack blocks that only update 1 byte. So for example if you have 1 - 10,000 bytes outstanding the attacker sends in something like:

ACK 0 SACK(1-512) SACK(1024 - 1536), SACK(2048-2536), SACK(4096 - 4608), SACK(8192-8704)
This first sack looks fine but then the attacker sends

ACK 0 SACK(1-512) SACK(1025 - 1537), SACK(2049-2537), SACK(4097 - 4609), SACK(8193-8705)
ACK 0 SACK(1-512) SACK(1027 - 1539), SACK(2051-2539), SACK(4099 - 4611), SACK(8195-8707)
...
These blocks are making you hunt across your linked list and split things up so that you have an entry for every other byte. Has your list grows you spend more and more CPU running through the lists. The idea here is the attacker chooses entries as far apart as possible that make you run through the list. This example is small but in theory if the window is open to say 1Meg you could end up with 100's of thousands link list entries.

To combat this we introduce three things.

when the peer requests a very small MSS we stop processing SACK's from them. This prevents a malicious peer from just using a small MSS to do the same thing.
Any time we get a sack block, we use the sack-filter to remove sacks that are smaller than the smallest v4 mss (minus 40 for max TCP options) unless it ties up to snd_max (since that is legal). All other sacks in theory should be at least an MSS. If we get such an attacker that means we basically start skipping all but MSS sized Sacked blocks.
The sack filter used to throw away data when its bounds were exceeded, instead now we increase its size to 15 and then throw away sack's if the filter gets over-run to prevent the malicious attacker from over-running the sack filter and thus we start to process things anyway.
The default stack will need to start using the sack-filter which we have talked about in past conference calls to take full advantage of the protections offered by it (and reduce cpu consumption when processing sacks).

After this set of changes is in rack can drop its SAD detection completely

Reviewed by:tuexen@, rscheff@
 Differential Revision:	<https://reviews.freebsd.org/D44903>
2024-05-05 09:08:47 -04:00
Michael Tuexen
c9cd686bd4 tcp: drop data received after a FIN has been processed
RFC 9293 describes the handling of data in the CLOSE-WAIT, CLOSING,
LAST-ACK, and TIME-WAIT states:
This should not occur since a FIN has been received from the remote
side. Ignore the segment text.
Therefore, implement this handling.

Reviewed by:		rrs, rscheff
MFC after:		3 days
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D44746
2024-04-18 21:54:42 +02:00
Michael Tuexen
e8c149ab85 tcp: add some debug output
Also log, when dropping text or FIN after having received a FIN.
This is the intended behavior described in RFC 9293.
A follow-up patch will enforce this behavior for the base stack
and the RACK stack.
Reviewed by:		rscheff
MFC after:		3 days
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D44669
2024-04-07 22:41:24 +02:00
Michael Tuexen
3e1c8a35f7 tcp: improve consistency
No functional change intended.

Reported by:		Coverity Scan
CID:			1523781
Reviewed by:		rscheff
MFC after:		3 days
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D44645
2024-04-06 10:02:06 +02:00
Gleb Smirnoff
dd7b86e2a0 tcp: remove IS_FASTOPEN() macro
The macro is more obfuscating than helping as it just checks a single flag
of t_flags.  All other t_flags bits are checked without a macro.

A bigger problem was that declaration of the macro in tcp_var.h depended
on a kernel option.  It is a bad practice to create such definitions in
installable headers.

Reviewed by:		rscheff, tuexen, kib
Differential Revision:	https://reviews.freebsd.org/D44362
2024-03-18 08:56:17 -07:00
Richard Scheffenegger
40fdc6d25f tcp: provide correct snd_fack on post_recovery
Ensure that snd_fack holds a valid value when doing
the post_recovery CC processing, for preparation of
the cc_cubic update, so that local pipe calculations
can correctly refer to snd_fack during and after CC events.

Reviewed By:		tuexen, #transport
Sponsored by:		NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D43957
2024-02-24 16:55:31 +01:00
Richard Scheffenegger
fcea1cc971 tcp: fix RTO ssthresh for non-6675 pipe calculation
Follow up on D43768 to properly deal with the non-default
pipe calculation. When CC_RTO is processed, the timeout
will have already pulled back snd_nxt. Further, snd_fack
is not pulled along with snd_una.

Reviewed By:		tuexen, #transport
Sponsored by:		NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D43876
2024-02-14 14:51:53 +01:00
Richard Scheffenegger
3eeb22cb81 tcp: clean scoreboard when releasing the socket buffer
The SACK scoreboard is conceptually an extention of the socket
buffer. Remove it when the socket buffer goes away with
soisdisconnected(). Verify that this is also the expected
state in tcp_discardcb().

PR:			276761
Reviewed by:		glebius, tuexen, #transport
Sponsored by:		NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D43805
2024-02-10 10:20:00 +01:00
Richard Scheffenegger
0b3f9e435f tcp: move cc_post_recovery past snd_una update
The RFC6675 pipe calculation (sack.revised, enabled
by default since D28702), uses outdated information,
while the previous default calculated it correctly
with up-to-date information from the incoming ACK.

This difference can become as large as the receive
window (not the congestion window previously),
potentially triggering a massive burst of new packets.

MFC after:             1 week
Reviewed By:           tuexen, #transport
Sponsored by:          NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43520
2024-01-28 00:18:51 +01:00
Richard Scheffenegger
2d05a1c81b tcp: commonize check for more data to send, style changes
Use SEQ_SUB instead of a plain subtraction, for an implict
type conversion and prevention of a possible overflow.
Use curly brackets in stacked if statements throughout.
Use of the ? operator to enhance readability when clearing
the FIN flag in tcp_output().

None of the above change the function.

Reviewed By:           tuexen, cc, #transport
Sponsored by:          NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43539
2024-01-26 01:20:35 +01:00
Richard Scheffenegger
c7c325d01d tcp: pass maxseg around instead of calculating locally
Improve slowpath processing (reordering, retransmissions)
slightly by calculating maxseg only once. This typically
saves one of two calls to tcp_maxseg().

Reviewed By:           glebius, tuexen, cc, #transport
Sponsored by:          NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43536
2024-01-24 16:43:29 +01:00
Richard Scheffenegger
429f14f83a tcp: clean PRR state after ECN congestion recovery.
PRR state was not properly reset on subsequent ECN CE
events. Clean up after local transmission failures too.

Reviewed by:           tuexen, cc, #transport
MFC after:             3 days
Sponsored by:          NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43170
2024-01-08 10:53:04 +01:00
Richard Scheffenegger
f4574e2dc5 tcp: prevent spurious empty segments and fix uncommon panic
Only try sending more data on pure ACKs when there is
more data available in the send buffer.

In the case of a retransmitted SYN not being sent due to
an internal error, the snd_una/snd_nxt accounting could
be off, leading to a panic. Pulling snd_nxt up to snd_una
prevents this from happening.

Reported by:           fengdreamer@126.com
Reviewed by:           cc, tuexen, #transport
MFC after:             1 week
Sponsored by:          NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43343
2024-01-08 10:52:49 +01:00
Richard Scheffenegger
30409ecdb6 tcp: do not purge SACK scoreboard on first RTO
Keeping the SACK scoreboard intact after the first RTO
and retransmitting all data anew only on subsequent RTOs
allows a more timely and efficient loss recovery under
many adverse cirumstances.

Reviewed By:           tuexen, #transport
MFC after:             10 weeks
Sponsored by:          NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D42906
2024-01-06 20:25:38 +01:00
Richard Scheffenegger
893ed42eca tcp: Make use of enum for sack_changed
No functional change.

Reviewed By:           tuexen, #transport
MFC after:             3 days
Sponsored by:          NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43346
2024-01-06 20:23:52 +01:00
Gleb Smirnoff
513f2e2e71 tcp: always set tcp_tun_port to a correct value
The tcp_tun_port field that is used to pass port value between UDP
and TCP in case of tunneling is a generic field that used to pass
data between network layers.  It can be contaminated on entry, e.g.
by a VLAN tag set by a NIC driver.  Explicily set it, so that it
is zeroed out in a normal not-tunneled TCP.  If it contains garbage,
tcp_twcheck() later can enter wrong block of code and treat the packet
as incorrectly tunneled one.  On main and stable/14 that will end up
with sending incorrect responses, but on stable/13 with ipfw(8) and
pcb-matching rules it may end up in a panic.

This is a minimal conservative patch to be merged to stable branches.
Later we may redesign this.

PR:			275169
Reviewed by:		tuexen
Differential Revision:	https://reviews.freebsd.org/D43065
2023-12-19 11:24:17 -08:00
Richard Scheffenegger
9276ad23b8 tcp: shift PRR sending cadence slightly left
Don't let PRR pass up on the opportunity of clocking
out packets on arrival of ACKs - by pulling sends
forward by about half a packet. Prevents unexpectedly
long runs of incoming ACKs without eliciting a
packet transmission.

MFC after:             1 week
Reviewed By:           #transport, tuexen
Sponsored by:          NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D42918
2023-12-07 15:37:45 +01:00
Richard Scheffenegger
f42518ff12 tcp: for LRD move sysctl from tcp.do_lrd tp tcp.sack.lrd, remove sockopt
Moving lrd sysctl to the tcp.sack branch, since LRD only works with SACK.
Remove the sockopt to programmatically control LRD per session.

Reviewed By:           #transport, tuexen, rrs
Sponsored by:          NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D42851
2023-11-30 21:11:45 +01:00
Richard Scheffenegger
34c45bc6a3 tcp: enable LRD by default
Lost Retransmission Detection was added as a
feature in May 2021, but disabled by default.

Enabling the feature by default to reduce the
flow completion time by avoiding RTOs when
retransmissions get lost too.

Reviewed By:           tuexen, #transport, zlei
MFC after:             10 weeks
Sponsored by:          NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D42845
2023-11-30 05:38:16 +01:00
Warner Losh
29363fb446 sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by:		Netflix
2023-11-26 22:23:30 -07:00
Richard Scheffenegger
49a6fbe387 [tcp] add PRR 6937bis heuristic and retire prr_conservative sysctl
Improve Proportional Rate Reduction (RFC6937) by using a
heuristic, which automatically chooses between
conservative CRB and more aggressive SSRB modes.
Only when snd_una advances (a partial ACK), SSRB may be
used. Also, that ACK must not have any indication of
ongoing loss - using the addition of new holes into the
scoreboard as proxy for such an event.

MFC after: 4 weeks
Reviewed By: #transport, kbowling, rrs
Sponsored By: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D28822
2023-11-15 23:10:29 +01:00
Richard Scheffenegger
e2c6a6d29b tcp: include RFC6675 IsLost() in pipe calculation
Add more accounting while processing SACK data, to
keep track of when a packet is deemed lost using
the RFC6675 guidance.

Together with PRR (RFC6972) this allows a sender to
retransmit presumed lost packets faster, and loss
recovery to complete earlier.

Reviewed By: cc, rrs, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D39299
2023-10-09 12:37:20 +02:00
Warner Losh
685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Richard Scheffenegger
b352ef58c2 tcp: Handle <RST,ACK> in SYN-RCVD
Patch base stack to correctly handle the RST bit independently
of other header flags per TCP RFC.

MFC after: 1 week
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D40982
2023-07-27 00:42:26 +02:00
Cheng Cui
e5738ee04b
Under RSS, assign a TCP flow's inp_flowid anyway.
Summary:
This brings some benefit of a tcp flow identification for some kernel
modules, such as siftr.

Reviewers: rrs, rscheff, tuexen, #transport!
Approved by: tuexen (mentor), rrs
Subscribers: imp, melifaro, glebius
Differential Revision: https://reviews.freebsd.org/D40061
2023-05-18 11:11:53 -04:00
Gleb Smirnoff
35bc0bcc51 tcp: reduce argument list to functions that pass a segment
The socket argument is superfluous, as a tcpcb always has one and
only one socket.

Reviewed by:		rrs
Differential Revision:	https://reviews.freebsd.org/D39434
2023-04-07 12:18:06 -07:00
Gleb Smirnoff
78e6c3aacc tcp: update error counter when dropping a packet due to bad source
Use the same counter that ip_input()/ip6_input() use for bad destination
address.  For IPv6 this is already heavily abused ip6s_badscope, which
needs to be split into several separate error counters.

Reviewed by:		markj
Differential Revision:	https://reviews.freebsd.org/D39234
2023-03-27 18:37:15 -07:00
Randall Stewart
69c7c81190 Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities.
The TCP stacks have long accessed t_logstate directly, but in order to do tracepoints and the new bbpoints
we need to move to using the new inline functions. This adds them and moves rack to now use
the tcp_tracepoints.

Reviewed by: tuexen, gallatin
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D38831
2023-03-16 11:43:16 -04:00
Mark Johnston
713264f6b8 netinet: Tighten checks for unspecified source addresses
The assertions added in commit b0ccf53f24 ("inpcb: Assert against
wildcard addrs in in_pcblookup_hash_locked()") revealed that protocol
layers may pass the unspecified address to in_pcblookup().

Add some checks to filter out such packets before we attempt an inpcb
lookup:
- Disallow the use of an unspecified source address in in_pcbladdr() and
  in6_pcbladdr().
- Disallow IP packets with an unspecified destination address.
- Disallow TCP packets with an unspecified source address, and add an
  assertion to verify the comment claiming that the case of an
  unspecified destination address is handled by the IP layer.

Reported by:	syzbot+9ca890fb84e984e82df2@syzkaller.appspotmail.com
Reported by:	syzbot+ae873c71d3c71d5f41cb@syzkaller.appspotmail.com
Reported by:	syzbot+e3e689aba1d442905067@syzkaller.appspotmail.com
Reviewed by:	glebius, melifaro
MFC after:	2 weeks
Sponsored by:	Klara, Inc.
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D38570
2023-03-06 15:06:00 -05:00
Richard Scheffenegger
18b83b626a tcp: reduce the size of t_rttupdated in tcpcb
During tcp session start, various mechanisms need to
track a few initial RTTs before becoming active.
Prevent overflows of the corresponding tracking counter
and reduce the size of tcpcb simultaneously.

Reviewed By:		#transport, tuexen, guest-ccui
Sponsored by:		NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D21117
2023-01-26 18:08:00 +01:00
Gleb Smirnoff
aab8c844b9 tcp/ipfw: fix "ipfw fwd localaddr,port"
The ipfw(4) feature of forwarding to local address without modifying
a packet was broken.  The first lookup needs always be a non-wildcard
one, cause its goal is to find an already existing socket.  Otherwise
a local wildcard listener with the same port number may match resulting
in the connection being forwared to wrong port.

Reported by:	Pavel Polyakov <bsd kobyla.org>
Fixes:		d88eb4654f
2023-01-05 14:34:50 -08:00
Gleb Smirnoff
eaabc93764 tcp: retire TCPDEBUG
This subsystem is superseded by modern debugging facilities,
e.g. DTrace probes and TCP black box logging.

We intentionally leave SO_DEBUG in place, as many utilities may
set it on a socket.  Also the tcp::debug DTrace probes look at
this flag on a socket.

Reviewed by:		gnn, tuexen
Discussed with:		rscheff, rrs, jtl
Differential revision:	https://reviews.freebsd.org/D37694
2022-12-14 09:54:06 -08:00
Gleb Smirnoff
e68b379244 tcp: embed inpcb into tcpcb
For the TCP protocol inpcb storage specify allocation size that would
provide space to most of the data a TCP connection needs, embedding
into struct tcpcb several structures, that previously were allocated
separately.

The most import one is the inpcb itself.  With embedding we can provide
strong guarantee that with a valid TCP inpcb the tcpcb is always valid
and vice versa.  Also we reduce number of allocs/frees per connection.
The embedded inpcb is placed in the beginning of the struct tcpcb,
since in_pcballoc() requires that.  However, later we may want to move
it around for cache line efficiency, and this can be done with a little
effort.  The new intotcpcb() macro is ready for such move.

The congestion algorithm data, the TCP timers and osd(9) data are
also embedded into tcpcb, and temprorary struct tcpcb_mem goes away.
There was no extra allocation here, but we went through extra pointer
every time we accessed this data.

One interesting side effect is that now TCP data is allocated from
SMR-protected zone.  Potentially this allows the TCP stacks or other
TCP related modules to utilize that for their own synchronization.

Large part of the change was done with sed script:

s/tp->ccv->/tp->t_ccv./g
s/tp->ccv/\&tp->t_ccv/g
s/tp->cc_algo/tp->t_cc/g
s/tp->t_timers->tt_/tp->tt_/g
s/CCV\(ccv, osd\)/\&CCV(ccv, t_osd)/g

Dependency side effect is that code that needs to know struct tcpcb
should also know struct inpcb, that added several <netinet/in_pcb.h>.

Differential revision:	https://reviews.freebsd.org/D37127
2022-12-07 09:00:48 -08:00
Michael Tuexen
bd4f986644 tcp: remove unused t_rttbest
No functional change intended.

Reviewed by:		rscheff@
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D37401
2022-11-16 11:22:13 +01:00
Gleb Smirnoff
9eb0e8326d tcp: provide macros to access inpcb and socket from a tcpcb
There should be no functional changes with this commit.

Reviewed by:		rscheff
Differential revision:	https://reviews.freebsd.org/D37123
2022-11-08 10:24:40 -08:00
Gleb Smirnoff
f71cb9f748 tcp: inp_socket is valid through the lifetime of a TCP inpcb
The inp_socket is cleared only in in_pcbdetach(), which for TCP is
always accompanied with inp_pcbfree().  An inpcb that went through
in_pcbfree() shall never be returned by any kind of pcb lookup.

Reviewed by:		tuexen
Differential revision:	https://reviews.freebsd.org/D37062
2022-11-08 10:24:39 -08:00
Gleb Smirnoff
f567d55f51 inpcb: don't return INP_DROPPED entries from pcb lookups
The in_pcbdrop() KPI, which is used solely by TCP, allows to remove a
pcb from hash list and mark it as dropped.  The comment suggests that
such pcb won't be returned by lookups.  Indeed, every call to
in_pcblookup*() is accompanied by a check for INP_DROPPED.  Do what
comment suggests: never return such pcbs and remove unnecessary checks.

Reviewed by:		tuexen
Differential revision:	https://reviews.freebsd.org/D37061
2022-11-08 10:24:39 -08:00
Richard Scheffenegger
004bb636ca tcp: Move sysctl OIDs related to ECN to tcp_ecn.c
Keep all ECN related code in (mostly) one place.

No functional change.

Event:			IETF 115 Hackathon
Reviewed By:		tuexen, #transport
Sponsored by:		NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D37285
2022-11-06 12:38:42 +01:00