Commit graph

141 commits

Author SHA1 Message Date
Cheng Cui
ee45061051
cc_cubic: use newreno to emulate AIMD in TCP-friendly region
Reviewed by: rscheff, tuexen
Differential Revision: https://reviews.freebsd.org/D46546
2024-09-17 10:37:00 -04:00
Cheng Cui
b6c137de0a
tcp cc: re-organize newreno functions into parts that can be re-used
Reviewed by: rscheff, tuexen
Differential Revision: https://reviews.freebsd.org/D46046
2024-09-17 09:54:17 -04:00
Cheng Cui
8cc528c682
tcp cc: clean up some un-used cc_var flags
Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D46299
2024-08-15 09:33:04 -04:00
Michael Tuexen
00d3b74406 tcp cc: remove non-working sctp support
As suggested by lstewart, remove the non-working SCTP support in the
TCP congestion control modules. SCTP has a similar functionality
(although not using kernel loadable modules), on which the TCP stuff
was built on, but the integration was never done.
No functional change intended.

Reviewed by:		Peter Lei, cc
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46142
2024-07-28 22:25:48 +02:00
Cheng Cui
9565854ab4
cc_cubic: remove the redundant variable num_cong_events from struct cubic.
Summary:
This variable was added by commit eb5bfdd065, but unnecessarily needed.
No functional change.

Reviewed by: tuexen

Differential Revision: https://reviews.freebsd.org/D46042
2024-07-25 13:11:32 -04:00
Henrich Hartzer
674956e199 sys/netinet/cc: Switch from deprecated random() to prng32()
Related: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277655

Signed-off-by: henrichhartzer@tuta.io
Reviewed by: imp, mav
Pull Request: https://github.com/freebsd/freebsd-src/pull/1162
2024-05-23 15:10:09 -06:00
Richard Scheffenegger
8917131e00 tcp: need default in switch statement for enum.
fix clang error after c9b6241e25

Reviewed By: imp
Differential Revision: https://reviews.freebsd.org/D44081
2024-02-25 08:24:13 +01:00
Richard Scheffenegger
c9b6241e25 tcp: address enum-int-mismatch
fix gcc13 error after f74352fbcf
2024-02-25 04:46:39 +01:00
Richard Scheffenegger
5e248c23d9 tcp: retain some CC signals outside of kernel scope
Summary: fix build error after f74352fbcf

Reviewers: #transport!

Subscribers: imp, melifaro, glebius

Differential Revision: https://reviews.freebsd.org/D44066
2024-02-24 21:01:54 +01:00
Richard Scheffenegger
038699a8f1 tcp: cubic - restart epoch after RTO
This is a migitation to avoid sudden extreme jumps in
cwnd, as t_epoch can be very out of date after an RTO.
Per RFC9438, sec 4.8, t_epoch is to be reset whenever
cwnd grows beyond ssthresh (CC phase transitions from
slow start to congestion avoidance), to be fixed with
the upcoming cc_cubic changes.

MFC after:		3 days
Reviewed By:		cc, #transport
Sponsored by:		NetApp, Inc
Differential Revision:	https://reviews.freebsd.org/D44023
2024-02-24 17:07:46 +01:00
Richard Scheffenegger
f74352fbcf tcp: use enum for all congestion control signals
Facilitate easier troubleshooting by enumerating
all congestion control signals. Typecast the
enum to int, when a congestion control module uses
private signals.

No external change.

Reviewed By:		glebius, tuexen, #transport
Sponsored by:		NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D43838
2024-02-24 16:41:48 +01:00
Richard Scheffenegger
38983d40c1 tcp: prevent div by zero in cc_htcp
Make sure the divident is at least one. While cwnd should
never be smaller than t_maxseg, this can happen during
Path MTU Discovery, or when TCP options are considered
in other parts of the stack.

PR:			276674
MFC after:		3 days
Reviewed By:		tuexen, #transport
Sponsored by:		NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D43797
2024-02-24 16:35:59 +01:00
Richard Scheffenegger
fcea1cc971 tcp: fix RTO ssthresh for non-6675 pipe calculation
Follow up on D43768 to properly deal with the non-default
pipe calculation. When CC_RTO is processed, the timeout
will have already pulled back snd_nxt. Further, snd_fack
is not pulled along with snd_una.

Reviewed By:		tuexen, #transport
Sponsored by:		NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D43876
2024-02-14 14:51:53 +01:00
Richard Scheffenegger
32a6df57df tcp: calculate ssthresh on RTO according to RFC5681
per RFC5681, only adjust ssthresh on the initital
retransmission timeout. Since RTO often happens
during loss recovery, while cwnd no longer tracks
all data in flight, calculcate pipe properly.

Reviewed By:           tuexen, #transport
Sponsored by:          NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43768
2024-02-08 19:18:26 +01:00
Richard Scheffenegger
1adab814e8 tcp: use tcp_fixed_maxseg instead of tcp_maxseg in cc modules
tcp_fixed_maxseg() is the streamlined calculation of typical
tcp options and more suitable for heavy use in the congestion
control modules on every received packet.

No external functional change.

Reviewed By:           tuexen, #transport
Sponsored by:          NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43779
2024-02-08 18:36:59 +01:00
Warner Losh
fdafd315ad sys: Automated cleanup of cdefs and other formatting
Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by:		Netflix
2023-11-26 22:24:00 -07:00
Warner Losh
685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Warner Losh
95ee2897e9 sys: Remove $FreeBSD$: two-line .h pattern
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
2023-08-16 11:54:11 -06:00
Richard Scheffenegger
eb5bfdd065 tcp: Add and update cubic module variable names
Prepare the cubic congestion control module to better align with
the specifications in RFC8312bis.

Rename a few cubic state variables to the variable names found in
the RFC8312bis specification. This makes the code more understandable
for someone reading the RFC and the code. It also makes the variable
naming convention more uniform. Add some variables needed subsequently.

No functional change.

Submitted By:		Bhaskar Pardeshi, VMware Inc.
Reviewed By:		tuexen, #transport
Sponsored by:		NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D40436
2023-06-06 23:09:28 +02:00
Cheng Cui
a3aa6f6529
cc_cubic: Use units of micro seconds (usecs) instead of ticks in rtt.
This improves TCP friendly cwnd in cases of low latency high drop rate
networks. Tests show +42% and +37% better performance in 1Gpbs and 10Gbps
cases.

Reported by: Bhaskar Pardeshi from VMware.
Reviewed By: rscheff, tuexen
Approved by: rscheff (mentor), tuexen (mentor)
2023-06-01 07:55:01 -04:00
Randall Stewart
ec6d620b19 There are congestion control algorithms will that pull in srtt, and this can cause issues with rack.
When using rack, cubic and htcp will grab the srtt, but they think it is in ticks. For rack
it is in micro-seconds (which we should probably move all stacks to actually). This causes
issues so instead lets make a new interface so that any CC module can pull the srtt in
whatever granularity they want.

Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision:https://reviews.freebsd.org/D40146
2023-05-19 11:16:28 -04:00
Warner Losh
4d846d260e spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with:		pfg
MFC After:		3 days
Sponsored by:		Netflix
2023-05-12 10:44:03 -06:00
Randall Stewart
69c7c81190 Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities.
The TCP stacks have long accessed t_logstate directly, but in order to do tracepoints and the new bbpoints
we need to move to using the new inline functions. This adds them and moves rack to now use
the tcp_tracepoints.

Reviewed by: tuexen, gallatin
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D38831
2023-03-16 11:43:16 -04:00
Randall Stewart
e2e088ae86 Rack cannot be loaded without cc_newreno compiled into the kernel.
Right now rack will fail to load due to its hack in accessing symbol names
in cc_newreno. This was fine when newreno was always compiled into the
kernel but now ... not so much. Instead lets fix up rack to use the socket
option queries to get the information it wants and set the parameters. We
also fix the CC parameter so they are always settable.

Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D37622
2022-12-14 15:37:48 -05:00
Gleb Smirnoff
e68b379244 tcp: embed inpcb into tcpcb
For the TCP protocol inpcb storage specify allocation size that would
provide space to most of the data a TCP connection needs, embedding
into struct tcpcb several structures, that previously were allocated
separately.

The most import one is the inpcb itself.  With embedding we can provide
strong guarantee that with a valid TCP inpcb the tcpcb is always valid
and vice versa.  Also we reduce number of allocs/frees per connection.
The embedded inpcb is placed in the beginning of the struct tcpcb,
since in_pcballoc() requires that.  However, later we may want to move
it around for cache line efficiency, and this can be done with a little
effort.  The new intotcpcb() macro is ready for such move.

The congestion algorithm data, the TCP timers and osd(9) data are
also embedded into tcpcb, and temprorary struct tcpcb_mem goes away.
There was no extra allocation here, but we went through extra pointer
every time we accessed this data.

One interesting side effect is that now TCP data is allocated from
SMR-protected zone.  Potentially this allows the TCP stacks or other
TCP related modules to utilize that for their own synchronization.

Large part of the change was done with sed script:

s/tp->ccv->/tp->t_ccv./g
s/tp->ccv/\&tp->t_ccv/g
s/tp->cc_algo/tp->t_cc/g
s/tp->t_timers->tt_/tp->tt_/g
s/CCV\(ccv, osd\)/\&CCV(ccv, t_osd)/g

Dependency side effect is that code that needs to know struct tcpcb
should also know struct inpcb, that added several <netinet/in_pcb.h>.

Differential revision:	https://reviews.freebsd.org/D37127
2022-12-07 09:00:48 -08:00
Gleb Smirnoff
9eb0e8326d tcp: provide macros to access inpcb and socket from a tcpcb
There should be no functional changes with this commit.

Reviewed by:		rscheff
Differential revision:	https://reviews.freebsd.org/D37123
2022-11-08 10:24:40 -08:00
Richard Scheffenegger
dc9daa04fb tcp: allow packets to be marked as ECT1 instead of ECT0
This adds the capability for a modular congestion control
to select which variant of ECN-capable-transport it wants to use
when sending out elegible segments. As an initial CC to utilize
this, DCTCP was selected.

Event:			IETF 115 Hackathon
Reviewed By:		tuexen, #transport
Sponsored by:		NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D24869
2022-11-08 18:36:38 +01:00
Michael Tuexen
0fdc247274 tcp: make RACK loadable again using the default configuration
Without this patch, loading the RACK stack required the newreno
CC module to be compiled into the kernel. This is not the case
anymore since CUBIC is the default now.

Reviewed by:		rscheff@
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D36707
2022-09-26 12:30:50 +02:00
Richard Scheffenegger
bb1d472d79 tcp: make CUBIC the default congestion control mechanism.
This changes the default TCP Congestion Control (CC) to CUBIC.
For small, transactional exchanges (e.g. web objects <15kB), this
will not have a material effect. However, for long duration data
transfers, CUBIC allocates a slightly higher fraction of the
available bandwidth, when competing against NewReno CC.

Reviewed By: tuexen, mav, #transport, guest-ccui, emaste
Relnotes: Yes
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D36537
2022-09-13 12:09:21 +02:00
Richard Scheffenegger
ea6d0de299 tcp: Make all references to CUBIC uppercase
Consistently refer to the CUBIC congestion control
mechanism in uppercase throughout all comments.

No functional change.

Reviewed By: #transport, tuexen, mav, guest-ccui, emaste
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D36547
2022-09-13 12:07:06 +02:00
Michael Tuexen
ccdfd621d0 tcp cc: don't recurse on non recursive mutex
This issue was found by syzkaller.

Reviewed by:		rrs
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D34743
2022-04-05 13:52:36 +02:00
Michael Tuexen
d4290f7e62 Revert "sctp: remove a test, which isn't safe"
It included unrelated changes still under review.
This reverts commit b1fe92b28b.
2022-04-02 14:49:14 +02:00
Michael Tuexen
b1fe92b28b sctp: remove a test, which isn't safe
We can't ensure the stcb is still around. This issue was found
by syzkaller.

MFC after:	3 days
2022-04-02 14:44:06 +02:00
Gordon Bergling
17628f1b79 cc_vegas(4): Fix a typo in a source code comment
- s/measurment/measurement/

MFC after:	3 days
2022-04-02 14:07:44 +02:00
Randall Stewart
e88412d89b Opps sorry, typo in the cc_cubic fix when morphing it from nreno. 2022-04-01 08:37:04 -04:00
Randall Stewart
653cf466f0 hystart++ may not properly exit CSS back to slowstart.
In the changes to get hystart++ into cubic an inadvertent line
was removed in the conditional to figure out if you need to exit
hystart++ back to slowstart. The line of course is the most crucial
one (the others are valid but not critical) i.e. is the new rtt
less than the point where we entered hystart++. Without the line
we end up bouncing in and out of CSS.

Reported By: Reese Enghardt
Sponsored By: Netflix Inc.
2022-04-01 08:33:44 -04:00
Randall Stewart
ea9017fb25 tcp: Congestion control move to using reference counting.
In the transport call on 12/3 Gleb asked to move the CC modules towards
using reference counting to prevent folks from unloading a module in use.
It was also agreed that Michael would do a user space utility like tcp_drop
that could be used to move all connections that are using a specific CC
to some other CC.

This is the half I committed to doing, making it so that we maintain a refcount
on a cc module every time a pcb refers to it and decrementing that every
time a pcb no longer uses a cc module. This also helps us simplify the
whole unloading process by getting rid of tcp_ccunload() which munged
through all the tcb's. Instead we mark a module as being removed and
prevent further references to it. We also make sure that if a module is
marked as being removed it cannot be made as the default and also
the opposite of that, if its a default it fails and does not mark it as being
removed.

Reviewed by: Michael Tuexen, Gleb Smirnoff
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D33249
2022-02-21 06:30:17 -05:00
Randall Stewart
a9696510f5 tcp: Add hystart++ to our cubic implementation.
As promised to the transport call on 11/4/22 here is an implementation
of hystart++ for cubic. It also cleans up the tcp_congestion function
to have a better name. Common variables are moved into the general
cc.h structure so that both cubic and newreno can use them for
hystart++

Reviewed by: Michael Tuexen, Richard Scheffenegger
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D33035
2022-02-07 06:37:46 -05:00
Cy Schubert
db0ac6ded6 Revert "wpa: Import wpa_supplicant/hostapd commit 14ab4a816"
This reverts commit 266f97b5e9, reversing
changes made to a10253cffe.

A mismerge of a merge to catch up to main resulted in files being
committed which should not have been.
2021-12-02 14:45:04 -08:00
Cy Schubert
266f97b5e9 wpa: Import wpa_supplicant/hostapd commit 14ab4a816
This is the November update to vendor/wpa committed upstream 2021-11-26.

MFC after:      1 month
2021-12-02 13:35:14 -08:00
Randall Stewart
dcf2dfed26 tcp: unloading a module that is set to default should error.
I just discovered that the return of the EBUSY error was incorrectly
rigged so that you could unload a CC module that was set to default.
Its supposed to be an EBUSY error. Make it so.

Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D33229
2021-12-02 06:12:16 -05:00
Gordon Bergling
b4fbc855a5 cc_newreno(4): Fix a typo in a source code comment
- s/conditons/conditions/

MFC after:	3 days
2021-11-19 19:16:02 +01:00
Mark Johnston
034a924009 tcp: Ensure that vnets have an initialized V_default_cc_ptr
This causes new vnets to inherit the cc algorithm from vnet0. This is a
temporary patch to fix vnet jail creation.

With encouragement from: glebius
Fixes: b8d60729de ("tcp: Congestion control cleanup.")
Differential Revision: https://reviews.freebsd.org/D32970
2021-11-12 12:18:12 -07:00
Warner Losh
7e3c9ec906 tcp: better congestion control defaults
Define CC_NEWRENO in all the appropriate DEFAULTS and std.* config
files. It's the default congestion control algorithm.  Add code to cc.c
so that CC_DEFAULT is "newreno" if it's not overriden in the config
file.

Sponsored by: Netflix
Fixes: b8d60729de ("tcp: Congestion control cleanup.")
Revired by: manu, hselasky, jhb, glebius, tuexen
Differential Revision:	https://reviews.freebsd.org/D32964
2021-11-12 12:16:11 -07:00
Randall Stewart
b8d60729de tcp: Congestion control cleanup.
NOTE: HEADS UP read the note below if your kernel config is not including GENERIC!!

This patch does a bit of cleanup on TCP congestion control modules. There were some rather
interesting surprises that one could get i.e. where you use a socket option to change
from one CC (say cc_cubic) to another CC (say cc_vegas) and you could in theory get
a memory failure and end up on cc_newreno. This is not what one would expect. The
new code fixes this by requiring a cc_data_sz() function so we can malloc with M_WAITOK
and pass in to the init function preallocated memory. The CC init is expected in this
case *not* to fail but if it does and a module does break the
"no fail with memory given" contract we do fall back to the CC that was in place at the time.

This also fixes up a set of common newreno utilities that can be shared amongst other
CC modules instead of the other CC modules reaching into newreno and executing
what they think is a "common and understood" function. Lets put these functions in
cc.c and that way we have a common place that is easily findable by future developers or
bug fixers. This also allows newreno to evolve and grow support for its features i.e. ABE
and HYSTART++ without having to dance through hoops for other CC modules, instead
both newreno and the other modules just call into the common functions if they desire
that behavior or roll there own if that makes more sense.

Note: This commit changes the kernel configuration!! If you are not using GENERIC in
some form you must add a CC module option (one of CC_NEWRENO, CC_VEGAS, CC_CUBIC,
CC_CDG, CC_CHD, CC_DCTCP, CC_HTCP, CC_HD). You can have more than one defined
as well if you desire. Note that if you create a kernel configuration that does not
define a congestion control module and includes INET or INET6 the kernel compile will
break. Also you need to define a default, generic adds 'options CC_DEFAULT=\"newreno\"
but you can specify any string that represents the name of the CC module (same names
that show up in the CC module list under net.inet.tcp.cc). If you fail to add the
options CC_DEFAULT in your kernel configuration the kernel build will also break.

Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc.
RELNOTES:YES
Differential Revision: https://reviews.freebsd.org/D32693
2021-11-11 06:28:18 -05:00
Michael Tuexen
b15b053596 tcp: allow new reno functions to be called from other CC modules
Some new reno functions use the internal data, but are also called
from functions of other CC modules. Ensure that in this case, the
internal data is not accessed.

Reported by:		syzbot+1d219ea351caa5109d4b@syzkaller.appspotmail.com
Reported by:    	syzbot+b08144f8cad9c67258c5@syzkaller.appspotmail.com
Reviewed by:		rrs
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D32649
2021-10-25 22:53:49 +02:00
Randall Stewart
4e4c84f8d1 tcp: Add hystart-plus to cc_newreno and rack.
TCP Hystart draft version -03:
https://datatracker.ietf.org/doc/html/draft-ietf-tcpm-hystartplusplus

Is a new version of hystart that allows one to carefully exit slow start if the RTT
spikes too much. The newer version has a slower-slow-start so to speak that then
kicks in for five round trips. To see if you exited too early, if not into congestion avoidance.
This commit will add that feature to our newreno CC and add the needed bits in rack to
be able to enable it.

Reviewed by: tuexen
Sponsored by: Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D32373
2021-10-22 07:10:28 -04:00
Michael Tuexen
fa3746be42 tcp: fix two bugs in new reno
* Completely initialise the CC module specific data
* Use beta_ecn in case of an ECN event whenever ABE is enabled
  or it is requested by the stack.

Reviewed by:		rscheff, rrs
MFC after:		3 days
Sponsored by:		Netflix, Inc.
2021-06-11 15:40:34 +02:00
Richard Scheffenegger
c358f1857f tcp: Use local CC data only in the correct context
Most CC algos do use local data, and when calling
newreno_cong_signal from there, the latter misinterprets
the data as its own struct, leading to incorrect behavior.

Reported by:  chengc_netapp.com
Reviewed By:  chengc_netapp.com, tuexen, #transport
MFC after:    3 days
Sponsored By: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D30470
2021-05-26 20:15:53 +02:00
Randall Stewart
5d8fd932e4 This brings into sync FreeBSD with the netflix versions of rack and bbr.
This fixes several breakages (panics) since the tcp_lro code was
committed that have been reported. Quite a few new features are
now in rack (prefecting of DGP -- Dynamic Goodput Pacing among the
largest). There is also support for ack-war prevention. Documents
comming soon on rack..

Sponsored by:           Netflix
Reviewed by:		rscheff, mtuexen
Differential Revision:	https://reviews.freebsd.org/D30036
2021-05-06 11:22:26 -04:00