opnsense-src

mirror of https://github.com/opnsense/src.git synced 2026-05-20 00:45:45 -04:00

Author	SHA1	Message	Date
Mark Johnston	668a555de6	rip: Add missing minimum length validation in rip_output() If the socket is configured such that the sender is expected to supply the IP header, then we need to verify that it actually did so. Reported by: syzkaller+KMSAN Reviewed by: donner Sponsored by: The FreeBSD Foundation (cherry picked from commit `ba21825202`)	2021-08-02 15:01:11 -04:00
Richard Scheffenegger	e4ee2a39ad	tcp: Add PRR cwnd reduction for non-SACK loss This completes PRR cwnd reduction in all circumstances for the base TCP stack (SACK loss recovery, ECN window reduction, non-SACK loss recovery), preventing the arriving ACKs to clock out new data at the old, too high rate. This reduces the chance to induce additional losses while recovering from loss (during congested network conditions). For non-SACK loss recovery, each ACK is assumed to have one MSS delivered. In order to prevent ACK-split attacks, only one window worth of ACKs is considered to actually have delivered new data. MFC after: 6 weeks Reviewed By: rrs, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29441 (cherry picked from commit `74d7fc8753`)	2021-08-02 13:59:23 +02:00
Kristof Provost	c3d03672e1	pf: syncookie support Import OpenBSD's syncookie support for pf. This feature help pf resist TCP SYN floods by only creating states once the remote host completes the TCP handshake rather than when the initial SYN packet is received. This is accomplished by using the initial sequence numbers to encode a cookie (hence the name) in the SYN+ACK response and verifying this on receipt of the client ACK. Reviewed by: kbowling Obtained from: OpenBSD MFC after: 1 week Sponsored by: Modirum MDPay Differential Revision: https://reviews.freebsd.org/D31138 (cherry picked from commit `8e1864ed07`)	2021-07-27 09:42:25 +02:00
Michael Tuexen	9b1219b24a	tcp: fix RACK and BBR when using VIMAGE enabled kernel Fix a bug in VNET handling, which occurs when using specific NICs. PR: 257195 Reviewed by: rrs Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D31212 (cherry picked from commit `a730d82378`)	2021-07-22 11:13:31 +02:00
Stefan Eßer	791035c8da	libalias: fix divide by zero causing panic The packet_limit can fall to 0, leading to a divide by zero abort in the "packets % packet_limit". An possible solution would be to apply a lower limit of 1 after the calculation of packet_limit, but since any number modulo 1 gives 0, the more efficient solution is to skip the modulo operation for packet_limit <= 1. Reported by: Karl Denninger <karl@denninger.net> (cherry picked from commit `58080fbca0`)	2021-07-14 13:49:21 +02:00
Andrew Gallatin	7751a6b585	tcp: fix alternate stack build with LINT-NO{INET,INET6,IP} When fixing another bug, I noticed that the alternate TCP stacks do not build when various combinations of ipv4 and ipv6 are disabled. Reviewed by: rrs, tuexen Differential Revision: https://reviews.freebsd.org/D31094 Sponsored by: Netflix (cherry picked from commit `b1e806c0ed`)	2021-07-13 22:00:50 +02:00
Randall Stewart	1bb521ab7d	tcp: Fix 32 bit platform breakage This fixes the incorrect use of a sysctl add to u64. It was for a useconds time, but on 32 bit platforms its not a u64. Instead use the long directive. Reviewed by: tuexen Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D31107 (cherry picked from commit `7312e4e5cf`)	2021-07-13 21:59:50 +02:00
Randall Stewart	deb3c279d1	tcp: HPTS performance enhancements HPTS drives both rack and bbr, and yet there have been many complaints about performance. This bit of work restructures hpts to help reduce CPU overhead. It does this by now instead of relying on the timer/callout to drive it instead use user return from a system call as well as lro flushes to drive hpts. The timer becomes a backstop that dynamically adjusts based on how "late" we are. Reviewed by: tuexen, glebius Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D31083 (cherry picked from commit `d7955cc0ff`)	2021-07-13 21:58:30 +02:00
Randall Stewart	3b2aeae726	tcp: Address goodput and TLP edge cases. There are several cases where we make a goodput measurement and we are running out of data when we decide to make the measurement. In reality we should not make such a measurement if there is no chance we can have "enough" data. There is also some corner case TLP's that end up not registering as a TLP like they should, we fix this by pushing the doing_tlp setup to the actual timeout that knows it did a TLP. This makes it so we always have the appropriate flag on the sendmap indicating a TLP being done as well as count correctly so we make no more that two TLP's. In addressing the goodput lets also add a "quality" metric that can be viewed via blackbox logs so that a casual observer does not have to figure out how good of a measurement it is. This is needed due to the fact that we may still make a measurement that is of a poorer quality as we run out of data but still have a minimal amount of data to make a measurement. Reviewed by: tuexen Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D31076 (cherry picked from commit `e834f9a44a`)	2021-07-13 21:57:20 +02:00
Randall Stewart	2e1fdc728b	tcp: Preparation for allowing hardware TLS to be able to kick a tcp connection that is retransmitting too much out of hardware and back to software. Hardware TLS is now supported in some interface cards and it works well. Except that when we have connections that retransmit a lot we get into trouble with all the retransmits. This prep step makes way for change that Drew will be making so that we can "kick out" a session from hardware TLS. Reviewed by: tuexen, gallatin Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D30895 (cherry picked from commit `9e4d9e4c4d`)	2021-07-13 21:56:06 +02:00
Andrew Gallatin	1843b06dee	tcp: enter network epoch when calling tfb_tcp_fb_fini We need to enter the network epoch when calling into tfb_tcp_fb_fini. I noticed this when I hit an assert running the latest rack Differential Revision: https://reviews.freebsd.org/D30407 Reviewed by: rrs, tuexen Sponsored by: Netflix (cherry picked from commit `086a35562f`)	2021-07-13 21:54:43 +02:00
Randall Stewart	648c68168c	tcp: Rack not being very friendly with V6:4 socket and having a connection from V4 There were two bugs that prevented V4 sockets from connecting to a rack server running a V4/V6 socket. As well as a bug that stops the mapped v4 in V6 address from working. Reviewed by: tuexen Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D30885 PR: 256657 (cherry picked from commit `66aec14a53`)	2021-07-13 20:51:55 +02:00
Michael Tuexen	a3f96f1e57	sctp: Fix errno in case of association setup failures Do not report always ETIMEDOUT, but only when appropriate. In other cases report ECONNABORTED. (cherry picked from commit `105b68b42d`)	2021-07-13 20:30:57 +02:00
Michael Tuexen	9e8d5ea2b5	sctp: provide consistent stream information in case of early errors While there, make sure the function is called correctly. (cherry picked from commit `ce64352a70`)	2021-07-13 20:30:21 +02:00
Michael Tuexen	7f21500f49	sctp: provide sac_error also for ABORT chunk being sent Thanks to Florent Castelli for bringing this issue up for the userland stack and providing an initial patch. (cherry picked from commit `84992a3251`)	2021-07-13 20:29:33 +02:00
Michael Tuexen	6a4f29a3c4	sctp: initialize sequence numbers for ECN correctly Reported by: Junseok Yang (for the userland stack) (cherry picked from commit `c7f048ab35`)	2021-07-13 20:28:48 +02:00
Michael Tuexen	24df96b642	sctp: Fix length check for ECNE chunks (cherry picked from commit `6587a2bd1e`)	2021-07-13 20:27:58 +02:00
Michael Tuexen	c5ba872129	tcp: tolerate missing timestamps Some TCP stacks negotiate TS support, but do not send TS at all or not for keep-alive segments. Since this includes modern widely deployed stacks, tolerate the violation of RFC 7323 per default. Reviewed by: rgrimes, rrs, rscheff Differential Revision: https://reviews.freebsd.org/D30740 Sponsored by: Netflix, Inc. (cherry picked from commit `870af3f4dc`)	2021-07-13 20:24:09 +02:00
Lutz Donnerhacke	3a96a25da8	libalias: Switch to SPLAY trees Current data structure is using a hash of unordered lists. Those unordered lists are quite efficient, because the least recently inserted entries are most likely to be used again. In order to avoid long search times in other cases, the lists are hashed into many buckets. Unfortunatly a search for a miss needs an exhaustive inspection and a careful definition of the hash. Splay trees offer a similar feature - almost O(1) for access of the least recently used entries), and amortized O(ln(n) - for almost all other cases. Get rid of the hash. Now the data structure should able to quickly react to external packets without eating CPU cycles for breakfast, preventing a DoS. PR: 192888 Discussed with: Dimitry Luhtionov Differential Revision: https://reviews.freebsd.org/D30516 Differential Revision: https://reviews.freebsd.org/D30536 Differential Revision: https://reviews.freebsd.org/D30844 (cherry picked from commit `935fc93af1`) (cherry picked from commit `d261e57dea`) (cherry picked from commit `f70c98a2f5`) (cherry picked from commit `25392fac94`) (cherry picked from commit `2f4d91f9cb`) (cherry picked from commit `4060e77f49`)	2021-07-06 08:55:53 +02:00
Lutz Donnerhacke	78d515b222	libalias: Restructure Clean up the database handling in order to switch to more efficient data structures. The development of this patch was artificially split in to many small steps to ease reviewing. - Common search terms - Separate fully qualified search - Separate table for partial links - Cleanup _FindLinkIn - Factor out the outgoing search function - Factor out a common idiom to return found links - Reorder incoming links by grouping of common search terms - Remove LSNAT from outgoing search - Group internal structure semantically - Separate table for PPTP - Use AliasRange instead of PORT_BASE - Remove temporary state deleteAllLinks from global struct - Avoid uninitialized expiration Discussed with: Dimitry Luhtionov Differential Revision: https://reviews.freebsd.org/D30568 Differential Revision: https://reviews.freebsd.org/D30569 Differential Revision: https://reviews.freebsd.org/D30570 Differential Revision: https://reviews.freebsd.org/D30571 Differential Revision: https://reviews.freebsd.org/D30572 Differential Revision: https://reviews.freebsd.org/D30573 Differential Revision: https://reviews.freebsd.org/D30574 Differential Revision: https://reviews.freebsd.org/D30575 Differential Revision: https://reviews.freebsd.org/D30580 Differential Revision: https://reviews.freebsd.org/D30581 Differential Revision: https://reviews.freebsd.org/D30604 Differential Revision: https://reviews.freebsd.org/D30582 (cherry picked from commit `d41044ddfd`) (cherry picked from commit `32f9c2ceb3`) (cherry picked from commit `cac129e603`) (cherry picked from commit `19dcc4f225`) (cherry picked from commit `d541903438`) (cherry picked from commit `d4ab07d2ae`) (cherry picked from commit `492d3b7109`) (cherry picked from commit `7b44ff4c52`) (cherry picked from commit `1178dda53d`) (cherry picked from commit `9efcad61d8`) (cherry picked from commit `fe83900f9f`) (cherry picked from commit `d989935b5b`) (cherry picked from commit `b50a4dce18`) (cherry picked from commit `f284553444`)	2021-07-06 08:55:53 +02:00
Lutz Donnerhacke	390866d47e	libalias: Promote per instance global variable timeStamp Summary: - Use LibAliasTime as a real global variable for central timekeeping. - Reduce number of syscalls in user space considerably. - Dynamically adjust the packet counters to match the second resolution. - Only check the first few packets after a time increase for expiry. Discussed with: hselasky Differential Revision: https://reviews.freebsd.org/D30566 (cherry picked from commit `ef828d39be`)	2021-07-06 08:55:53 +02:00
Lutz Donnerhacke	69965155a5	libalias: Stats are unsigned Stats counters are used as unsigned valued (i.e. printf("%u")) but are defined as signed int. This causes trouble later, so fix it early. Differential Revision: https://reviews.freebsd.org/D30587 (cherry picked from commit `3fd20a79e7`)	2021-07-06 08:55:52 +02:00
Lutz Donnerhacke	3423d44cd1	libalias: tidy up housekeeping Replace current expensive, but sparsly called housekeeping by a single, repetive action. This is part of a larger restructure of libalias in order to switch to more efficient data structures. The whole restructure process is split into 15 reviews to ease reviewing. All those steps will be squashed into a single commit for MFC in order to hide the intermediate states from production systems. Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D30277 (cherry picked from commit `294799c6b0`)	2021-07-06 08:55:52 +02:00
Mark Johnston	d77e57f125	Consistently use the SOCKBUF_MTX() and SOCK_MTX() macros This makes it easier to change the socket locking protocols. No functional change intended. Sponsored by: The FreeBSD Foundation (cherry picked from commit `a100217489`)	2021-06-21 09:14:48 -04:00
Mark Johnston	46d8116cae	Consistently use the SOLISTENING() macro Some code was using it already, but in many places we were testing SO_ACCEPTCONN directly. As a small step towards fixing some bugs involving synchronization with listen(2), make the kernel consistently use SOLISTENING(). No functional change intended. Sponsored by: The FreeBSD Foundation (cherry picked from commit `f4bb1869dd`)	2021-06-21 09:14:40 -04:00
Marko Zec	4715d948c5	Introduce DXR as an IPv4 longest prefix matching / FIB module DXR maintains compressed lookup structures with a trivial search procedure. A two-stage trie is indexed by the more significant bits of the search key (IPv4 address), while the remaining bits are used for finding the next hop in a sorted array. The tradeoff between memory footprint and search speed depends on the split between the trie and the remaining binary search. The default of 20 bits of the key being used for trie indexing yields good performance (see below) with footprints of around 2.5 Bytes per prefix with current BGP snapshots. Rebuilding lookup structures takes some time, which is compensated for by batching several RIB change requests into a single FIB update, i.e. FIB synchronization with the RIB may be delayed for a fraction of a second. RIB to FIB synchronization, next-hop table housekeeping, and lockless lookup capability is provided by the FIB_ALGO infrastructure. DXR works well on modern CPUs with several MBytes of caches, especially in VMs, where is outperforms other currently available IPv4 FIB algorithms by a large margin. Reviewed by: melifaro MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D29821 (cherry picked from commit `2aca58e16f`)	2021-06-17 12:07:05 +02:00
Zhenlei Huang	7da8312f7b	Do not forward datagrams originated by link-local addresses The current implement of ip_input() reject packets destined for 169.254.0.0/16, but not those original from 169.254.0.0/16 link-local addresses. Fix to fully respect RFC 3927 section 2.7. PR: 255388 Reviewed by: donner, rgrimes, karels Differential Revision: https://reviews.freebsd.org/D29968 Reviewed by: rgrimes, donner, karels, marcus, emaste Differential Revision: https://reviews.freebsd.org/D30374 (cherry picked from commit `3d846e4822`) (cherry picked from commit `03b0505b8f`)	2021-06-17 10:08:59 +02:00
Randall Stewart	d0eaf95edc	tcp: Missing mfree in rack and bbr Recently (Nov) we added logic that protects against a peer negotiating a timestamp, and then not including a timestamp. This involved in the input path doing a goto done_with_input label. Now I suspect the code was cribbed from one in Rack that has to do with the SYN. This had a bug, i.e. it should have a m_freem(m) before going to the label (bbr had this missing m_freem() but rack did not). This then caused the missing m_freem to show up in both BBR and Rack. Also looking at the code referencing m->m_pkthdr.lro_nsegs later (after processing) is not a good idea, even though its only for logging. Best to copy that off before any frees can take place. Reviewed by: mtuexen Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D30727 (cherry picked from commit `ba1b3e48f5`)	2021-06-14 23:00:17 +02:00
Randall Stewart	8ecbecdcfd	tcp: Mbuf leak while holding a socket buffer lock. When running at NF the current Rack and BBR changes with the recent commits from Richard that cause the socket buffer lock to be held over the ip_output() call and then finally culminating in a call to tcp_handle_wakeup() we get a lot of leaked mbufs. I don't think that this leak is actually caused by holding the lock or what Richard has done, but is exposing some other bug that has probably been lying dormant for a long time. I will continue to look (using his changes) at what is going on to try to root cause out the issue. In the meantime I can't leave the leaks out for everyone else. So this commit will revert all of Richards changes and move both Rack and BBR back to just doing the old sorwakeup_locked() calls after messing with the so_rcv buffer. We may want to look at adding back in Richards changes after I have pinpointed the root cause of the mbuf leak and fixed it. Reviewed by: mtuexen,rscheff Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D30704 (cherry picked from commit `67e892819b`)	2021-06-14 22:51:42 +02:00
Randall Stewart	2071c3fb0d	tcp: LRO timestamps have lost their previous precision Recently we had a rewrite to tcp_lro.c that was tested but one subtle change was the move to a less precise timestamp. This causes all kinds of chaos in tcp's that do pacing and needs to be fixed to use the more precise time that was there before. Reviewed by: mtuexen, gallatin, hselasky Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D30695 (cherry picked from commit `b45daaea95`)	2021-06-14 22:49:27 +02:00
Michael Tuexen	82f75079f1	tcp: fix two bugs in new reno * Completely initialise the CC module specific data * Use beta_ecn in case of an ECN event whenever ABE is enabled or it is requested by the stack. Reviewed by: rscheff, rrs Sponsored by: Netflix, Inc. (cherry picked from commit `fa3746be42`)	2021-06-14 01:29:14 +02:00
Michael Tuexen	fce16041a8	tcp: remove debug output from RACK Reported by: iron.udjin@gmail.com, Marek Zarychta Reviewed by: rrs PR: 256538 Differential Revision: https://reviews.freebsd.org/D30723 Sponsored by: Netflix, Inc. (cherry picked from commit `f1536bb538`)	2021-06-14 01:28:19 +02:00
Michael Tuexen	7a2030a106	tcp: fix compilation of IPv4-only builds PR: 256538 Reported by: iron.udjin@gmail.com Sponsored by: Netflix, Inc. (cherry picked from commit `224cf7b35b`)	2021-06-14 01:27:17 +02:00
Hans Petter Selasky	ca81bcbbf1	Add missing chunks after cherry-picking `9ca874cf74` from main to stable/13: Add TCP LRO support for VLAN and VxLAN. Make sure all counters are allocated. This is a direct commit. Reported by: Herbert J. Skuhra <herbert@gojira.at> Sponsored by: Mellanox Technologies // NVIDIA Networking	2021-06-11 15:51:05 +02:00
Michael Tuexen	71b88ee39f	tcp: fix a RACK socket buffer lock issue Fix a missing socket buffer unlocking of the socket receive buffer. Reviewed by: gallatin, rrs MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D30402 (cherry picked from commit `9bbd1a8fcb`)	2021-06-09 12:51:50 +02:00
Richard Scheffenegger	0230e6cf56	rack: honor prior socket buffer lock when doing the upcall While partially reverting D24237 with D29690, due to introducing some unintended effects for in-kernel TCP consumers, the preexisting lock on the socket send buffer was not considered properly. Found by: markj MFC after: 2 weeks Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D30390 (cherry picked from commit `3975688563`)	2021-06-09 12:51:36 +02:00
Richard Scheffenegger	55cc0a4785	[tcp] Keep socket buffer locked until upcall r367492 would unlock the socket buffer before eventually calling the upcall. This leads to problematic interaction with NFS kernel server/client components (MP threads) accessing the socket buffer with potentially not correctly updated state. Reported by: rmacklem Reviewed By: tuexen, #transport Tested by: rmacklem, otis MFC after: 2 weeks Sponsored By: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29690 (cherry picked from commit `032bf749fd`)	2021-06-09 12:51:19 +02:00
Randall Stewart	fc53b7269f	tcp: A better fix for the previously attempted fix of the ack-war issue with tcp. So it turns out that my fix before was not correct. It ended with us failing some of the "improved" SYN tests, since we are not in the correct states. With more digging I have figured out the root of the problem is that when we receive a SYN\|FIN the reassembly code made it so we create a segq entry to hold the FIN. In the established state where we were not in order this would be correct i.e. a 0 len with a FIN would need to be accepted. But if you are in a front state we need to strip the FIN so we correctly handle the ACK but ignore the FIN. This gets us into the proper states and avoids the previous ack war. I back out some of the previous changes but then add a new change here in tcp_reass() that fixes the root cause of the issue. We still leave the rack panic fixes in place however. Reviewed by: mtuexen Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D30627 (cherry picked from commit `4747500dea`)	2021-06-09 02:19:47 +02:00
Randall Stewart	eb91abb4ba	tcp: When we have an out-of-order FIN we do want to strip off the FIN bit. The last set of commits fixed both a panic (in rack) and an ACK-war (in freebsd and bbr). However there was a missing case, i.e. where we get an out-of-order FIN by itself. In such a case we don't want to leave the FIN bit set, otherwise we will do the wrong thing and ack the FIN incorrectly. Instead we need to go through the tcp_reasm() code and that way the FIN will be stripped and all will be well. Reviewed by: mtuexen,rscheff Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D30497 (cherry picked from commit `8c69d988a8`)	2021-06-09 02:18:36 +02:00
Randall Stewart	362f95f528	tcp: Add a socket option to rack so we can test various changes to the slop value in timers. Timer_slop, in TCP, has been 200ms for a long time. This value dates back a long time when delayed ack timers were longer and links were slower. A 200ms timer slop allows 1 MSS to be sent over a 60kbps link. Its possible that lowering this value to something more in line with todays delayed ack values (40ms) might improve TCP. This bit of code makes it so rack can, via a socket option, adjust the timer slop. Reviewed by: mtuexen Sponsered by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D30249 (cherry picked from commit `4f3addd94b`)	2021-06-09 02:16:00 +02:00
Randall Stewart	12e181b672	tcp: Fix bugs related to the PUSH bit and rack and an ack war Michaels testing with UDP tunneling found an issue with the push bit, which was only partly fixed in the last commit. The problem is the left edge gets transmitted before the adjustments are done to the send_map, this means that right edge bits must be considered to be added only if the entire RSM is being retransmitted. Now syzkaller also continued to find a crash, which Michael sent me the reproducer for. Turns out that the reproducer on default (freebsd) stack made the stack get into an ack-war with itself. After fixing the reference issues in rack the same ack-war was found in rack (and bbr). Basically what happens is we go into the reassembly code and lose the FIN bit. The trick here is we should not be going into the reassembly code if tlen == 0 i.e. the peer never sent you anything. That then gets the proper action on the FIN bit but then you end up in LAST_ACK with no timers running. This is because the usrclosed function gets called and the FIN's and such have already been exchanged. So when we should be entering FIN_WAIT2 (or even FIN_WAIT1) we get stuck in LAST_ACK. Fixing this means tweaking the usrclosed function so that we properly recognize the condition and drop into FIN_WAIT2 where a timer will allow at least TP_MAXIDLE before closing (to allow time for the peer to retransmit its FIN if the ack is lost). Setting the fast_finwait2 timer can speed this up in testing. Reviewed by: mtuexen,rscheff Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D30451 (cherry picked from commit `13c0e198ca`)	2021-06-09 02:13:32 +02:00
Randall Stewart	e99fa57b98	tcp: Fix an issue with the PUSH bit as well as fill in the missing mtu change for fsb's The push bit itself was also not actually being properly moved to the right edge. The FIN bit was incorrectly on the left edge. We fix these two issues as well as plumb in the mtu_change for alternate stacks. Reviewed by: mtuexen Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D30413 (cherry picked from commit `631449d5d0`)	2021-06-09 02:12:21 +02:00
Michael Tuexen	6264ff9bd9	tcp: Handle stack switch while processing socket options Handle the case where during socket option processing, the user switches a stack such that processing the stack specific socket option does not make sense anymore. Return an error in this case. Reviewed by: markj Reported by: syzbot+a6e1d91f240ad5d72cd1@syzkaller.appspotmail.com Sponsored by: Netflix, Inc. Differential revision: https://reviews.freebsd.org/D30395 (cherry picked from commit `8923ce6304`)	2021-06-09 02:09:37 +02:00
Michael Tuexen	56aeedd2fd	tcp: Fix sending of TCP segments with IP level options When bringing in TCP over UDP support in https://cgit.FreeBSD.org/src/commit/?id=9e644c23000c2f5028b235f6263d17ffb24d3605, the length of IP level options was considered when locating the transport header. This was incorrect and is fixed by this patch. X-MFC with: https://cgit.FreeBSD.org/src/commit/?id=9e644c23000c2f5028b235f6263d17ffb24d3605 Reviewed by: markj, rscheff Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D30358 (cherry picked from commit `500eb6dd80`)	2021-06-09 02:06:26 +02:00
Randall Stewart	3a3bba7df5	tcp: Incorrect KASSERT causes a panic in rack Skyzall found an interesting panic in rack. When a SYN and FIN are both sent together a KASSERT gets tripped where it is validating that a mbuf pointer is in the sendmap. But a SYN and FIN often will not have a mbuf pointer. So the fix is two fold a) make sure that the SYN and FIN split the right way when cloning an RSM SYN on left edge and FIN on right. And also make sure the KASSERT properly accounts for the case that we have a SYN or FIN so we don't panic. Reviewed by: mtuexen Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D30241 (cherry picked from commit `02cffbc250`)	2021-06-09 02:05:15 +02:00
Michael Tuexen	6170c93c03	tcp rack: improve initialisation of retransmit timeout When the TCP is in the front states, don't take the slop variable into account. This improves consistency with the base stack. Reviewed by: rrs@ Differential Revision: https://reviews.freebsd.org/D30230 Sponsored by: Netflix, Inc. (cherry picked from commit `251842c639`)	2021-06-09 02:02:39 +02:00
Randall Stewart	ecfc25f05b	tcp: In rack, we must only convert restored rtt when the hostcache does restore them. Rack now after the previous commit is very careful to translate any value in the hostcache for srtt/rttvar into its proper format. However there is a snafu here in that if tp->srtt is 0 is the only time that the HC will actually restore the srtt. We need to then only convert the srtt restored when it is actually restored. We do this by making sure it was zero before the call to cc_conn_init and it is non-zero afterwards. Reviewed by: Michael Tuexen Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D30213 (cherry picked from commit `4b86a24a76`)	2021-06-09 02:01:32 +02:00
Randall Stewart	87cf5dcc33	tcp:Host cache and rack ending up with incorrect values. The hostcache up to now as been updated in the discard callback but without checking if we are all done (the race where there are more than one calls and the counter has not yet reached zero). This means that when the race occurs, we end up calling the hc_upate more than once. Also alternate stacks can keep there srtt/rttvar in different formats (example rack keeps its values in microseconds). Since we call the hc_update before the stack fini() then the values will be in the wrong format. Rack on the other hand, needs to convert items pulled from the hostcache into its internal format else it may end up with very much incorrect values from the hostcache. In the process lets commonize the update mechanism for srtt/rttvar since we now have more than one place that needs to call it. Reviewed by: Michael Tuexen Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D30172 (cherry picked from commit `9867224bab`)	2021-06-09 02:00:27 +02:00
Randall Stewart	4651125ac6	This takes Warners suggested approach to making it so that platforms that for whatever reason cannot include the RATELIMIT option can still work with rack. It adds two dummy functions that rack will call and find out that the highest hw supported b/w is 0 (which kinda makes sense and rack is already prepared to handle). Reviewed by: Michael Tuexen, Warner Losh Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D30163 (cherry picked from commit `5a4333a537`)	2021-06-09 01:59:21 +02:00
Randall Stewart	f5d0badc70	Fix a UDP tunneling issue with rack. Basically there are two issues. A) Not enough hdrlen was being calculated when a UDP tunnel is in place. and B) Not enough memory is allocated in racks fsb. We need to overbook the fsb to include a udphdr just in case. Submitted by: Peter Lei Reviewed by: Michael Tuexen Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D30157 (cherry picked from commit `a16cee0218`)	2021-06-09 01:58:06 +02:00

1 2 3 4 5 ...

6928 commits