opnsense-src

mirror of https://github.com/opnsense/src.git synced 2026-04-29 18:32:49 -04:00

Author	SHA1	Message	Date
Michael Tuexen	b7e08865e8	Unbreak INET-less build. Reported by bz@ MFC after: 1 week	2011-05-18 19:49:39 +00:00
Michael Tuexen	4f36da915f	Copy out the mtu when calling getsockopt() with SCTP_GET_PEER_ADDR_INFO. MFC after: 1 week.	2011-05-17 15:57:31 +00:00
Michael Tuexen	c954cac48b	Fix whitespacing. Reported by scf@ MFC after: 1 week.	2011-05-17 15:46:28 +00:00
Michael Tuexen	96f4bcfff2	Fix the source address selection for boundall sockets when sending INITs to a global IPv4 address having only private IPv4 address. Allow the usage of a private address and make sure that no other private address will be used by the association. Initial work was done by rrs@. MFC after: 1 week.	2011-05-14 18:22:14 +00:00
John Baldwin	5891ebd6cd	Oops, fix order of sequence numbers in KASSERT()'s to catch negative receive windows to match the labels in the panic message. Submitted by: trociny	2011-05-14 14:41:40 +00:00
Alexander Motin	bc7d18ae72	Refactor TCP ISN increment logic. Instead of firing callout at 100Hz to keep constant ISN growth rate, do the same directly inside tcp_new_isn(), taking into account how much time (ticks) passed since the last call. On my test systems this decreases idle interrupt rate from 140Hz to 70Hz.	2011-05-09 07:37:47 +00:00
Michael Tuexen	689e6a5fa3	Fix a locking issue showing up on Mac OS X when subscribing to authentication events. DTLS/SCTP renegotiations trigger the bug. MFC after: 2 weeks.	2011-05-08 09:11:59 +00:00
Michael Tuexen	936fc35bb3	Change the name of an internal structure, since the name is used by a structure of the (new) SCTP API. MFC after: 1 week.	2011-05-06 20:40:33 +00:00
Andrey V. Elsukov	318b735cc3	Convert delay parameter back to ms when reporting to user. PR: 156838 MFC after: 1 week	2011-05-06 07:13:34 +00:00
Michael Tuexen	c3d72c80d3	Implement Resource Pooling V2 and an MPTCP like congestion control. Based on a patch received from Martin Becke. MFC after: 2 weeks.	2011-05-04 21:27:05 +00:00
Michael Tuexen	274b0bd51d	Remove code with any effect.	2011-05-03 20:34:02 +00:00
Michael Tuexen	1d663b4658	Add a missing break. This bug was introduced in r221249. MFC after: 1 week	2011-05-03 20:32:21 +00:00
John Baldwin	f701e30d7f	Handle a rare edge case with nearly full TCP receive buffers. If a TCP buffer fills up causing the remote sender to enter into persist mode, but there is still room available in the receive buffer when a window probe arrives (either due to window scaling, or due to the local application very slowing draining data from the receive buffer), then the single byte of data in the window probe is accepted. However, this can cause rcv_nxt to be greater than rcv_adv. This condition will only last until the next ACK packet is pushed out via tcp_output(), and since the previous ACK advertised a zero window, the ACK should be pushed out while the TCP pcb is write-locked. During the window while rcv_nxt is greather than rcv_adv, a few places would compute the remaining receive window via rcv_adv - rcv_nxt. However, this value was then (uint32_t)-1. On a 64 bit machine this could expand to a positive 2^32 - 1 when cast to a long. In particular, when calculating the receive window in tcp_output(), the result would be that the receive window was computed as 2^32 - 1 resulting in advertising a far larger window to the remote peer than actually existed. Fix various places that compute the remaining receive window to either assert that it is not negative (i.e. rcv_nxt <= rcv_adv), or treat the window as full if rcv_nxt is greather than rcv_adv. Reviewed by: bz MFC after: 1 month	2011-05-02 21:05:52 +00:00
Michael Tuexen	ea5eba1157	Some more cleanups related to an kernel without INET. MFC after: 1 week	2011-05-02 15:53:00 +00:00
Bjoern A. Zeeb	29bd2010d4	Fix a mismerge from p4 in that in_localaddr() is not available without INET. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days	2011-04-30 16:30:18 +00:00
Michael Tuexen	d085528d04	Remove some leftover debug code. MFC after: 1 week	2011-04-30 11:22:30 +00:00
Bjoern A. Zeeb	b287c6c70c	Make the TCP code compile without INET. Sort #includes and add #ifdef INETs. Add some comments at #endifs given more nestedness. To make the compiler happy, some default initializations were added in accordance with the style on the files. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days	2011-04-30 11:21:29 +00:00
Michael Tuexen	e6194c2ed4	Improve compilation of SCTP code without INET support. Some bugs where fixed while doing this: * ASCONF-ACK messages might use wrong port number when using IPv6. * Checking for additional addresses takes the correct address into account and also does not do more comparisons than necessary. This patch is based on one received from bz@ who was sponsored by The FreeBSD Foundation and iXsystems. MFC after: 1 week	2011-04-30 11:18:16 +00:00
Bjoern A. Zeeb	79288c112c	Make the UDP code compile without INET. Expose udp_usrreq.c to IPv6 only as well compiling out most functions adding or extending #ifdef INET coverage. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days	2011-04-30 11:17:00 +00:00
Bjoern A. Zeeb	67107f4594	Make the PCB code compile without INET support by adding #ifdef INETs and correcting few #includes. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days	2011-04-30 11:04:34 +00:00
John Baldwin	672dc4aea2	TCP reuses t_rxtshift to determine the backoff timer used for both the persist state and the retransmit timer. However, the code that implements "bad retransmit recovery" only checks t_rxtshift to see if an ACK has been received in during the first retransmit timeout window. As a result, if ticks has wrapped over to a negative value and a socket is in the persist state, it can incorrectly treat an ACK from the remote peer as a "bad retransmit recovery" and restore saved values such as snd_ssthresh and snd_cwnd. However, if the socket has never had a retransmit timeout, then these saved values will be zero, so snd_ssthresh and snd_cwnd will be set to 0. If the socket is in fast recovery (this can be caused by excessive duplicate ACKs such as those fixed by 220794), then each ACK that arrives triggers either NewReno or SACK partial ACK handling which clamps snd_cwnd to be no larger than snd_ssthresh. In effect, the socket's send window is permamently stuck at 0 even though the remote peer is advertising a much larger window and pending data is only sent via TCP window probes (so one byte every few seconds). Fix this by adding a new TCP pcb flag (TF_PREVVALID) that indicates that the various snd_*_prev fields in the pcb are valid and only perform "bad retransmit recovery" if this flag is set in the pcb. The flag is set on the first retransmit timeout that occurs and is cleared on subsequent retransmit timeouts or when entering the persist state. Reviewed by: bz MFC after: 2 weeks	2011-04-29 15:40:12 +00:00
Bjoern A. Zeeb	b8e463e644	MfP4 CH=192029: Expose ip_icmp.c to INET6 as well and only export badport_bandlim() along with the two sysctls in the non-INET case. The bandlim types work for all cases I reviewed in IPv6 as well and the sysctls are available as we export net.inet.* from in_proto.c. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days	2011-04-27 19:36:35 +00:00
Bjoern A. Zeeb	74e9dcf786	MfP4 CH=192004: Move ip_defttl to raw_ip.c where it is actually used. In an IPv6 only world we do not want to compile ip_input.c in for that and it is a shared default with INET6. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days	2011-04-27 19:32:27 +00:00
Bjoern A. Zeeb	a0ae8f04e8	Make various (pseudo) interfaces compile without INET in the kernel adding appropriate #ifdefs. For module builds the framework needs adjustments for at least carp. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days	2011-04-27 19:30:44 +00:00
Attilio Rao	2903309aca	Add the possibility to verify MD5 hash of incoming TCP packets. As long as this is a costy function, even when compiled in (along with the option TCP_SIGNATURE), it can be disabled via the net.inet.tcp.signature_verify_input sysctl. Sponsored by: Sandvine Incorporated Reviewed by: emaste, bz MFC after: 2 weeks	2011-04-25 17:13:40 +00:00
Bjoern A. Zeeb	acaeca65b3	Be less strict on includes than in r220746. We need in.h for both INET or INET6 as it holds all the IPPROTO_* definitions needed for the SYSCTL_NODE definitions. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 5 days	2011-04-25 16:36:16 +00:00
Gleb Smirnoff	acdef0460e	Use size_t for sopt_valsize. Submitted by: Brandon Gooch <jamesbrandongooch gmail.com>	2011-04-21 08:18:55 +00:00
Bjoern A. Zeeb	00c081e908	MFp4 CH=191760: When compiling out INET we still need the initialization routines as well as the tuning and montoring sysctls shared with IPv6. Move the two send/recvspace variables up from the middle of the file to ease compiling out the INET only code. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 3 days	2011-04-20 08:03:22 +00:00
Bjoern A. Zeeb	aae49dd304	MFp4 CH=191470: Move the ipport_tick_callout and related functions from ip_input.c to in_pcb.c. The random source port allocation code has been merged and is now local to in_pcb.c only. Use a SYSINIT to get the callout started and no longer depend on initialization from the inet code, which would not work in an IPv6 only setup. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days	2011-04-20 08:00:29 +00:00
Bjoern A. Zeeb	ec4f97277f	MFp4 CH=191466: Move fw_one_pass to where it belongs: it is a property of ipfw, not of ip_input. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 3 days	2011-04-20 07:55:33 +00:00
Gleb Smirnoff	9d0a2ddf69	- Rewrite functions that copyin/out NAT configuration, so that they calculate required memory size dynamically. - Fix races on chain re-lock. - Introduce new field to ip_fw_chain - generation count. Now utilized only in the NAT configuration, but can be utilized wider in ipfw. - Get rid of NAT_BUF_LEN in ip_fw.h PR: kern/143653	2011-04-19 15:06:33 +00:00
Andrey V. Elsukov	e3665201f5	Add sysctl handlers for net.inet.ip.dummynet.hash_size, .pipe_byte_limit and .pipe_slot_limit oids to prevent to set incorrect values. MFC after: 2 weeks	2011-04-19 11:33:39 +00:00
Andrey V. Elsukov	8ad66025f6	ipdn_bound_var() functions is designed to bound a variable between specified minimum and maximum. In case when specified default value is out of bounds it does not work as expected and does not limit variable. Check that default value is in range and limit it if needed. Also bump max_hash_size value to 65536 to correspond with manual page. PR: kern/152887 MFC after: 2 weeks	2011-04-19 11:29:09 +00:00
Andrey V. Elsukov	3ab4af737d	Use M_WAITOK instead M_WAIT for malloc. Remove unneded checks. MFC after: 1 week	2011-04-19 05:59:37 +00:00
Gleb Smirnoff	ca47294ddf	LibAliasInit() should allocate memory with M_WAITOK flag. Modify it and its callers.	2011-04-18 20:07:08 +00:00
Gleb Smirnoff	d0e16e0d1e	Pullup up to TCP header length before matching against 'tcpopts'. PR: kern/156180 Reviewed by: luigi	2011-04-18 18:22:10 +00:00
John Baldwin	da84b2e6c5	When checking to see if a window update should be sent to the remote peer, don't force a window update if the window would not actually grow due to window scaling. Specifically, if the window scaling factor is larger than 2 * MSS, then after the local reader has drained 2 * MSS bytes from the socket, a window update can end up advertising the same window. If this happens, the supposed window update actually ends up being a duplicate ACK. This can result in an excessive number of duplicate ACKs when using a higher maximum socket buffer size. Reviewed by: bz MFC after: 1 month	2011-04-18 17:43:16 +00:00
Bjoern A. Zeeb	336d023b2e	Make in_proto.c dependent on either inet or inet6. While it does not provide any functionality for IPv6, it provides the sysctl nodes for net.inet.* that a lot of functionality shared between IPv4 and IPv6 depends on. We cannot change these anymore without breaking a lot of management and tuning. In case of IPv6 only, we compile out everything but the sysctl node declarations. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC After: 5 days	2011-04-17 16:35:16 +00:00
Edward Tomasz Napierala	79bb84fb15	Refactor udp_input(), moving calls to u_tun_func() into udp_append(). Obtained from: Wheel Systems Sp. z o.o. Reviewed by: bz@	2011-04-14 10:40:57 +00:00
Bjoern A. Zeeb	05b9d121aa	The mbuf_frag_size always was and is file local and not queried from base user space tools via kvm. Mark it static. MFC after: 3 days	2011-04-14 09:47:09 +00:00
Sergey Kandaurov	6bed196c35	Staticize malloc types. Approved by: lstewart MFC after: 1 week	2011-04-13 11:28:46 +00:00
Andrey V. Elsukov	9974d151ec	Restore previous behaviour - always match rule when we doing tagging, even when tag is already exists. Reported by: Vadim Goncharov MFC after: 1 week	2011-04-12 15:20:34 +00:00
Lawrence Stewart	891b8ed467	Use the full and proper company name for Swinburne University of Technology throughout the source tree. Requested by: Grenville Armitage, Director of CAIA at Swinburne University of Technology MFC after: 3 days	2011-04-12 08:13:18 +00:00
Jack F Vogel	c31aa19c53	Port of the LRO fix from mxge driver to the generic LRO code. Thanks to Andrew Gallatin for the change. MFC after: 7 days	2011-04-07 21:20:26 +00:00
Andrey V. Elsukov	a5620cc6c5	Fill up src_port and dst_port variables for SCTP over IPv4. PR: kern/153415 MFC after: 1 week	2011-03-31 16:30:14 +00:00
Andrey V. Elsukov	5600c92750	Fix malloc types. MFC after: 1 week	2011-03-31 15:11:12 +00:00
Andrey V. Elsukov	3d10d64fd3	Fix a memory leak. Memory that is allocated for schedulers hash table was not freed. PR: kern/156083 MFC after: 1 week	2011-03-31 15:10:41 +00:00
John Baldwin	766282cbe7	Clamp the initial advertised receive window when responding to a SYN/ACK to the maximum allowed window. Growing the window too large would cause an underflow in the calculations in tcp_output() to decide if a window update should be sent which would prevent the persist timer from being started if data was pending and the other end of the connection advertised an initial window size of 0. PR: kern/154006 Submitted by: Stefan `Sec` Zehl sec 42 org Reviewed by: bz MFC after: 1 week	2011-03-30 12:35:39 +00:00
Weongyo Jeong	c45e1b3cad	Covers values if (BYTES_THIS_ACK(tp, th) / tp->t_maxseg) value is from 2.0 to 3.0. Reviewed by: lstewart	2011-03-28 19:03:56 +00:00
Sergey Kandaurov	79d514355c	Reference ifaddr object before unlocking as it can be freed from another context at the moment of later access. PR: kern/155555 Submitted by: Andrew Boyer <aboyer att averesystems.com> Approved by: avg (mentor) MFC after: 2 weeks	2011-03-21 14:19:40 +00:00
Jeff Roberson	e4cd31dd3c	- Merge changes to the base system to support OFED. These include a wider arg2 for sysctl, updates to vlan code, IFT_INFINIBAND, and other miscellaneous small features.	2011-03-21 09:40:01 +00:00
Bjoern A. Zeeb	4d457387fe	Properly check for an IPv4 socket after r219579. In some cases as udp6_connect() without an earlier bind(2) to an address, v4-mapped scokets allowed and a non mapped destination address, we can end up here with both v4 and v6 indicated: inp_vflag = (INP_IPV4\|INP_IPV6\|INP_IPV6PROTO) In that case however laddrp is NULL as the IPv6 path does not pass in a copy currently. Reported by: Pawel Worach (pawel.worach gmail.com) Tested by: Pawel Worach (pawel.worach gmail.com) MFC after: 6 days X-MFC with: r219579	2011-03-19 19:08:54 +00:00
Bjoern A. Zeeb	efc76f729a	Merge the two identical implementations for local port selections from in_pcbbind_setup() and in6_pcbsetport() in a single in_pcb_lport(). MFC after: 2 weeks	2011-03-12 21:46:37 +00:00
Randall Stewart	f79aab1866	Tunes and fixes the new DC-CC to seem to hit the right mix. Still may need some tweaks but it appears to almost not give away too much to an RFC2581 flow, but can really minimize the amount of buffers used in the net. MFC after: 3 months	2011-03-08 11:58:25 +00:00
Randall Stewart	48b6c64938	Adds a new Congestion Control that helps reduce the RTT that a flow will build up in buffers in transit. It is a slight modification to RFC2581 but is more friendly i.e. less aggressive. MFC after: 3 months	2011-03-01 00:37:46 +00:00
Dimitry Andric	cb8750c269	Fix breakage in sys/netinet/sctp_sysctl.c, introduced by r219057. If SCTP_HAS_RTTC is not defined, this file fails to compile. Insert the necessary #ifdefs to make it work. Pointy hat to: rrs	2011-02-26 22:45:40 +00:00
Randall Stewart	299108c5a2	Improvements to CC modules: 1) Add four new points that allow you to get more information to cc algo's 2) Fix the case where user changes module on a existing TCB, in such a case, the initialization module needs to be called on all nets. 3) Move htcp_cc structure to a union that other modules can use. 4) Add 5th point for get/set socket options for cc_module specific options MFC after: 2 months	2011-02-26 15:23:46 +00:00
Michael Tuexen	0191fb6de2	* Fix several bugs where the scaled versions of srtt and rttvar where used incorrectly. * Use appropriate variable names for RTO instead of RTT. MFC after: 3 months.	2011-02-24 22:58:15 +00:00
Michael Tuexen	be1d917696	* Cleanup the code computing the retransmission timeout. * Fix an initialization bug for the scaled variance of the RTO. MFC after: 3 months.	2011-02-24 22:36:40 +00:00
Rebecca Cran	6bccea7c2b	Fix typos - remove duplicate "the". PR: bin/154928 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days	2011-02-21 09:01:34 +00:00
Michael Tuexen	f0878bdcc5	Bugfix: Get per vnet sysctl variables and statistics working. MFC after:3 months.	2011-02-18 20:30:58 +00:00
Bjoern A. Zeeb	1fb51a12f2	Mfp4 CH=177274,177280,177284-177285,177297,177324-177325 VNET socket push back: try to minimize the number of places where we have to switch vnets and narrow down the time we stay switched. Add assertions to the socket code to catch possibly unset vnets as seen in r204147. While this reduces the number of vnet recursion in some places like NFS, POSIX local sockets and some netgraph, .. recursions are impossible to fix. The current expectations are documented at the beginning of uipc_socket.c along with the other information there. Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH Reviewed by: jhb Tested by: zec Tested by: Mikolaj Golub (to.my.trociny gmail.com) MFC after: 2 weeks	2011-02-16 21:29:13 +00:00
Sergey Kandaurov	4fd8408ae7	Bump dummynet module version to meet dummynet schedulers' requirements, and thus unbreak loading dummynet.ko via /boot/loader.conf. Reported by: rihad <rihad att mail.ru> on freebsd-net Approved by: kib (mentor)	2011-02-16 15:43:35 +00:00
Randall Stewart	d69e7322cb	Fix a bug reported by Jonathan Leighton in his web-sctp testing at the Univ-of-Del. Basically when a 1-to-1 socket did a socket/bind/send(data)/close. If the timing was right we would dereference a socket that is NULL. MFC after: 1 month	2011-02-13 14:48:11 +00:00
Michael Tuexen	be2a6988a1	Fix several bugs related to stream scheduling. Obtained from: Robin Seggelmann MFC after: 3 months.	2011-02-13 13:53:28 +00:00
Daniel Eischen	9d22191d17	Oops, revert an accidental local change that got added in my last commit (r218627). No damage was done in the last commit, just some duplicated code was added (which is now removed).	2011-02-13 04:44:06 +00:00
Daniel Eischen	f7e6ce6d7a	Allow the SO_SETFIB socket option to select the default (0) routing table. Reviewed by: julian	2011-02-13 00:14:13 +00:00
Michael Tuexen	2678fe1ee9	Remove addresses from endpoint when there are no associations. This fixes a bug reported by brucec@. MFC after: 3 months.	2011-02-10 14:46:37 +00:00
Michael Tuexen	4c97400f86	Fix bugs related to M_FLOWID: * Store the flowid when receiving an SCTP/IPv6 packet. * Store the flowid when receiving an SCTP packet with wrong CRC. * Initilize flowid correctly. * Put test code under INVARIANTS. MFC after: 3 months.	2011-02-07 15:04:23 +00:00
Randall Stewart	f8140f7291	If not set (due to some error Michael is working on fixing) set it for the net. MFC after: 3 months	2011-02-07 08:12:24 +00:00
Randall Stewart	73403d4141	1) Track when flowid does get set. MFC after: 3 months	2011-02-07 08:10:29 +00:00
Randall Stewart	38521fb9b4	1) Use same scheme Michael and I discussed for a selected for a flowid 2) If flowid is not set, arrange so it is stored. 3) If flowid is set by lower layer, use it. MFC after: 3 Months	2011-02-06 13:17:40 +00:00
Luigi Rizzo	9b0456f075	correct the 'output_time' of packets generated by dummynet. In the dec.2009 rewrite I introduced a bug, using for the computation the arrival time instead of the time the packet has exited from the queue. The bandwidth computation was still correct because it is computed elsewhere, but traffic was sent out in bursts. The bug is also present in RELENG_8 after dec.2009 Thanks to Daikichi Osuga for investingating, finding and fixing the bug with detailed graphs of the behaviour before and after the fix. Submitted by: Daikichi Osuga MFC after: 2 weeks	2011-02-05 23:32:17 +00:00
Michael Tuexen	a4ae38f117	Add support for M_FLOWID.	2011-02-05 19:13:38 +00:00
Randall Stewart	5d40cf5d23	1) Typo correction in comments and one spacing change. 2) Mass update to all copyrights. MFC after: 3 Months	2011-02-05 12:12:51 +00:00
John Baldwin	d28b9e89a9	When turning off TCP_NOPUSH, only call tcp_output() to immediately flush any pending data if the connection is established. Submitted by: csjp Reviewed by: lstewart MFC after: 1 week	2011-02-04 14:13:15 +00:00
Randall Stewart	0071ee5ede	1) Fix cpu mapping per JB's suggestions 2) Fix it so INIT's don't always end up on CPU0 MFC after: 3 months	2011-02-04 13:50:30 +00:00
Rebecca Cran	492fddb2c4	Fix typo (Tuneable -> Tunable).	2011-02-04 12:03:48 +00:00
Michael Tuexen	252f7f93b0	Fix several bugs in the stream schedulers. From Robin Seggelmann. MFC after: 3 months.	2011-02-03 20:44:49 +00:00
Michael Tuexen	c446091b1e	Make sure that changing the ECN sysctl does not affect exisiting associations and endpoints. MFC after: 3 months.	2011-02-03 19:59:00 +00:00
Randall Stewart	dec0177df6	1) Move per John Baldwin to mp_maxid 2) Some signed/unsigned errors found by Mac OS compiler (from Michael) 3) a couple of copyright updates on the effected files. MFC after: 3 months	2011-02-03 19:22:21 +00:00
Randall Stewart	ae26e0a472	Fix the per CPU stats so that: 1) They don't use the giant "MAX_CPU" define and instead are allocated dynamically based on mp_ncpus 2) Will zero with the netstat -z -s -p sctp 3) Will be properly handled by both the sctp_init and finish (the multi-net stuff was incorrectly bzero'ing in sctp_init the wrong size.. the bzero is now moved to the right places). And of course the free is put in at the very end. MFC after: 3 Months	2011-02-03 11:52:22 +00:00
Randall Stewart	bfc46083b9	Adds an experimental option to create a pool of threads. These serve as input threads and are queued packets based on the V-tag number. This is similar to what a modern card can do with queue's for TCP... but alas modern cards know nothing about SCTP. MFC after: 3 months (maybe)	2011-02-03 10:05:30 +00:00
Randall Stewart	899288ae4b	1) Allow a chunk to track the cwnd it was at when sent. 2) Add separate max-bursts for retransmit and hb. These are set to sysctlable values but not settable via the socket api. This makes sure we don't blast out HB's or fast-retransmits. 3) Determine on the first data transmission on a net if its local-lan (by being under or over a RTT). This can later be used to think about different algorithms based on locallan vs big-i (experimental) 4) The cwnd should NOT be allowed to grow when an ECNEcho is seen (TCP has this same bug). We fix this in SCTP so an ECNe being seen prevents an advance of cwnd. 5) CWR's should not be sent multiple times to the same network, instead just updating the TSN being transmitted if needed. MFC after: 1 Month	2011-02-02 11:13:23 +00:00
Lawrence Stewart	03f0843bdb	Algorithm modules can define their own private congestion signal types in the top 8 bits of the 32 bit signal bit field space for internal use. These private signals should not be leaked outside of a module. Given that many algorithm modules use the NewReno hook functions to simplify their implementation, the obvious place such a leak would show up is in the NewReno cong_signal hook function. - Show the full number of significant bits in the signal type definitions in <netinet/cc.h>. - Add a bitmask to simplify figuring out if a given signal is in the private or public bit range. - Add a sanity check in newreno_cong_signal() to ensure private signals are not being leaked into the hook function. Sponsored by: FreeBSD Foundation Discussed with: David Hayes <dahayes at swin edu au> MFC after: 1 week X-MFC with: r215166	2011-02-01 13:32:27 +00:00
Lawrence Stewart	ec943febbb	Fix typo in comment: "course" -> "coarse" Sponsored by: FreeBSD Foundation Submitted by: jmallett MFC after: 3 months X-MFC with: r218152	2011-02-01 07:10:13 +00:00
Lawrence Stewart	0927e1a18b	Import an implementation of the CAIA-Hamilton-Delay (CHD) congestion control algorithm described in the paper "Improved coexistence and loss tolerance for delay based TCP congestion control" by Hayes and Armitage. It is implemented as a kernel module compatible with the recently committed modular congestion control framework. CHD enhances the approach taken by the Hamilton-Delay (HD) algorithm to provide tolerance to non-congestion related packet loss and improvements to coexistence with loss-based congestion control algorithms. A key idea in improving coexistence with loss-based congestion control algorithms is the use of a shadow window, which attempts to track how NewReno's congestion window (cwnd) would evolve. At the next packet loss congestion event, CHD uses the shadow window to correct cwnd in a way that reduces the amount of unfairness CHD experiences when competing with loss-based algorithms. In collaboration with: David Hayes <dahayes at swin edu au> and Grenville Armitage <garmitage at swin edu au> Sponsored by: FreeBSD Foundation Reviewed by: bz and others along the way MFC after: 3 months	2011-02-01 07:05:14 +00:00
Lawrence Stewart	ac230a79e1	Import a clean-room implementation of the Hamilton-Delay (HD) congestion control algorithm based on the paper "A strategy for fair coexistence of loss and delay-based congestion control algorithms" by Budzisz, Stanojevic, Shorten and Baker. It is implemented as a kernel module compatible with the recently committed modular congestion control framework. HD uses a probabilistic approach to reacting to delay-based congestion. The probability of reducing cwnd is zero when the queuing delay is very small, increasing to a maximum at a set threshold, then back down to zero again when the queuing delay is high. Normal operation keeps the queuing delay below the set threshold. However, since loss-based congestion control algorithms push the queuing delay high when probing for bandwidth, having the probability of reducing cwnd drop back to zero for high delays allows HD to compete with loss-based algorithms. In collaboration with: David Hayes <dahayes at swin edu au> and Grenville Armitage <garmitage at swin edu au> Sponsored by: FreeBSD Foundation Reviewed by: bz and others along the way MFC after: 3 months	2011-02-01 06:42:46 +00:00
Lawrence Stewart	1d4ed791d0	Import a clean-room implementation of the VEGAS congestion control algorithm based on the paper "TCP Vegas: end to end congestion avoidance on a global internet" by Brakmo and Peterson. It is implemented as a kernel module compatible with the recently committed modular congestion control framework. VEGAS uses network delay as a congestion indicator and unlike regular loss-based algorithms, attempts to keep the network operating with stable queuing delays and no congestion losses. By keeping network buffers used along the path within a set range, queuing delays are kept low while maintaining high throughput. In collaboration with: David Hayes <dahayes at swin edu au> and Grenville Armitage <garmitage at swin edu au> Sponsored by: FreeBSD Foundation Reviewed by: bz and others along the way MFC after: 3 months	2011-02-01 06:17:00 +00:00
Randall Stewart	493d8e5a83	More ECN fixes: 1) We now remove ECN-Nonce since it will no longer continue as a I-D 2) Eliminate last_tsn_echo, this tied us to an assoc not the net and thus we were not doing m-homing on the ECN-Echo senders side right. 3) Increment the count going out even if the TSN in lower in the pending ECN-Echo, this way the receiver knows exactly how many packets were marked even with network re-ordering 4) Fix so we DO NOT stop doing delayed sack if a ECN Echo is in queue MFC after: 1 month	2011-01-31 11:50:11 +00:00
Bjoern A. Zeeb	7f79e7e4db	Remove duplicate printing of TF_NOPUSH in db_print_tflags(). MFC after: 10 days	2011-01-29 22:11:13 +00:00
Randall Stewart	a21779f050	Fixes to ECN in SCTP. 1) ECN was on an association basis, this is incorrect and will not work with CMT or for that matter if the user is sending to multiple addresses. This commit makes ECN on a per path basis. 2) Adopt the new format for the ECN internet draft. This also maintains compatability with old format chunks as well. 3) Keep track of the real time of a RTT down to micro seconds. For some future conditional features (for like a data center this is good information to have). MFC after: 1 month	2011-01-29 19:55:29 +00:00
Randall Stewart	410bcbef0a	Keep track of the real last RTT on each net. This will be used for Data Center congestion control, we won't want to engage it in the ECN code unless we KNOW that the RTT is less than 500us. MFC after: 1 week	2011-01-28 21:05:21 +00:00
Randall Stewart	d77e2e42b3	Fix a bug in the way ECN-Echo chunk sends were being accounted for. The counting was such that we counted only when we queued a chunk, not when we sent it. Now keep an additional counter for queuing and one for sending. MFC after: 1 week	2011-01-28 20:49:15 +00:00
Michael Tuexen	f8cdf87663	* Use 300 ms as the default for RTO_MIN. * Disable burst mitigation by default. * Remove unused constant. Discussed with rrs. MFC after: 3 months.	2011-01-26 21:38:17 +00:00
Michael Tuexen	507c72969d	Make SCTP_MAX_BURST compliant with the latest version of the socket API ID. This is not compatible with the API in stable/8.	2011-01-26 19:55:54 +00:00
Michael Tuexen	90fed1d88e	Change infrastructure for SCTP_MAX_BURST to allow compliance with the latest socket API ID. Especially it can be disabled. Full compliance needs changing the structure used in the socket option. Since this breaks the API, it will be a seperate commit which will not be MFCed to stable/8. MFC after: 3 months.	2011-01-26 19:49:03 +00:00
Daniel Eischen	e691be70f9	Prison check addresses set with multicast interface options. Reviewed by: bz MFC after: 1 week	2011-01-26 17:31:03 +00:00
Andrew Thompson	965615476e	When matching an incoming ARP against a bridge, ensure both interfaces belong to the same bridge. Submitted by: Alexander Zagrebin	2011-01-25 17:15:23 +00:00
Lawrence Stewart	050570efa7	Import the ERTT (Enhanced Round Trip Time) Khelp module. ERTT uses the Khelp/Hhook KPIs to hook into the TCP stack and maintain a per-connection, low noise estimate of the instantaneous RTT. ERTT's implementation is robust even in the face of delayed acknowledgements and/or TSO being in use for a connection. A high quality, low noise RTT estimate is a requirement for applications such as delay-based congestion control, for which we will be importing some algorithm implementations shortly. In collaboration with: David Hayes <dahayes at swin edu au> and Grenville Armitage <garmitage at swin edu au> Sponsored by: FreeBSD Foundation Reviewed by: bz and others along the way MFC after: 3 months	2011-01-24 23:08:38 +00:00

1 2 3 4 5 ...

4137 commits