In most cases, usage does not return, so mark them as __dead2. For the
cases where they do return, they have not been marked __dead2.
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/735
Restructure c and C flag checks for string length to
work properly. Quickly bypass for non TCP protos too.
Reviewed By: tuexen
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D40480
Make struct pfsync_state contents configurable by sending out new
versions of the structure in separate subheader actions. Both old and
new version of struct pfsync_state can be understood, so replication of
states from a system running an older kernel is possible. The version
being sent out is configured using ifconfig pfsync0 … version XXXX. The
version is an user-friendly string - 1301 stands for FreeBSD 13.1 (I
have checked synchronization against a host running 13.1), 1400 stands
for 14.0.
A host running an older kernel will just ignore the messages and count
them as "packets discarded for bad action".
Reviewed by: kp
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D39392
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
protopr does not support reading from a core anymore.
So don't state that it can.
Reviewed by: glebius, rscheff, rrs
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D39688
This change touches both kernel and netstat(1), but either of the changes
will fix printing pcb addresses with -A.
The thing is that historically netstat(1) treated TCP differently, and
printed tcpcb address instead of inpcb address. This is not documented
anywhere! With e68b379244 these two addresses became the same. It is
highly likely they will be the same for a long time, but it might be they
will start to differ again in a far future. My proposal is to stop
treating TCP differently with netstat(1) and right now is a good opportunity
to do that, since there will be no behavior change at all. The kernel
change to tcp_inptoxtp() will go into stable/14 to make it compatible with
netstat(1) binary from stable/13. We can drop it later, probably together
with in_ppcb pointer from inpcb. The in_ppcb in xinpcb will stay for size
compatibility.
Reviewed by: tuexen, rrs
Differential Revision: https://reviews.freebsd.org/D39736
Make userland tools such as netstat, route, arp and ndp use
either netlink or rtsock interfaces based on the NETLINK_SUPPORT
options.
Both NETLINK and NETLINK_SUPPORT options are turned on by default.
Reviewed By: eugen
Differential Revision: https://reviews.freebsd.org/D39148
A number of improvements has commited to snl(3) recently.
A notable one is snl(3) build-in parsers for all of the objects
exported by the kernel.
This change updates netlink handling code to the latest available snl(3)
API.
netstat(1) allows to specify both -i (all interfaces) and -I <ifname>.
However, when both are specified, -I always overrides -i.
Add a comment where appropriate the same way we do in rm(1) for -f and -i.
PR: 202708
Reported by: demon@
Approved by: manpages (pauamma@)
Differential Revision: https://reviews.freebsd.org/D38654
* Parse and export newly-added NL_RTA_WEIGHT attribute, providing path
weight for a non-multipath route. This fixes a number of tests in
sys/net/routing which rely on this data.
* Remove handling of NL_RTA_KNH_ID in multipath routes, as it is
not provided.
* Improve kernel/user nexthop index export. As a result,
for multipath routes:
* nhg-kidx attribute represents kernel nhg index (always provided)
* nhg-uidx attribute represents user-provided nhg index (if set)
for non-multipath routes:
* nhop-kidx attribute represents kernel nhop index (always provided)
* nhop-udx attribute represents user-provided nexthop index (if set)
This change switches route listing in netstat to netlink, with fallback to rtsock.
The outputs are mostly identical, with an exception of not showing kernel
nexthop indexes for multipath routes.
Differential Revision: https://reviews.freebsd.org/D36529
trpt(8) was utility to pull TCP debugging data from the kernel
originating back from 4.2BSD. It is not used nowadays by TCP
developers. We have more powerful debugging facilities, e.g.
the Dtrace probing, the TCP black box logging and siftr.
Discussed with: rscheff, tuexen, rrs, jtl and others
Have tcpstats (netstat -s) differentiate between received and sent
ECN-marked packets. Also account for IP ECN bits (on TCP packets)
even when the tcp session has not negotiated ECN support.
Event: IETF 115 Hackathon
Reviewed By: glebius, tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D37314
Rack has had the ability to timeout connections that just sit idle automatically. This
feature of course is off by default and requires the user set it on (though the socket option
has been missing in tcp_usrreq.c). Lets get the progress timeout fully supported in
the base stack as well as rack.
Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D36716
Instead of incrementing pretty random counters in the IP statistics,
create divert socket statistics structure. Export via netstat(1).
Differential revision: https://reviews.freebsd.org/D36381
The divert(4) is not a protocol of IPv4. It is a socket to
intercept packets from ipfw(4) to userland and re-inject them
back. It can divert and re-inject IPv4 and IPv6 packets today,
but potentially it is not limited to these two protocols. The
IPPROTO_DIVERT does not belong to known IP protocols, it
doesn't even fit into u_char. I guess, the implementation of
divert(4) was done the way it is done basically because it was
easier to do it this way, back when protocols for sockets were
intertwined with IP protocols and domains were statically
compiled in.
Moving divert(4) out of inetsw accomplished two important things:
1) IPDIVERT is getting much closer to be not dependent on INET.
This will be finalized in following changes.
2) Now divert socket no longer aliases with raw IPv4 socket.
Domain/proto selection code won't need a hack for SOCK_RAW and
multiple entries in inetsw implementing different flavors of
raw socket can merge into one without requirement of raw IPv4
being the last member of dom_protosw.
Differential revision: https://reviews.freebsd.org/D36379
The field for interface names for netstat -i was 5 characters by
default, which is no longer sufficient with names like "vlan1234"
and "vtnet0". netstat -iW computed the necessary field width, but
also enlarged the address field by a lot (especially with IPv6 enabled).
Make netstat -i compute the field width for interface names with or
without -W. Note that the existing default output does not fit in
80 columns in any case. Update the man page accordingly, documenting
the remaining effect of -W with -i. Also add -W to the list of
General Options, as there are numerous pointers to this.
Reported by: Chris Ross
Reviewed by: melifaro, rgrimes, cy
Differential Revision: https://reviews.freebsd.org/D35703
MFC after: 1 week
Mbufs leak when manually removing incomplete NDP records with pending packet via ndp -d.
It happens because lltable_drop_entry_queue() rely on `la_numheld`
counter when dropping NDP entries (lles). It turned out NDP code never
increased `la_numheld`, so the actual free never happened.
Fix the issue by introducing unified lltable_append_entry_queue(),
common for both ARP and NDP code, properly addressing packet queue
maintenance.
Reviewed By: melifaro
Differential Revision: https://reviews.freebsd.org/D35365
MFC after: 2 weeks
With M_EXTPG mbufs these two counters already do not represent the
reality. As we are moving towards protocol independent socket buffers,
which may not even use mbufs at all, the counters become less and less
relevant. The only userland seeing them was 'netstat -x'.
PR: 264181 (exp-run)
Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D35334
Reserve couters in the tcps struct in preparation
for AccECN, extend the debugging output for TF2
flags, optimize the syncache flags from individual
bits to a codepoint for the specifc ECN handshake.
This is in preparation of AccECN.
No functional chance except for extended debug
output capabilities.
Reviewed By: #transport, rrs
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D34161
When attempting to characterize bound addresses, netstat was checking
for host 0 on a (historical) net using inet_lnaof(). Such addresses
are not normally bound, as they would not work, with the exception
of the unspecified address, INADDR_ANY. Check for that explicitly.
Similarly, don't check bound addresses for a match to a network name.
MFC after: 1 month
Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D32714
DSACK accounting has been for quite some time under a NETFLIX_STATS ifdef. Statistics
on DSACKs however are very useful in figuring out how much bad retransmissions you
are doing. This is further complicated, however, by stacks that do TLP. A TLP
when discovering a lost ack in the reverse path will cause the generation
of a DSACK. For this situation we introduce a new dsack-tlp-bytes as well
as the more traditional dsack-bytes and dsack-packets. These will now
all display in netstat -p tcp -s. This also updates all stacks that
are currently built to keep track of these stats.
Reviewed by: tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D32158
API should work as following:
- periodicaly report Lower-or-EQual bandwidth (LEQ) connections
over kernel socket, if user application registered for such
per-flow notifications
- report Grater-or-EQual (GEQ) bandwidth as soon as it reaches
specified value in configured time window
Custom implementation of callouts was removed. There is no
point of doing calout-wheel here as generic callouts are
doing exactly the same. The performance is not critical
for such reporting, so the biggest concern should be
to have a code which can be easily maintained.
This is ia preparation for locking rework which is highly inefficient.
Approved by: mw
Sponsored by: Stormshield
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D30210
Recover from excessive losses without reverting to a
retransmission timeout (RTO). Disabled by default, enable
with sysctl net.inet.tcp.do_lrd=1
Reviewed By: #transport, rrs, tuexen, #manpages
Sponsored by: Netapp, Inc.
Differential Revision: https://reviews.freebsd.org/D28931
Adding support for TCP over UDP allows communication with
TCP stacks which can be implemented in userspace without
requiring special priviledges or specific support by the OS.
This is joint work with rrs.
Reviewed by: rrs
Sponsored by: Netflix, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29469
rttrash (unused but not yet delete entries) were eliminated
during routing rework. Remove reading these symbols from the kernel.
PR: 254681
Reported by: rashey@superbox.pl
MFC after: immediately
Pad the icmp6stat structure so that we can add more counters in the
future without breaking compatibility again, last done in r358620.
Annotate the rarely executed error paths with __predict_false while
here.
Reviewed by: bz, melifaro
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D26578
Adding the "-c" option used to show detailed per-connection
congestion control state for TCP sessions.
This is one summary patch, which adds the relevant variables into
xtcpcb. As previous "spare" space is used, these changes are ABI
compatible.
Reviewed by: tuexen
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D26518
This change is based on the nexthop objects landed in D24232.
The change introduces the concept of nexthop groups.
Each group contains the collection of nexthops with their
relative weights and a dataplane-optimized structure to enable
efficient nexthop selection.
Simular to the nexthops, nexthop groups are immutable. Dataplane part
gets compiled during group creation and is basically an array of
nexthop pointers, compiled w.r.t their weights.
With this change, `rt_nhop` field of `struct rtentry` contains either
nexthop or nexthop group. They are distinguished by the presense of
NHF_MULTIPATH flag.
All dataplane lookup functions returns pointer to the nexthop object,
leaving nexhop groups details inside routing subsystem.
User-visible changes:
The change is intended to be backward-compatible: all non-mpath operations
should work as before with ROUTE_MPATH and net.route.multipath=1.
All routes now comes with weight, default weight is 1, maximum is 2^24-1.
Current maximum multipath group width is statically set to 64.
This will become sysctl-tunable in the followup changes.
Using functionality:
* Recompile kernel with ROUTE_MPATH
* set net.route.multipath to 1
route add -6 2001:db8::/32 2001:db8::2 -weight 10
route add -6 2001:db8::/32 2001:db8::3 -weight 20
netstat -6On
Nexthop groups data
Internet6:
GrpIdx NhIdx Weight Slots Gateway Netif Refcnt
1 ------- ------- ------- --------------------------------------- --------- 1
13 10 1 2001:db8::2 vlan2
14 20 2 2001:db8::3 vlan2
Next steps:
* Land outbound hashing for locally-originated routes ( D26523 ).
* Fix net/bird multipath (net/frr seems to work fine)
* Add ROUTE_MPATH to GENERIC
* Set net.route.multipath=1 by default
Tested by: olivier
Reviewed by: glebius
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D26449
At this point, AES is the more common name for Rijndael128. setkey(8)
will still accept the old name, and old constants remain for
compatiblity.
Reviewed by: cem, bcr (manpages)
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24964
Examples of depecrated algorithms in manual pages and sample configs
are updated where relevant. I removed the one example of combining
ESP and AH (vs using a cipher and auth in ESP) as RFC 8221 says this
combination is NOT RECOMMENDED.
Specifically, this removes support for the following ciphers:
- des-cbc
- 3des-cbc
- blowfish-cbc
- cast128-cbc
- des-deriv
- des-32iv
- camellia-cbc
This also removes support for the following authentication algorithms:
- hmac-md5
- keyed-md5
- keyed-sha1
- hmac-ripemd160
Reviewed by: cem, gnn (older verisons)
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24342