For cases where code conditionally does something based on an address family
and later assumes one of the paths was taken. This was initially just calls
to panic until guenther suggested a function to reduce the amount of strings
needed.
This reduces the amount of noise with static analysers and acts as a sanity
check.
ok guenther@ bluhm@
Obtained from: OpenBSD, jsg <jsg@openbsd.org>, ba4138390b
Sponsored by: Rubicon Communications, LLC ("Netgate")
This changes ABI due to the changed opcodes and includes the
following:
* rule numbers and named object indexes converted to 32-bits
* all hardcoded maximum rule number was replaced with
IPFW_DEFAULT_RULE macro
* now it is possible to grow maximum numbers or rules in
build time
* several opcodes converted to ipfw_insn_u32 to keep rulenum:
O_CALL, O_SKIPTO
* call stack modified to keep u32 rulenum. The behaviour of
O_CALL opcode was changed to avoid possible packets looping.
Now when call stack is overflowed or mbuf tag allocation
failed, a packet will be dropped instead of skipping to next
rule.
* 'return' action now have two modes to specify return point:
'next-rulenum' and 'next-rule'
* new lookup key added for O_IP_DST_LOOKUP opcode 'lookup rulenum'
* several opcodes converted to keep u32 named object indexes
in special structure ipfw_insn_kidx
* tables related opcodes modified to use two structures:
ipfw_insn_kidx and ipfw_insn_table
* added ability for table value matching for specific value type
in 'table(name,valtype=value)' opcode
* dynamic states and eaction code converted to use u32 rulenum
and named objects indexes
* added insntod() and insntoc() macros to cast to specific
ipfw instruction type
* default sockopt version was changed to IP_FW3_OPVER=1
* FreeBSD 7-11 rule format support was removed
* added ability to generate special rtsock messages via log opcode
* added IP_FW_SKIPTO_CACHE sockopt to enable/disable skipto cache.
It helps to reduce overhead when many rules are modified in batch.
* added ability to keep NAT64LSN states during sets swapping
Obtained from: Yandex LLC
Relnotes: yes
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D46183
Currently af-to works only on inbound interface by creating a reversed
NAT state key which is used to match traffic returning on the outbound
interface.
Such limitation is not necessary. When an af-to state is created
for an outbound rule do not reverse the NAT state key, making it work
just like if it was created for a normal NAT rule. Depending on firewall
design it might be easier and more natural to use af-to on the outbound
interface.
Reviewed by: kp
Approved by: kp (mentor)
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D49122
To support DHCP for IPoIB links, DHCP clients and servers require the
ability to transmit link-layer broadcasts on the IB interfaces. BPF
provides the mechanism for doing this.
This change updates the if_infiniband driver to be capable of accepting
link-layer broadcast requests via BPF using Ethernet formatted frames
(the driver currently registers with BPF as DLT_EN10MB). Only Broadcast
frames can reliably be interpreted using the Ethernet header format so
detect unicast and multicast frames are rejected if passed in using the
Ethernet format. This doesn't impact the ability to support native
unicast, broadcast or multicast frames if native infiniband header
support is added to BPF at a later date.
Further the above, this commit also addresses an issue in the existing
code that can result in separation of part of the packet header from the
rest of the payload if a BPF write was attempted. This was caused by
mbuf preallocation of the infiniband header length regardless of length
of the prepend data.
Reviewed by: rpokala; Greg Foster <gfoster@vdura.com>
Tested by: Greg Foster <gfoster@vdura.com>
MFC after: 1 week
Sponsored by: Vdura
Pull Request: https://github.com/freebsd/freebsd-src/pull/1591
change log(matches) semantics slightly to make it more useful. since it
is a debug tool change of semantics not considered problematic.
up until now, log(matches) forced logging on subsequent matching rules,
the actual logging used the log settings from that matched rule.
now, log(matches) causes subsequent matches to be logged with the log settings
from the log(matches) rule. in particular (this was the driving point),
log(matches, to pflog23) allows you to have the trace log going to a seperate
pflog interface, not clobbering your regular pflogs, actually not affecting
them at all.
long conversation with bluhm about it, which didn't lead to a single bit
changed in the diff but was very very helpful. ok bluhm as well.
Obtained from: OpenBSD, henning <henning@openbsd.org>, f61b1efcce
Sponsored by: Rubicon Communications, LLC ("Netgate")
pfi_kkif_attach() annotates the kif with a flag indicating it is the "any" match.
pfi_kif_match obeys() that flag.
ok benno
Obtained from: OpenBSD, henning <henning@openbsd.org>, 4be478ce5d
Sponsored by: Rubicon Communications, LLC ("Netgate")
No change to the underlying type, so no ABI change.
We define __time_t as uint64_t if __LP64__, otherwise uint32_t,
and only define __LP64__ if long is 64 bits.
In other words: __time_t == long.
ok henning@ deraadt@
Obtained from: OpenBSD, guenther <guenther@openbsd.org>, 6c1b69a0ff
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D48963
For every state pf creates up to two source nodes: a limiting one
struct pf_kstate -> src_node and a NAT one struct pf_kstate -> nat_src_node.
The limiting source node is tracking information needed for limits using
max-src-states and max-src-nodes and the NAT source node is tracking NAT
rules only.
On closer inspection some issues emerge:
- For route-to rules the redirection decision is stored in the limiting source
node. Thus sticky-address and source limiting can't be used separately.
- Global source tracking, as promised in the man page, is totally absent from
the code. Pfctl is capable of setting flags PFRULE_SRCTRACK (enable source
tracking) and PFRULE_RULESRCTRACK (make source tracking per rule). The kernel
code checks PFRULE_SRCTRACK but ignores PFRULE_RULESRCTRACK. That makes
source tracking work per-rule only.
This patch is based on OpenBSD approach where source nodes have a type and each
state has an array of source node pointers indexed by source node type
instead of just two pointers. The conditions for limiting are applied
only to source nodes of PF_SN_LIMIT type. For global limit tracking
source nodes are attached to the default rule.
Reviewed by: kp
Approved by: kp (mentor)
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D39880
Just like TCP and UDP we can fold the SCTP code into pf_test_state().
This does require a dummy variable to hold the protocol checksum, because unlike
TCP and UDP the SCTP checksum is 32-bits. We don't need to change the checksum
though, so simply pointing the pcksum pointer to a safe dummy location suffices
to re-use pf_test_state().
Sponsored by: Rubicon Communications, LLC ("Netgate")
Set it up in pf_setup_pdesc(). ok ryan benno mikeb bluhm
Obtained from: OpenBSD, henning <henning@openbsd.org>, 14255d4d87
Sponsored by: Rubicon Communications, LLC ("Netgate")
We lost the quick flag as soon as we stepped into a child anchor.
Simplify the logic, get rid of the match flag in the anchor stack, just
use the match variable we already had (and used in a boolean style) to track
the nest level we had a match at. When a child anchor had a match we also
have a match in the current anchor, so update the match level accordingly,
and thus correctly honour the quick flag.
Reported by, along with the right idea on how to fix this, by Sean Gallagher
\sean at teletech.com.au/, who also helped testing the fix. ok ryan & benno
Obtained from: OpenBSD, henning <henning@openbsd.org>, 32a028bff7
Sponsored by: Rubicon Communications, LLC ("Netgate")
Renumber 1000BASE-BX and add 100BASE-BX sequentially
I added this 1000BASE-BX in 78c63ed260 but
did not connect it to any code yet, appologize for the churn.
MFC after: 3 days
The newly introduced function bpf_ifdetach() is only available when
device bpf is enabled.
Fixes: 1ed9b381d4 ifnet: Detach BPF descriptors on interface vmove event
Commit 20c4899a8e modified pf_test_eth_rule() to not acquire the
rules read lock, so pf_commit_eth() was changed to wait until the
now-inactive rules are no longer in use before freeing them. In
particular, it uses the net_epoch to schedule callbacks once the
inactive rules are no longer visible to packet processing threads.
However, since commit 812839e5aa, pf_test_eth_rule() acquires the
rules read lock, so this deferred action is unneeded. This patch
reverts a portion of 20c4899a8e such that we avoid using deferred
callbacks to free inactive rules.
The main motivation is performance: epoch_drain_callbacks() is quite
slow, especially on busy systems, and its use in the DIOCXBEGIN handler
in particular causes long stalls in relayd when reloading configuration.
Reviewed by: kp
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Modirum MDPay
Differential Revision: https://reviews.freebsd.org/D48822
In particular, we store a FIB number in both struct socket and in struct
inpcb. When updating the FIB number with setsockopt(SO_SETFIB), make
the update atomic. This is required to support the new bind_all_fibs
mode, since in that mode changing the FIB of a bound socket is not
permitted.
This requires a bit more code, but avoids a layering violation in
sosetopt(), where we hard-code the list of protocol families that
implement SO_SETFIB.
Reviewed by: glebius
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D48666
remove confuzzling comment
:dlg: the xxx can go
...and this time commit to the real repo and not the one on my laptop
Obtained from: OpenBSD, henning <henning@openbsd.org>, 15e15606eb
Sponsored by: Rubicon Communications, LLC ("Netgate")
Just like we do for IPv6, generate an ICMP fragmentation needed packet if we're
going to need fragmenation for IPv4 as well (i.e. DF is set). Do so before full
processing, so we generate it with pre-NAT addreses, just as we do for IPv6.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D48805
When an interface is moving to/from a vnet jail, it may still have BPF
descriptors attached. The userland (e.g. tcpdump) does not get noticed
that the interface is departing and still opens BPF descriptors thus
may result in leaking sensitive traffic (e.g. an interface is moved
back to parent jail but a user is still sniffing traffic over it in
the child jail).
Detach BPF descriptors so that the userland will be signaled.
Reviewed by: ae
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D45727
if_detach_internal() never fail since change [1]. As a consequence,
also does its caller if_vmove(). While here, remove a stall comment.
No functional change intended.
This reverts commit c7bab2a7ca.
[1] a779388f8b if: Protect V_ifnet in vnet_if_return()
Reviewed by: glebius
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D48820
There're two possible race conditions,
1. Concurrent bpfattach() and bpf_setif(), i.e., BIOCSETIF ioctl,
2. Concurrent bpfdetach() and bpf_setif().
For the first case, userland may see BPF interface attached but it has
not been in the attached interfaces list `bpf_iflist` yet. Well it
will eventually be so this case does not matter.
For the second one, bpf_setif() may reference `dead_bpf_if` and the
kernel will panic (spotted by change [1], without the change we will
end up silently corrupted memory).
A simple fix could be that, we add additional check for `dead_bpf_if`
in the function `bpf_setif()`. But that requires to extend protection
of global lock (BPF_LOCK), i.e., BPF_LOCK should also protect the
assignment of `ifp->if_bpf`. That simple fix works but is apparently
not a good design. Since the attached interfaces list `bpf_iflist` is
the single source of truth, we look through it rather than check
against the interface's side, aka `ifp->if_bpf`.
This change has performance regression, that the cost of BPF interface
attach operation (BIOCSETIF ioctl) goes back from O(1) to O(N) (where
N is the number of BPF interfaces). Well we normally have sane amounts
of interfaces, an O(N) should be affordable.
[1] 7a974a6498 bpf: Make dead_bpf_if const
Fixes: 16d878cc99 Fix the following bpf(4) race condition ...
MFC after: 4 days
Differential Revision: https://reviews.freebsd.org/D45725
This driver does not need to retrieve those tunable during early boot.
Meanwhile SYSCTL_INT can provide rich info such as description.
Also `sysctl net.link.vxlan.[legacy_port|reuse_port]` can report the
current settings.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D48621
Using one taskqueue group with single thread to execute all admin
tasks may lead to unexpected timeouts when long running task (e.g.
handling a reset after FW update) for one interface prevents
tasks from other interfaces being executed. Taskqueue group API
doesn't let to dynamically add threads, and pre-allocating thread
for each CPU as it's done for traffic queues would be a waste
of resources on systems with small number of interfaces. Replace
global taskqueue group for admin tasks with taskqueue allocated
for each interface to allow independent execution.
Signed-off-by: Krzysztof Galazka <krzysztof.galazka@intel.com>
Reviewed by: imp, jhb
Pull Request: https://github.com/freebsd/freebsd-src/pull/1336
There was a limit on the number of pflog interfaces - 16. remove that.
mostly by dynamically allocating pflogifs instead of making that a static
array. ok claudio zinke
Obtained from: OpenBSD, henning <henning@openbsd.org>, ab0a082ea6
Sponsored by: Rubicon Communications, LLC ("Netgate")
As suggested by henning.
Which unbreaks ie route-to after the recent pf changes.
With much help debugging and pointing out of missing bits from claudio@
ok claudio@ "looks good" henning@
Obtained from: OpenBSD, jsg <jsg@openbsd.org>, 7fa5c09028
Sponsored by: Rubicon Communications, LLC ("Netgate")
It is harmless but pointless to invoke vxlan_stop event handler when the
interface was not previously configured. This change will also prevent
an assert panic from t4_vxlan_stop_handler().
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D48494
This is part of the upcoming USB umb(4) work.
Differential Revision: https://reviews.freebsd.org/D48167
Approved by: adrian, zlei
Sponsored by: FreeBSD Foundation
PR: kern/263783
Submitted by: Pierre Pronchery <khorben@defora.org>
Teach dummymbuf to replace mbufs with larger ones.
This can be useful for testing for bugs that depend on mbuf layout.
Sponsored by: Rubicon Communications, LLC ("Netgate")
When we call pf_normalize_ip() or pf_normalize_ip6() we passed the mbuf twice.
Once as m0, and once inside the struct pf_pdesc. Remove the former to avoid
confusion when we free *m0, but don't update pd->m.
This could lead to use-after-free errors e.g. if reassembly failed.
PR: 283705
Reported by: Yichen Chai <yichen.chai@gmail.com>, Zhuo Ying Jiang Li <zyj20@cl.cam.ac.uk>
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC ("Netgate")
Allow users to choose to allow permitted SCTP connections to set up additional
multihomed connections regardless of the ruleset. That is, allow an already
established connection to set up flows that would otherwise be disallowed.
In case of if-bound connections we initially set the extra associations to
be floating, because we don't know what path they'll be taking when they're
created. Once we see the first traffic we can bind them.
MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D48453
- Don't use bare vnet(4) definitions in the KASSERT, they aren't available
on a kernel without VIMAGE. Just through MPASS() here. This is more of
documenting assertion rather than an assertion that may actually fire on
an unmodified kernel.
- V_netisr_enable is different to the rest of V_ prefixed globals. On a
kernel without VIMAGE it basically doesn't exist, instead of being
present as a single instance.
Fixes: a1be7978f1
This makes their argument list shorter. Also fix a bug where pf_walk_option6()
used the outer header in the pd2 case.
ok henning@ mikeb@
Obtained from: OpenBSD, bluhm <bluhm@openbsd.org>, dfff4707a1
Sponsored by: Rubicon Communications, LLC ("Netgate")
Only map mbuf when a policy is looked up and indicates that IPSEC needs
to transform the packet. If IPSEC is inline offloaded, it is up to the
interface driver to request remap if needed.
Fetch the IP header using m_copydata() instead of using mtod() to select
policy/SA.
Reviewed by: markj
Sponsored by: NVidia networking
Differential revision: https://reviews.freebsd.org/D48265