netmap: fix lock order reversal related to kqueue usage
When using poll(), select() or kevent() on netmap file descriptors,
netmap executes the equivalent of NIOCTXSYNC and NIOCRXSYNC commands,
before collecting the events that are ready. In other words, the
poll/kevent callback has side effects. This is done to avoid the
overhead of two system call per iteration (e.g., poll() + ioctl(NIOC*XSYNC)).
When the kqueue subsystem invokes the kqueue(9) f_event callback
(netmap_knrw), it holds the lock of the struct knlist object associated
to the netmap port (the lock is provided at initialization, by calling
knlist_init_mtx).
However, netmap_knrw() may need to wake up another netmap port (or even
the same one), which means that it may need to call knote().
Since knote() needs the lock of the struct knlist object associated to
the to-be-wake-up netmap port, it is possible to have a lock order reversal
problem (AB/BA deadlock).
This change prevents the deadlock by executing the knote() call in a
per-selinfo taskqueue, where it is possible to hold a mutex.
The change also adds a counter (kqueue_users) to keep track of how many
kqueue users are referencing a given struct nm_selinfo.
In this way, nm_os_selwakeup() can schedule the kevent notification
task only when kqueue is actually being used.
This is important to avoid wasting CPU in the common case where
kqueue is not used.
Reviewed by: aleksandr.fedorov_itglobal.com
Differential Revision: https://reviews.freebsd.org/D18956
netmap: refactor logging macros and pipes
Changelist:
- Replace ND, D and RD macros with nm_prdis, nm_prinf, nm_prerr
and nm_prlim, to avoid possible naming conflicts.
- Add netmap_krings_mode_commit() helper function and use that
to reduce code duplication.
- Refactor pipes control code to export some functions that
can be reused by the veth driver (on Linux) and epair(4).
- Add check to reject API requests with version less than 11.
- Small code refactoring for the null adapter.
Conflicts:
sys/dev/netmap/netmap.c
netmap: upgrade sync-kloop support
Add SYNC_KLOOP_MODE option, and add support for direct mode, where application
executes the TXSYNC and RXSYNC in the context of the ioeventfd wake up callback.
netmap: improvements to the netmap kloop (CSB mode)
Changelist:
- Add the proper memory barriers in the kloop ring processing
functions.
- Fix memory barriers usage in the user helpers (nm_sync_kloop_appl_write,
nm_sync_kloop_appl_read).
- Fix nm_kr_txempty() helper to look at rhead rather than rcur. This
is important since the kloop can read a value of rcur which is ahead
of the value of rhead (see explanation in nm_sync_kloop_appl_write)
- Remove obsolete ptnetmap_guest_write_kring_csb() and
ptnet_guest_read_kring_csb().
- Prepare in advance the arguments for netmap_sync_kloop_[tr]x_ring(),
to make the kloop faster.
- Provide kernel and user implementation for nm_ldld_barrier() and
nm_ldst_barrier()
netmap: fix knote() argument to match the mutex state
The nm_os_selwakeup function needs to call knote() to wake up kqueue(9)
users. However, this function can be called from different code paths,
with different lock requirements.
This patch fixes the knote() call argument to match the relavant lock state.
Also, comments have been updated to reflect current code.
PR: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219846
Reported by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D18876
netmap: move buf_size validation code to its own function
This code validates the netmap buf_size against the interface MTU
and maximum descriptor size, to make sure the values are consistent.
Moving this functionality to its own function is needed because this
function is also called by Linux-specific code.
netmap: align codebase to the current upstream (760279cfb2730a585)
Changelist:
- Replace netmap passthrough host support with a more general
mechanism to call TXSYNC/RXSYNC from an in-kernel event-loop.
No kernel threads are used to use this feature: the application
is required to spawn a thread (or a process) and issue a
SYNC_KLOOP_START (NIOCCTRL) command in the thread body. The
kernel loop is executed by the ioctl implementation, which returns
to userspace only when a different thread calls SYNC_KLOOP_STOP
or the netmap file descriptor is closed.
- Update the if_ptnet driver to cope with the new data structures,
and prune all the obsolete ptnetmap code.
- Add support for "null" netmap ports, useful to allocate netmap_if,
netmap_ring and netmap buffers to be used by specialized applications
(e.g. hypervisors). TXSYNC/RXSYNC on these ports have no effect.
- Various fixes and code refactoring.
Sponsored by: Sunny Valley Networks
Differential Revision: https://reviews.freebsd.org/D18015
netmap: align codebase to the current upstream (sha 8374e1a7e6941)
Changelist:
- Move large parts of VALE code to a new file and header netmap_bdg.[ch].
This is useful to reuse the code within upcoming projects.
- Improvements and bug fixes to pipes and monitors.
- Introduce nm_os_onattach(), nm_os_onenter() and nm_os_onexit() to
handle differences between FreeBSD and Linux.
- Introduce some new helper functions to handle more host rings and fake
rings (netmap_all_rings(), netmap_real_rings(), ...)
- Added new sysctl to enable/disable hw checksum in emulated netmap mode.
- nm_inject: add support for NS_MOREFRAG
Approved by: gnn (mentor)
Differential Revision: https://reviews.freebsd.org/D17364
o Restore netmap emulation mode to working order, including
fixing the destructor panics on detach.
o Omit pipe additions to these fixes, likely problematic for
Suricata to pass traffic like it does on 11.0 without this
patch.
o Allow to build the module without errors in the tree.
Many thanks to Vincenzo Maffione for assistance and review! :)
From b497fe34fd275da6b850bf271f510d02b888b8bc Mon Sep 17 00:00:00 2001
From: Giuseppe Lettieri <g.lettieri@iet.unipi.it>
Date: Thu, 2 Jun 2016 00:21:40 +0200
Subject: [PATCH] allocate only the rings requested by the user
From 09936864fa5b67b82ef4a9907819b7018e9a38f2 Mon Sep 17 00:00:00 2001
From: Giuseppe Lettieri <g.lettieri@iet.unipi.it>
Date: Wed, 20 Jul 2016 20:35:12 +0000
Subject: [PATCH] freebsd: fix const-related warning
From ab90c6c10224fefbb6a6c6e0b92e6ba80e5b694d Mon Sep 17 00:00:00 2001
From: Vincenzo Maffione <v.maffione@gmail.com>
Date: Wed, 28 Sep 2016 18:39:55 +0200
Subject: [PATCH] freebsd: generic: change mbuf allocation management
From fe811e11b2c37fc274a1134e1c10b2f6ada1a91c Mon Sep 17 00:00:00 2001
From: Vincenzo Maffione <v.maffione@gmail.com>
Date: Thu, 29 Sep 2016 08:54:52 +0200
Subject: [PATCH] freebsd: generic: call m_extadd() only once for each mbuf