Commit graph

240 commits

Author SHA1 Message Date
Vincenzo Maffione
8c9874f5b1 netmap: fix knote() argument to match the mutex state
The nm_os_selwakeup function needs to call knote() to wake up kqueue(9)
users. However, this function can be called from different code paths,
with different lock requirements.
This patch fixes the knote() call argument to match the relavant lock state.
Also, comments have been updated to reflect current code.

PR:	https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219846
Reported by:	Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D18876
2019-01-23 14:21:23 +00:00
Vincenzo Maffione
58e185425a netmap: fix txsync check in netmap poll
To check if txsync can be skipped, it is necessary to look for
unseen TX space. However, this means comparing ring->cur
against ring->tail, rather than ring->head against ring->tail
(like nm_ring_empty() does).
This change also adds some more comments to explain the optimization
performed at the beginning of netmap_poll().

MFC after:	3 days
Sponsored by:	Sunny Valley Networks
2018-12-22 16:23:42 +00:00
Vincenzo Maffione
e1ed1fbdea netmap: fix bug in netmap_poll() optimization
The bug was introduced by r339639, although it is present in the upstream
netmap code since 2015. It is due to resetting the want_rx variable to
POLLIN, rather than resetting it to POLLIN|POLLRDNORM.
It only affects select(), which uses POLLRDNORM. poll() is not affected,
because it uses POLLIN.
Also, it only affects FreeBSD, because Linux skips the optimization
implemented by the piece of code where the bug occurs.

MFC after:	3 days
Sponsored by:	Sunny Valley Networks
2018-12-22 15:15:45 +00:00
Vincenzo Maffione
77a2baf551 netmap: move buf_size validation code to its own function
This code validates the netmap buf_size against the interface MTU
and maximum descriptor size, to make sure the values are consistent.
Moving this functionality to its own function is needed because this
function is also called by Linux-specific code.

MFC after:	3 days
2018-12-21 11:50:14 +00:00
Vincenzo Maffione
c52382bd40 netmap: pipes: make sure both ends use the same number of slots 2018-12-21 11:32:55 +00:00
Vincenzo Maffione
dde885de95 netmap: fix warning in netmap_kloop.c
Reported by:	markj
MFC after:	3 days
2018-12-12 16:32:15 +00:00
Vincenzo Maffione
2605ddfce9 netmap: remove dead code obsoleted by iflib
The iflib subsystem implements netmap support in a driver-independent
way (sys/net/iflib.c). We can therefore remove the headers that
used to implement netmap support for all the drivers now supported
by iflib (em, igb, ixl, ixgbe, lem).

MFC after:	1 week
2018-12-07 11:47:42 +00:00
Vincenzo Maffione
89a9a5b5c9 netmap: netmap_transmit should honor bpf packet tap hook
This allows tcpdump to capture outbound kernel packets while
in netmap mode

Submitted by:	Marc de la Gueronniere <mdelagueronniere@verisign.com>
Reviewed by:	vmaffione
MFC after:	1 week
Sponsored by:	Verisign, Inc.
Differential Revision:	https://reviews.freebsd.org/D17896
2018-12-06 09:45:25 +00:00
Vincenzo Maffione
b6e66be22b netmap: align codebase to the current upstream (760279cfb2730a585)
Changelist:
  - Replace netmap passthrough host support with a more general
    mechanism to call TXSYNC/RXSYNC from an in-kernel event-loop.
    No kernel threads are used to use this feature: the application
    is required to spawn a thread (or a process) and issue a
    SYNC_KLOOP_START (NIOCCTRL) command in the thread body. The
    kernel loop is executed by the ioctl implementation, which returns
    to userspace only when a different thread calls SYNC_KLOOP_STOP
    or the netmap file descriptor is closed.
  - Update the if_ptnet driver to cope with the new data structures,
    and prune all the obsolete ptnetmap code.
  - Add support for "null" netmap ports, useful to allocate netmap_if,
    netmap_ring and netmap buffers to be used by specialized applications
    (e.g. hypervisors). TXSYNC/RXSYNC on these ports have no effect.
  - Various fixes and code refactoring.

Sponsored by:	Sunny Valley Networks
Differential Revision:	https://reviews.freebsd.org/D18015
2018-12-05 11:57:16 +00:00
Vincenzo Maffione
d55913f555 netmap: set IFCAP_NETMAP in if_capabilities
Revision r307394 removed (by mistake) the code that sets IFCAP_NETMAP
in if_capabilities on netmap_attach. This patch reverts this change.

Reviewed by:	np
Approved by:	gnn (mentor)
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D17987
2018-11-28 14:07:34 +00:00
Vincenzo Maffione
2e42b74a6f vtnet: fix netmap support
netmap(4) support for vtnet(4) was incomplete and had multiple bugs.
This commit fixes those bugs to bring netmap on vtnet in a functional state.

Changelist:
  - handle errors returned by virtqueue_enqueue() properly (they were
    previously ignored)
  - make sure netmap XOR rest of the kernel access each virtqueue.
  - compute the number of netmap slots for TX and RX separately, according to
    whether indirect descriptors are used or not for a given virtqueue.
  - make sure sglist are freed according to their type (mbufs or netmap
    buffers)
  - add support for mulitiqueue and netmap host (aka sw) rings.
  - intercept VQ interrupts directly instead of intercepting them in txq_eof
    and rxq_eof. This simplifies the code and makes it easier to make sure
    taskqueues are not running for a VQ while it is in netmap mode.
  - implement vntet_netmap_config() to cope with changes in the number of queues.

Reviewed by:	bryanv
Approved by:	gnn (mentor)
MFC after:	3 days
Sponsored by:	Sunny Valley Networks
Differential Revision:	https://reviews.freebsd.org/D17916
2018-11-14 15:39:48 +00:00
Bjoern A. Zeeb
be01db051f Remove redundant redeclaration of netmap_vp_reg().
This should unbreak sparc64 and powerpc LINT builds.
2018-10-24 14:14:49 +00:00
Vincenzo Maffione
2a7db7a63d netmap: align codebase to the current upstream (sha 8374e1a7e6941)
Changelist:
    - Move large parts of VALE code to a new file and header netmap_bdg.[ch].
      This is useful to reuse the code within upcoming projects.
    - Improvements and bug fixes to pipes and monitors.
    - Introduce nm_os_onattach(), nm_os_onenter() and nm_os_onexit() to
      handle differences between FreeBSD and Linux.
    - Introduce some new helper functions to handle more host rings and fake
      rings (netmap_all_rings(), netmap_real_rings(), ...)
    - Added new sysctl to enable/disable hw checksum in emulated netmap mode.
    - nm_inject: add support for NS_MOREFRAG

Approved by:	gnn (mentor)
Differential Revision:	https://reviews.freebsd.org/D17364
2018-10-23 08:55:16 +00:00
David Bright
53e992cfb9 Fix several memory leaks.
The libkqueue tests have several places that leak memory by using an
idiom like:

puts(kevent_to_str(kevp));

Rework to save the pointer returned from kevent_to_str() and then
free() it after it has been used.

Reported by:	asomers (pointer to Coverity), Coverity
CID:		1296063, 1296064, 1296065, 1296066, 1296067, 1350287, 1394960
Sponsored by:	Dell EMC
2018-08-14 19:12:45 +00:00
Matt Macy
24a7d6d3a6 netmap and iflib drivers, silence unused var warnings 2018-05-19 05:57:26 +00:00
Matt Macy
3535fae847 netmap: compare e1 with e2, not with itself 2018-05-19 05:37:18 +00:00
Matt Macy
cfa866f6a1 netmap: pull fix for 32-bit support from upstream
Approved by:	sbruno
2018-05-18 03:38:17 +00:00
Brooks Davis
1315f9b59f Fix build on 32-bit systems. 2018-04-13 19:43:23 +00:00
Vincenzo Maffione
2ff91c175e netmap: align codebase to the current upstream (commit id 3fb001303718146)
Changelist:
    - Turn tx_rings and rx_rings arrays into arrays of pointers to kring
      structs. This patch includes fixes for ixv, ixl, ix, re, cxgbe, iflib,
      vtnet and ptnet drivers to cope with the change.
    - Generalize the nm_config() callback to accept a struct containing many
      parameters.
    - Introduce NKR_FAKERING to support buffers sharing (used for netmap
      pipes)
    - Improved API for external VALE modules.
    - Various bug fixes and improvements to the netmap memory allocator,
      including support for externally (userspace) allocated memory.
    - Refactoring of netmap pipes: now linked rings share the same netmap
      buffers, with a separate set of kring pointers (rhead, rcur, rtail).
      Buffer swapping does not need to happen anymore.
    - Large refactoring of the control API towards an extensible solution;
      the goal is to allow the addition of more commands and extension of
      existing ones (with new options) without the need of hacks or the
      risk of running out of configuration space.
      A new NIOCCTRL ioctl has been added to handle all the requests of the
      new control API, which cover all the functionalities so far supported.
      The netmap API bumps from 11 to 12 with this patch. Full backward
      compatibility is provided for the old control command (NIOCREGIF), by
      means of a new netmap_legacy module. Many parts of the old netmap.h
      header has now been moved to netmap_legacy.h (included by netmap.h).

Approved by:	hrs (mentor)
2018-04-12 07:20:50 +00:00
Vincenzo Maffione
4f80b14ce2 netmap: align codebase to upstream version v11.4
Changelist:
  - remove unused nkr_slot_flags
  - new nm_intr adapter callback to enable/disable interrupts
  - remove unused sysctls and document the other sysctls
  - new infrastructure to support NS_MOREFRAG for NIC ports
  - support for external memory allocator (for now linux-only),
    including linux-specific changes in common headers
  - optimizations within netmap pipes datapath
  - improvements on VALE control API
  - new nm_parse() helper function in netmap_user.h
  - various bug fixes and code clean up

Approved by:	hrs (mentor)
2018-04-09 09:24:26 +00:00
Vincenzo Maffione
46023447b6 netmap: align if_ptnet guest driver to the upstream code (commit 0e15788)
The change upgrades the driver to use the split Communication Status
Block (CSB) format. In this way the variables written by the guest
and read by the host are allocated in a different cacheline than
the variables written by the host and read by the guest; this is
needed to avoid cache thrashing.

Approved by:	hrs (mentor)
2018-04-04 21:31:12 +00:00
Pedro F. Giffuni
ac2fffa4b7 Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by:	wosch
PR:		225197
2018-01-21 15:42:36 +00:00
Pedro F. Giffuni
26c1d774b5 dev: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these is likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.
2018-01-13 22:30:30 +00:00
Pedro F. Giffuni
718cf2ccb9 sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 14:52:40 +00:00
Gleb Smirnoff
e8fd18f306 Shorten list of arguments to mbuf external storage freeing function.
All of these arguments are stored in m_ext, so there is no reason
to pass them in the argument list.  Not all functions need the second
argument, some don't even need the first one.  The second argument
lives in next cache line, so not dereferencing it is a performance
gain.  This was discovered in sendfile(2), which will be covered by
next commits.

The second goal of this commit is to bring even more flexibility
to m_ext mbufs, allowing to create more fields in m_ext, opaque to
the generic mbuf code, and potentially set and dereferenced by
subsystems.

Reviewed by:	gallatin, kbowling
Differential Revision:	https://reviews.freebsd.org/D12615
2017-10-09 20:35:31 +00:00
Luiz Otavio O Souza
4dd4446129 Restore the changes done in r313982: Replace zero with NULL for pointers.
Spotted by:	Harry Schmalzbauer
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC (Netgate)
2017-07-21 03:59:56 +00:00
Luiz Otavio O Souza
a02dbe4ca1 Do not allow the use of the loopback interface in netmap.
The generic support in netmap send the packets using if_transmit() and the
loopback do not support packets coming from if_transmit()/if_start().

This avoids the use of the loopback interface and the subsequent crash that
happens when the application send packets to the loopback interface.

Details in:	https://github.com/luigirizzo/netmap/issues/322
Reported by:	Vincenzo Maffione <v.maffione@gmail.com>
Sponsored by:	Rubicon Communications, LLC (Netgate)
2017-07-21 03:28:35 +00:00
Luiz Otavio O Souza
c3e9b4db8c Update the current version of netmap to bring it in sync with the github
version.

This commit contains mostly refactoring, a few fixes and minor added
functionality.

Submitted by:	Vincenzo Maffione <v.maffione at gmail.com>
Requested by:	many
Sponsored by:	Rubicon Communications, LLC (Netgate)
2017-06-12 22:53:18 +00:00
Pedro F. Giffuni
4d24901ac9 sys/dev: Replace zero with NULL for pointers.
Makes things easier to read, plus architectures may set NULL to something
different than zero.

Found with:	devel/coccinelle
MFC after:	3 weeks
2017-02-20 03:43:12 +00:00
Mark Johnston
4c55b4e8da Unbreak the gcc build of netmap.
This fixes several LINT targets.

Reviewed by:	Vincenzo Maffione
2017-02-14 21:36:18 +00:00
Sean Bruno
25a3341048 Fix panic on mb_free_ext() due to NULL destructor.
This used to happen because of the SET_MBUF_DESTRUCTOR() called
on unregif.

Submitted by:	Vincenzo Maffione <v.maffione@gmail.com>
2017-01-12 16:24:10 +00:00
Adrian Chadd
67ca1051e0 [netmap] call RLOCK /and/ RUNLOCK.
Reported by: olivier
2017-01-02 06:36:12 +00:00
Adrian Chadd
869d88787d [netmap] fix locking regressions
* Firmware oriented NICs may need to sleep in their configuration paths.
  Use RLOCK instead of WLOCK to allow this to again occur.

  This fixes netmap on cxgbe.

* Change the worker lock to a normal mutex rather than a spin lock.
  Drivers shouldn't be doing netmap work from the fast interrupt
  handlers, so it's not required to be a spinlock.

Submitted by:	luigi, Vincenzo Maffione <v.maffione@gmail.com>
Reviewed by:	jhb
2016-12-30 14:47:46 +00:00
Ed Maste
54c7693f2c netmap: add cast to fix powerpc64 LINT kernel
Attempt to fix powerpc64 LINT kernel broken by r308000. Netmap's use of
a uint64_t wchan seems odd, but in the interest of minimizing this
change just cast through uintptr_t to silence the compiler warning.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D8669
2016-11-30 02:00:30 +00:00
Sean Bruno
c371f1143c The buffer address is always overwritten in the extended descriptor format,
we have to refresh it ... always.  This fixes problems reported in NetMap
with em(4) devices after conversion to extended descriptor format in
svn r293331.

Submitted by:	luigi@
Reported by:	franco@opnsense.org
MFC after:	2 days
2016-10-28 13:37:58 +00:00
Luigi Rizzo
844a6f0c53 Various fixes for ptnet/ptnetmap (passthrough of netmap ports). In detail:
- use PCI_VENDOR and PCI_DEVICE ids from a publicly allocated range
  (thanks to RedHat)
- export memory pool information through PCI registers
- improve mechanism for configuring passthrough on different hypervisors
Code is from Vincenzo Maffione as a follow up to his GSOC work.
2016-10-27 09:46:22 +00:00
Ed Maste
984ff0d910 netmap: fix kernel build on GCC-using architectures
GCC produced a multiple declaration warning from the
SYSCTL_DECL(_dev_netmap).
2016-10-21 13:51:47 +00:00
Sepherosa Ziehau
ffaa5deb38 netmap: Unbreak LINT-VIMAGE building
Sponsored by:	Microsoft
2016-10-21 06:32:45 +00:00
Sepherosa Ziehau
e3f94e5133 netmap: Unbreak i386 LINT building
Sponsored by:	Microsoft
2016-10-21 06:05:16 +00:00
Luigi Rizzo
a2a7409151 remove stale and unused code from various files
fix build on 32 bit platforms
simplify logic in netmap_virt.h

The commands (in net/netmap.h) to configure communication with the
hypervisor may be revised soon.
At the moment they are unused so this will not be a change of API.
2016-10-18 16:18:25 +00:00
Luigi Rizzo
6ad42d71b2 remove trailing whitespace. No code changes. 2016-10-18 15:41:57 +00:00
Sean Bruno
225d33ff5c Restore svn r306772 that was overwritten by netmap import at svn r307394
#include <sys/selinfo.h> should be here as all drivers that support
netmap need to use this file regardless.
2016-10-18 14:48:41 +00:00
Luigi Rizzo
a9e644cd24 add two missing files for the netmap import 2016-10-16 15:22:17 +00:00
Luigi Rizzo
37e3a6d349 Import the current version of netmap, aligned with the one on github.
This commit, long overdue, contains contributions in the last 2 years
from Stefano Garzarella, Giuseppe Lettieri, Vincenzo Maffione, including:
+ fixes on monitor ports
+ the 'ptnet' virtual device driver, and ptnetmap backend, for
  high speed virtual passthrough on VMs (bhyve fixes in an upcoming commit)
+ improved emulated netmap mode
+ more robust error handling
+ removal of stale code
+ various fixes to code and documentation (some mixup between RX and TX
  parameters, and private and public variables)

We also include an additional tool, nmreplay, which is functionally
equivalent to tcpreplay but operating on netmap ports.
2016-10-16 14:13:32 +00:00
Sean Bruno
87de0cd185 Move netmap selinfo.h in to sensible location.
netmap_kern.h currently requires all drivers including it to include
selinfo.h.

Submitted by:	mmacy@nextbsd.org
Reviewed by:	gnn
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D5334
2016-10-06 17:54:34 +00:00
Eric Joyner
ff9b61ca07 Fix linker warnings (errors on gcc) that resulted from r304510.
The variables that are extern in the netmap header file should be
defined in ixl_txrx.c (the file that is included in both ixl(4)/ixlv(4),
not in the main driver source files.

Reported by:	ed@, dim@, ngie@
2016-09-01 01:08:18 +00:00
Jean-Sébastien Pédron
bd937497ea Consistently use device_t
Several files use the internal name of `struct device` instead of
`device_t` which is part of the public API. This patch changes all
`struct device *` to `device_t`.

The remaining occurrences of `struct device` are those referring to the
Linux or OpenBSD version of the structure, or the code is not built on
FreeBSD and it's unclear what to do.

Submitted by:	Matthew Macy <mmacy@nextbsd.org> (previous version)
Approved by:	emaste, jhibbits, sbruno
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D7447
2016-08-09 19:32:06 +00:00
Eitan Adler
cef367e6a1 Don't repeat the the word 'the'
(one manual change to fix grammar)

Confirmed With: db
Approved by: secteam (not really, but this is a comment typo fix)
2016-05-17 12:52:31 +00:00
Pedro F. Giffuni
453130d9bf sys/dev: minor spelling fixes.
Most affect comments, very few have user-visible effects.
2016-05-03 03:41:25 +00:00
Navdeep Parhar
fddd4f6273 Plug leak in m_unshare.
m_unshare passes on the source mbuf's flags as-is to m_getcl and this
results in a leak if the flags include M_NOFREE.  The fix is to clear
the bits not listed in M_COPYALL before calling m_getcl.  M_RDONLY
should probably be filtered out too but that's outside the scope of this
fix.

Add assertions in the zone_mbuf and zone_pack ctors to catch similar
bugs.

Update netmap_get_mbuf to not pass M_NOFREE to m_getcl.  It's not clear
what the original code was trying to do but it's likely incorrect.
Updated code is no different functionally but it avoids the newly added
assertions.

Reviewed by:	gnn@
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D5698
2016-03-26 23:39:53 +00:00
Gleb Smirnoff
8ec07310fa These files were getting sys/malloc.h and vm/uma.h with header pollution
via sys/mbuf.h
2016-02-01 17:41:21 +00:00
Sean Bruno
b834dcea9a Switch em(4) to the extended RX descriptor format. This matches the
e1000/e1000e split in linux.

Split rxbuffer and txbuffer apart to support the new RX descriptor format
structures. Move rxbuffer manipulation to em_setup_rxdesc() to unify the
new behavior changes.

Add a RSSKEYLEN macro for help in generating the RSSKEY data structures
in the card.

Change em_receive_checksum() to process the new rxdescriptor format
status bit.

MFC after:	2 weeks
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D3447
2016-01-07 16:42:48 +00:00
Kevin Lo
ddb1359877 Fix typo (s/harware/hardware/) 2015-12-25 14:51:36 +00:00
Adrian Chadd
15b1492c9b Don't call enable_all_rings if the adapter has been freed.
This is a subtle use-after-free race that results in some very undesirable
hang behaviour.

Reviewed by:	pkelsey
Obtained from:	Kip Macy, NextBSD (91a9bd1dbb)
2015-09-07 23:16:39 +00:00
Luigi Rizzo
847adfb7b3 add a use count so the netmap module cannot be unloaded while in use. 2015-07-19 18:07:25 +00:00
Luigi Rizzo
10b8ef3d6a properly destroy persistent vale ports 2015-07-19 18:06:30 +00:00
Luigi Rizzo
9694aad375 do not free NULL if pipe allocation fails 2015-07-19 18:05:49 +00:00
Luigi Rizzo
05f7605789 release a reference when stopping a monitor 2015-07-19 18:04:51 +00:00
Luigi Rizzo
85fe4e7c6b small documentation update 2015-07-19 17:54:42 +00:00
Patrick Kelsey
8aa7fdbd78 Add netmap support for ixgbe SRIOV VFs (that is, to if_ixv).
Differential Revision: https://reviews.freebsd.org/D2923
Reviewed by: erj, gnn
Approved by: jmallett (mentor)
Sponsored by: Norse Corp, Inc.
2015-07-15 01:02:01 +00:00
Luigi Rizzo
5f94000ee4 set the refcount for the structure (dropped by mistake in the last commit). 2015-07-13 10:23:52 +00:00
Luigi Rizzo
8fd44c9395 staticize functions only used in netmap.c
(detected by jenkins run with gcc 4.9)

Update documentation on the use of netmap_priv_d,
rename the refcount and use the same structure in
FreeBSD and linux

No functional changes.
2015-07-10 16:05:24 +00:00
Luigi Rizzo
847bf38369 Sync netmap sources with the version in our private tree.
This commit contains large contributions from Giuseppe Lettieri and
Stefano Garzarella, is partly supported by grants from Verisign and Cisco,
and brings in the following:

- fix zerocopy monitor ports and introduce copying monitor ports
  (the latter are lower performance but give access to all traffic
  in parallel with the application)

- exclusive open mode, useful to implement solutions that recover
  from crashes of the main netmap client (suggested by Patrick Kelsey)

- revised memory allocator in preparation for the 'passthrough mode'
  (ptnetmap) recently presented at bsdcan. ptnetmap is described in
        S. Garzarella, G. Lettieri, L. Rizzo;
        Virtual device passthrough for high speed VM networking,
        ACM/IEEE ANCS 2015, Oakland (CA) May 2015
        http://info.iet.unipi.it/~luigi/research.html

- fix rx CRC handing on ixl

- add module dependencies for netmap when building drivers as modules

- minor simplifications to device-specific routines (*txsync, *rxsync)

- general code cleanup (remove unused variables, introduce macros
  to access rings and remove duplicate code,

Applications do not need to be recompiled, unless of course
they want to use the new features (monitors and exclusive open).

Those willing to try this code on stable/10 can just update the
sys/dev/netmap/*, sys/net/netmap* with the version in HEAD
and apply the small patches to individual device drivers.

MFC after:	1 month
Sponsored by:	(partly) Verisign, Cisco
2015-07-10 05:51:36 +00:00
Sean Bruno
23c9098b2a Change EM_MULTIQUEUE to a real kernconf entry and enable support for
up to 2 rx/tx queues for the 82574.

Program the 82574 to enable 5 msix vectors, assign 1 to each rx queue,
1 to each tx queue and 1 to the link handler.

Inspired by DragonFlyBSD, enable some RSS logic for handling tx queue
handling/processing.

Move multiqueue handler functions so that they line up better in a diff
review to if_igb.c

Always enqueue tx work to be done in em_mq_start, if unable to acquire
the TX lock, then this will be processed in the background later by the
taskqueue.  Remove mbuf argument from em_start_mq_locked() as the work
is always enqueued.  (stolen from igb)

Setup TARC, TXDCTL and RXDCTL registers for better performance and stability
in multiqueue and singlequeue implementations. Handle Intel errata  3 and
generic multiqueue behavior with the initialization of TARC(0) and TARC(1)

Bind interrupt threads to cpus in order.  (stolen from igb)

Add 2 new DDB functions, one to display the queue(s) and their settings and
one to reset the adapter.  Primarily used for debugging.

In the multiqueue configuration, bump RXD and TXD ring size to max for the
adapter (4096).  Setup an RDTR of 64 and an RADV of 128 in multiqueue configuration
to cut down on the number of interrupts.  RADV was arbitrarily set to 2x RDTR
and can be adjusted as needed.

Cleanup the display in top a bit to make it clearer where the taskqueue threads
are running and what they should be doing.

Ensure that both queues are processed by em_local_timer() by writing them both
to the IMS register to generate soft interrupts.

Ensure that an soft interrupt is generated when em_msix_link() is run so that
any races between assertion of the link/status interrupt and a rx/tx interrupt
are handled.

Document existing tuneables: hw.em.eee_setting, hw.em.msix, hw.em.smart_pwr_down, hw.em.sbp

Document use of hw.em.num_queues and the new kernel option EM_MULTIQUEUE

Thanks to Intel for their continued support of FreeBSD.

Reviewed by:	erj jfv hiren gnn wblock
Obtained from:	Intel Corporation
MFC after:	2 weeks
Relnotes:	Yes
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D1994
2015-06-03 18:01:09 +00:00
Patrick Kelsey
dd4fcbc594 When a netmap process terminates without the full set of buffers it
was granted via rings and ni_bufs_list_head represented in those rings
and lists (e.g., via SIGKILL), those buffers are no longer available
for subsequent users for the lifetime of the system. To mitigate this
resource leak, reset the allocator state when the last ref to that
allocator is released.

Note that this only recovers leaked resources for an allocator when
there are no longer any users of that allocator, so there remain
circumstances in which leaked allocator resources may not ever be
recovered - consider a set of multiple netmap processes that are all
using the same allocator (say, the global allocator) where members of
that set may be killed and restarted over time but at any given point
there is one member of that set running.

Based on intial work by adrian@.

Reviewed by: Giuseppe Lettieri (g.lettieri@iet.unipi.it), luigi
Approved by: jmallett (mentor)
MFC after: 1 week
Sponsored by: Norse Corp, Inc.
2015-05-15 15:36:57 +00:00
Rui Paulo
d82f9014fa netmap: improve the netmap attach message on FreeBSD.
MFC after:	1 week
2015-04-11 06:20:46 +00:00
Bjoern A. Zeeb
69cfd6a666 Make ix_crcstrip a public symbol for the moment; it probably is not
the right solution but I will leave it to experts to untangle this
problem to properly stop the build failures.

At the moment only if_ix.c includes dev/netmap/ixgbe_netmap.h which is
good as ixgbe_netmap.h defines a couple of (file) static variables--thus
local to if_ix.c.
static int ix_crcstrip however now also got checked from ix_txrx.c
(as an extern) and should not be visible there.  In fact we do see
powerpc and powerpc64 build failures because of this.  It is unclear
to me why on other (clang built?) architectures this does not lead
to a reference of an undefined symbol and similar build breakage.
2015-03-24 09:46:47 +00:00
Luigi Rizzo
bc8b78d393 Add native netmap support to ixl.
Preliminary tests indicate 32 Mpps on tx, 24 Mpps on rx
with source and receiver on two different ports of the same 40G card.
Optimizations are likely possible.
The code follows closely the one for ixgbe so i do not
expect stability issues.

Hardware kindly supplied by Intel.

Reviewed by:	Jack Vogel
MFC after:	1 week
2015-02-24 06:20:50 +00:00
Luigi Rizzo
735c8d9528 add MODULE_VERSION, needed to track module dependencies
MFC after:	3 days
2015-02-23 07:28:31 +00:00
Luigi Rizzo
6641c68bcd two minor changes from the master netmap version:
1. handle errors from nm_config(), if any (none of the FreeBSD drivers
   currently returns an error on this function, so this change
   is a no-op at this time
2. use a full memory barrier on ioctls
2015-02-14 19:03:11 +00:00
Luigi Rizzo
c929ca72c9 whitespace change:
clarify the role of MAKEDEV_ETERNAL_KLD, and remove an old
#ifdef __FreeBSD__ since the code is valid on all platforms.
2015-02-14 18:59:31 +00:00
Adrian Chadd
11c0b69c08 Change the permissions from 0660 to 0600.
Otherwise people in wheel can do things with netmap, including
but not limited to promisc transmit/receive.

Approved by:	luigi
MFC after:	1 week
2015-01-24 19:49:27 +00:00
Hans Petter Selasky
c25290420e Start process of removing the use of the deprecated "M_FLOWID" flag
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.

This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.

"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.

Additional notes:
- The SCTP code changes will be committed as a separate patch.
- Removal of the "M_FLOWID" flag will also be done separately.
- The FreeBSD version has been bumped.

MFC after:	1 month
Sponsored by:	Mellanox Technologies
2014-12-01 11:45:24 +00:00
Luigi Rizzo
0e73f29ae2 add support for private knote lock (reduces lock contention),
adapting OS_selrecord accordingly.
Problem and fix suggested by adrian and jmg
2014-11-13 00:40:34 +00:00
Luigi Rizzo
ad15cc59e9 we need full barriers here 2014-11-13 00:14:25 +00:00
Luigi Rizzo
039dd540f5 in the Linux section, properly define the NMG_LOCK type.
Also import WITH_GENERIC in preparation to adding fine-grained
options to disable specific netmap components.
2014-11-11 00:13:28 +00:00
Luigi Rizzo
204f91dd3a - fix typo: use ring size from the rx ring, not the tx one (they should be
the same, but just in case);
- reuse the previously computed len-1 value
2014-11-11 00:10:44 +00:00
Luigi Rizzo
6435a0dc1b fix a typo 2014-11-10 21:00:23 +00:00
Luigi Rizzo
4e93beff92 initialize *color if passed as an argument 2014-11-10 20:25:33 +00:00
Luigi Rizzo
db5cb21105 sync a comment with our internal repo 2014-11-10 20:19:58 +00:00
Luigi Rizzo
b3d3758852 fix a panic when passing ifioctl from a netmap file descriptor to
the underlying device. This needs to be merged to 10.1

Reported by: Patrick Kelsey
MFC after:	3 days
2014-09-25 16:22:32 +00:00
Luigi Rizzo
7f154b713a adapt the code to different freebsd versions.
Not necessary to MFC
2014-09-25 15:57:57 +00:00
Gleb Smirnoff
c8dfaf382f Mechanically convert to if_inc_counter(). 2014-09-19 03:51:26 +00:00
Gleb Smirnoff
997d2d833f Provide pointer from struct ifnet to struct netmap_adapter,
instead of abusing spare field.
2014-08-31 11:33:19 +00:00
Navdeep Parhar
9721a22d4a Change netmap's global lock to sx instead of a mutex.
Reviewed by:	luigi@
MFC after:	1 day
2014-08-20 23:37:44 +00:00
Luigi Rizzo
1460a86867 staticize two functions, and use proper format for a struct sglist
(reported by bz)
2014-08-17 10:25:27 +00:00
Luigi Rizzo
4bf50f18eb Update to the current version of netmap.
Mostly bugfixes or features developed in the past 6 months,
so this is a 10.1 candidate.

Basically no user API changes (some bugfixes in sys/net/netmap_user.h).

In detail:

1. netmap support for virtio-net, including in netmap mode.
  Under bhyve and with a netmap backend [2] we reach over 1Mpps
  with standard APIs (e.g. libpcap), and 5-8 Mpps in netmap mode.

2. (kernel) add support for multiple memory allocators, so we can
  better partition physical and virtual interfaces giving access
  to separate users. The most visible effect is one additional
  argument to the various kernel functions to compute buffer
  addresses. All netmap-supported drivers are affected, but changes
  are mechanical and trivial

3. (kernel) simplify the prototype for *txsync() and *rxsync()
  driver methods. All netmap drivers affected, changes mostly mechanical.

4. add support for netmap-monitor ports. Think of it as a mirroring
  port on a physical switch: a netmap monitor port replicates traffic
  present on the main port. Restrictions apply. Drive carefully.

5. if_lem.c: support for various paravirtualization features,
  experimental and disabled by default.
  Most of these are described in our ANCS'13 paper [1].
  Paravirtualized support in netmap mode is new, and beats the
  numbers in the paper by a large factor (under qemu-kvm,
  we measured gues-host throughput up to 10-12 Mpps).

A lot of refactoring and additional documentation in the files
in sys/dev/netmap, but apart from #2 and #3 above, almost nothing
of this stuff is visible to other kernel parts.

Example programs in tools/tools/netmap have been updated with bugfixes
and to support more of the existing features.

This is meant to go into 10.1 so we plan an MFC before the Aug.22 deadline.

A lot of this code has been contributed by my colleagues at UNIPI,
including Giuseppe Lettieri, Vincenzo Maffione, Stefano Garzarella.

MFC after:	3 days.
2014-08-16 15:00:01 +00:00
Gleb Smirnoff
fcc34a238c Fix style bug: rename the refcount field of m_ext to ext_cnt, to match
other members.

Sponsored by:	Nginx, Inc.
2014-07-11 14:34:29 +00:00
Luigi Rizzo
63a3395e5d change the netmap mbuf destructor so the same code works also on FreeBSD 9.
For head and 10 this change has no effect, but on stable/9 it would cause
panics when using emulated netmap on top of a standard device driver.
2014-06-10 16:06:59 +00:00
Luigi Rizzo
348c44a5be Fixes from Fanco Ficthner on transparent mode
* The way rings are updated changed with the last API bump.
  Also sync ->head when moving slots in netmap_sw_to_nic().

* Remove a crashing selrecord() call.

* Unclog the logic surrounding netmap_rxsync_from_host().

* Add timestamping to RX host ring.

* Remove a couple of obsolete comments.

Submitted by:	Franco Fichtner
MFC after:	3 days
Sponsored by:	Packetwerk
2014-06-09 15:46:11 +00:00
Luigi Rizzo
46aa1303f3 sync the code with the one in stable/10
(wrap the if_t compatibilty function into a __FreeBSD_version
conditional block)
2014-06-09 15:44:31 +00:00
Luigi Rizzo
e4166283fb better handling of netmap emulation over standard device drivers:
plug a potential mbuf leak, and detect bogus drivers that
return ENOBUFS even when the packet has been queued.

MFC after:	3 days
2014-06-06 18:36:02 +00:00
Luigi Rizzo
997b054cf1 introduce mbq_lock() and mbq_unlock() for the mbq,
so it is easier to buil the same code on linux
(this generalizes the change in svn 267142)

MFC after:	3 days
2014-06-06 18:02:32 +00:00
Luigi Rizzo
0dc809c034 move netmap_getna() to a freebsd-specific file 2014-06-06 16:23:08 +00:00
Luigi Rizzo
89cc25561c align comments with the ones in our development trunk 2014-06-06 14:58:25 +00:00
Luigi Rizzo
d8e1c53b15 rate limit some error messages 2014-06-06 14:57:40 +00:00
Luigi Rizzo
5899a007ae remove two debugging messages, align comments with the code
in our development trunk
2014-06-06 14:57:16 +00:00
Luigi Rizzo
e31c6ec7e2 add checks for invalid buffer pointers and lengths 2014-06-06 10:50:14 +00:00
Luigi Rizzo
441ab64f52 prevent a panic when the netdev/ifp is not set in attach
(internal  c63a7b85)

MFC after:	3 days
2014-06-06 10:40:20 +00:00
Andrey Zonov
dc8a95e62b Use mtx_lock_spin/mtx_unlock_spin primitives on spin lock
Reviewed by:	luigi
MFC after:	1 week
2014-06-06 00:24:04 +00:00