This commit adds a handler for the new aenq message
ENA_ADMIN_DEVICE_REQUEST_RESET,
which in turn causes the driver to trigger reset of a new type:
ENA_REGS_RESET_DEVICE_REQUEST. Also adds counting of such occurrences in
a new statistic for it.
Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
When attaching ENA driver, ena_netmap_attach() is invoked which, in turn
calls netmap_attach which, initializes a struct netmap_adapter,
allocating the struct's netmap_ring and the struct selinfo.
When we change the interface number of queues we need to reinit the
netmap adapter struct as well, so we need to detach it in order to free
the memory allocated by netmap_attach and allocate new memory based on
the new parameters like number of rings, ring size etc...
Without detaching and attaching the netmap interface, if we're to change
the number of queues from 8 to 2 for example and try to enable netmap,
the kernel will panic since the original netmap struct within the
kernel's possession still thinks that the driver has 8 queues which will
eventually cause a non-allocated virtual address access fault.
Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
When processing packets within the rx-flow
ena_netmap_rx_load_desc doesn't know the number of descriptors, so it
sets NS_MOREFRAG to all the slots to indicate that there are more
fragments for this packet.
The code calls ena_netmap_rx_load_desc() for every descriptor in
this packet to map the relevant buffer into the netmap shared memory.
After ena_netmap_rx_load_desc() calls, we need to unset the NS_MOREFRAG
for the last fragment to indicate that this is the last fragment,
so we explicitly turn off NS_MOREFRAG flag.
Current code overrides all other flags and sets NS_BUF_CHANGED.
This patch unsets the relevant flag only.
Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
Netmap index wraps around based on the number of netmap kernel ring
slots.
Currently the driver prefetches the next slot using nm_i + 1 which may
be wrong since it does not handle wrap around.
This patch fixes that by using the kernel API for fetching the next
netmap index.
Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
In case ena_com_prepare_tx() fails within the netmap tx flow,
the driver will unmap the last socket chain.
Currently, the driver unmaps the wrong socket within
ena_netmap_unmap_last_socket_chain().
Illustration of the flow:
1- ena_netmap_tx_frames()
2- ena_netmap_tx_frame()
3- ena_netmap_tx_map_slots()
3.1- Map slot
3.2- Advance to the next socket
4- ena_com_prepare_tx()
4.1- ena_com_prepare_tx() fails
5- ena_netmap_unmap_last_socket_chain()
In step 5, where the driver unmaps the socket, the netmap
index already points at the next entry, meaning we're unmapping the
wrong socket in case ena_com_prepare_tx() fails.
In order to fix that, the driver should first update the netmap index to
point at the previous entry and only then update the socket parameters.
Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
This commit changes the code so all global counters will have the
same line break.
Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
The mbuf is NULL issue happens when the device sends the driver
a completion with a wrong request id.
Trigger a reset whenever this happens.
Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
This commit adds differentiation for a reset caused by missing tx
completions, by verifying if the driver didn't receive tx
completions caused by missing interrupts.
The cleanup_running field was added to ena_ring because
cleanup_task.ta_pending is zeroed before ena_cleanup() runs.
Also ena_increment_reset_counter() API was added in order to support
only incrementing the reset counter.
Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
This commit sets the default value for ena_min_poll_delay_us to 100.
This commit does not change the behavior of the driver, the delay is
calculated as MAX(ENA_MIN_ADMIN_POLL_US, delay_us), where the first
field is already defined as 100.
The second parameter, delay_us is taken from ena_min_poll_delay_us
which is currently unset - 0.
Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
There can be cases when we trigger reset if an admin interrupt
is missing.
In order to identify this use-case specifically,
this commit adds a new reset reason.
Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
RX completion descriptors may sometimes contain errors due
to corruption. Upon identifying such a case, the driver will
trigger a reset with an explicit reset reason
ENA_REGS_RESET_RX_DESCRIPTOR_MALFORMED.
Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
TX completion descriptors may sometimes contain errors due
to corruption. Upon identifying such a case, the driver will
trigger a reset with an explicit reset reason
ENA_REGS_RESET_TX_DESCRIPTOR_MALFORMED.
Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
The driver uses different reset reasons.
Some of them are counted and presented in the driver statistics.
There are cases where statistics are counted on a ring level,
but these are zeroed after a reset procedure takes place.
This commit makes the following changes:
1. Add statistics for the unrepresented reset reasons.
2. Add reset reasons which are counted on a ring level,
to be also global for better tracking.
Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
This commit is part of the effort of notifying the user of non-optimal
or performance impacting practices.
A new interface is serving as a communication channel
between the device and the driver. One of the goals of this channel is
to create a new mechanism of notifying the driver and user in case of
sub-optimal configuration using a bitmap.
Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
Currently we count all of the newly added and already existing
missing tx completions in each iteration of
check_missing_comp_in_tx_queue() causing duplicate counts
to missing_tx_comp stat.
This commit adds a new counter new_missed_tx within the relevant
function which only counts the newly added missing tx completions
in each iteration of check_missing_comp_in_tx_queue().
This will allow us to update missing_tx_comp stat accurately without
counting duplicates.
Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
Upstream commit [1] made if_alloc_domain() never fail, then also do the
wrappers if_alloc(), if_alloc_dev(), and if_gethandle().
Upstream commit [2] removed the NULL check conducted by the driver.
This commit also removes err_customer_metrics_alloc goto label.
Commit [2] leaves behind a floating free() statement that
deallocates customer_metrics_array. This commit places the
deallocation statement where it belongs.
[1] commit 4787572d05 ("ifnet: make if_alloc_domain() never fail")
[2] commit aa3860851b ("net: Remove unneeded NULL check for the allocated ifnet")
Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
The Arm True Random Number Generator Firmware Interface provides a way
to query the SMCCC firmware for up to 192 bits of entropy. Use it to
provide another source of randomness to the kernel.
Reviewed by: cem, markm
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D46989
This will be used to create an integer with a given version. It can
then be used to check if the SMCCC version is late enough for a driver.
Sponsored by: Arm Ltd
This will be used by other drivers that manage SMCCC firmware services
to use as an attachment point.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D46988
Simplify the calls into the SMCCC firmware with the new
arm_smccc_invoke* macros.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D46987
When calling into SMCCC functions we often only need a few arguments.
As the current function needs all 8 possible arguments to be set the
unused values will be zero.
Create a macro to pass in the used values, followed by enough zeros,
then the result pointer.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D46986
If we have a new enough SPCR, then use it when it provides a
PreciseBaudrate and/or a UartClkFreq.
Sponsored by: Netflix
Reviewed by: andrew,adrian
Differential Revision: https://reviews.freebsd.org/D47097
Two reasons for this: we know it's a uart after we call probe and it
returns successfully. Second, uart passes data between probe and attach
with softc. As it is now, we call probe twice, once in the bidding
process and once after bidding id done. However, the probe process for
uart isn't completely idempotent (we change state of the uart
sometimes). The second call can result in odd behavior (though so far
only in buggy version of other code I've not committed). The bigger
problem is the softc: newbus creates it, we populate it, then frees it
when we don't return 0 to claim the device. It then calls us again, we
repopulate it, and this time it doesn't free it before calling attach.
Returning 0 avoids both of these issues. The justification for doing it
in the commit that changed it was 'while I'm here', so there doesn't
seem to be a use case for it.
Sponsored by: Netflix
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D47095
When instructed to do so, compute the rclk (baud rate generator clock)
based on the currently programmed divisor and the communicated baud
rate. We only do this once and only for consoles that tell us the
configured rate and flag we can likely safely compute rclk.
Sponsored by: Netflix
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D47077
This allows one to specify dt:XXXX when the default class isn't compiled
into the kernel. It's not an error to not have a class until we're done
parsing the spec, so defer checking until then.
Sponsored by: Netflix
Reviewed by: adrian, andrew, markj
Differential Revision: https://reviews.freebsd.org/D47078
When instructed to do so, compute the rclk (baud rate generator clock)
based on the currently programmed divisor and the communicated baud
rate. We only do this once and only for consoles that tell us the
configured rate and flag we can likely safely compute rclk.
Sponsored by: Netflix
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D47076
Disable interrupts before we set the parameters for the UART. Usually,
it makes no difference, but it's possible that setting the baud rate, etc
could create problems if there's data pending, so move the interrupt
disabling ealier.
Sponsored by: Netflix
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D47075
We have two copies (soon to be three) of reading the divisor. Since it's
a complicated tricky process, abstract it to its own routine.
Sponsored by: Netflix
Reviewed by: andrew, markj
Differential Revision: https://reviews.freebsd.org/D47074
It can be confusing when the ns8250 driver prints error messages with
just ns8250 as the prefix. Add uart: to the live and commented out
printfs.
Sponsored by: Netflix
Reviewed by: andrew, markj
Differential Revision: https://reviews.freebsd.org/D47073
With newer, more diverse hardware designs, the rclk can be
unknown. Currently deployed systems have no standard way to discover the
baud-clock generator frequency. However, sometimes we have a fairly good
idea that the firmware programmed the UART to be the baud rate that it's
telling us it's at. Create a way to instruct the uart class drivers to
compute the baud clock frequency the first time their init routines are
called. Usually the 'divisors' are relatively small, meaning we will
likely have a fairly large error (goes as 1 / (divisor + 1). However,
we also know that the baud-generator clock needs to be divided down
to the baud-rate +/- about 5% (so while the error could be large for
an arbitrary baud-clock, standard baud rates generally will give
an error of 5% or less).
Often, the console speed and the getty-configured speed are the same, so
this heuristic allows boot messages and login sessions to work.
Sponsored by: Netflix
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D47072
To help debugging, export the rclk a uart is using as
dev.uart.X.rclk. It can be opaque when it is wrong since any error
messages printed to the system console using the wrong rclk aren't
informative.
Sponsored by: Netflix
Reviewed by: andrew, markj
Differential Revision: https://reviews.freebsd.org/D47070
If rclk is set in sysdev, then it was set during the boot process and is
intended to override the defaults. By prefering the sysdev one over the
class, xo=XXXX in hw.uart.console can give the user a usable console for
non-traditional UARTs, especially on !x86 platforms. The default rclk
generally only is good for I/O mapped UARTS or PCI ones that we can do a
table lookup on. Other times, it can be hard to know what a good default
is without more information.
Sponsored by: Netflix
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D47069
Packet capturing on dpni is only half-working given the BPF_MTAP call
in the TX path is missing. Add it to see packets in both directions.
MFC after: 3 days
Reviewed by: dsl
Differential Revision: https://reviews.freebsd.org/D47103
Drop variable names of function prototypes since the file is mixed in
listing them or not and they fall out of sync.
MFC after: 1 week
Sponsored by: BBOX.io
Drop variable names of function prototypes since the file is mixed in
listing them or not and they fall out of sync.
MFC after: 1 week
Sponsored by: BBOX.io
Drop variable names of function prototypes since the file is mixed in
listing them or not and they fall out of sync.
MFC after: 1 week
Sponsored by: BBOX.io
Rename the 'struct adapter' to 'struct igc_softc' to avoid type
ambiguity in things like kgdb and make sharing code with e1000 and
ixgbe easier.
MFC after: 1 week
Sponsored by: BBOX.io
Each netmap adapter associated with a physical adapter is attached to a
netmap memory pool. contigmalloc() is used to allocate physically
contiguous memory for the pool, but ideally we would ensure that all
such memory is allocated from the NUMA domain local to the adapter.
Augment netmap's memory pools with a NUMA domain ID, similar to how
IOMMU groups are handled in the Linux port. That is, when attaching to
a physical adapter, ensure that the associated memory pools are local to
the adapter's associated memory domain, creating new pools as needed.
Some types of ifnets do not have any defined NUMA affinity; in this case
the domain ID in question is the sentinel value -1.
Add a sysctl, dev.netmap.port_numa_affinity, which can be used to enable
the new behaviour. Keep it disabled by now to avoid surprises in case
netmap applications are relying on zero-copy optimizations to forward
packets between ports belonging to different NUMA domains.
Reviewed by: vmaffione
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D46666
Do it also for the preloaded disk, in addition to the dynamically
configured device. This is needed to avoid geom checking alignment and
panicing on read of the last sector, e.g. for partition schemes and
label tasting.
PR: 281978
Reported by: bz
Reviewed by: bz, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D47102
Apparently, sometimes on hot plug/unplug, a null cr comes back from
ciss_dequeue_notify. This is clearly a bug, and by ignoring it we're
papering over that bug. We only ever wake the thread after enqueing a
notification or setting a bit about killing the thread, so once we check
the bit isn't the cause, cr can't be NULL unless something else has
dequeued it.
Ideally, this would be fixed, rather than papered over, but this makes a
very old card somewhat more useable for external enclosures. I suspect
it's a race when we set CISS_THREAD_SHUT and another flag (the latter
w/o ciss_mtx held), but I don't see it and w/o hardware to reproduce
it would be hard to know for sure.
PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
Add hw.ciss.inititor_id to set the initiator to something other than the
default.
PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
Add support for tracking the maximum physical target and using that to
override the maximum logical target.
PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
Expose the hw.ciss.force_interrupt tuneable as a sysctl and make it
writeable at runtime.
PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
Expose the hw.ciss.force_transport tuneable as a sysctl and make it
writeable at runtime.
PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155
Expose the hw.ciss.nop_message_heartbeat tuneable as a sysctl and make
it writeable at runtime.
PR: 246279
Reviewed by: imp
Tested by: Marek Zarychta
Differential Revision: https://reviews.freebsd.org/D25155