This is required to support > MCS15 as more than 32 bit rate entries are
suddenly available.
This is quite messy - instead of doing typecasts at each mask operation,
this should be migrated to use a macro and have that do the typecast.
For now, the only module implement is 'sample', and that's only partially
implemented. The main issue here with reusing this structure in userland
is that it uses 'rix' everywhere, which requires the userland code to
have access to the current HAL rate table.
For now, this is a very large work in progress.
Specific details:
* The rate control information is per-node at the moment and wrapped
in a TLV, to ease parsing and backwards compatibility.
* .. but so I can be slack for now, the userland statistics are just
a copy of the kernel-land sample node state.
* However, for now use a temporary copy and change the rix entries
to dot11rate entries to make it slightly easier to eyeball.
Problems:
* The actual rate information table is unfortunately indexed by rix
and it doesn't contain a rate code. So the userland side of this
currently has no way to extract out a mapping.
TODO:
* Add a TLV payload to dump out the rate control table mapping so
'rix' can be turned into a dot11 / MCS rate.
* .. then remove the temporary copy.
been bait-and-switched from the rate control code.
This will avoid the panic that I saw and will avoid sending invalid rates
(eg 11a/11g OFDM rates when in 11b, on 11b-only NICs (AR5211)) where the
rate table is not "big".
It also will point out situations where this occurs for the 11n NICs
which will have sufficiently large rate tables that "invalid rix" doesn't
occur.
I'll try to follow this up with a commit that adds a current operating mode
check. The "rix" is only relevant to the current operating mode and rate
table.
PR: kern/165475
for Atheros AR5416 and later wireless devices.
This is a very large commit - the complete history can be
found in the user/adrian/if_ath_tx branch.
Legacy (ie, pre-AR5416) devices also use the per-software
TXQ support and (in theory) can support non-aggregation
ADDBA sessions. However, the net80211 stack doesn't currently
support this.
In summary:
TX path:
* queued frames normally go onto a per-TID, per-node queue
* some special frames (eg ADDBA control frames) are thrown
directly onto the relevant hardware queue so they can
go out before any software queued frames are queued.
* Add methods to create, suspend, resume and tear down an
aggregation session.
* Add in software retransmission of both normal and aggregate
frames.
* Add in completion handling of aggregate frames, including
parsing the block ack bitmap provided by the hardware.
* Write an aggregation function which can assemble frames into
an aggregate based on the selected rate control and channel
configuration.
* The per-TID queues are locked based on their target hardware
TX queue. This matches what ath9k/atheros does, and thus
simplified porting over some of the aggregation logic.
* When doing TX aggregation, stick the sequence number allocation
in the TX path rather than net80211 TX path, and protect it
by the TXQ lock.
Rate control:
* Delay rate control selection until the frame is about to
be queued to the hardware, so retried frames can have their
rate control choices changed. Frames with a static rate
control selection have that applied before each TX, just
to simplify the TX path (ie, not have "static" and "dynamic"
rate control special cased.)
* Teach ath_rate_sample about aggregates - both completion and
errors.
* Add an EWMA for tracking what the current "good" MCS rate is
based on failure rates.
Misc:
* Introduce a bunch of dirty hacks and workarounds so TID mapping
and net80211 frame inspection can be kept out of the net80211
layer. Because of the way this code works (and it's from Atheros
and Linux ath9k), there is a consistent, 1:1 mapping between
TID and AC. So we need to ensure that frames going to a specific
TID will _always_ end up on the right AC, and vice versa, or the
completion/locking will simply get very confused. I plan on
addressing this mess in the future.
Known issues:
* There is no BAR frame transmission just yet. A whole lot of
tidying up needs to occur before BAR frame TX can occur in the
"correct" place - ie, once the TID TX queue has been drained.
* Interface reset/purge/etc results in frames in the TX and RX
queues being removed. This creates holes in the sequence numbers
being assigned and the TX/RX AMPDU code (on either side) just
hangs.
* There's no filtered frame support at the present moment, so
stations going into power saving mode will simply have a number
of frames dropped - likely resulting in a traffic "hang".
* Raw frame TX is going to just not function with 11n aggregation.
Likely this needs to be modified to always override the sequence
number if the frame is going into an aggregation session.
However, general raw frame injection currently doesn't work in
general in net80211, so let's just ignore this for now until
this is sorted out.
* HT protection is just not implemented and won't be until the above
is sorted out. In addition, the AR5416 has issues RTS protecting
large aggregates (anything >8k), so the work around needs to be
ported and tested. Thus, this will be put on hold until the above
work is complete.
* The rate control module 'sample' is the only currently supported
module; onoe/amrr haven't been tested and have likely bit rotted
a little. I'll follow up with some commits to make them work again
for non-11n rates, but they won't be updated to handle 11n and
aggregation. If someone wishes to do so then they're welcome to
send along patches.
* .. and "sample" doesn't really do a good job of 11n TX. Specifically,
the metrics used (packet TX time and failure/success rates) isn't as
useful for 11n. It's likely that it should be extended to take into
account the aggregate throughput possible and then choose a rate
which maximises that. Ie, it may be acceptable for a higher MCS rate
with a higher failure to be used if it gives a more acceptable
throughput/latency then a lower MCS rate @ a lower error rate.
Again, patches will be gratefully accepted.
Because of this, ATH_ENABLE_11N is still not enabled by default.
Sponsored by: Hobnob, Inc.
Obtained from: Linux, Atheros
* Use 64 bit integer types for the sample rate statistics.
When TX'ing 11n aggregates, a 32 bit counter will overflow in a few
hours due to the high packet throughput.
* Create a default label of "" rather than defaulting to "Mb" - that way
if a rate hasn't yet been selected, it won't say "-1 Mb".
Sponsored by: Hobnob, Inc.
packet duration for the ath_rate_sample module.
This doesn't affect the packet TX at all; only how much time the
sample rate module attributes to a completed TX.
This doesn't yet take into account HT40 packet durations as the node info
(needed to know if it's a HT20 or HT40 node) isn't available everywhere
it needs to be.
This removes the chipset-dependent TX DMA completion descriptor groveling.
It should now be (more) portable to other, later atheros chipsets when the
time comes.
o eliminate private state indexed by 802.11 rate codes; use the hal's
rate tables directly to get the same info
o calculate a mask of operational rates to optimize lookups and checks
(instead of using for loops and similar)
o optimize size bin operations
o ignore rates marked as "do not use" in the hal phy tables
o fix bug that caused upshifting to break in 11g once the rate dropped
below 11Mb/s
o add more intelligent multi-rate tx schedules
o add support for 1/2 and 1/4 width channels
o add dev.ath.X.sample_stats sysctl to dump runtime statistics to the console
(needs to go up to a user app)
o export more tuning knobs via sysctls (still a couple of magic constants)
Note this includes changes to all drivers and moves some device firmware
loading to use firmware(9) and a separate module (e.g. ral). Also there
no longer are separate wlan_scan* modules; this functionality is now
bundled into the wlan module.
Supported by: Hobnob and Marvell
Reviewed by: many
Obtained from: Atheros (some bits)
o no more ds_vdata in tx/rx descriptors
o split h/w tx/rx descriptor from s/w status
o as part of the descriptor split change the rate control module api
so the ath_buf is passed in to the module so it can fetch both
descriptor and status information as needed
o add some const poisoning
Also for sample rate control algorithm:
o split debug msgs (node, rate, any)
o uniformly bounds check rate indices (and in some cases correct checks)
o move array index ops to after bounds checking
o use final tsi from the status block instead of the h/w descriptor
o replace h/w descriptor struct's with proper mask+shift defs (this
doesn't belong here; everything is known by the driver and should
just be sent down so there's no h/w-specific knowledge)
MFC after: 1 month