Commit graph

37171 commits

Author SHA1 Message Date
Conrad Meyer
b6db1cc710 random(4): De-export random_sources list
The internal datastructures do not need to be visible outside of
random_harvestq, and this helps ensure they are not misused.

No functional change.

Approved by:	csprng(delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D22485
2019-11-22 20:24:15 +00:00
Scott Long
02d4535d2d Mark hpt27xx for removal in 13.0; all CAM drivers will be Giant-free by then.
Relnotes:	yes
2019-11-22 20:23:22 +00:00
Conrad Meyer
d7a23f9f6b random(4): Use ordinary sysctl definitions
There's no need to dynamically populate them; the SYSCTL_ macros take care
of load/unload appropriately already (and random_harvestq is 'standard' and
cannot be unloaded anyway).

Approved by:	csprng(delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D22484
2019-11-22 20:22:29 +00:00
Conrad Meyer
f19de0a945 random(4): Abstract loader entropy injection
Break random_harvestq_prime up into some logical subroutines.  The goal
is that it becomes easier to add other early entropy sources.

While here, drop pre-12.0 compatibility logic.  loader default configuration
should preload the file as expeced since 12.0.

Approved by:	csprng(delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D22482
2019-11-22 20:20:37 +00:00
Conrad Meyer
92ebf15da5 random(4): Remove unused definitions
Approved by:	csprng(gordon, markm)
Differential Revision:	https://reviews.freebsd.org/D22481
2019-11-22 20:18:07 +00:00
Conrad Meyer
cb285f7c7c random/ivy: Provide mechanism to read independent seed values from rdrand
On x86 platforms with the intrinsic, rdrand is a deterministic bit generator
(AES-CTR) seeded from an entropic source.  On x86 platforms with rdseed, it
is something closer to the upstream entropic source.  (There is more nuance;
a block diagram is provided in [1].)

On devices with rdrand and without rdseed, there is no good intrinsic for
acecssing the good entropic soure directly.  However, the DRBG is guaranteed
to reseed every 8 kB on these platforms.  As a conservative option, on such
hardware we can read an extra 7.99kB samples every time we want a sample
from an independent seed.

As one can imagine, this drastically slows the effective read rate of
RDRAND (a factor of 1024 on amd64 and 2048 on ia32).  Microbenchmarks on AMD
Zen (has RDSEED) show an RDRAND rate of 25 MB/s and Intel Haswell (no
RDSEED) show RDRAND of 170 MB/s.  This would reduce the read rate on Haswell
to ~170 kB/s (at 100% CPU).  random(4)'s harvestq thread periodically
"feeds" from pure sources in amounts of 128-1024 bytes.  On Haswell,
enabling this feature increases the CPU time of RDRAND in each "feed" from
approximately 0.7-6 µs to 0.7-6 ms.

Because there is some performance penalty to this more conservative option,
a knob is provided to enable the change.  The change does not affect
platforms with RDSEED.

[1]: https://software.intel.com/en-us/articles/intel-digital-random-number-generator-drng-software-implementation-guide#inpage-nav-4-2

Approved by:	csprng(delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D22455
2019-11-22 19:30:31 +00:00
Scott Long
8823960b8d Schedule the trm(4) driver for removal. It relies on Giant and thus has
required compat shims in CAM for 12 years.

Relnotes:	yes
2019-11-22 18:50:53 +00:00
John Baldwin
bddf73433e NIC KTLS for Chelsio T6 adapters.
This adds support for ifnet (NIC) KTLS using Chelsio T6 adapters.
Unlike the TOE-based KTLS in r353328, NIC TLS works with non-TOE
connections.

NIC KTLS on T6 is not able to use the normal TSO (LSO) path to segment
the encrypted TLS frames output by the crypto engine.  Instead, the
TOE is placed into a special setup to permit "dummy" connections to be
associated with regular sockets using KTLS.  This permits using the
TOE to segment the encrypted TLS records.  However, this approach does
have some limitations:

1) Regular TOE sockets cannot be used when the TOE is in this special
   mode.  One can use either TOE and TOE-based KTLS or NIC KTLS, but
   not both at the same time.

2) In NIC KTLS mode, the TOE is only able to accept a per-connection
   timestamp offset that varies in the upper 4 bits.  Put another way,
   only connections whose timestamp offset has the 28 lower bits
   cleared can use NIC KTLS and generate correct timestamps.  The
   driver will refuse to enable NIC KTLS on connections with a
   timestamp offset with any of the lower 28 bits set.  To use NIC
   KTLS, users can either disable TCP timestamps by setting the
   net.inet.tcp.rfc1323 sysctl to 0, or apply a local patch to the
   tcp_new_ts_offset() function to clear the lower 28 bits of the
   generated offset.

3) Because the TCP segmentation relies on fields mirrored in a TCB in
   the TOE, not all fields in a TCP packet can be sent in the TCP
   segments generated from a TLS record.  Specifically, for packets
   containing TCP options other than timestamps, the driver will
   inject an "empty" TCP packet holding the requested options (e.g. a
   SACK scoreboard) along with the segments from the TLS record.
   These empty TCP packets are counted by the
   dev.cc.N.txq.M.kern_tls_options sysctls.

Unlike TOE TLS which is able to buffer encrypted TLS records in
on-card memory to handle retransmits, NIC KTLS must re-encrypt TLS
records for retransmit requests as well as non-retransmit requests
that do not include the start of a TLS record but do include the
trailer.  The T6 NIC KTLS code tries to optimize some of the cases for
requests to transmit partial TLS records.  In particular it attempts
to minimize sending "waste" bytes that have to be given as input to
the crypto engine but are not needed on the wire to satisfy mbufs sent
from the TCP stack down to the driver.

TCP packets for TLS requests are broken down into the following
classes (with associated counters):

- Mbufs that send an entire TLS record in full do not have any waste
  bytes (dev.cc.N.txq.M.kern_tls_full).

- Mbufs that send a short TLS record that ends before the end of the
  trailer (dev.cc.N.txq.M.kern_tls_short).  For sockets using AES-CBC,
  the encryption must always start at the beginning, so if the mbuf
  starts at an offset into the TLS record, the offset bytes will be
  "waste" bytes.  For sockets using AES-GCM, the encryption can start
  at the 16 byte block before the starting offset capping the waste at
  15 bytes.

- Mbufs that send a partial TLS record that has a non-zero starting
  offset but ends at the end of the trailer
  (dev.cc.N.txq.M.kern_tls_partial).  In order to compute the
  authentication hash stored in the trailer, the entire TLS record
  must be sent as input to the crypto engine, so the bytes before the
  offset are always "waste" bytes.

In addition, other per-txq sysctls are provided:

- dev.cc.N.txq.M.kern_tls_cbc: Count of sockets sent via this txq
  using AES-CBC.

- dev.cc.N.txq.M.kern_tls_gcm: Count of sockets sent via this txq
  using AES-GCM.

- dev.cc.N.txq.M.kern_tls_fin: Count of empty FIN-only packets sent to
  compensate for the TOE engine not being able to set FIN on the last
  segment of a TLS record if the TLS record mbuf had FIN set.

- dev.cc.N.txq.M.kern_tls_records: Count of TLS records sent via this
  txq including full, short, and partial records.

- dev.cc.N.txq.M.kern_tls_octets: Count of non-waste bytes (TLS header
  and payload) sent for TLS record requests.

- dev.cc.N.txq.M.kern_tls_waste: Count of waste bytes sent for TLS
  record requests.

To enable NIC KTLS with T6, set the following tunables prior to
loading the cxgbe(4) driver:

hw.cxgbe.config_file=kern_tls
hw.cxgbe.kern_tls=1

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D21962
2019-11-21 19:30:31 +00:00
Ian Lepore
e3c42ad809 Rewrite iicdev_writeto() to use a single buffer and a single iic_msg, rather
than effectively doing scatter/gather IO with a pair of iic_msgs that direct
the controller to do a single transfer with no bus STOP/START between the
two buffers.  It turns out we have multiple i2c hardware drivers that don't
honor the NOSTOP and NOSTART flags; sometimes they just try to do the
transfers anyway, creating confusing failures or leading to corrupted data.
2019-11-21 19:13:05 +00:00
Hans Petter Selasky
c4e11f2231 Add USB ID for Diamond Multimedia BVU195 Display Link device.
Submitted by:	darius@dons.net.au
PR:		242128
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-11-21 16:42:25 +00:00
Gleb Smirnoff
71f0077631 Remove sio(4).
It had been disconnected from build in r181233 in 2008.

Reviewed by:	imp
2019-11-21 01:24:49 +00:00
Conrad Meyer
c41faf5591 random/ivy: Trivial refactoring
It is clearer to me to return success/error (true/false) instead of some
retry count linked to the inline assembly implementation.

No functional change.

Approved by:	core(csprng) => csprng(markm)
Differential Revision:	https://reviews.freebsd.org/D22454
2019-11-20 19:55:43 +00:00
Andriy Gapon
97d8f008af hyperv/storvsc: stash a pointer to hv_storvsc_request in ccb
A SIM-private field is used for that.
The pointer can be useful when examining a state of a queued ccb.
E.g., a ccb on a da_softc.pending_ccbs.

MFC after:	2 weeks
2019-11-19 07:20:59 +00:00
Alexander Motin
7280125e81 Add ioat_get_domain() to ioat(4) KPI.
This allows NUMA-aware consumers to reduce inter-domain traffic.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-11-19 02:09:04 +00:00
Alexander Motin
f0dd6a1787 Call bus_dma_dmar_set_buswide(9) added in r354830.
PLX NTB sends translated DMA requests not only from itsels, but from all
slots and functions of its bus.  By default DMAR blocks those additional.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-11-19 02:03:10 +00:00
Konstantin Belousov
fa83f68917 Add x86 msr tweak KPI.
Use the KPI to tweak MSRs in mitigation code.

Reviewed by:	markj, scottl
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D22431
2019-11-18 20:53:57 +00:00
Scott Long
e372160177 TSX Asynchronous Abort mitigation for Intel CVE-2019-11135.
This CVE has already been announced in FreeBSD SA-19:26.mcu.

Mitigation for TAA involves either turning off TSX or turning on the
VERW mitigation used for MDS. Some CPUs will also be self-mitigating
for TAA and require no software workaround.

Control knobs are:
machdep.mitigations.taa.enable:
        0 - no software mitigation is enabled
        1 - attempt to disable TSX
        2 - use the VERW mitigation
        3 - automatically select the mitigation based on processor
	    features.

machdep.mitigations.taa.state:
        inactive        - no mitigation is active/enabled
        TSX disable     - TSX is disabled in the bare metal CPU as well as
                        - any virtualized CPUs
        VERW            - VERW instruction clears CPU buffers
	not vulnerable	- The CPU has identified itself as not being
			  vulnerable

Nothing in the base FreeBSD system uses TSX.  However, the instructions
are straight-forward to add to custom applications and require no kernel
support, so the mitigation is provided for users with untrusted
applications and tenants.

Reviewed by:	emaste, imp, kib, scottph
Sponsored by:	Intel
Differential Revision:	22374
2019-11-16 00:26:42 +00:00
Alexander Motin
348efb140e Initialize *comp_update with valid value.
I've noticed that sometimes with enabled DMAR initial write from device
to this address is somehow getting delayed, triggering assertion due to
zero default being invalid.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-11-15 23:01:09 +00:00
Alexander Motin
1f4a469d36 Cleanup address range checks in ioat(4).
- Deduce allowed address range for bus_dma(9) from the hardware version.
Different versions (CPU generations) have different documented limits.
 - Remove difference between address ranges for src/dst and crc.  At least
docs for few recent generations of CPUs do not mention anything like that,
while older are already limited with above limits.
 - Remove address assertions from arguments.  While I do not think the
addresses out of allowed ranges should realistically happen there due to
the platforms physical address limitations, there is now bus_dma(9) to
make sure of that, preferably via IOMMU.
 - Since crc now has the same address range as src/dst, remove crc_dmamap,
reusing dst2_dmamap instead.

Discussed with:	cem
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-11-15 22:47:59 +00:00
Navdeep Parhar
5877e649f0 cxgbev(4): Catch up with the pciids in the PF driver.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2019-11-15 18:48:14 +00:00
Gleb Smirnoff
782b97cb80 Fix regression from r353841: ctx.rc needs to be initialized,
otherwise driver might silently fail to initialize.

Pointy hat to:	glebius
2019-11-15 18:02:37 +00:00
Josh Paetzel
4c6bf7c398 Fix build with GCC
Fix suggested by:	jhb, scottl
Sponsored by:	Panzura
2019-11-15 01:07:39 +00:00
Josh Paetzel
052e12a508 Add the pvscsi driver to the tree.
This driver allows to usage of the paravirt SCSI controller
in VMware products like ESXi.  The pvscsi driver provides a
substantial performance improvement in block devices versus
the emulated mpt and mps SCSI/SAS controllers.

Error handling in this driver has not been extensively tested
yet.

Submitted by:	vbhakta@vmware.com
Relnotes:	yes
Sponsored by:	VMware, Panzura
Differential Revision:	D18613
2019-11-14 23:31:20 +00:00
Alexander Motin
3eb70a09f4 Pass more reasonable WAIT flags to bus_dma(9) calls.
MFC after:	2 weeks
2019-11-14 04:39:48 +00:00
Alexander Motin
7f215e071e Make ntb(4) send bus_get_dma_tag() requests to parent buses passing real
bus' child pointers instead of grandchilds.

DMAR does not like requests from devices not parented directly by PCI.

MFC after:	2 weeks
2019-11-14 04:34:58 +00:00
Scott Long
2058e7dbde Stop the VESA driver from whining loudly in the dmesg during boot on
systems that use EFI instead of BIOS.
2019-11-13 15:31:31 +00:00
John Baldwin
a1b2b6e184 Create a file to hold shared routines for dealing with T6 key contexts.
ccr(4) and TLS support in cxgbe(4) construct key contexts used by the
crypto engine in the T6.  This consolidates some duplicated code for
helper functions used to build key contexts.

Reviewed by:	np
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D22156
2019-11-13 00:53:45 +00:00
Konstantin Belousov
c08973d09c Workaround for Intel SKL002/SKL012S errata.
Disable the use of executable 2M page mappings in EPT-format page
tables on affected CPUs.  For bhyve virtual machines, this effectively
disables all use of superpage mappings on affected CPUs.  The
vm.pmap.allow_2m_x_ept sysctl can be set to override the default and
enable mappings on affected CPUs.

Alternate approaches have been suggested, but at present we do not
believe the complexity is warranted for typical bhyve's use cases.

Reviewed by:	alc, emaste, markj, scottl
Security:	CVE-2018-12207
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D21884
2019-11-12 18:01:33 +00:00
D Scott Phillips
178d6bc844 nvdimm(4): Fix various problems when the using the second label index block
struct nvdimm_label_index is dynamically sized, with the `free`
bitfield expanding to hold `slot_cnt` entries. Fix a few places
where we were treating the struct as though it had a fixed sized.

Reviewed by:	cem
Approved by:	scottl (mentor)
MFC after:	1 week
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D22253
2019-11-12 16:24:37 +00:00
D Scott Phillips
cf8b104f04 nvdimm(4): Only expose namespaces for accessible data SPAs
Apply the same user accessible filter to namespaces as is applied
to full-SPA devices. Also, explicitly filter out control region
SPAs which don't expose the nvdimm data area.

Reviewed by:	cem
Approved by:	scottl (mentor)
MFC after:	1 week
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D21987
2019-11-12 15:50:30 +00:00
Alexander Motin
028d96899b Add compact scraptchpad protocol for ntb_transport(4).
Previously ntb_transport(4) required at least 6 scratchpad registers,
plus 2 more for each additional memory window.  That is too much for some
configurations, where several drivers have to share resources of the same
NTB hardware.  This patch introduces new compact version of the protocol,
requiring only 3 scratchpad registers, plus one more for each additional
memory window.  The optimization is based on fact that neither of version,
number of windows or number of queue pairs really need more then one byte
each, and window sizes of 4GB are not very useful now.  The new protocol
is activated automatically when the configuration is low on scratchpad
registers, or it can be activated explicitly with loader tunable.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-11-10 03:37:45 +00:00
Alexander Motin
7aafa7c368 Allow splitting PLX NTB BAR2 into several memory windows.
Address Lookup Table (A-LUT) being enabled allows to specify separate
translation for each 1/128th or 1/256th of the BAR2.  Previously it was
used only to limit effective window size by blocking access through some
of A-LUT elements.  This change allows A-LUT elements to also point
different memory locations, providing to upper layers several (up to 128)
independent memory windows.  A-LUT hardware allows even more flexible
configurations than this, but NTB KPI have no way to manage that now.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-11-10 03:24:53 +00:00
Emmanuel Vadot
306e46eb1e generic_ehci_fdt: Fix compile when EXT_RESOURCES isn't present 2019-11-09 22:25:45 +00:00
Michal Meloun
124a91ac18 Implement support for (soft)linked clocks.
This kind of clock nodes represent temporary placeholder for clocks
defined later in boot process. Also, these are necessary to break
circular dependencies occasionally occurring in complex clock graphs.

MFC after: 3 weeks
2019-11-08 18:57:41 +00:00
Navdeep Parhar
43b5712444 cxgbe(4): Query Vdd from the firmware if its last known value is 0.
TVSENSE may not be ready by the time t4_fw_initialize returns and the
firmware returns 0 if the driver asks for the Vdd before the sensor is
ready.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-11-08 01:13:12 +00:00
Mark Johnston
1903c60041 iwm: Sync device initialization and reset code with iwlwifi.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-11-07 23:39:17 +00:00
Mark Johnston
666c8655f2 iwm: Implement support for scans with "adaptive" dwell time.
This is required by 9000-series firmware.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-11-07 23:39:04 +00:00
Mark Johnston
c513f15bf0 iwm: Use the default station for all transmits.
This is what iwlwifi seems to do, and the previous behaviour triggered
firmware panics during transmit on a 9560.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-11-07 23:38:49 +00:00
Mark Johnston
d2c7b21a56 iwm: Set flag for pad bytes in offload_assist.
Though we don't otherwise use firmware's offload capabilities, we need
to set this flag when the MAC header's size isn't a multiple of four.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-11-07 23:38:36 +00:00
Mark Johnston
49c76634fb iwm: Use antenna B for TX on 9000-series chips.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-11-07 23:38:17 +00:00
Mark Johnston
09a07cd5ea iwm: Update the station add command for the new RX API.
The firmware expects a new version of the add-station command in
9000-series chips.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-11-07 23:37:55 +00:00
Mark Johnston
c3bfecf3df iwm: Sync with iwm_run_init_mvm_ucode() with iwlwifi.
Do not configure bluetooth on newer chips, it causes firmware panics.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-11-07 23:37:30 +00:00
Mark Johnston
1f0976dc58 iwm: Fix scheduler configuration for aux and cmd queue configuration.
- Configure the scheduler only for the management queue.
- Fix a bug when enabling the schduler: the queues are specified using a
  bitmask.
- Fix style in the area.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-11-07 23:37:17 +00:00
Mark Johnston
96c5aa2f4b iwm: Implement the new receive path.
This is the multiqueue receive code required for 9000-series chips.
Note that we still only configure a single RX queue for now.  Multiqueue
support will require MSI-X configuration and a scheme for managing a
global pool of RX buffers.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-11-07 23:37:02 +00:00
Mark Johnston
2ca43dacca iwm: Enable all 31 tx queues.
For now iwm only ever uses queue 0 and the management queue, but my 9560
raises a software error interrupt during initialization if this flag is
not set.  iwlwifi sets it for all 7000- and 8000-series hardware, so we
might as well do it unconditionally.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-11-07 23:36:46 +00:00
Mark Johnston
b1a48ccc18 iwm: Explicitly enable MSI on newer chipsets.
9000-series chips implement support for MSI-X interrupts and disable MSI
by default.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-11-07 23:36:25 +00:00
Mark Johnston
1809534a1a iwm: Define the mqrx_supported capability.
The firmware for 9000-series and newer devices has a different receive
API which supports multiple queues.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-11-07 23:36:10 +00:00
Mark Johnston
5b3b7a2df1 iwm: Add device configuration definitions for 9000-series chips.
Match such chips using the device ID.  We should really be checking the
subdevice as well, since a smaller number of 9460 and 9560 devices
actually belong to a new series of devices and require different
firmware, but that will require some extra logic in iwm_attach().

Submitted by:	lwhsu, Guo Wen Jun <blockk2000@gmail.com>
MFC after:	2 weeks
2019-11-07 23:35:54 +00:00
Mark Johnston
d2ec5b521b iwm: Sync the firmware tx_cmd descriptor fields with iwlwifi.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-11-07 23:35:29 +00:00
Mark Johnston
be05a0fd77 iwm: Use the same delays as iwlwifi when resetting the device.
This is required for initialization to succeed for newer device
families.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-11-07 23:35:15 +00:00