amd64: compare TLB shootdown target to all_cpus

On amd64, the pmap code passes all_cpus to
smp_targeted_tlb_shootdown() when unmapping from the
kernel pmap.  This function has an optimized path to send IPIs
to all but itself, which it intends to do when the target
is all cpus.   However, we need to compare the target cpu mask
with all_cpus, rather than using CPU_ISFULLSET().  Comparing with
CPU_ISFULLSET() will only work when we have MAXCPU cpus active in
the system, otherwise, we'll be sending repeated IPIs, rather than
a single IPI to all CPUs but ourself.

Fixing this should reduce the time spent in native_lapic_ipi_wait()
as we will be sending ipis in parallel, rather than one-by-one.
This is confirmed by dtrace.

Reviewed by: alc, jhb, kib, markj
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D28102
This commit is contained in:
Andrew Gallatin 2021-01-11 20:03:37 -05:00
parent a5584ace96
commit 7eaea04a5b

View file

@ -673,7 +673,7 @@ smp_targeted_tlb_shootdown(cpuset_t mask, pmap_t pmap, vm_offset_t addr1,
/*
* Check for other cpus. Return if none.
*/
if (CPU_ISFULLSET(&mask)) {
if (!CPU_CMP(&mask, &all_cpus)) {
if (mp_ncpus <= 1)
goto local_cb;
} else {
@ -719,7 +719,7 @@ smp_targeted_tlb_shootdown(cpuset_t mask, pmap_t pmap, vm_offset_t addr1,
* (zeroing slot) and reading from it below (wait for
* acknowledgment).
*/
if (CPU_ISFULLSET(&mask)) {
if (!CPU_CMP(&mask, &all_cpus)) {
ipi_all_but_self(IPI_INVLOP);
other_cpus = all_cpus;
CPU_CLR(PCPU_GET(cpuid), &other_cpus);