Commit graph

48 commits

Author SHA1 Message Date
Mark Johnston
9093dd9a66 Implement x86 dtrace_invop_(un)init() in C.
There is no reason for these routines to be written in assembly.  In
the ports of DTrace to other platforms, they are already written in C.
No functional change intended.

MFC after:	1 week
Sponsored by:	Netflix
2019-09-23 15:08:17 +00:00
Mark Johnston
a4b59d3db6 Use an explicit comparison with VM_GUEST_NO.
Reported by:	jhb
MFC with:	r345359
Sponsored by:	The FreeBSD Foundation
2019-03-21 20:07:50 +00:00
Mark Johnston
e362e590f9 Don't attempt to measure TSC skew when running as a VM guest.
It simply doesn't work in general since VCPUs may migrate between
physical cores.  The approach used to measure skew also doesn't
make much sense in a VM.

PR:		218452
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-03-21 02:52:22 +00:00
John Baldwin
d41e41f9f0 Remove very old and unused signal information codes.
These have been supplanted by the MI signal information codes in
<sys/signal.h> since 7.0.  The FPE_*_TRAP ones were deprecated even
earlier in 1999.

PR:		226579 (exp-run)
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D14637
2018-03-27 20:57:51 +00:00
Ed Maste
fc2a8776a2 Rename assym.s to assym.inc
assym is only to be included by other .s files, and should never
actually be assembled by itself.

Reviewed by:	imp, bdrewery (earlier)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D14180
2018-03-20 17:58:51 +00:00
Mark Johnston
6c7828a280 Avoid CPU migration in dtrace_gethrtime() on x86.
dtrace_gethrtime() may be called outside of probe context, and in
particular, from the DTRACEIOC_BUFSNAP handler.

Disable interrupts rather than using sched_pin() to help ensure that
we don't call any external functions when in probe context.

PR:		218452
MFC after:	1 week
2017-12-18 17:26:24 +00:00
Patrick Kelsey
67d955aab4 Corrected misspelled versions of rendezvous.
The MFC will include a compat definition of smp_no_rendevous_barrier()
that calls smp_no_rendezvous_barrier().

Reviewed by:	gnn, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D10313
2017-04-09 02:00:03 +00:00
Mark Johnston
7174af791e Directly include needed headers rather than relying on pollution.
We get machine/cpu.h via kmem.h -> proc.h -> _vm_domain.h -> seq.h.

Reported by:	Ryan Libby
Sponsored by:	Dell EMC Isilon
X-MFC with:	r313841
2017-02-17 03:27:20 +00:00
Mark Johnston
a11ac730a7 Prevent CPU migration when checking the DTrace nofault flag on x86.
dtrace_trap() consumes page and protection faults triggered by code running
in DTrace probe context. Such faults occur with interrupts disabled and are
detected using a per-CPU flag. Regular faults cause dtrace_trap() to be
called with interrupts enabled, and nothing was ensuring that the flag was
read from the correct CPU. This may result in dtrace_trap() consuming
unrelated page and protection faults when DTrace is enabled, causing the
fault handler to return without actually having handled the fault.

Diagnosed by:	Ryan Libby <rlibby@gmail.com>
MFC after:	3 days
Sponsored by:	Dell EMC Isilon
2017-02-16 23:05:20 +00:00
Mark Johnston
e86e17af79 Merge {amd64,i386}/instr_size.c into x86_instr_size.c.
Also reduce the diff between us and upstream: the input data model will
always be DATAMODEL_NATIVE because of a bug (p_model is never set but is
always initialized to 0), so we don't need to override the caller anyway.
This change is also necessary to support the pid provider for 32-bit
processes on amd64.

MFC after:	2 weeks
2016-07-20 00:02:10 +00:00
John Baldwin
fdce57a042 Add an EARLY_AP_STARTUP option to start APs earlier during boot.
Currently, Application Processors (non-boot CPUs) are started by
MD code at SI_SUB_CPU, but they are kept waiting in a "pen" until
SI_SUB_SMP at which point they are released to run kernel threads.
SI_SUB_SMP is one of the last SYSINIT levels, so APs don't enter
the scheduler and start running threads until fairly late in the
boot.

This change moves SI_SUB_SMP up to just before software interrupt
threads are created allowing the APs to start executing kernel
threads much sooner (before any devices are probed).  This allows
several initialization routines that need to perform initialization
on all CPUs to now perform that initialization in one step rather
than having to defer the AP initialization to a second SYSINIT run
at SI_SUB_SMP.  It also permits all CPUs to be available for
handling interrupts before any devices are probed.

This last feature fixes a problem on with interrupt vector exhaustion.
Specifically, in the old model all device interrupts were routed
onto the boot CPU during boot.  Later after the APs were released at
SI_SUB_SMP, interrupts were redistributed across all CPUs.

However, several drivers for multiqueue hardware allocate N interrupts
per CPU in the system.  In a system with many CPUs, just a few drivers
doing this could exhaust the available pool of interrupt vectors on
the boot CPU as each driver was allocating N * mp_ncpu vectors on the
boot CPU.  Now, drivers will allocate interrupts on their desired CPUs
during boot meaning that only N interrupts are allocated from the boot
CPU instead of N * mp_ncpu.

Some other bits of code can also be simplified as smp_started is
now true much earlier and will now always be true for these bits of
code.  This removes the need to treat the single-CPU boot environment
as a special case.

As a transition aid, the new behavior is available under a new kernel
option (EARLY_AP_STARTUP).  This will allow the option to be turned off
if need be during initial testing.  I plan to enable this on x86 by
default in a followup commit in the next few days and to have all
platforms moved over before 11.0.  Once the transition is complete,
the option will be removed along with the !EARLY_AP_STARTUP code.

These changes have only been tested on x86.  Other platform maintainers
are encouraged to port their architectures over as well.  The main
things to check for are any uses of smp_started in MD code that can be
simplified and SI_SUB_SMP SYSINITs in MD code that can be removed in
the EARLY_AP_STARTUP case (e.g. the interrupt shuffling).

PR:		kern/199321
Reviewed by:	markj, gnn, kib
Sponsored by:	Netflix
2016-05-14 18:22:52 +00:00
Mark Johnston
6c2806594b Make the second argument of dtrace_invop() a trapframe pointer.
Currently this argument is a pointer into the stack which is used by FBT
to fetch the first five probe arguments. On all non-x86 architectures it's
simply the trapframe address, so this change has no functional impact. On
amd64 it's a pointer into the trapframe such that stack[1 .. 5] gives the
first five argument registers, which are deliberately grouped together in
the amd64 trapframe definition.

A trapframe argument simplifies the invop handlers on !x86 and makes the
x86 FBT invop handler easier to understand. Moreover, it allows for invop
handlers that may want to modify the register set of the interrupted thread.
2016-04-17 23:08:47 +00:00
Mark Johnston
e1e33ff912 Initialize DTrace hrtimer frequency during SI_SUB_CPU on i386 and amd64.
This allows the hrtimer to be used earlier during boot. This is required
for boot-time DTrace: anonymous enablings are created during
SI_SUB_DTRACE_ANON, which runs before APs are started. In particular,
the DTrace deadman timer requires that the hrtimer be functional.

MFC after:	2 weeks
2016-04-10 01:23:39 +00:00
Mark Johnston
48cc2d5e22 Remove unused variables dtrace_in_probe and dtrace_in_probe_addr. 2016-03-17 18:55:54 +00:00
Konstantin Belousov
888e282ab4 When checking for the valid value of the frame pointer, verify that it
belongs to the kernel stack address range for the thread.  Right now,
code checks that new frame is not farther then KSTACK_PAGES pages from
the current frame, which allows the address to point past the top of
the stack.

Reviewed by:	andrew, emaste, markj
Differential revision:	https://reviews.freebsd.org/D3108
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-07-16 19:40:18 +00:00
Konstantin Belousov
6fdfd88220 Use single instance of the identical INKERNEL() and PMC_IN_KERNEL()
macros on amd64 and i386.  Move the definition to machine/param.h.
kgdb defines INKERNEL() too, the conflict is resolved by renaming kgdb
version to PINKERNEL().

On i386, correct the lowest kernel address.  After the shared page was
introduced, USRSTACK no longer points to the last user address + 1 [*]

Submitted by:	Oliver Pinter [*]
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-07-02 14:37:21 +00:00
Mark Johnston
11027ebcbb Remove unused references to calltrap.
MFC after:	3 days
2015-05-25 01:22:56 +00:00
Mark Johnston
5a9f9cb38e Remove some commented-out upstream code for handling traps from usermode
DTrace probes. This handling is already done in trap() on i386 and amd64.
2015-05-10 22:27:48 +00:00
Mark Johnston
8241ee3b2c Fix DTrace's panic() action.
It would previously call into some unfinished Solaris compatibility code and
return without actually calling panic(9). The compatibility code is
unneeded, however, so just remove it and have dtrace_panic() call vpanic(9)
directly.

Differential Revision:	https://reviews.freebsd.org/D2349
Reviewed by:	avg
MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
2015-04-24 03:19:30 +00:00
Mark Johnston
09a15aa38d Import a missing piece of commit b8fac8e162eda7e98d from illumos-gate.
This adds an upper bound, dtrace_ustackdepth_max, to the number of frames
traversed when computing the userland stack depth. Some programs - notably
firefox - are otherwise able to trigger an infinite loop in
dtrace_getustack_common(), causing a panic.

MFC after:	1 week
2015-03-30 03:55:51 +00:00
Steven Hartland
bc96366c86 Mechanically convert cddl sun #ifdef's to illumos
Since the upstream for cddl code is now illumos not sun, mechanically
convert all sun #ifdef's to illumos #ifdef's which have been used in all
newer code for some time.

Also do a manual pass to correct the use if #ifdef comments as per style(9)
as well as few uses of #if defined(__FreeBSD__) vs #ifndef illumos.

MFC after:	1 month
Sponsored by:	Multiplay
2015-01-17 14:44:59 +00:00
Mark Johnston
cafe874475 Restore the trap type argument to the DTrace trap hook, removed in r268600.
It's redundant at the moment since it can be obtained from the trapframe
on the architectures where DTrace is supported, but this won't be the case
with ARM.
2014-12-23 15:38:19 +00:00
Mark Johnston
291624fdf6 Invoke the DTrace trap handler before calling trap() on amd64. This matches
the upstream implementation and helps ensure that a trap induced by tracing
fbt::trap:entry is handled without recursively generating another trap.

This makes it possible to run most (but not all) of the DTrace tests under
common/safety/ without triggering a kernel panic.

Submitted by:	Anton Rang <anton.rang@isilon.com> (original version)
Phabric:	D95
2014-07-14 04:38:17 +00:00
Mark Johnston
efa1aff675 Fix some bugs when fetching probe arguments in i386. Firstly ensure that
the 4 byte-aligned dtrace_invop_callsite can be found and that it
immediately follows the call to dtrace_invop(). Secondly, fix some pointer
arithmetic to account for differences between struct i386_frame and illumos'
struct frame. Finally, ensure that dtrace_getarg() isn't inlined. It works
by following a fixed number of frame pointers to the probe site, so inlining
breaks it.

MFC after:	3 weeks
2014-06-23 02:00:14 +00:00
Mark Johnston
0339a1c2b4 Move some files that are identical on i386 and amd64 to an x86 subdirectory
rather than keeping duplicate copies.

Discussed with:	avg
MFC after:	1 week
2014-02-27 01:04:35 +00:00
Andriy Gapon
0f09691df8 dtrace disassembler: take the latest/last CDDL code from OpenSolaris
OpenSolaris version is:
13108:33bb8a0301ab
6762020 Disassembly support for Intel Advanced Vector Extensions (AVX)

This corresponds to Illumos-gate (github) version
ab47273fedff893c8ae22ec39ffc666d4fa6fc8b

MFC after:	3 weeks
2013-07-29 16:56:38 +00:00
George V. Neville-Neil
d638e8dcde Change UL to ULL since time is 32 bits.
Pointed out by: avg@
MFC after:	2 weeks
2012-07-17 14:36:40 +00:00
George V. Neville-Neil
57d025c338 Add support for walltimestamp in DTrace.
Submitted by:	Fabian Keil
MFC after:	2 weeks
2012-07-16 20:17:19 +00:00
Andriy Gapon
e0bc3c3d99 r237748 continuation: fix nopw (0f 1f) behavior with respect to modifiers
To do: proper merge with Illumos vendor area.

Reported by:	emaste
Tested by:	emaste
Obtained from:	Illumos commit 13442:4adbe6de60c8
MFC after:	5 days
2012-07-06 14:45:30 +00:00
Andriy Gapon
7e18e35b14 r237748 continuation: segment-override prefixes are not invalid in long mode
Update DTrace disassembler accordingly.  The code to treat the prefixes
as null prefixes was already in place.
Although in practice compilers seem to generate only cs-prefix for use
in long NOPs, the same treatment is applied to all of cs, ds, es, ss for
consistency.

Reported by:	emaste
Tested by:	emaste
Obtained from:	Illumos commit 13442:4adbe6de60c8 (+ local changes)
MFC after:	5 days
2012-07-06 14:41:02 +00:00
Andriy Gapon
f724c6a137 dtrace instruction decoder: add 0x0f 0x1f NOP opcode support
According to the AMD manual the whole range from 0x09 to 0x1f are NOPs.
Intel manual mentions only 0x1f.  Use only Intel one for now, it seems
to be the one actually generated by compilers.
Use gdb mnemonic for the operation: "nopw".

[1] AMD64 Architecture Programmer's Manual
    Volume 3: General-Purpose and System Instructions
[2] Software Optimization Guide for AMD Family 10h Processors
[3] Intel(R) 64 and IA-32 Architectures Software Developer’s Manual
    Volume 2 (2A, 2B & 2C): Instruction Set Reference, A-Z

Tested by:	Fabian Keil <freebsd-listen@fabiankeil.de> (earlier version)
MFC after:	3 days
2012-06-29 07:35:37 +00:00
George V. Neville-Neil
4737d389b0 Integrate a fix for a very odd signal delivery problem found
by Bryan Cantril and others in the Solaris/Illumos version of DTrace.

Obtained from: https://www.illumos.org/issues/789
MFC after:	2 weeks
2012-06-04 16:15:40 +00:00
Zachary Loafman
db5c7d363d Fix DTrace TSC skew calculation:
The skew calculation here is exactly backwards. We were able to repro
it on a multi-package ESX server running a FreeBSD VM, where the TSCs
can be pretty evil.

MFC after: 1 week

Submitted by: Jeff Ford <jeffrey.ford2@isilon.com>
Reviewed by: avg, gnn
2012-06-04 16:04:01 +00:00
Ryan Stone
cddcb8b4dc On i386, fbt probes are implemented by writing an invalid opcode over
certain instructions in a function prologue or epilogue.  DTrace has a
hook into the invalid opcode fault handler that checks whether the fault
was due to an probe and if so, runs the DTrace magic.

Upon returning from an invalid opcode fault caused by a probe, DTrace must
emulate the instruction that was replaced with the invalid opcode and then
return control to the instruction following the invalid opcode.

There were a pair of related bugs in the emulation for the leave
instruction.  The leave instruction is used to pop off a stack frame prior
to returning from a function.  The emulation for this instruction must
move the trap frame for the invalid opcode fault down the stack to the
bottom of the stack frame that is being removed, and then execute an iret.

At two points in this process, the emulation code was storing values above
the current value of the stack pointer.  This opened up a window in which
if we were two take an interrupt, the trap frame for the interrupt would
overwrite the values stored on the stack, causing the system to panic
later.

The first bug was that at one point the emulation code saves the new value
for $esp above the current stack pointer value.  The fix is to save this
value instead inside of the original trap frame.  At this point we do
not need the original trap frame so this is safe.

The second bug is that when the emulate code loads $esp from the stack, it
points part-way through the new trap frame instead of at its beginning.
The emulation code adjusts the stack pointer to the correct value
immediately afterwards, but this still leaves a one instruction window in
which an interrupt would corrupt this trap frame.  Fix this by adjusting
the stack frame value before loading it into $esp.

This fixes panics in invop_leave on i386 when using fbt return probes.

Reviewed by:	rpaulo, attilio
MFC after:	1 week
2011-11-10 22:03:35 +00:00
Attilio Rao
ada5b73915 Remove pc_cpumask usage from dtrace MD support 2011-06-28 13:14:39 +00:00
Attilio Rao
b68eda3b54 MFC 2011-05-10 15:54:37 +00:00
Andriy Gapon
d9b8935fb9 dtrace: remove unused code
Which is also useless, IMO.

MFC after:	5 days
2011-05-10 15:05:27 +00:00
Attilio Rao
71a19bdc64 Commit the support for removing cpumask_t and replacing it directly with
cpuset_t objects.
That is going to offer the underlying support for a simple bump of
MAXCPU and then support for number of cpus > 32 (as it is today).

Right now, cpumask_t is an int, 32 bits on all our supported architecture.
cpumask_t on the other side is implemented as an array of longs, and
easilly extendible by definition.

The architectures touched by this commit are the following:
- amd64
- i386
- pc98
- arm
- ia64
- XEN

while the others are still missing.
Userland is believed to be fully converted with the changes contained
here.

Some technical notes:
- This commit may be considered an ABI nop for all the architectures
  different from amd64 and ia64 (and sparc64 in the future)
- per-cpu members, which are now converted to cpuset_t, needs to be
  accessed avoiding migration, because the size of cpuset_t should be
  considered unknown
- size of cpuset_t objects is different from kernel and userland (this is
  primirally done in order to leave some more space in userland to cope
  with KBI extensions). If you need to access kernel cpuset_t from the
  userland please refer to example in this patch on how to do that
  correctly (kgdb may be a good source, for example).
- Support for other architectures is going to be added soon
- Only MAXCPU for amd64 is bumped now

The patch has been tested by sbruno and Nicholas Esborn on opteron
4 x 12 pack CPUs. More testing on big SMP is expected to came soon.
pluknet tested the patch with his 8-ways on both amd64 and i386.

Tested by:	pluknet, sbruno, gianni, Nicholas Esborn
Reviewed by:	jeff, jhb, sbruno
2011-05-05 14:39:14 +00:00
Jung-uk Kim
3453537fa5 Use atomic load & store for TSC frequency. It may be overkill for amd64 but
safer for i386 because it can be easily over 4 GHz now.  More worse, it can
be easily changed by user with 'machdep.tsc_freq' tunable (directly) or
cpufreq(4) (indirectly).  Note it is intentionally not used in performance
critical paths to avoid performance regression (but we should, in theory).
Alternatively, we may add "virtual TSC" with lower frequency if maximum
frequency overflows 32 bits (and ignore possible incoherency as we do now).
2011-04-07 23:28:28 +00:00
Rebecca Cran
6bccea7c2b Fix typos - remove duplicate "the".
PR:	bin/154928
Submitted by:	Eitan Adler <lists at eitanadler.com>
MFC after: 	3 days
2011-02-21 09:01:34 +00:00
Andriy Gapon
fe8c7b3d77 dtrace_xcall: no need for special handling of curcpu
smp_rendezvous_cpus alreadt does the right thing in a very similar
fashion, so the code was kind of duplicating that.

MFC after:	3 weeks
2010-12-07 09:19:47 +00:00
Andriy Gapon
7becfa95b9 dtrace_gethrtime_init: pin to master while examining other CPUs
Also use pc_cpumask to be future-friendly.

Reviewed by:	jhb
MFC after:	2 weeks
2010-12-07 09:03:17 +00:00
Rui Paulo
c6f5742f90 Kernel DTrace support for:
o uregs  (sson@)
o ustack (sson@)
o /dev/dtrace/helper device (needed for USDT probes)

The work done by me was:
Sponsored by:	The FreeBSD Foundation
2010-08-22 10:53:32 +00:00
Rui Paulo
58f668bba5 Add a function compatibility function dtrace_instr_size_isa() that on
FreeBSD does the same as dtrace_dis_isize().

Sponsored by:	The FreeBSD Foundation
2010-08-22 10:40:15 +00:00
John Baldwin
3aa6d94e0c Update several places that iterate over CPUs to use CPU_FOREACH(). 2010-06-11 18:46:34 +00:00
Andriy Gapon
b064b6d1cd dtrace_gethrtime: improve scaling of TSC ticks to nanoseconds
Currently dtrace_gethrtime uses formula similar to the following for
converting TSC ticks to nanoseconds:
rdtsc() * 10^9 / tsc_freq
The dividend overflows 64-bit type and wraps-around every 2^64/10^9 =
18446744073 ticks which is just a few seconds on modern machines.

Now we instead use precalculated scaling factor of
10^9*2^N/tsc_freq < 2^32 and perform TSC value multiplication separately
for each 32-bit half.  This allows to avoid overflow of the dividend
described above.
The idea is taken from OpenSolaris.
This has an added feature of always scaling TSC with invariant value
regardless of TSC frequency changes. Thus the timestamps will not be
accurate if TSC actually changes, but they are always proportional to
TSC ticks and thus monotonic. This should be much better than current
formula which produces wildly different non-monotonic results on when
tsc_freq changes.

Also drop write-only 'cp' variable from amd64 dtrace_gethrtime_init()
to make it identical to the i386 twin.

PR:		kern/127441
Tested by:	Thomas Backman <serenity@exscape.org>
Reviewed by:	jhb
Discussed with:	current@, bde, gnn
Silence from:	jb
Approved by:	re (gnn)
MFC after:	1 week
2009-07-15 17:07:39 +00:00
Ganbold Tsagaankhuu
79dae0aa0b Remove unused variable.
Found with:     Coverity Prevent(tm)
CID: 3669,3671

Approved by: jb
2008-11-25 19:25:54 +00:00
John Birrell
91eaf3e183 Custom DTrace kernel module files plus FreeBSD-specific DTrace providers. 2008-05-23 05:59:42 +00:00