remove the former.
All other EGA/VGA methods were already shared, with VGA-only features
mostly not used and no decisions in inner loops to optimize fof VGA,
but this method was split up because it is the only important one and
using VGA methods if possible is about twice as fast. The speed is
mostly not from splitting to reduce branches but from doing half as
many bus accesses, so make this easier to maintain by not splitting.
There is now 1 extra branch in an inner loop where it costs less than
1% of the bus access overhead on Haswell even if the compiler schedules
it poorly.
vga planar method (for testing that was supposed to be local that the
former still works). The ega method works on vga but is about twice
as slow. The vga method doesn't work on ega.
Optimize the main vga planar method a little. For changing the
background color (which was otherwise optimized better than most
things), don't switch the write mode from 3 to 0 just to select
the pixel mask of 0xff obscurely by writing 0. Just write 0xff
directly.
a pointer to the main ega drawing method which is misoptimized be in
a different function than the main vga planar mode drawing method.
Vga initialization handles everything with no extra code except for
selecting the different function.
corresponding to the gaps between characters. This fixes distortion
of the cursor due to expanding it across the gaps.
Again for character width 9, when the cursor characters are not in the
graphics range (0xb0-0xdf), the gaps were always there (filled in the
background color for the previous char). They still look strange, but
don't cause distortion. When the cursor characters are in the graphics
range, the gaps are filled by repeating the previous line. This gives
distortion with cilia. Removing vertical lines reduces the distortion
to vertical cilia.
Move the default for the cursor characters out of the graphics range.
With character width 9, this gives gaps instead of distortion and
other problems. With character width 8, it just fixes a smaller set
of other problems. Some distortion and other problems can be recovered
using vidcontrol -M. Presumably the default was to fill the gaps
intentionally, but it is much better to leave gaps. The gaps can even
be considered as a feature for text processing -- they give sub-pointers
to character boundaries. The other problems are: (1) with character
width 9, characters near the cursor are moved into the graphics range
and thus distorted if any of their 8th bits is set; (2) conflicts with
national characters in the graphics range.
The default range for the graphics cursor characters is now 8-11. This
doesn't conflict with anything, since the glyphs for the characters in
this range are unreachable.
Use the 10x16 mouse cursor in text mode too (if the font size is >= 14).
When the character width is 9, removal of 1 or 2 vertical lines makes
10x16 cursor no wider than the 9x13 one usually was. We could even
handle cursors 1 pixel wider in 2 character cells and gaps without
more clipping than given by the gaps (the worst case is 1 pixel in the
left cell, 1 removed in the middle gap, 8 in the right cell and 1
removed in the right gap. The pixel in the right gap is removed so
it doesn't matter if it is in the font).
When the character width is 8, we now clip the 10-wide cursor by 1
pixel in the worst case. This clipping is usually invisible since it
is of the border and and the border usually merges with the background
so is invisible. There should be an option to use reverse video to
highlight the border and its tip instead of the interior (graphics
mode can do better using separate colors). This needs the 9x13 cursor
again.
Ideas from: ache (especially about the bad default character range)
Note that KVA mapping of the framebuffer already uses write-combining
mode, so the change, besides improving speed of user mode writes, also
satisfies requirement of the IA32 architecture of using consistent
caching modes for multiple mappings of the same page.
Reported and tested by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
not consider a "disabled" cpu as a CPU we have to ignore, and we should use
them if they provide a "enable-method".
While I'm there, support "ok" as well as "okay", while ePAPR only accepts
"okay", linux accepts "ok" too so we can expect it to be used.
Reviewed by: andrew (partially)
9 wide.
I only need this to improve the mouse cursor, but it has always been
needed to select and/or adjust fonts.
This is complicated because there are no standard parameter tables
giving this bit of information directly, and the device register bit
giving the information can't be trusted even if it is read from the
hardware. Use a heuristic to guess if the device register can be
trusted. (The device register is normally read from the BIOS mode
table, but on my system where the device register is wrong, the mode
table doesn't match the hardware and is not used; the device registers
are used in this case.)
mode.
Use the general DRAWPIXEL() macro with its bigger case statement
(twice) instead of our big case statement (once). DRAWPIXEL() is more
complicated since it is not missing support for depth 24 or
complications for colors in depth 16 (we currently hard-code black and
white so the complications for colors are not needed). DRAWPIXEL()
also does the bpp calculation in the inner loop. Compilers optimize
DRAWPIXEL() well enough, and the main text drawing method always
depended on this. In direct mode, mouse cursor drawing is now similar
to normal text drawing except it draws in 2 hard-coded colors instead
of 1 variable color.
This also fixes a nested hard-coding of colors. DRAWPIXEL() uses the
palette in all cases, but the direct code didn't use the palette for
its hard-coded black. This only had an effect in depth 8, since
changing the palette is not supported in other depths.
direct mode renderer. I thought that reads were not much slower than
writes, so that the method only tripled the time for the whole function,
but I recently measured that video memory reads can be up to 53 times
slower than writes in tighter loops than here. Loop overheap here
reduces the multiplier to only 16-20 on Haswell.
Start cleaning up and fixing larger bugs in this function. Only replace
the 22-line removal loop by a 3-line one for now, since adjusting the
old loop would have required many palette calculations which are better
done in the DRAW_PIXEL() macro. This also fixes missing support for
depth 24, but only for removal.
Removal is currently sloppy at the right bottom corner. It sometimes
leaks border color into the text window. This is soon cleaned up by the
caller. The planar renderer has complications to clip at the corner.
of our tweaked modes based on it. In practice, this means limiting the
tweaked modes to at most 80x50 based on 80x25, so there are no 90-column,
80x30 or 80x60 modes.
This happens when the the initial mode is is not in the parameter
table. We always detected this case, but assumed that the (necessarily
nonstandard) parameters of the initial mode could be tweaked just as
blindly as the probably-standard parameters of initial modes in the
table.
On 1 laptop system with near-VGA where the initial mode is nonstandard,
this is because the hardware apparently doesn't support 9-bit mode,
but otherwise has standard timing. The initial mode has 8-bit mode
CRTC horizontal parameters similar to those in syscons' 90-column modes
and in EGA modes. Tweaking these values for the 90-column modes has
little effect except to print the extra 10 columns off the screen.
Tweaking from 80x25 to 80x30 requires changing from 400 scan lines to
480. This can probably be made to work, but syscons blindly applies
values based on standard timing. This gives blank output. Tweaking
from 80x25 to 80x50 doesn't change the CRTC timing and works.
numbers with Chacha20. Keep the API, though, as that is what the
other *BSD's have done.
Use the boot-time entropy stash (if present) to bootstrap the
in-kernel entropy source.
Reviewed by: delphij,rwatson
Approved by: so(delphij)
MFC after: 2 months
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D10048
is no phy-handle property, fall back to using MII_PHY_ANY.
This still doesn't support an mdio bus with multiple PHYs on it, or the
possibility that the PHY being used by this instance of ffec is on the
mdio bus of some other instance (which is now a possibility with imx6ul).
Adding that support will require changes in fdt_get_phyaddr(), which is
currently making some assumptions that don't work with modern fdt data.
The only thing this phy needs that the ukphy driver doesn't provide is
that the value in the proprietary Phy Control 2 Register must be saved
before doing a soft reset and restored afterwards. Most modern phys have
"sticky bits" for low-level config that survive a reset, but on this one
the values in all registers go back to defaults, wiping out any board-
specific config set up by the bootloader/bios/whatever.
modes if the font size is >= 14.
This is the X cursor XC_left_ptr (#68) (glyph #45 in an X cursor font).
Also found in vt. The old 9x13 cursor is the 10x16 one trimmed not very
well.
8x8 fonts need a smaller cursor instead of a larger one, except when
the pixel size is small. Text mode is still limited to width and height
1 more than the font (so the 9x13 is already 4 pixels too high for it).
access it via pointers (still to only 1 instance, now with a less
generic name).
Restructure the "and" and "or" masks as border and interior masks
(where the "and" mask was for the union of the border and the interior).
"and" and "or" were only a detail in a not very good implementation,
and after fixing that the union was only used to calculate the border
at runtime.
Use the metric data in more places to clip to active pixels earlier.
available starting with T6. The values in the timer holdoff registers
are multiplied by the scaling factor before use.
dev.<nexus>.<n>.holdoff_timers shows the final values of the
timers in microseconds.
MFC after: 1 week
Sponsored by: Chelsio Communications
mode.
Direct mode always supported widths up to 32, except for its hard-coded
16s matching the pixmap size. Text mode is still limited to 9 its 2x2
character cell method and missing adjustments for the gap between
characters, if any.
Cursor heights can be almost anything in graphics modes.
much as possible, by avoiding null ANDs and ORs to the frame buffer.
Mouse cursors are fairly sparse, especially for their frame. Pixels
are written in groups of 8 in planar mode and the per-group sparseness
is not as large, but it still averages about 40% with the current
9x13 mouse cursor. The average drawing time is reduced by about this
amount (from 22 usec constant to 12.5 usec average on Haswell).
This optimization is relatively larger with larger cursors. Width 10
requires 6 frame buffer accesses per line instead of 4 if not done
sparsely, but rarely more than 4 if done sparsely.
mode.
Don't manually unroll the 2 inner loops. On Haswell, doing so gave a
speedup of about 0.5% (about 4 cycles per iteration out of 1400), but
hard-coded a limit of width 9 and made better better optimizations
harder to see. gcc-4.2.1 -O does the unrolling anyway, unless tricked
with a volatile hack. gcc's unrolling is not very good and gives a
a speedup of about half as much (about 2 cycles per iteration). (All
timing on i386.)
Manual unrolling was only feasible because the inner loop only iterates
once or twice. Usually twice, but a dynamic check is needed to decide,
and was not moved from the second-innermost loop manually or by gcc.
This commit basically adds another dynamic check in the inner loop.
Cursor widths of 10-17 require 3 iterations in the inner loop and this
is not so easy to unroll -- even gcc stops at 2.
TXP_CMD_WAIT argument allocates a response buffer. If the allocation
fails, txp_ext_command() returns an error and it's handed in caller.
Found by: PVS-Studio