is default and caller does not require dedicated thread. timer needs
a dedicated thread to maintain overrun count correctly in notification
context. mqueue and aio can use thread pool to do notification
concurrently, the thread pool has lifecycle control, some threads will
exit if they have idled for a while.
Staticize two tables thare are not visible in <resolv.h>
and which are also local in Solaris' libresolv.
Remove two functions that are not referenced in libc nor
anywhere else I can find, not visible in <resolv.h> and
which are also local in Solaris libresolv.
but then sizes the containing data structure at run-time to make room
for per-cpu cache data. Modify libmemstat to separately allocate a
buffer to hold per-cpu cache data, sized based on the run-time mp_maxid
variable when using libkvm to access UMA data. This avoids reading
invalid cache data from beyond the end of the uma_zone data structure
on the stack, which can result in invalid statistics and/or reads from
invalid kernel addresses.
Foot target practice by: ps
MFC after: 3 days
return a KVM error rather than an out of memory error, so that the caller
reports the KVM error state. This replaces a misleading error message
with a more accurate although equally confusing one.
MFC after: 3 days
cpu mask before looking at the cache entries for the CPU. For systems
with sparse CPU id arrays, this skips otherwise uninitialized cache
structures.
MFC after: 3 days
Remove the block of code that tries to use delayed regions in LIFO order,
since from a policy perspective, it conflicts with LRU caching of newly
coalesced regions in arena_undelay(). There are numerous policy
alternatives, and it isn't readily obvious which (if any) is superior;
this change at least has the virtue of being consistent with policy.
Add %M{essage} extension which prints an errno value as the
corresponding string if possible or numerically otherwise.
It is not currently possible to do the syslog(3) like %m extension
because errno would need to get capatured on entry to the first
function in the printf family, so %M requires you to supply errno
as an argument.
Add %Q{uote} extension which will print a string in double quotes with
appropriate back-slash escapes (only) if necessary.
fit regions are available, use the delayed regions in LIFO order, in order
to increase locality of reference. We might expect this to cause delayed
regions to be removed from the delay ring buffer more often (since we're
now re-using more recently buffered regions), but numerous tests indicate
that the overall impact on memory usage tends to be good (reduced
fragmentation).
Re-work arena_frag_reg_alloc() so that when large free regions are
exhausted, it uses small regions in a way that favors contiguous allocation
of sequentially allocated small regions. Use arena_frag_reg_alloc() in
this capacity, rather than directly attempting over-fitting of small
requests when no large regions are available.
Remove the bin overfit statistic, since it is no longer relevant due to
the arena_frag_reg_alloc() changes.
Do not specify arena_frag_reg_alloc() as an inline function. It is too
large to benefit much from being inlined, and it is also called in two
places, only one of which is in the critical path (the other call bloated
arena_reg_alloc()).
Call arena_coalesce() for a region before caching it with
arena_mru_cache().
Add assertions that detect the attempted caching of adjacent free regions,
so that we notice this problem when it is first created, rather than in
arena_coalesce(), when it's too late to know how the problem arose.
Reported by: Hans Blancke
behaviour of returning EINVAL when ".." is passed as either argument
has been restored.
rmdir("..") now returns EINVAL instead of EPERM. Document the
previously undocumented behaviour of rmdir(".") returning EINVAL
as required by POSIX and SUSv3. Bump the man page change date.
undelete("..") now returns EINVAL instead of EPERM. Bump the man
page change date.
MFC after: 3 days
problems in cases where regions are faked up for the purposes of red-black
tree searches, since those faked region headers reside on the stack, rather
than in a malloc chunk.
allowing the error to be fatal.
Move a label in order to make sure to properly handle errors in malloc(0).
Reported by: Alastair D'Silva, Saneto Takanori
routine fails or the first read fails), invoke the client close
routine immediately so the client can clean up. Also, don't store the
client pointers in this case, so that the client close routine can't
accidentally get called more than once.
A minor style fix to archive_read_open_fd.c while I'm here.
PR: 86453
Thanks to: Andrew Turner for reporting this and suggesting a fix.
archiver for Fourth Edition through Sixth Edition Unix; it was
replaced by tar in Seventh Edition. (First Edition through
Third Edition used "tap.")
Unfortunately, tp was not so very standard; there were a
few different variants. The code here attempts to support
what I believe were the most common variants.
tp support is not yet enabled by archive_read_support_format_all(),
as I'm not yet entirely comfortable with the detection
heuristics. People interested in experimenting can
add archive_read_support_format_tp() just after any calls
to archive_read_support_format_all() in bsdtar to see how
well this works.
TODO: tp format is roughly similar in structure to dump/restore
archive formats used by many systems. It should be possible
to generalize this code to handle many dump/restore variants.
Format detection heuristics are going to be rough, though.
Thanks to: Warren Toomey, whose very basic tp extraction programs
and documentation made this possible.
there is never any need to recursively call the main allocation functions.
Remove recursive spinlock support, since it is no longer needed.
Allow chunks to be as small as the page size.
Correctly propagate OOM errors from arena_new().
from /etc/login.conf, or an unterminated string buffer could result.
Probably, login_times.c should reject excessively long time strings as
unparseable, rather than truncating, which might render an invalid
string valid.
Found with: Coverity Prevent (tm)
Reviewed by: csjp
MFC after: 3 days
list, which could cause problems for multi-threaded applications
using libmemstat to monitor UMA in more than one thread
simultaneously.
MFC after: 3 days
broken for non-threaded shared processes in that __tls_get_addr()
assumes the thread pointer is always initialized. This is not the
case. When arenas_map is referenced in choose_arena() and it is
defined as a thread-local variable, it will result in a SIGSEGV.
PR: ia64/91846 (describes the TLS/ia64 bug).
had been replied, the reply was always delivered to the originator
synchronously.
With introduction of netgraph item callbacks and a wait channel with
mutex in ng_socket(4), we have fixed the problem with ngctl(8) returning
earlier than the command has been proceeded by target node. But still
ngctl(8) can return prior to the reply has arrived to its node.
To fix this:
- Introduce a new flag for netgraph(4) messages - NGM_HASREPLY.
This flag is or'ed with message like NGM_READONLY.
- In netgraph userland library if we have sent a message with
NGM_HASREPLY flag, then select(2) until reply comes.
- Mark appropriate generic commands with NGM_HASREPLY flag,
gathering them into one enum {}. Bump generic cookie.
* Add posix_memalign().
* Move calloc() from calloc.c to malloc.c. Add a calloc() implementation in
rtld-elf in order to make the loader happy (even though calloc() isn't
used in rtld-elf).
* Add _malloc_prefork() and _malloc_postfork(), and use them instead of
directly manipulating __malloc_lock.
Approved by: phk, markm (mentor)
While we don't use the NC_BROADCAST value of nc_flag anywhere in the
RPC code, it is parseable by getnetconfigent(3) from /etc/netconfig.
o Clean up some "see below"'s that were cut and pasted from netconfig.h.
operation, the caller is blocked util target threads are really
suspended, also avoid suspending a thread when it is holding a
critical lock.
Fix a bug in _thr_ref_delete which tests a never set flag.
commit broke the 2**24 cases where |x| > DBL_MAX/2. There are exponent
range problems not just for denormals (underflow) but for large values
(overflow). Doubles have more than enough exponent range to avoid the
problems, but I forgot to convert enough terms to double, so there was
an x+x term which was sometimes evaluated in float precision.
Unfortunately, this is a pessimization with some combinations of systems
and compilers (it makes no difference on Athlon XP's, but on Athlon64's
it gives a 5% pessimization with gcc-3.4 but not with gcc-3.3).
Exlain the problem better in comments.
algorithm for the second step significantly to also get a perfectly
rounded result in round-to-nearest mode. The resulting optimization
is about 25% on Athlon64's and 30% on Athlon XP's (about 25 cycles
out of 100 on the former).
Using extra precision, we don't need to do anything special to avoid
large rounding errors in the third step (Newton's method), so we can
regroup terms to avoid a division, increase clarity, and increase
opportunities for parallelism. Rearrangement for parallelism loses
the increase in clarity. We end up with the same number of operations
but with a division reduced to a multiplication.
Using specifically double precision, there is enough extra precision
for the third step to give enough precision for perfect rounding to
float precision provided the previous steps are accurate to 16 bits.
(They were accurate to 12 bits, which was almost minimal for imperfect
rounding in the old version but would be more than enough for imperfect
rounding in this version (9 bits would be enough now).) I couldn't
find any significant time optimizations from optimizing the previous
steps, so I decided to optimize for accuracy instead. The second step
needed a division although a previous commit optimized it to use a
polynomial approximation for its main detail, and this division dominated
the time for the second step. Use the same Newton's method for the
second step as for the third step since this is insignificantly slower
than the division plus the polynomial (now that Newton's method only
needs 1 division), significantly more accurate, and simpler. Single
precision would be precise enough for the second step, but doesn't
have enough exponent range to handle denormals without the special
grouping of terms (as in previous versions) that requires another
division, so we use double precision for both the second and third
steps.
functions in the child after a fork() from a threaded process,
use __sys_setprocmask() rather than setprocmask() to keep our
signal handling sane. Without this fix, signals are essentially
ignored in said child and things such as protection violations
result in an endless busy loop.
Reviewed by: deischen
similar the the Solaris implementation. Repackage the krb5 GSS mechanism
as a plugin library for the new implementation. This also includes a
comprehensive set of manpages for the GSS-API functions with text mostly
taken from the RFC.
Reviewed by: Love Hörnquist Åstrand <lha@it.su.se>, ru (build system), des (openssh parts)
between a 32-bit integer and a radix-64 ASCII string. The l64a_r() function
is a NetBSD addition.
PR: 51209 (based on submission, but very different)
Reviewed by: bde, ru
distributed non-large args, this saves about 14 of 134 cycles for
Athlon64s and about 5 of 199 cycles for AthlonXPs.
Moved the check for x == 0 inside the check for subnormals. With
gcc-3.4 on uniformly distributed non-large args, this saves another
5 cycles on Athlon64s and loses 1 cycle on AthlonXPs.
Use INSERT_WORDS() and not SET_HIGH_WORD() when converting the first
approximation from bits to double. With gcc-3.4 on uniformly distributed
non-large args, this saves another 4 cycles on both Athlon64s and and
AthlonXPs.
Accessing doubles as 2 words may be an optimization on old CPUs, but on
current CPUs it tends to cause extra operations and pipeline stalls,
especially for writes, even when only 1 of the words needs to be accessed.
Removed an unused variable.
function approximation for the second step. The polynomial has degree
2 for cbrtf() and 4 for cbrt(). These degrees are minimal for the final
accuracy to be essentially the same as before (slightly smaller).
Adjust the rounding between steps 2 and 3 to match. Unfortunately,
for cbrt(), this breaks the claimed accuracy slightly although incorrect
rounding doesn't. Claim less accuracy since its not worth pessimizing
the polynomial or relying on exhaustive testing to get insignificantly
more accuracy.
This saves about 30 cycles on Athlons (mainly by avoiding 2 divisions)
so it gives an overall optimization in the 10-25% range (a larger
percentage for float precision, especially in 32-bit mode, since other
overheads are more dominant for double precision, surprisingly more
in 32-bit mode).
- in preparing for the third approximation, actually make t larger in
magnitude than cbrt(x). After chopping, t must be incremented by 2
ulps to make it larger, not 1 ulp since chopping can reduce it by
almost 1 ulp and it might already be up to half a different-sized-ulp
smaller than cbrt(x). I have not found any cases where this is
essential, but the think-time error bound depends on it. The relative
smallness of the different-sized-ulp limited the bug. If there are
cases where this is essential, then the final error bound would be
5/6+epsilon instead of of 4/6+epsilon ulps (still < 1).
- in preparing for the third approximation, round more carefully (but
still sloppily to avoid branches) so that the claimed error bound of
0.667 ulps is satisfied in all cases tested for cbrt() and remains
satisfied in all cases for cbrtf(). There isn't enough spare precision
for very sloppy rounding to work:
- in cbrt(), even with the inadequate increment, the actual error was
0.6685 in some cases, and correcting the increment increased this
a little. The fix uses sloppy rounding to 25 bits instead of very
sloppy rounding to 21 bits, and starts using uint64_t instead of 2
words for bit manipulation so that rounding more bits is not much
costly.
- in cbrtf(), the 0.667 bound was already satisfied even with the
inadequate increment, but change the code to almost match cbrt()
anyway. There is not enough spare precision in the Newton
approximation to double the inadequate increment without exceeding
the 0.667 bound, and no spare precision to avoid this problem as
in cbrt(). The fix is to round using an increment of 2 smaller-ulps
before chopping so that an increment of 1 ulp is enough. In cbrt(),
we essentially do the same, but move the chop point so that the
increment of 1 is not needed.
Fixed comments to match code:
- in cbrt(), the second approximation is good to 25 bits, not quite 26 bits.
- in cbrt(), don't claim that the second approximation may be implemented
in single precision. Single precision cannot handle the full exponent
range without minor but pessimal changes to renormalize, and although
single precision is enough, 25 bit precision is now claimed and used.
Added comments about some of the magic for the error bound 4/6+epsilon.
I still don't understand why it is 4/6+ and not 6/6+ ulps.
Indent comments at the right of code more consistently.
to be compatible with symbol versioning support as implemented by
GNU libc and documented by http://people.redhat.com/~drepper/symbol-versioning
and LSB 3.0.
Implement dlvsym() function to allow lookups for a specific version of
a given symbol.
means:
o Remove Elf64_Quarter,
o Redefine Elf64_Half to be 16-bit,
o Redefine Elf64_Word to be 32-bit,
o Add Elf64_Xword and Elf64_Sxword for 64-bit entities,
o Use Elf_Size in MI code to abstract the difference between
Elf32_Word and Elf64_Word.
o Add Elf_Ssize as the signed counterpart of Elf_Size.
MFC after: 2 weeks
on probationary terms: it may go away again if it transpires it is
a bad idea.
This extensible printf version will only be used if either
environment variable USE_XPRINTF is defined
or
one of the extension functions are called.
or
the global variable __use_xprintf is set greater than zero.
In all other cases our traditional printf implementation will
be used.
The extensible version is slower than the default printf, mostly
because less opportunity for combining I/O operation exists when
faced with extensions. The default printf on the other hand
is a bad case of spaghetti code.
The extension API has a GLIBC compatible part and a FreeBSD version
of same. The FreeBSD version exists because the GLIBC version may
run afoul of our FILE * locking in multithreaded programs and it
even further eliminate the opportunities for combining I/O operations.
Include three demo extensions which can be enabled if desired: time
(%T), hexdump (%H) and strvis (%V).
%T can format time_t (%T), struct timeval (%lT) and struct timespec (%llT)
in one of two human readable duration formats:
"%.3llT" -> "20349.245"
"%#.3llT" -> "5h39m9.245"
%H will hexdump a sequence of bytes and takes a pointer and a length
argument. The width specifies number of bytes per line.
"%4H" -> "65 72 20 65"
"%+4H" -> "0000 65 72 20 65"
"%#4H" -> "65 72 20 65 |er e|"
"%+#4H" -> "0000 65 72 20 65 |er e|"
%V will dump a string in strvis format.
"%V" -> "Hello\tWor\377ld" (C-style)
"%0V" -> "Hello\011Wor\377ld" (octal)
"%+V" -> "Hello%09Wor%FFld" (http-style)
Tests, comments, bugreports etc are most welcome.
allocate a memory block. sscanf calls __svfscanf which in turn calls
fread, fread triggers mutex initialization but the mutex is not
destroyed in sscanf, this leads to memory leak. To avoid the memory
leak and performance issue, we create a none MT-safe version of fread:
__fread, and instead let __svfscanf call __fread.
PR: threads/90392
Patch submitted by: dhartmei
MFC after: 7 days
the second step of approximating cbrt(x). It turns out to be neither
very magic not nor very good. It is just the (2,2) Pade approximation
to 1/cbrt(r) at r = 1, arranged in a strange way to use fewer operations
at a cost of replacing 4 multiplications by 1 division, which is an
especially bad tradeoff on machines where some of the multiplications
can be done in parallel. A Remez rational approximation would give
at least 2 more bits of accuracy, but the (2,2) Pade approximation
already gives 6 more bits than needed. (Changed the comment which
essentially says that it gives 3 more bits.)
Lower order Pade approximations are not quite accurate enough for
double precision but are plenty for float precision. A lower order
Remez rational approximation might be enough for double precision too.
However, rational approximations inherently require an extra division,
and polynomial approximations work well for 1/cbrt(r) at r = 1, so I
plan to switch to using the latter. There are some technical
complications that tend to cost a division in another way.
This gives an optimization of between 9 and 22% on Athlons (largest
for cbrt() on amd64 -- from 205 to 159 cycles).
We extracted the sign bit and worked with |x|, and restored the sign
bit as the last step. We avoided branches to a fault by using accesses
to FP values as bits to clear and restore the sign bit. Avoiding
branches is usually good, but the bit access macros are not so good
(especially for setting FP values), and here they always caused pipeline
stalls on Athlons. Even using branches would be faster except on args
that give perfect branch misprediction, since only mispredicted branches
cause stalls, but it possible to avoid touching the sign bit in FP
values at all (except to preserve it in conversions from bits to FP
not related to the sign bit). Do this. The results are identical
except in 2 of the 3 unsupported rounding modes, since all the
approximations use odd rational functions so they work right on strictly
negative values, and the special case of -0 doesn't use an approximation.
For some denormalized long double values, a bug in __hldtoa() (called
from *printf()'s %A format) results in a base 16 digit being rounded
up from 0xf to 0x10.
When this digit is subsequently converted to string format, an index
of 10 reaches past the end of the uppper-case hex/char array, picking
up whatever the code segment happen to contain at that address.
This mostly seem to be some character from the upper half of the
byte range.
When using the %a format instead of %A, the first character past
the end of the lowercase hex/char table happens to be index 0 in
the uppercase hex/char table hextable and therefore the string
representation features a '0', which is supposedly correct.
This leads me to belive that the proper fix _may_ be as simple as
masking all but the lower four bits off after incrementing a hex-digit
in libc/gdtoa/_hdtoa.c:roundup(). I worry however that the upper
bit in 0x10 indicates a carry not carried.
Until das@ or bde@ finds time to visit this issue, extend the
hexdigit arrays with a 17th index containing '?' so that we get a
invalid but consistent and printable output in both %a and %A formats
whenever this bug strikes.
This unmasks the bug in the %a format therefore solving the real
issue may both become easier and more urgent.
Possibly related to: PR 85080
With help by: bde@
<cbrt(x) in bits> ~= <x in bits>/3 + BIAS.
Keep the large comments only in the double version as usual.
Fixed some style bugs (mainly grammar and spelling errors in comments).
It was because I forgot to translate the part of the double precision
algorithm that chops t so that t*t is exact. Now the maximum error
is the same as for double precision (almost exactly 2.0/3 ulps).
The maximum error was 3.56 ulps.
The bug was another translation error. The double precision version
has a comment saying "new cbrt to 23 bits, may be implemented in
precision". This means exactly what it says -- that the 23 bit second
approximation for the double precision cbrt() may be implemented in
single (i.e., float) precision. It doesn't mean what the translation
assumed -- that this approximation, when implemented in float precision,
is good enough for the the final approximation in float precision.
First, float precision needs a 24 bit approximation. The "23 bit"
approximation is actually good to 24 bits on float precision args, but
only if it is evaluated in double precision. Second, the algorithm
requires a cleanup step to ensure its error bound.
In float precision, any reasonable algorithm works for the cleanup
step. Use the same algorithm as for double precision, although this
is much more than enough and is a significant pessimization, and don't
optimize or simplify anything using double precision to implement the
float case, so that the whole double precision algorithm can be verified
in float precision. A maximum error of 0.667 ulps is claimed for cbrt()
and the max for cbrtf() using the same algorithm shouldn't be different,
but the actual max for cbrtf() on amd64 is now 0.9834 ulps. (On i386
-O1 the max is 0.5006 (down from < 0.7) due to extra precision.)
The threshold for not being tiny was too small. Use the usual 2**-12
threshold. As for sinhf, use a different method (now the same as for
sinhf) to set the inexact flag for tiny nonzero x so that the larger
threshold works, although this method is imperfect. As for sinhf,
this change is not just an optimization, since the general code that
we fell into has accuracy problems even for tiny x. On amd64, avoiding
it fixes tanhf on 2*13495596 args with errors of between 1 and 1.3
ulps and thus reduces the total number of args with errors of >= 1 ulp
from 37533748 to 5271278; the maximum error is unchanged at 2.2 ulps.
The magic number 22 is log(DBL_MAX)/2 plus slop. This is bogus for
float precision. Use 9 (log(FLT_MAX)/2 plus less slop than for
double precision). Unlike for coshf and tanhf, this is just an
optimization, and MAX isn't misspelled EPSILON in the commit log.
I started testing with nonstandard rounding modes, and verified that
the chosen thresholds work for all modes modulo problems not related
to thresholds. The best thresholds are not very dependent on the mode,
at least for tanhf.
shares its low half with pio2_hi. pio2_hi is rounded down although
rounding to nearest would be a tiny bit better, so pio4_hi must be
rounded down too. It was rounded to nearest, which happens to be
different in float precision but the same in double precision.
This fixes about 13.5 million errors of more than 1 ulp in asinf().
The largest error was 2.81 ulps on amd64 and 2.57 ulps on i386 -O1.
Now the largest error is 0.93 ulps on amd65 and 0.67 ulps on i386 -O1.
sqrt(2)/2-1. For log1p(), fixed the approximation to sqrt(2)/2-1.
The end result is to fix an error of 1.293 ulps in
log1pf(0.41421395540 (hex 0x3ed413da))
and an error of 1.783 ulps in
log1p(-0.292893409729003961761) (hex 0x12bec4 00000001)).
The former was the only error of > 1 ulp for log1pf() and the latter
is the only such error that I know of for log1p().
The approximations don't need to be very accurate, but the last 2 need
to be related to the first and be rounded up a little (even more than
1 ulp for sqrt(2)/2-1) for the following implementation-detail reason:
when the arg (x) is not between (the approximations to) sqrt(2)/2-1
and sqrt(2)-1, we commit to using a correction term, but we only
actually use it if 1+x is between sqrt(2)/2 and sqrt(2) according to
the first approximation. Thus we must ensure that
!(sqrt(2)/2-1 < x < sqrt(2)-1) implies !(sqrt(2)/2 < x+1 < sqrt(2)),
where all the sqrt(2)'s are really slightly different approximations
to sqrt(2) and some of the "<"'s are really "<="'s. This was not done.
In log1pf(), the last 2 approximations were rounded up by about 6 ulps
more than needed relative to a good approximation to sqrt(2), but the
actual approximation to sqrt(2) was off by 3 ulps. The approximation
to sqrt(2)-1 ended up being 4 ulps too small, so the algoritm was
broken in 4 cases. The result happened to be broken in 1 case. This
is fixed by using a natural approximation to sqrt(2) and derived
approximations for the others.
In logf(), all the approximations made sense, but the approximation
to sqrt(2)/2-1 was 2 ulps too small (a tiny amount, since we compare
with a granularity of 2**32 ulps), so the algorithm was broken in 2
cases. The result was broken in 1 case. This is fixed by rounding
up the approximation to sqrt(2)/2-1 by 2**32 ulps, so 2**32 cases are
now handled a little differently (still correctly according to my
assertion that the approximations don't need to be very accurate, but
this has not been checked).
through the history in sh.
| Refresh bug reported by Julien Torres:
|
| going from:
| activate -verbose
| to:
| reset -activation
| results in:
| reset -activationverbose"
| instead of:
| reset -activation
|
| This is because we choose to insert "reset -" before the current line,
| and the delete "e -" and insert "ion" in the appropriate place. The
| cleareol code did not handle this case properly; we now cleareol to
| the maximum number of characters of the first difference, the second
| difference and the difference in line length.
on assignment.
Extra precision on i386's broke hi+lo decomposition in the usual way.
It caused all except 1 of the 62343 errors of more than 1 ulp for
log1pf() on i386's with gcc -O [-fno-float-store].
according to the highest nonzero bit in a denormal was missing.
fdlibm ilogbf() and ilogb() have always had the adjustment, but only
use a small part of their method for handling denormals; use the
normalization method in log[f]() for the main part.
It was lost in rev.1.9. The log message for rev.1.9 says that the
special case of +-0 is handled twice, but it was only handled once,
so it became unhandled, and this happened to break half of the cases
that return +-0:
- round-towards-minus-infinity: 0 < x < 1: result was -0 not 0
- round-to-nearest: -0.5 <= x < 0: result was 0 not -0
- round-towards-plus-infinity: -1 < x < 0: result was 0 not -0
- round-towards-zero: -1 < x < 0: result was 0 not -0
TWO52[sx] to trick gcc into correctly converting TWO52[sx]+x to double
on assignment to "double w", force a correct assignment by assigning
to *(double *)&w. This is cleaner and avoids the double rounding
problem on machines that evaluate double expressions in double
precision. It is not necessary to convert w-TWO52[sx] to double
precision on return as implied in the comment in rev.1.3, since
the difference is exact.
(1) In round-to-nearest mode, on all machines, fdlibm rint() never
worked for |x| = n+0.75 where n is an even integer between 262144
and 524286 inclusive (2*131072 cases). To avoid double rounding
on some machines, we begin by adjusting x to a value with the 0.25
bit not set, essentially by moving the 0.25 bit to a lower bit
where it works well enough as a guard, but we botched the adjustment
when log2(|x|) == 18 (2*2**52 cases) and ended up just clearing
the 0.25 bit then. Most subcases still worked accidentally since
another lower bit serves as a guard. The case of odd n worked
accidentally because the rounding goes the right way then. However,
for even n, after mangling n+0.75 to 0.5, rounding gives n but the
correct result is n+1.
(2) In round-towards-minus-infinity mode, on all machines, fdlibm rint()
never for x = n+0.25 where n is any integer between -524287 and
-262144 inclusive (262144 cases). In these cases, after mangling
n+0.25 to n, rounding gives n but the correct result is n-1.
(3) In round-towards-plus-infinity mode, on all machines, fdlibm rint()
never for x = n+0.25 where n is any integer between 262144 and
524287 inclusive (262144 cases). In these cases, after mangling
n+0.25 to n, rounding gives n but the correct result is n+1.
A variant of this bug was fixed for the float case in rev.1.9 of s_rintf.c,
but the analysis there is incomplete (it only mentions (1)) and the fix
is buggy.
Example of the problem with double rounding: rint(1.375) on a machine
which evaluates double expressions with just 1 bit of extra precision
and is in round-to-nearest mode. We evaluate the result using
(double)(2**52 + 1.375) - 2**52. Evaluating 2**52 + 1.375 in (53+1) bit
prcision gives 2**52 + 1.5 (first rounding). (Second) rounding of this
to double gives 2**52 + 2.0. Subtracting 2**52 from this gives 2.0 but
we want 1.0. Evaluating 2**52 + 1.375 in double precision would have
given the desired intermediate result of 2**52 + 1.0.
The double rounding problem is relatively rare, so the botched adjustment
can be fixed for most machines by removing the entire adjustment. This
would be a wrong fix (using it is 1 of the bugs in rev.1.9 of s_rintf.c)
since fdlibm is supposed to be generic, but it works in the following cases:
- on all machines that evaluate double expressions in double precision,
provided either long double has the same precision as double (alpha,
and i386's with precision forced to double) or my earlier fix to use
a long double 2**52 is modified to avoid using long double precision.
- on all machines that evaluate double expressions in many more than 11
bits of extra precision. The 1 bit of extra precision in the example
is the worst case. With N bits of extra precision, it sufices to
adjust the bit N bits below the 0.5 bit. For N >= about 52 there is
no such bit so the adjustment is both impossible and unnecessary. The
fix in rev.1.9 of s_rintf.c apparently depends on corresponding magic
in float precision: on all supported machines N is either 0 or >= 24,
so double rounding doesn't occur in practice.
- on all machines that don't use fdlibm rint*() (i386's).
So under FreeBSD, the double rounding problem only affects amd64 now, but
should only affect i386 in future (when double expressions are evaluated
in long double precision).
Switch strncpy to strlcpy suggested by gad and issue found by pjd.
Add to uname(3) man page describing:
UNAME_s
UNAME_r
UNAME_v
UNAME_m
Add to getosreldate(3) man page describing:
OSVERSION
Submitted by: ru, pjd/gad
Reviewed by: ru (man pages)
- in round-towards-minus-infinity mode, on all machines, roundf(x) never
worked for 0 < |x| < 0.5 (2*0x3effffff cases in all, or almost half of
float space). It was -0 for 0 < x < 0.5 and 0 for -0.5 < x < 0, but
should be 0 and -0, respectively. This is because t = ceilf(|x|) = 1
for these args, and when we adjust t from 1 to 0 by subtracting 1, we
get -0 in this rounding mode, but we want and expected to get 0.
- in round-towards-minus-infinity, round towards zero and round-to-nearest
modes, on machines that evaluate float expressions in float precision
(most machines except i386's), roundf(x) never worked for |x| =
<float value immediately below 0.5> (2 cases in all). It was +-1 but
should have been +-0. This is because t = ceilf(|x|) = 1 for these
args, and when we try to classify |x| by subtracting it from 1 we
get an unexpected rounding error -- the result is 0.5 after rounding
to float in all 3 rounding modes, so we we have forgotten the
difference between |x| and 0.5 and end up returning the same value
as for +-0.5.
The fix is to use floorf() instead of ceilf() and to add 1 instead of
-1 in the adjustment. With floorf() all the expressions used are
always evaluated exactly so there are no rounding problems, and with
adjustments of +1 we don't go near -0 when adjusting.
Attempted to fix round() and roundl() by cloning the fix for roundf().
This has only been tested for round(), only on args representable as
floats. Double expressions are evaluated in double precision even on
i386's, so round(0.5-epsilon) was broken even on i386's. roundl()
must be completely broken on i386's since long double precision is not
really supported. There seem to be no other dependencies on the
precision.
FreeBSD machine. To do this add the man 1 uname changes to __xuname.c
so we can override the settings it reports. Add OSVERSION override
to getosreldate. Finally which Makefile.inc1 to use uname -m instead
of sysctl -n hw.machine_arch to get the arch. type.
With these change you can put a complete FreeBSD OS image into a
chroot set:
UNAME_s=FreeBSD
UNAME_r=4.7-RELEASE
UNAME_v="FreeBSD $UNAME_r #1: Fri Jul 22 20:32:52 PDT 2005 fake@fake:/usr/obj/usr/src/sys/FAKE"
UNAME_m=i386
UNAME_p=i386
OSVERSION=470000
on an amd64 or i386 and it just work including building ports and using
pkg_add -r etc. The caveat for this example is that these patches
have to be applied to FreeBSD 4.7 and the uname(1) changes need to
be merged. This also addresses issue with libtool.
This is usefull for when a build machine has been trashed for an
old release and we want to do a build on a new machine that FreeBSD
4.7 won't run on ...
other systems it prevents a tty from becoming a controlling tty on the
open. O_SYNC is the POSIX name for O_FSYNC.
The Markup Police may need to tweak my references to standards.
k_tanf.c but with different details.
The polynomial is odd with degree 13 for tanf() and odd with degree
9 for sinf(), so the details are not very different for sinf() -- the
term with the x**11 and x**13 coefficients goes awaym and (mysteriously)
it helps to do the evaluation of w = z*z early although moving it later
was a key optimization for tanf(). The details are different but simpler
for cosf() because the polynomial is even and of lower degree.
On Athlons, for uniformly distributed args in [-2pi, 2pi], this gives
an optimization of about 4 cycles (10%) in most cases (13% for sinf()
on AXP, but 0% for cosf() with gcc-3.3 -O1 on AXP). The best case
(sinf() with gcc-3.4 -O1 -fcaller-saves on A64) now takes 33-39 cycles
(was 37-45 cycles). Hardware sinf takes 74-129 cycles. Despite
being fine tuned for Athlons, the optimization is even larger on
some other arches (about 15% on ia64 (pluto2) and 20% on alpha (beast)
with gcc -O2 -fomit-frame-pointer).
cosf(x) is supposed to return something like x when x is a NaN, and
we actually fairly consistently return x-x which is normally very like
x (on i386 and and it is x if x is a quiet NaN and x with the quiet bit
set if x is a signaling NaN. Rev.1.10 broke this by normalising x to
fabsf(x). It's not clear if fabsf(x) is should preserve x if x is a NaN,
but it actually clears the sign bit, and other parts of the code depended
on this.
The bugs can be fixed by saving x before normalizing it, and using the
saved x only for NaNs, and using uint32_t instead of int32_t for ix
so that negative NaNs are not misclassified even if fabsf() doesn't
clear their sign bit, but gcc pessimizes the saving very well, especially
on Athlon XPs (it generates extra loads and stores, and mixes use of
the SSE and i387, and this somehow messes up pipelines). Normalizing
x is not a very good optimization anyway, so stop doing it. (It adds
latency to the FPU pipelines, but in previous versions it helped except
for |x| <= 3pi/4 by simplifying the integer pipelines.) Use the same
organization as in s_sinf.c and s_tanf.c with some branches reordered.
These changes combined recover most of the performance of the unfixed
version on A64 but still lose 10% on AXP with gcc-3.4 -O1 but not with
gcc-3.3 -O1.
that was used doesn't work normally here, since we want to be able to
multiply `hi' by the exponent of x _exactly_, and the exponent of x has
more than 7 significant bits for most denormal x's, so the multiplication
was not always exact despite a cloned comment claiming that it was. (The
comment is correct in the double precision case -- with the normal 33+53
bit decomposition the exponent can have 20 significant bits and the extra
bit for denormals is only the 11th.)
Fixing this had little or no effect for denormals (I think because
more precision is inherently lost for denormals than is lost by roundoff
errors in the multiplication).
The fix is to reduce the precision of the decomposition to 16+24 bits.
Due to 2 bugs in the old deomposition and numerical accidents, reducing
the precision actually increased the precision of hi+lo. The old hi+lo
had about 39 bits instead of at least 41 like it should have had.
There were off-by-1-bit errors in each of hi and lo, apparently due
to mistranslation from the double precision hi and lo. The correct
16 bit hi happens to give about 19 bits of precision, so the correct
hi+lo gives about 43 bits instead of at least 40. The end result is
that expf() is now perfectly rounded (to nearest) except in 52561 cases
instead of except in 67027 cases, and the maximum error is 0.5013 ulps
instead of 0.5023 ulps.
rather than forcing the state to LOOK. If we are in the middle of parsing
a line when we have to do a FILL we would have lost any token we were in
the middle of parsing and would have treated the next character as being
at the start of a new line instead.
PR: kern/89181
Submitted by: Antony Mawer gnats at mawer dot org
MFC after: 1 week
Instead of echoing the code in a comment, try to describe why we split
up the evaluation in a special way.
The new optimization is mostly to move the evaluation of w = z*z later
so that everything else (except z = x*x) doesn't have to wait for w.
On Athlons, FP multiplication has a latency of 4 cycles so this
optimization saves 4 cycles per call provided no new dependencies are
introduced. Tweaking the other terms in to reduce dependencies saves
a couple more cycles in some cases (more on AXP than on A64; up to 8
cycles out of 56 altogether in some cases). The previous version had
a similar optimization for s = z*x. Special optimizations like these
probably have a larger effect than the simple 2-way vectorization
permitted (but not activated by gcc) in the old version, since 2-way
vectorization is not enough and the polynomial's degree is so small
in the float case that non-vectorizable dependencies dominate.
On an AXP, tanf() on uniformly distributed args in [-2pi, 2pi] now
takes 34-55 cycles (was 39-59 cycles).
of between 1.0 and 1.8509 ulps for lgammaf(x) with x between -2**-21 and
-2**-70.
As usual, the cutoff for tiny args was not correctly translated to
float precision. It was 2**-70 but 2**-21 works. Not as usual, having
a too-small threshold was worse than a pessimization. It was just a
pessimization for (positive) args between 2**-70 and 2**-21, but for
the first ~50 million (negative) args below -2**-70, the general code
overflowed and gave a result of infinity instead of correct (finite)
results near 70*log(2). For the remaining ~361 million negative args
above -2**21, the general code gave almost-acceptable errors (lgamma[f]()
is not very accurate in general) but the pessimization was larger than
for misclassified tiny positive args.
Now the max error for lgammaf(x) with |x| < 2**-21 is 0.7885 ulps, and
speed and accuracy are almost the same for positive and negative args
in this range. The maximum error overall is still infinity ulps.
A cutoff of 2**-70 is probably wastefully small for the double precision
case. Smaller cutoffs can be used to reduce the max error to nearly
0.5 ulps for tiny args, but this is useless since the general algrorithm
for nearly-tiny args is not nearly that accurate -- it has a max error of
about 1 ulp.
gives a tiny but hopefully always free optimization in the 2 quadrants
to which it applies. On Athlons, it reduces maximum latency by 4 cycles
in these quadrants but has usually has a smaller effect on total time
(typically ~2 cycles (~5%), but sometimes 8 cycles when the compiler
generates poor code).
of the function name.
Added my (non-)copyright.
In k_tanf.c, added the first set of redundant parentheses to control
grouping which was claimed to be added in the previous commit.
returning float). The functions are renamed from __kernel_{cos,sin}f()
to __kernel_{cos,sin}df() so that misuses of them will cause link errors
and not crashes.
This version is an almost-routine translation with no special optimizations
for accuracy or efficiency. The not-quite-routine part is that in
__kernel_cosf(), regenerating the minimax polynomial with double
precision coefficients gives a coefficient for the x**2 term that is
not quite -0.5, so the literal 0.5 in the code and the related `hz'
variable need to be modified; also, the special code for reducing the
error in 1.0-x**2*0.5 is no longer needed, so it is convenient to
adjust all the logic for the x**2 term a little. Note that without
extra precision, it would be very bad to use a coefficient of other
than -0.5 for the x**2 term -- the old version depends on multiplication
by -0.5 being infinitely precise so as not to need even more special
code for reducing the error in 1-x**2*0.5.
This gives an unimportant increase in accuracy, from ~0.8 to ~0.501
ulps. Almost all of the error is from the final rounding step, since
the choice of the minimax polynomials so that their contribution to the
error is a bit less than 0.5 ulps just happens to give contributions that
are significantly less (~.001 ulps).
An Athlons, for uniformly distributed args in [-2pi, 2pi], this gives
overall speed increases in the 10-20% range, despite giving a speed
decrease of typically 19% (from 31 cycles up to 37) for sinf() on args
in [-pi/4, pi/4].
libarchive doesn't make malloc(0) requests, so the autoconf
checks aren't needed and the autoconf workarounds for
broken malloc(0) just create problems.
Thanks to: Dan Nelson, who reports that this fixes libarchive on AIX 5.2
- Remove dead code that I forgot to remove in the previous commit.
- Calculate the sum of the lower terms of the polynomial (divided by
x**5) in a single expression (sum of odd terms) + (sum of even terms)
with parentheses to control grouping. This is clearer and happens to
give better instruction scheduling for a tiny optimization (an
average of about ~0.5 cycles/call on Athlons).
- Calculate the final sum in a single expression with parentheses to
control grouping too. Change the grouping from
first_term + (second_term + sum_of_lower_terms) to
(first_term + second_term) + sum_of_lower_terms. Normally the first
grouping must be used for accuracy, but extra precision makes any
grouping give a correct result so we can group for efficiency. This
is a larger optimization (average 3-4 cycles/call or 5%).
- Use parentheses to indicate that the C order of left to right evaluation
is what is wanted (for efficiency) in a multiplication too.
The old fdlibm code has several optimizations related to these. 2
involve doing an extra operation that can be done almost in parallel
on some superscalar machines but are pessimizations on sequential
machines. Others involve statement ordering or expression grouping.
All of these except the ordering for the combining the sums of the odd
and even terms seem to be ideal for Athlons, but parallelism is still
limited so all of these optimizations combined together with the ones
in this commit save only ~6-8 cycles (~10%).
On an AXP, tanf() on uniformly distributed args in [-2pi, 2pi] now
takes 39-59 cycles. I don't know of any more optimizations for tanf()
short of writing it all in asm with very MD instruction scheduling.
Hardware fsin takes 122-138 cycles. Most of the optimizations for
tanf() don't work very well for tan[l](). fdlibm tan() now takes
145-365 cycles.
A single polynomial approximation for tan(x) works in infinite precision
up to |x| < pi/2, but in finite precision, to restrict the accumulated
roundoff error to < 1 ulp, |x| must be restricted to less than about
sqrt(0.5/((1.5+1.5)/3)) ~= 0.707. We restricted it a bit more to
give a safety margin including some slop for optimizations. Now that
we use double precision for the calculations, the accumulated roundoff
error is in double-precision ulps so it can easily be made almost 2**29
times smaller than a single-precision ulp. Near x = pi/4 its maximum
is about 0.5+(1.5+1.5)*x**2/3 ~= 1.117 double-precision ulps.
The minimax polynomial needs to be different to work for the larger
interval. I didn't increase its degree the old degree is just large
enough to keep the final error less than 1 ulp and increasing the
degree would be a pessimization. The maximum error is now ~0.80
ulps instead of ~0.53 ulps.
The speedup from this optimization for uniformly distributed args in
[-2pi, 2pi] is 28-43% on athlons, depending on how badly gcc selected
and scheduled the instructions in the old version. The old version
has some int-to-float conversions that are apparently difficult to schedule
well, but gcc-3.3 somehow did everything ~10 cycles or ~10% faster than
gcc-3.4, with the difference especially large on AXPs. On A64s, the
problem seems to be related to documented penalties for moving single
precision data to undead xmm registers. With this version, the speed
is cycles is almost independent of the athlon and gcc version despite
the large differences in instruction selection to use the FPU on AXPs
and SSE on A64s.
This is a minor interface change. The function is renamed from
__kernel_tanf() to __kernel_tandf() so that misues of it will cause
link errors and not crashes.
This version is a routine translation with no special optimizations
for accuracy or efficiency. It gives an unimportant increase in
accuracy, from ~0.9 ulps to 0.5285 ulps. Almost all of the error is
from the minimax polynomial (~0.03 ulps and the final rounding step
(< 0.5 ulps). It gives strange differences in efficiency in the -5
to +10% range, with -O1 fairly consistently becoming faster and -O2
slower on AXP and A64 with gcc-3.3 and gcc-3.4.
arg to __kernel_rem_pio2() gives 53-bit (double) precision, not single
precision and/or the array dimension like I thought. prec == 2 is
used in e_rem_pio2.c for double precision although it is documented
to be for 64-bit (extended) precision, and I just reduced it by 1
thinking that this would give the value suitable for 24-bit (float)
precision. Reducing it 1 more to the documented value for float
precision doesn't actually work (it gives errors of ~0.75 ulps in the
reduced arg, but errors of much less than 0.5 ulps are needed; the bug
seems to be in kernel_rem_pio2.c). Keep using a value 1 larger than
the documented value but supply an array large enough hold the extra
unused result from this.
The bug can also be fixed quickly by increasing init_jk[0] in
k_rem_pio2.c from 2 to 3. This gives behaviour identical to using
prec == 1 except it doesn't create the extra result. It isn't clear
how the precision bug affects higher precisions. 113-bit (quad) is
the largest precision, so there is no way to use a large precision
to fix it.
they can be #included in other .c files to give inline functions, and
use them to inline the functions in most callers (not in e_lgammaf_r.c).
__kernel_tanf() is too large and complicated for gcc to inline very well.
An athlons, this gives a speed increase under favourable pipeline
conditions of about 10% overall (larger for AXP, smaller for A64).
E.g., on AXP, sinf() on uniformly distributed args in [-2Pi, 2Pi]
now takes 30-56 cycles; it used to take 45-61 cycles; hardware fsin
takes 65-129.
On athlons, this gives a speedup of 10-20% for tanf() on uniformly
distributed args in [-2Pi, 2Pi]. (It only directly applies for 43%
of the args and gives a 16-20% speedup for these (more for AXP than
A64) and this gives an overall speedup of 10-12% which is all that it
should; however, it gives an overall speedup of 17-20% with gcc-3.3
on AXP-A64 by mysteriously effected cases where it isn't executed.)
I originally intended to use double precision for all internals of
float trig functions and will probably still do this, but benchmarking
showed that converting to double precision and back is a pessimization
in cases where a simple float precision calculation works, so it may
be optimal to switch precisions only when using extra precision is
much simpler.
__ieee754_rem_pio2f() to its 3 callers and manually inline them.
On Athlons, with favourable compiler flags and optimizations and
favourable pipeline conditions, this gives a speedup of 30-40 cycles
for cosf(), sinf() and tanf() on the range pi/4 < |x| <= 9pi/4, so
thes functions are now signifcantly faster than the hardware trig
functions in many cases. E.g., in a benchmark with uniformly distributed
x in [-2pi, 2pi], A64 hardware fcos took 72-129 cycles and cosf() took
37-55 cycles. Out-of-order execution is needed to get both of these
times. The optimizations in this commit apparently work more by
removing 1 serialization point than by reducing latency.
s_cosf.c and s_sinf.c:
Use a non-bogus magic constant for the threshold of pi/4. It was 2 ulps
smaller than pi/4 rounded down, but its value is not critical so it should
be the result of natural rounding.
s_cosf.c and s_tanf.c:
Use a literal 0.0 instead of an unnecessary variable initialized to
[(float)]0.0. Let the function prototype convert to 0.0F.
Improved wording in some comments.
Attempted to improve indentation of comments.
number of branches.
Use a non-bogus magic constant for the threshold of pi/4. It was 2 ulps
smaller than pi/4 rounded down, but its value is not critical so it should
be the result of natural rounding. Use "<=" comparisons with rounded-
down thresholds for all small multiples of pi/4.
Cleaned up previous commit:
- use static const variables instead of expressions for multiples of pi/2
to ensure that they are evaluated at compile time. gcc currently
evaluates them at compile time but C99 compilers are not required
to do so. We want compile time evaluation for optimization and don't
care about side effects.
- use M_PI_2 instead of a magic constant for pi/2. We need magic constants
related to pi/2 elsewhere but not here since we just want pi/2 rounded
to double and even prefer it to be rounded in the default rounding mode.
We can depend on the cmpiler being C99ish enough to round M_PI_2 correctly
just as much as we depended on it handling hex constants correctly. This
also fixes a harmless rounding error in the hex constant.
- keep using expressions n*<value for pi/2> in the initializers for the
static const variables. 2*M_PI_2 and 4*M_PI_2 are obviously rounded in
the same way as the corresponding infinite precision expressions for
multiples of pi/2, and 3*M_PI_2 happens to be rounded like this, so we
don't need magic constants for the multiples.
- fixed and/or updated some comments.
define also, but res_config.h was not included into libc/net/name6.c.
So getipnodebyaddr() ignored the multiple PTRs.
PR: kern/88241
Submitted by: Dan Lukes <dan__at__obluda.cz>
MFC after: 3 days
The threshold for not being tiny was too small. Use the usual 2**-12
threshold. This change is not just an optimization, since the general
code that we fell into has accuracy problems even for tiny x. Avoiding
it fixes 2*1366 args with errors of more than 1 ulp, with a maximum
error of 1.167 ulps.
The magic number 22 is log(DBL_EPSILON)/2 plus slop. This is bogus
for float precision. Use 9 (~log(FLT_EPSILON)/2 plus less slop than
for double precision). The code for handling the interval
[2**-28, 9_was_22] has accuracy problems even for [9, 22], so this
change happens to fix errors of more than 1 ulp in about 2*17000
cases. It leaves such errors in about 2*1074000 cases, with a max
error of 1.242 ulps.
The threshold for switching from returning exp(x)/2 to returning
exp(x/2)^2/2 was a little smaller than necessary. As for coshf(),
This was not quite harmless since the exp(x/2)^2/2 case is inaccurate,
and fixing it avoids accuracy problems in 2*6 cases, leaving problems
in 2*19997 cases.
Fixed naming errors in pseudo-code in comments.
The threshold for not being tiny was confusing and too small. Use the
usual 2**-12 threshold and simplify the algorithm slightly so that
this threshold works (now use the threshold for sinhf() instead of one
for 1+expm1()). This is just a small optimization.
The magic number 22 is log(DBL_EPSILON)/2 plus slop. This is bogus
for float precision. Use 9 (~log(FLT_EPSILON)/2 plus less slop than
for double precision).
The threshold for switching from returning exp(x)/2 to returning
exp(x/2)^2/2 was a little smaller than necessary. This was not quite
harmless since the exp(x/2)^2/2 case is inaccurate. Fixing it happens
to avoid accuracy problems for 2*6 of the 2*151 args that were handled
by the exp(x)/2 case. This leaves accuracy problems for about 2*19997
args near the overflow threshold (~89); the maximum error there is
2.5029 ulps.
There are also accuracy probles for args in +-[0.5*ln2, 9] -- 2*188885
args with errors of more than 1 ulp, with a maximum error of 1.384 ulps.
Fixed a syntax error and naming errors in pseudo-code in comments.
specialized for float precision. The new polynomial has degree 8
instead of 14, and a maximum error of 2**-34.34 (absolute) instead of
2**-30.66. This doesn't affect the final error significantly; the
maximum error was and is about 0.8879 ulps on amd64 -01.
The fdlibm expf() is not used on i386's (the "optimized" asm version
is used), but probably should be since it was already significantly
faster than the asm version on athlons. The asm version has the
advantage of being more accurate, so keep using it for now.
polynomial for __kernel_tanf(). The old one was the double-precision
polynomial with coefficients truncated to float. Truncation is not
a good way to convert minimax polynomials to lower precision. Optimize
for efficiency and use the lowest-degree polynomial that gives a
relative error of less than 1 ulp. It has degree 13 instead of 27,
and happens to be 2.5 times more accurate (in infinite precision) than
the old polynomial (the maximum error is 0.017 ulps instead of 0.041
ulps).
Unlike for cosf and sinf, the old accuracy was close to being inadequate
-- the polynomial for double precision has a max error of 0.014 ulps
and nearly this small an error is needed. The new accuracy is also a
bit small, but exhaustive checking shows that even the old accuracy
was enough. The increased accuracy reduces the maximum relative error
in the final result on amd64 -O1 from 0.9588 ulps to 0.9044 ulps.
special case pi/4 <= |x| < 3*pi/4. This gives a tiny optimization (it
saves 2 subtractions, which are scheduled well so they take a whole 1
cycle extra on an AthlonXP), and simplifies the code so that the
following optimization is not so ugly.
Optimize for the range 3*pi/4 < |x| < 9*Pi/2 in the same way. On
Athlon{XP,64} systems, this gives a 25-40% optimization (depending a
lot on CFLAGS) for the cosf() and sinf() consumers on this range.
Relative to i387 hardware fcos and fsin, it makes the software versions
faster in most cases instead of slower in most cases. The relative
optimization is smaller for tanf() the inefficient part is elsewhere.
The 53-bit approximation to pi/2 is good enough for pi/4 <= |x| <
3*pi/4 because after losing up to 24 bits to subtraction, we still
have 29 bits of precision and only need 25 bits. Even with only 5
extra bits, it is possible to get perfectly rounded results starting
with the reduced x, since if x is nearly a multiple of pi/2 then x is
not near a half-way case and if x is not nearly a multiple of pi/2
then we don't lose many bits. With our intentionally imperfect rounding
we get the same results for cosf(), sinf() and tanf() as without this
optimization.
standard in C99 and POSIX.1-2001+. They are also not deprecated, since
apart from being standard they can handle special args slightly better
than the ilogb() functions.
Move their documentation to ilogb.3. Try to use consistent and improved
wording for both sets of functions. All of ieee854, C99 and POSIX
have better wording and more details for special args.
Add history for the logb() functions and ilogbl(). Fix history for
ilogb().
so that it can be faster for tiny x and avoided for reduced x.
This improves things a little differently than for cosine and sine.
We still need to reclassify x in the "kernel" functions, but we get
an extra optimization for tiny x, and an overall optimization since
tiny reduced x rarely happens. We also get optimizations for space
and style. A large block of poorly duplicated code to fix a special
case is no longer needed. This supersedes the fixes in k_sin.c revs
1.9 and 1.11 and k_sinf.c 1.8 and 1.10.
Fixed wrong constant for the cutoff for "tiny" in tanf(). It was
2**-28, but should be almost the same as the cutoff in sinf() (2**-12).
The incorrect cutoff protected us from the bugs fixed in k_sinf.c 1.8
and 1.10, except 4 cases of reduced args passed the cutoff and needed
special handling in theory although not in practice. Now we essentially
use a cutoff of 0 for the case of reduced args, so we now have 0 special
args instead of 4.
This change makes no difference to the results for sinf() (since it
only changes the algorithm for the 4 special args and the results for
those happen not to change), but it changes lots of results for sin().
Exhaustive testing is impossible for sin(), but exhaustive testing
for sinf() (relative to a version with the old algorithm and a fixed
cutoff) shows that the changes in the error are either reductions or
from 0.5-epsilon ulps to 0.5+epsilon ulps. The new method just uses
some extra terms in approximations so it tends to give more accurate
results, and there are apparently no problems from having extra
accuracy. On amd64 with -O1, on all float args the error range in ulps
is reduced from (0.500, 0.665] to [0.335, 0.500) in 24168 cases and
increased from 0.500-epsilon to 0.500+epsilon in 24 cases. Non-
exhaustive testing by ucbtest shows no differences.
commit moved it). This includes a comment that the "kernel" sine no
longer works on arg -0, so callers must now handle this case. The kernel
sine still works on all other tiny args; without the optimization it is
just a little slower on these args. I intended it to keep working on
all tiny args, but that seems to be impossible without losing efficiency
or accuracy. (sin(x) ~ x * (1 + S1*x**2 + ...) would preserve -0, but
the approximation must be written as x + S1*x**3 + ... for accuracy.)
case never occurs since pi/2 is irrational so no multiple of it can
be represented as a float and we have precise arg reduction so we never
end up with a remainder of 0 in the "kernel" function unless the
original arg is 0.
If this case occurs, then we would now fall through to general code
that returns +-Inf (depending on the sign of the reduced arg) instead
of forcing +Inf. The correct handling would be to return NaN since
we would have lost so much precision that the correct result can be
anything _except_ +-Inf.
Don't reindent the else clause left over from this, although it was already
bogusly indented ("if (foo) return; else ..." just marches the indentation
to the right), since it will be removed too.
Index: k_tan.c
===================================================================
RCS file: /home/ncvs/src/lib/msun/src/k_tan.c,v
retrieving revision 1.10
diff -r1.10 k_tan.c
88,90c88
< if (((ix | low) | (iy + 1)) == 0)
< return one / fabs(x);
< else {
---
> {
a declaration was not translated to "float" although bit fiddling on
double variables was translated. This resulted in garbage being put
into the low word of one of the doubles instead of non-garbage being
put into the only word of the intended float. This had no effect on
any result because:
- with doubles, the algorithm for calculating -1/(x+y) is unnecessarily
complicated. Just returning -1/((double)x+y) would work, and the
misdeclaration gave something like that except for messing up some
low bits with the bit fiddling.
- doubles have plenty of bits to spare so messing up some of the low
bits is unlikely to matter.
- due to other bugs, the buggy code is reached for a whole 4 args out
of all 2**32 float args. The bug fixed by 1.8 only affects a small
percentage of cases and a small percentage of 4 is 0. The 4 args
happen to cause no problems without 1.8, so they are even less likely
to be affected by the bug in 1.8 than average args; in fact, neither
1.8 nor this commit makes any difference to the result for these 4
args (and thus for all args).
Corrections to the log message in 1.8: the bug only applies to tan()
and not tanf(), not because the float type can't represent numbers
large enough to trigger the problem (e.g., the example in the fdlibm-5.3
readme which is > 1.0e269), but because:
- the float type can't represent small enough numbers. For there to be
a possible problem, the original arg for tanf() must lie very near an
odd multiple of pi/2. Doubles can get nearer in absolute units. In
ulps there should be little difference, but ...
- ... the cutoff for "small" numbers is bogus in k_tanf.c. It is still
the double value (2**-28). Since this is 32 times smaller than
FLT_EPSILON and large float values are not very uniformly distributed,
only 6 args other than ones that are initially below the cutoff give
a reduced arg that passes the cutoff (the 4 problem cases mentioned
above and 2 non-problem cases).
Fixing the cutoff makes the bug affect tanf() and much easier to detect
than for tan(). With a cutoff of 2**-12 on amd64 with -O1, 670102
args pass the cutoff; of these, there are 337604 cases where there
might be an error of >= 1 ulp and 5826 cases where there is such an
error; the maximum error is 1.5382 ulps.
The fix in 1.8 works with the reduced cutoff in all cases despite the
bug in it. It changes the result in 84492 cases altogether to fix the
5826 broken cases. Fixing the fix by translating "double" to "float"
changes the result in 42 cases relative to 1.8. In 24 cases the
(absolute) error is increased and in 18 cases it is reduced, but it
remains less than 1 ulp in all cases.
rewritten, now timers created with same sigev_notify_attributes will
run in same thread, this allows user to organize which timers can
run in same thread to save some thread resource.
The log message for 1.5 said that some small (one or two ulp) inaccuracies
were fixed, and a comment implied that the critical change is to switch
the rounding mode to to-nearest, with a switch of the precision to
extended at no extra cost. Actually, the errors are very large (ucbtest
finds ones of several hundred ulps), and it is the switch of the
precision that is critical.
Another comment was wrong about NaNs being handled sloppily.
and medium size args too: instead of conditionally subtracting a float
17+24, 17+17+24 or 17+17+17+24 bit approximation to pi/2, always
subtract a double 33+53 bit one. The float version is now closer to
the double version than to old versions of itself -- it uses the same
33+53 bit approximation as the simplest cases in the double version,
and where the float version had to switch to the slow general case at
|x| == 2^7*pi/2, it now switches at |x| == 2^19*pi/2 the same as the
double version.
This speeds up arg reduction by a factor of 2 for |x| between 3*pi/4 and
2^7*pi/4, and by a factor of 7 for |x| between 2^7*pi/4 and 2^19*pi/4.
using under FreeBSD. Before this commit, all float precision functions
except exp2f() were implemented using only float precision, apparently
because Cygnus needed this in 1993 for embedded systems with slow or
inefficient double precision. For FreeBSD, except possibly on systems
that do floating point entirely in software (very old i386 and now
arm), this just gives a more complicated implementation, many bugs,
and usually worse performance for float precision than for double
precision. The bugs and worse performance were particulary large in
arg reduction for trig functions. We want to divide by an approximation
to pi/2 which has as many as 1584 bits, so we should use the widest
type that is efficient and/or easy to use, i.e., double. Use fdlibm's
__kernel_rem_pio2() to do this as Sun apparently intended. Cygnus's
k_rem_pio2f.c is now unused. e_rem_pio2f.c still needs to be separate
from e_rem_pio2.c so that it can be optimized for float args. Similarly
for long double precision.
This speeds up cosf(x) on large args by a factor of about 2. Correct
arg reduction on large args is still inherently very slow, so hopefully
these args rarely occur in practice. There is much more efficiency
to be gained by using double precision to speed up arg reduction on
medium and small float args.
__kernel_sinf(). The old ones were the double-precision polynomials
with coefficients truncated to float. Truncation is not a good way
to convert minimax polynomials to lower precision. Optimize for
efficiency and use the lowest-degree polynomials that give a relative
error of less than 1 ulp -- degree 8 instead of 14 for cosf and degree
9 instead of 13 for sinf. For sinf, the degree 8 polynomial happens
to be 6 times more accurate than the old degree 14 one, but this only
gives a tiny amount of extra accuracy in results -- we just need to
use a a degree high enough to give a polynomial whose relative accuracy
in infinite precision (but with float coefficients) is a small fraction
of a float ulp (fdlibm generally uses 1/32 for the small fraction, and
the fraction for our degree 8 polynomial is about 1/600).
The maximum relative errors for cosf() and sinf() are now 0.7719 ulps
and 0.7969 ulps, respectively.
This supersedes the fix for the old algorithm in rev.1.8 of k_cosf.c.
I want this change mainly because it is an optimization. It helps
make software cos[f](x) and sin[f](x) faster than the i387 hardware
versions for small x. It is also a simplification, and reduces the
maximum relative error for cosf() and sinf() on machines like amd64
from about 0.87 ulps to about 0.80 ulps. It was validated for cosf()
and sinf() by exhaustive testing. Exhaustive testing is not possible
for cos() and sin(), but ucbtest reports a similar reduction for the
worst case found by non-exhaustive testing. ucbtest's non-exhaustive
testing seems to be good enough to find problems in algorithms but not
maximum relative errors when there are spikes. E.g., short runs of
it find only 3 ulp error where the i387 hardware cos() has an error
of about 2**40 ulps near pi/2.
to floats (mainly i386's). All errors of more than 1 ulp for float
precision trig functions were supposed to have been fixed; however,
compiling with gcc -O2 uncovered 18250 more such errors for cosf(),
with a maximum error of 1.409 ulps.
Use essentially the same fix as in rev.1.8 of k_rem_pio2f.c (access a
non-volatile variable as a volatile). Here the -O1 case apparently
worked because the variable is in a 2-element array and it takes -O2
to mess up such a variable by putting it in a register.
The maximum error for cosf() on i386 with gcc -O2 is now 0.5467 (it
is still 0.5650 with gcc -O1). This shows that -O2 still causes some
extra precision, but the extra precision is now good.
Extra precision is harmful mainly for implementing extra precision in
software. We want to represent x+y as w+r where both "+" operations
are in infinite precision and r is tiny compared with w. There is a
standard algorithm for this (Knuth (1981) 4.2.2 Theorem C), and fdlibm
uses this routinely, but the algorithm requires w and r to have the
same precision as x and y. w is just x+y (calculated in the same
finite precision as x and y), and r is a tiny correction term. The
i386 gcc bugs tend to give extra precision in w, and then using this
extra precision in the calculation of r results in the correction
mostly staying in w and being missing from r. There still tends to
be no problem if the result is a simple expression involving w and r
-- modulo spills, w keeps its extra precision and r remains the right
correction for this wrong w. However, here we want to pass w and r
to extern functions. Extra precision is not retained in function args,
so w gets fixed up, but the change to the tiny r is tinier, so r almost
remains as a wrong correction for the right w.
{cos_sin}[f](x) so that x doesn't need to be reclassified in the
"kernel" functions to determine if it is tiny (it still needs to be
reclassified in the cosine case for other reasons that will go away).
This optimization is quite large for exponentially distributed x, since
x is tiny for almost half of the domain, but it is a pessimization for
uniformally distributed x since it takes a little time for all cases
but rarely applies. Arg reduction on exponentially distributed x
rarely gives a tiny x unless the reduction is null, so it is best to
only do the optimization if the initial x is tiny, which is what this
commit arranges. The imediate result is an average optimization of
1.4% relative to the previous version in a case that doesn't favour
the optimization (double cos(x) on all float x) and a large
pessimization for the relatively unimportant cases of lgamma[f][_r](x)
on tiny, negative, exponentially distributed x. The optimization should
be recovered for lgamma*() as part of fixing lgamma*()'s low-quality
arg reduction.
Fixed various wrong constants for the cutoff for "tiny". For cosine,
the cutoff is when x**2/2! == {FLT or DBL}_EPSILON/2. We round down
to an integral power of 2 (and for cos() reduce the power by another
1) because the exact cutoff doesn't matter and would take more work
to determine. For sine, the exact cutoff is larger due to the ration
of terms being x**2/3! instead of x**2/2!, but we use the same cutoff
as for cosine. We now use a cutoff of 2**-27 for double precision and
2**-12 for single precision. 2**-27 was used in all cases but was
misspelled 2**27 in comments. Wrong and sloppy cutoffs just cause
missed optimizations (provided the rounding mode is to nearest --
other modes just aren't supported).
readable on certain random memory configurations. If the libkvm consumer
tried to read something that was in the very last pdpe, pde or pte slot,
it would bogusly fail.
This is broken in RELENG_6 too.
expr and printf are not available during installworld, so
use /bin/sh arithmetic expansion instead of expr and simply
give up on vanity formatting. ;-)
systems (or on FreeBSD systems when using ports).
2) Overhaul the versioning logic. In particular,
SHLIB_MAJOR number is now computed as "major+minor",
which ensures library versions are the same for
the FreeBSD build system and the portable
libtool/autoconf/automake build system.
link names, usernames, or group names that contain
non-ASCII characters.
In particular, this corrects an inconsistency reported
by Ed Maste when archiving symlinks with odd characters:
long symlinks would get preserved, short ones would
be changed.
broken assignment to floats (e.g., i386 with gcc -O, but not amd64 or
ia64; i386 with gcc -O0 worked accidentally).
Use an unnamed volatile temporary variable to trick gcc -O into clipping
extra precision on assignment. It's surprising that only 1 place needed
to be changed.
For tanf() on i386 with gcc -O, the bug caused errors > 1 ulp with a
density of 2.3% for args larger in magnitude than 128*pi/2, with a
maximum error of 1.624 ulps.
After this fix, exhaustive testing shows that range reduction for
floats works as intended assuming that it is in within a factor of
about 2^16 of working as intended for doubles. It provides >= 8
extra bits of precision for all ranges. On i386:
range max error in double/single ulps extra precision
----- ------------------------------- ---------------
0 to 3*pi/4 0x000d3132 / 0.0016 9+ bits
3*pi/4 to 128*pi/2 0x00160445 / 0.0027 8+
128*pi/2 to +Inf 0x00000030 / 0.00000009 23+
128*pi/2 up, -O0 before fix 0x00000030 / 0.00000009 23+
128*pi/2 up, -O1 before fix 0x10000000 / 0.5 1
The 23+ bits of extra precision for large multiples corresponds to almost
perfect reduction to a pair of floats (24 extra would be perfect).
After this fix, the maximum relative error (relative to the corresponding
fdlibm double precision function) is < 1 ulp for all basic trig functions
on all 2^32 float args on all machines tested:
amd64 ia64 i386-O0 i386-O1
------ ------ ------ ------
cosf: 0.8681 0.8681 0.7927 0.5650
sinf: 0.8733 0.8610 0.7849 0.5651
tanf: 0.9708 0.9329 0.9329 0.7035
of pi/2 (1 line) and expand a comment about related magic (many lines).
The bug was essentially the same as for the +-pi/2 case (a mistranslated
mask), but was smaller so it only significantly affected multiples
starting near +-13*pi/2. At least on amd64, for cosf() on all 2^32
float args, the bug caused 128 errors of >= 1 ulp, with a maximum error
of 1.2393 ulps.
and add a comment about related magic (many lines)).
__kernel_cos[f]() needs a trick to reduce the error to below 1 ulp
when |x| >= 0.3 for the range-reduced x. Modulo other bugs, naive
code that doesn't use the trick would have an error of >= 1 ulp
in about 0.00006% of cases when |x| >= 0.3 for the unreduced x,
with a maximum relative error of about 1.03 ulps. Mistransation
of the trick from the double precision case resulted in errors in
about 0.2% of cases, with a maximum relative error of about 1.3 ulps.
The mistranslation involved not doing implicit masking of the 32-bit
float word corresponding to to implicit masking of the lower 32-bit
double word by clearing it.
sinf() uses __kernel_cosf() for half of all cases so its errors from
this bug are similar. tanf() is not affected.
The error bounds in the above and in my other recent commit messages
are for amd64. Extra precision for floats on i386's accidentally masks
this bug, but only if k_cosf.c is compiled with -O. Although the extra
precision helps here, this is accidental and depends on longstanding
gcc precision bugs (not clipping extra precision on assignment...),
and the gcc bugs are mostly avoided by compiling without -O. I now
develop libm mainly on amd64 systems to simplify error detection and
debugging.
17+17+24 bit pi/2 must only be used when subtraction of the first 2
terms in it from the arg is exact. This happens iff the the arg in
bits is one of the 2**17[-1] values on each side of (float)(pi/2).
Revert to the algorithm in rev.1.7 and only fix its threshold for using
the 3-term pi/2. Use the threshold that maximizes the number of values
for which the 3-term pi/2 is used, subject to not changing the algorithm
for comparing with the threshold. The 3-term pi/2 ends up being used
for about half of its usable range (about 64K values on each side).
a maximum error of 2.905 ulps for cosf(), but the algorithm for cosf()
is good for < 1 ulps and happens to give perfect rounding (< 0.5 ulps)
near +-pi/2 except for the bug. The extra relative errors for tanf()
were similar (slightly larger). The bug didn't affect sinf() since
sinf'(+-pi/2) is 0.
For range reduction in ~[-3pi/4, -pi/4] and ~[pi/4, 3pi/4] we must
subtract +-pi/2 and the only complication is that this must be done
in extra precision. We have handy 17+24-bit and 17+17+24-bit
approximations to pi/2. If we always used the former then we would
lose up to 24 bits of accuracy due to cancelation of leading bits, but
we need to keep at least 24 bits plus a guard digit or 2, and should
keep as many guard bits as efficiency permits. So we used the
less-precise pi/2 not very near +-pi/2 and switched to using the
more-precise pi/2 very near +-pi/2. However, we got the threshold for
the switch wrong by allowing 19 bits to cancel, so we ended up with
only 21 or 22 bits of accuracy in some cases, which is even worse than
naively subtracting pi/2 would have done.
Exhaustive checking shows that allowing only 17 bits to cancel (min.
accuracy ~24 bits) is sufficient to reduce the maximum error for cosf()
near +-pi/2 to 0.726 ulps, but allowing only 6 bits to cancel (min.
accuracy ~35-bits) happens to give perfect rounding for cosf() at
little extra cost so we prefer that.
We actually (in effect) allow 0 bits to cancel and always use the
17+17+24-bit pi/2 (min. accuracy ~41 bits). This is simpler and
probably always more efficient too. Classifying args to avoid using
this pi/2 when it is not needed takes several extra integer operations
and a branch, but just using it takes only 1 FP operation.
The patch also fixes misspelling of 17 as 24 in many comments.
For the double-precision version, the magic numbers include 33+53 bits
for the less-precise pi/2 and (53-32-1 = 20) bits being allowed to
cancel, so there are ~33-20 = 13 guard bits. This is sufficient except
probably for perfect rounding. The more-precise pi/2 has 33+33+53
bits and we still waste time classifying args to avoid using it.
The bug is apparently from mistranslation of the magic 32 in 53-32-1.
The number of bits allowed to cancel is not critical and we use 32 for
double precision because it allows efficient classification using a
32-bit comparison. For float precision, we must use an explicit mask,
and there are fewer bits so there is less margin for error in their
allocation. The 32 got reduced to 4 but should have been reduced
almost in proportion to the reduction of mantissa bits.
in 1993 in rev.1.5 of the i386 a.out version (csu/i386/crt0.c).
Profiling uses a magic label "eprol" to delimit the start of the part
of the text section covered by profiling. This label must be placed
before the call to main() to get main() properly profiled. It was
placed there in rev.1.1 of crt0.c. Rev.1.5 imported the initial
implementation of shared libraries in FreeBSD and misplaced the label.
Fortunately, the misplaced label was misspelled and the old label
wasn't removed, so the new label had no effect. Unfortunately, when
profiling was implemented for the ELF in 1998 in rev.1.2 of
csu/i386-elf/crt1.c, only the incorrectly placed label was copied
(after fixing its name). The bug was then copied to all other arches.
The label seems to be still misplaced in NetBSD for most arches. It
is in common.c for most arches so it is even further from being inside
the function that calls main().
I think "eprol" is short for "end of prologue", but it must be placed
before the end of the prologue so that it covers main(). crt0.c has
it before the calls atexit(_mcleanup) and monstartup(...), but it
cannot affect these calls so I moved it after the call to monstartup().
It now also covers the call to _init() but not the newer call to
_init_tls(). Profiling of _init() seems to be harmless, and the call
to _init_tls() seems to be misplaced.
Reviewed by: jdp (long ago, for a slightly different i386 version)
host name. This is matches the documented behaviro. The previous
behavior would remove the domain name even if the result retained a dot.
This fixes rsh connections from a.example.com to example.com.
Reviewed by: ceri (at least the concept)
o Don't reinitialise the atfork() handler list in the child. We
are meant to call the child handler, and on subsequent fork()s
should call all three functions as normal.
o Don't reinitialise the thread specific keyed data in the
child after a fork. Applications may require this for context.
o Reinitialise curthread->tlflags after removing ourselves from
(and reinitialising) the various internal thread lists.
o Reinitialise __malloc_lock in the child after fork() (to balance
our explicitly taking the lock prior to the fork()).
With these changes, it is possible to enable the NOTYET code in
thr_kern.c to allow the use of non-async-safe functions after
fork()ing from a threaded program.
Reviewed by: Daniel Eischen <deischen@freebsd.org>
[_malloc_lock reinitialisation has since been moved to avoid polluting the
!NOTYET code]
"HEADER" unless the open is successful. Instead, leave the state as
"NEW." In particular, if archive_read_open() fails, a subsequent call
to archive_read_next_header() will now cause an explicit assertion
failure instead of a silent segmentation fault.
This may need a little more work to fully realize the intention: If
archive_read_open() fails, you should be able to call it again on the
same archive handle to open a different archive (or the same archive
using a different mechanism).
sizeof(*list), not sizeof(**list). (i.e., sizeof(pointer) rather than
sizeof(char)).
It is possible that this buffer overflow is exploitable, but it was
added after RELENG_5 forked and hasn't been MFCed, so this will not
receive an advisory.
Submitted by: Vitezslav Novy
MFC after: 1 day
to doubles as bits. fdlibm-1.1 had similar aliasing bugs, but these
were fixed by NetBSD or Cygnus before a modified version of fdlibm was
imported in 1994. TRUNC() is only used by tgamma() and some
implementation-detail functions. The aliasing bugs were detected by
compiling with gcc -O2 but don't seem to have broken tgamma() on i386's
or amd64's. They broke my modified version of tgamma().
Moved the definition of TRUNC() to mathimpl.h so that it can be fixed
in one place, although the general version is even slower than necessary
because it has to operate on pointers to volatiles to handle its arg
sometimes being volatile. Inefficiency of the fdlibm macros slows
down libm generally, and tgamma() is a relatively unimportant part of
libm. The macros act as if on 32-bit words in memory, so they are
hard to optimize to direct actions on 64-bit double registers for
(non-i386) machines where this is possible. The optimization is too
hard for gcc on amd64's, and declaring variables as volatile makes it
impossible.
Change first MAXPATHLEN to more standard PATH_MAX
Change second MAXPATHLEN to 1024 (it is temp buffer not related)
Change comment to reflect that.
Suggested by: bde
just use MAXPATHLEN. It prevents potential buffer overflow with other
malloc implementations.
(this change based on submitted patch)
PR: 86135
Submitted by: Trevor Blackwell <tlb@tlb.org>
(wchar_t is defined in stddef.h, and only two files need more than that.)
Portability: Since the wchar requirements are really quite modest,
it's easy to define basic replacements for wcslen, wcscmp, wcscpy,
etc, for use on systems that lack <wchar.h>. In particular, this allows
libarchive to be used on older OpenBSD systems.
than the value of backlog argument.
- Document the fact that a subsequent listen(2) calls on the listening
socket change the backlog argument.
- Note that current listen queue lengths can be queried using netstat(1).
Submitted by: Igor Sysoev <is rambler-co.ru>
Wording by: gnn
It is the binary equivalent to strstr(3).
void *memmem(const void *big, size_t big_len,
const void *little, size_t little_len);
Submitted by: Pascal Gloor <pascal.gloor at spale.com>
MFC after: 3 days
correctly in the case of FTP_PROXY, because an empty FTP_PROXY has a
specific meaning ("don't use any proxy at all for ftp, even if HTTP_PROXY
is defined"), while an empty HTTP_PROXY has no meaning at all.
PR: bin/85185
Submitted by: Conall O'Brien <conallob=freebsd@maths.tcd.ie>
MFC after: 2 weeks
example in libmemstat.3 was not updated to take this rename into account.
Update the example.
PR: 84946
Submitted by: Wojciech A. Koszek <dunstan at freebsd dot czest dot pl>
MFC after: 1 day
inadvertently match a negative char in the RE being compiled.
This fixes compilation of "\376" (as an ERE) and "\376\376" (as a BRE).
PR: 84740
MFC after: 1 week
tokenizer.c:1.3). Contrary to the commit log there were no memory leaks,
but the change introduced a bug because the free'd pointer was not zeroed
and calling the appropriate _end() function would call free() a second time.
so that libmemstat can be used to view full memory statistics from
kernel core dumps and /dev/mem. This is provided via a new query
function, memstat_kvm_malloc(), which is also automatically invoked
by memstat_kvm_all(). A kvm handle must be passed in.
This will allow malloc(9)-specific code to be removed from vmstat(8).
opt_vmpage.h.
Remove definition of _KERNEL, it is no longer required in order to
include uma_int.h, as the sensitive parts of uma_int.h (a number of
inlines depending on kernel-only constants) are now protected by
_KERNEL.
extracted from tar archives. Otherwise, converting tar archives to
cpio format (with "bsdtar -cf out.cpio @in.tar") convert every entry
into a hard link to a single file. This simple logic breaks hard
links, but that's better than the alternative.
MFC after: 7 days
header of the pax extension entry, clip them to ustar limits. In particular,
this prevents an internal panic for very old files.
Thanks to: Chris Spiegel
MFC after: 7 days
GNU tar sparse files, people have extended cpio) and clarify an
important detail about pax format (that ustar-compliant archivers
can mostly read pax archives correctly).
MFC after: 7 days
that knows how to extract UMA(9) allocator statistics from a core dump or
live memory image using kvm(3). The caller is expected to provide the
necessary kvm_t handle, which is then used by libmemstat(3).
With these changes, it is trivially straight forward to re-introduce
vmstat -z support on core dumps, which was lost when UMA was introduced.
In the short term, this requires including vm/ include files that are not
intended for extra-kernel use, requiring in turn some ugliness.
- Move memory_type_list flushing logic from memstat_mtl_free() to
_memstat_mtl_empty(), a libmemstat-internal function that can
be called from other parts of the library. Invoke
_memstat_mtl_empty() from memstat_mtl_free(), which also frees
the containing list structure.
Invoke _memstat_mtl_empty() instead of memstat_mtl_free() in
various error cases in memstat_malloc.c and memstat_uma.c, which
previously resulted in the list being freed prematurely.
- Reverse the order of updating the mt_kegfree and mt_free fields
of the memory_type in memstat_uma.c, otherwise keg free items
won't be counted properly for non-secondary zones.
MFC after: 3 days
conversion routine, now change my mind and add one, memstat_strerror(3),
which returns a const char * pointer to a string describing the error,
to be used on the results of memstat_mtl_geterror().
While here, also correct a minor typo in the HISTORY man page.
Pointers on improving ease of internationalization would be
appreciated.
MFC after: 1 day
- Define a set of libmemstat(3) error constants, which are used by all
libmemstat(3) methods except for memstat_mtl_alloc(), which allocates
a memory type list and may return ENOMEM via errno.
- Define a per-memory_type_list current error value, which is set when a
call associated with a memory list fails. This requires wrapping a
structure around the queue(9) list head data structure, but this change
is not visible to libmemstat(3) consumers due to using access methods.
- Add a new accessor method, memstat_mtl_geterror() to retrieve the error
number.
- Consistently set the error number in a number of failure modes where
previously some combination of setting errno and printf'ing error
descriptions was used. libmemstat(3) will now no longer print to stdio
under any circumstances. Returns of NULL/-1 for errors remain the
same.
This avoids use of stdio, misuse of error numbers, and should make it
easier to program a libmemstat(3) consumer able to print useful error
messages. Currently, no error-to-string function is provided, as I'm
unsure how to address internationalization concerns.
MFC after: 1 day
try and discourage use outside the library.
Remove duplicate declaration of memstat_mtl_free() from memstat_internal.h,
as it's not internal, and the memstat.h definition suffices.
on top of a primary zone, sharing the same allocation "keg". When
reporting statistics for zones, do not report the free items in the
keg as part of the free items in the zone, or those free items will
be reported more than once: for the primary zone, and then any
secondary zones off the primary zone. Separately record and maintain
a kegfree statistic, and export via memstat_get_kegfree(), which is
available for use if needed. Since items free'd back to the keg are
not fully initialized, and hence may not actually be available (since
secondary zone ctor-time initialization can fail), this makes some
amount of sense.
This change corrects a bug made visible in the libmemstat(3)
modifications to netstat: mbufs freed back to the keg from the
packet zone would be counted twice, resulting in negative values
being printed in the mbuf free count.
Some further refinement of reporting relating to secondary zones may
still be required.
Reported by: ssouhlal
MFC after: 3 days
MEMSTAT_MAXCALLER (8), and expose MEMSTAT_MAXCALLER via memstat.h so
that applications can check their assumptions about how many slots
are available.
Remove 'spare' memory storage in struct malloc_type, since we now
don't expose the data structure internals to applications and rely
on accessor methods, this approach to ABI stability isn't required.
MFC after: 7 days
applications in tracking kernel memory statistics. It provides an
abstracted interface to uma(9) and malloc(9) statistics, wrapped
around the recently added binary stream sysctls for the allocators.
Using this interface, it is easy to build monitoring tools, query
specific memory types for usage information, etc. Facilities are
provided for binding caller-provided data to memory types,
incremental updates of memory types, and queries that span multiple
allocators.
Support for additional allocators is (relatively) easy to add.
The API for libmemstat(3) will probably change some over time as
consumers are written, and requirements evolve. It is written to
avoid encoding ABIs for data structure layout into consuming
applications for this reason.
MFC after: 1 week
- It is acceptable to call free(3) when the given pointer itself
is NULL, so we do not need to determine NULL before passing
a pointer to free(3)
- Handle failure of malloc(3)
MT6/5 Candidate
Submitted by: Dan Lukes <dan at obluda cz>
PR: bin/83352
if _FREEFALL_CONFIG is set gcc bails since pam_sm_setcred() in pam_krb5.c
no longer uses any of its parameters.
Pointy hat: kensmith
Approved by: re (scottl)
and writev() except that they take an additional offset argument and do
not change the current file position. In SAT speak:
preadv:readv::pread:read and pwritev:writev::pwrite:write.
- Try to reduce code duplication some by merging most of the old
kern_foov() and dofilefoo() functions into new dofilefoo() functions
that are called by kern_foov() and kern_pfoov(). The non-v functions
now all generate a simple uio on the stack from the passed in arguments
and then call kern_foov(). For example, read() now just builds a uio and
calls kern_readv() and pwrite() just builds a uio and calls kern_pwritev().
PR: kern/80362
Submitted by: Marc Olzheim marcolz at stack dot nl (1)
Approved by: re (scottl)
MFC after: 1 week
a tty device instead of the legacy minor number approach. This is known to
fix gnome-vfs' sftp module as well as kio_sftp and kdesu on -CURRENT.
Thanks to scottl for the snprintf() approach idea.
Reviewed by: phk
Tested by: pav
mich
Approved by: re (scottl)
branches but missed HEAD. This patch extends his a little bit,
setting it up via the Makefiles so that adding _FREEFALL_CONFIG
to /etc/make.conf is the only thing needed to cluster-ize things
(current setup also requires overriding CFLAGS).
From Peter's commit to the RELENG_* branches:
> Add the freebsd.org custer's source modifications under #ifdefs to aid
> keeping things in sync. For ksu:
> * install suid-root by default
> * don't fall back to asking for a unix password (ie: be pure kerberos)
> * allow custom user instances for things like www and not just root
The Makefile tweaks will be MFC-ed, the rest is already done.
MFC after: 3 days
Approved by: re (dwhite)
- Allow libpmc(3) to support P4/EMT64 PMCs on the amd64 architecture
and AMD K8 PMCs on the i386. [2]
Submitted by: ps [1]
Pointy hat: myself [2]
Approved by: re (scottl)
- pmcstat(8) gprof output mode fixes:
lib/libpmc/pmclog.{c,h}, sys/sys/pmclog.h:
+ Add a 'is_usermode' field to the PMCLOG_PCSAMPLE event
+ Add an 'entryaddr' field to the PMCLOG_PROCEXEC event,
so that pmcstat(8) can determine where the runtime loader
/libexec/ld-elf.so.1 is getting loaded.
sys/kern/kern_exec.c:
+ Use a local struct to group the entry address of the image being
exec()'ed and the process credential changed flag to the exec
handling hook inside hwpmc(4).
usr.sbin/pmcstat/*:
+ Support "-k kernelpath", "-D sampledir".
+ Implement the ELF bits of 'gmon.out' profile generation in a new
file "pmcstat_log.c". Move all log related functions to this
file.
+ Move local definitions and prototypes to "pmcstat.h"
- Other bug fixes:
+ lib/libpmc/pmclog.c: correctly handle EOF in pmclog_read().
+ sys/dev/hwpmc_mod.c: unconditionally log a PROCEXIT event to all
attached PMCs when a process exits.
+ sys/sys/pmc.h: correct a function prototype.
+ Improve usage checks in pmcstat(8).
Approved by: re (blanket hwpmc)
Like on libthr, there is an i386_set_gsbase() stub implementation here
to avoid libc.so.5 issues. This should likely be a weak symbol and I
expect this will be fixed soon.
Approved by: re
returned an lseek offset in a "u_long *" value, which can't express >4GB
offsets on 32 bit machines (eg: PAE). Change to "off_t *" for all.
Support ELF crashdumps on i386 and amd64.
Support PAE crashdumps on i386. This is done by auto-detecting the
presence of the IdlePDPT which means that PAE is active.
I used Marcel's _kvm_pa2off strategy and ELF header reader for ELF support
on amd64. Paul Saab ported the amd64 changes to i386 and we implemented
the PAE support from there.
Note that gdb6 in the src tree uses whatever libkvm supports. If you want
to debug an old crash dump, you might want to keep an old libkvm.so handy
and use LD_PRELOAD or the like. This does not detect the old raw dump
format.
Approved by: re
cached state. Otherwise, a subsequent call to devinfo_init() would succeed
without reading the device tree from the kernel thinking that the cached
state was up to date since the generation count was the same. However,
since the cached state was actually free'd, attempts to examine the tree
after the second devinfo_init() would fail.
Reported by: Juho Vuori juho dot vuori at kepa dot fi
Submitted by: Stefan Farfeleder stefan at fafoe dot narf dot at
Approved by: re (dwhite)
MFC after: 1 week
Compliance Definition. On sparc64, GCC emits _Qp_cmp() calls for its
__builtin_isfoo() functions which are used for C99's isfoo() macros.
Approved by: re(dwhite)
PR: 73782
to point at libmap.conf(5). This will help answer questions about what
and why it is, although not in great detail.
Approved by: re (scottl)
MFC after: 1 week
MFC note: When MFC'd, don't MFC mention of work not yet MFC'd.
method of executing commands remotely. There are no rexec clients in
the FreeBSD tree, and the client function rexec(3) is present only in
libcompat. It has been documented as "obsolete" since 4.3BSD, and its
use has been discouraged in the man page for over 10 years.
reflects the actual behavior of the API
for listing extended attributes.
PR: docs/79261
Submitted by: rodrigc
Reviewed by: rwatson, kan
Approved by: das (mentor)
- Implement sampling modes and logging support in hwpmc(4).
- Separate MI and MD parts of hwpmc(4) and allow sharing of
PMC implementations across different architectures.
Add support for P4 (EMT64) style PMCs to the amd64 code.
- New pmcstat(8) options: -E (exit time counts) -W (counts
every context switch), -R (print log file).
- pmc(3) API changes, improve our ability to keep ABI compatibility
in the future. Add more 'alias' names for commonly used events.
- bug fixes & documentation.
aio_write(2) completion through kevent(2). This method does not work on
64-bit architectures. It was deprecated in FreeBSD 4.4. See revisions
1.87 and 1.70.2.7.
Change aio_physwakeup() to call psignal(9) directly rather than indirectly
through a timeout(9). Discussed with: bde
Correct a bug introduced in revision 1.65 that could result in premature
delivery of a signal if an lio_listio(2) consisted of a mixture of
direct/raw and queued I/O operations. Observed by: tegge
Eliminate a field from struct kaioinfo that is now unused.
Reviewed by: tegge
netent.
- Change 1st argument of getnetbyaddr() to an uint32_t on 64 bit
arch as well to confirm to POSIX-2001.
These changes break ABI compatibility on 64 bit arch.
There is similar padding issue for ai_addrlen of struct addrinfo.
However, it is leaved as is for now.
Discussed on: arch@, standards@ and current@
X-MFC after: never
compiling on IRIX and Solaris. Remove the "archive_check_magic" macro
that existed only to provide __func__ to the underlying __archive_check_magic
function.
Thanks to: Darin Broady
MFC after: 14 days
from mode before using mode for extended attributes entry, copy
mtime/atime/ctime to extended attributes entry so it's a little more
clear that it corresponds to the like-named regular entry.
MFC after: 14 days
BZ_NO_COMPRESS support to the bzip2 sources directly (yes, this takes file
off the vendor branch, but looks like bzip2 maintainer doesn't care), so that
it will not be removed when the next upgrade is performed. Also, add a short
note on how to test bzip2 support.
Pointy hat to: obrien
Correct comment (libz -> libbz2) and remove useless full path to zutil.h
while I am here.
long (and unsigned long long) to long double conversions.
- Add a parameter that specifies the position of the sign bit to the _QP_TTOQ
macro, previously it always looked at bit 31. Pass a negative number to
disable sign inspection for unsigned types. This fixes _Qp_xtoq(),
_Qp_uitoq() and _Qp_uxtoq().
- In the functions __fpu_itof() and __fpu_xtof(), look at the sign bit to
decide whether we're doing a conversion from an unsigned type. If so, don't
negate the mantissa if the integer exceeds the biggest signed number.
PR: 55773
Patch by: Stephen Paskaluk (based upon)
MFC after: 2 weeks
and restoring the metadata. In particular, the metadata-restore
functions now all accept a file descriptor and a pathname. If the
file descriptor is set and the platform supports the appropriate
syscall, restore the metadata through the file descriptor. Otherwise,
restore it through the pathname. This is complicated by varying
syscall support (FreeBSD has an fchmod(2) but no fchflags(2), for
example) and because non-file entries don't have an fd to use in
restoring attributes (for example, mknod(2) doesn't return a file
handle).
MFC after: 14 days
bzip2 support provided, and amd64 depended on. Amd64 has a custom
${.OBJDIR}/machine symlink in it and the -I. picked this up. Without
it, the libstand code was being compiled in 32 bit mode, but with 64 bit
machine headers.
that use SSE. The compiler does attempt to do this in main() but not very
successfully - it still manages to use unaligned offsets from %ebp in some
cases. Also we need to have an aligned stack in case something uses SSE
via _init().
MFC After: 1 week
RFC 2553. In XNS5.2, and subsequently in POSIX-2001 and RFC
3493, it was changed to a socklen_t. And, the n_net of a
struct netent used to be an unsigned long integer. In XNS5,
and subsequently in POSIX-2001, it was changed to an uint32_t.
To accomodate for this while preserving ABI compatibility with
the old interface, we need to prepend or append 32 bits of
padding, depending on the (LP64) architecture's endianness.
- Correct 1st argument of getnetbyaddr() to uint32_t on 32
bit arch. Stay as is on 64 bit arch for ABI backward
compatibility for now.
Reviewed by: das, peter
MFC after: 2 weeks
Reviewed by: rwatson at freebsd dot org
Approved by: rwatson at freebsd dot org
MFC after: 1 week
Fix the matchlen() function so that it handles the IPv4 (AF_INET)
case correctly. Until now it has been treating IPv4 addresses
as if they were IPv6 which could lead to corruption errors.
return the buffer immediately. This will permit ssh and/or PAM logins
broken by previous commit.
The (potential) underlying problem is still under investigation.
Point hat to: me
different from what has been offered in libc_r (the one spotted in the
original PR which is found in libthr has already been removed by David's
commit, which is rev. 1.44 of lib/libthr/thread/thr_private.h):
- Use POSIX standard prototype for ttyname_r, which is,
int ttyname_r(int, char *, size_t);
Instead of:
char *ttyname_r(int, char *, size_t);
This is to conform IEEE Std 1003.1, 2004 Edition [1].
- Since we need to use standard errno for return code, include
errno.h in ttyname.c
- Update ttyname(3) implementation according to reflect the API
change.
- Document new ttyname_r(3) behavior
- Since we already make use of a thread local storage for
ttyname(3), remove the BUGS section.
- Remove conflicting ttyname_r related declarations found in libc_r.
Hopefully this change should not have changed the API/ABI, as the ttyname_r
symbol was never introduced before the last unistd.h change which happens a
couple of days before.
[1] http://www.opengroup.org/onlinepubs/009695399/functions/ttyname.html
Requested by: Tom McLaughlin <tmclaugh sdf lonestar org>
Through PR: threads/76938
Patched by: Craig Rodrigues <rodrigc crodrigues org> (with minor changes)
Prompted by: mezz@
primarily of pointless $FreeBSD$ tags), sync most files in HEAD with
those in the ZLIB branch. This minimizes the differences between
HEAD and ZLIB and should simplify future imports.
After this, there are only three files with local modifications
(gzio.c, minigzip.c, and zconf.h) and two non-vendor files
(Makefile, zopen.c). The rest exactly match the vendor distribution.
PR: i386/76294
MFC after: 2 weeks
(symlink or hardlink) is already set. Instead, it was always setting
the hardlink field. In particular, this caused GNU tar format long
symlinks to be interpreted as hardlinks.
Thanks to: Brooks Davis
MFC after: 7 days
negative (in addition to returning EINVAL when called on a descriptor
that is not a socket).
Submitted by: Arne H Juul <arnej@europe.yahoo-inc.com>
PR: docs/80587
(((truncate to zero length) or (create)) (text file)) (for writing)
and not
((truncate file to zero length) or (create text file)) (for writing)
MFC after: 1 week
- Use /*- instead of /* for copyright section
- Include unistd.h for prototype of it
- Sort and separate includes as described in style(9)
- ANSIfy the function defination
- Use const for the traversing iterator
Have pmcstat(8) and pmccontrol(8) use these APIs.
Return PMC class-related constants (PMC widths and capabilities)
with the OP GETCPUINFO call leaving OP PMCINFO to return only the
dynamic information associated with a PMC (i.e., whether enabled,
owner pid, reload count etc.).
Allow pmc_read() (i.e., OPS PMCRW) on active self-attached PMCs to
get upto-date values from hardware since we can guarantee that the
hardware is running the correct PMC at the time of the call.
Bug fixes:
- (x86 class processors) Fix a bug that prevented an RDPMC
instruction from being recognized as permitted till after the
attached process had context switched out and back in again after
a pmc_start() call.
Tighten the rules for using RDPMC class instructions: a GETMSR
OP is now allowed only after an OP ATTACH has been done by the
PMC's owner to itself. OP GETMSR is not allowed for PMCs that
track descendants, for PMCs attached to processes other than
their owner processes.
- (P4/HTT processors only) Fix a bug that caused the MI and MD
layers to get out of sync. Add a new MD operation 'get_config()'
as part of this fix.
- Allow multiple system-mode PMCs at the same row-index but on
different CPUs to be allocated.
- Reject allocation of an administratively disabled PMC.
Misc. code cleanups and refactoring. Improve a few comments.
any query.
- don't query against IPv6 link-local address.
- use IN6_IS_ADDR_V4{MAPPED,COMPAT} macros.
- use memcpy() instead of bcopy().
Inspired by: NetBSD
These are two of the three files that have non-trivial differences from
the vendor branch. minigzip.c is the third, but there were no changes
from ZLib 1.2.1 to ZLib 1.2.2 in that file.
The rest of the files I intend to get reverted back to the vendor
branch (with cooperation of cvsadmin@).
PR: i386/76294
internal error if pax extended attributes were being generated. Being
< 255 characters, the first-pass path editing (to generate a
ustar-compatible name for the main entry) wouldn't occur, and the
second-pass path editing (to generate a ustar name for the pax
attributes entry) assumed the input was already < 245 chars.
The core problem here was using an abbreviated algorithm for the
second pass that relied on the first pass having already run. The
rewritten code is much simpler: It just uses the full path-shortening
algorithm for building both ustar pathnames. This way, the second
ustar pathname will always be short enough.
Thanks to: Mark Cammidge
Related to: bin/74385
us when <sys/pmc.h> is included.
o Replace "#if __i386__" and "#if __amd64__" with the equivalent of
"#ifdef __i386__" and "#ifdef __amd64__" (resp.) These tokens are
not defined on all platforms.
o Conditionally compile pmc_parse_mask() on i386 and amd64 only. It's
only referenced there. This will change when support for other
platforms is added, of course.
Ok'd by: jkoshy@
getnameinfo(3). POSIX standard does not require a sa_len field
in sockaddr struct, hence such requirement will cause problem
for portability.
PR: standards/80008
Requested by: Xin Liu <lx@knight.6test.edu.cn>
Reviewed by: freebsd-standards (das)
MFC After: 2 weeks
check the password or group database before attempting to parse as an
integer, as is done for the first {uid,gid} in an identity phrase.
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR, SPARTA
a return instruction. (The latter is discouraged by the Opteron
optimization manual because it disables branch prediction for the return
instruction.)
Reviewed by: bde
* Handles entries with compressed size >2GB (signed/unsigned cleanup)
* Handles entries with compressed size >4GB ("ZIP64" extension)
* Handles Unix extensions (ctime, atime, mtime, mode, uid, etc)
* Format-specific "skip data" override allows ZIP reader to skip
entries without decompressing them, which makes "tar -t"
a lot faster.
* Handles "length-at-end" entries generated by, e.g., "zip -r - foo"
Many thanks to: Dan Nelson, who contributed the code and test files for
the first three items above and suggested the fourth.
libalias.
In /usr/src/lib/libalias/alias.c, the functions LibAliasIn and
LibAliasOutTry call the legacy PacketAliasIn/PacketAliasOut instead
of LibAliasIn/LibAliasOut when the PKT_ALIAS_REVERSE option is set.
In this case, the context variable "la" gets lost because the legacy
compatibility routines expect "la" to be global. This was obviously
an oversight when rewriting the PacketAlias* functions to the
LibAlias* functions.
The fix (as shown in the patch below) is to remove the legacy
subroutine calls and replace with the new ones using the "la" struct
as the first arg.
Submitted by: Gil Kloepfer <fgil@kloepfer.org>
Confirmed by: <nicolai@catpipe.net>
PR: 76839
MFC after: 3 days
implementations inspired by the ones in DragonFly. Unlike the
DragonFly versions, these have a small data cache footprint, and my
tests show that they're never slower than the old code except when the
charset or the span is 0 or 1 characters. This implementation is
generally faster than DragonFly until either the charset or the span
gets in the ballpark of 32 to 64 characters.
these at the moment, but applications that test for them will now
have a better chance of compiling.
I have intentionally omitted errnos that are only good for STREAMS,
since apps that use STREAMS won't compile anyway. The exception is
EPROTO, which was apparently intended for STREAMS, but worth having
anyway because Linux (mis)uses it for other things.
1. fast simple type mutex.
2. __thread tls works.
3. asynchronous cancellation works ( using signal ).
4. thread synchronization is fully based on umtx, mainly, condition
variable and other synchronization objects were rewritten by using
umtx directly. those objects can be shared between processes via
shared memory, it has to change ABI which does not happen yet.
5. default stack size is increased to 1M on 32 bits platform, 2M for
64 bits platform.
As the result, some mysql super-smack benchmarks show performance is
improved massivly.
Okayed by: jeff, mtm, rwatson, scottl
floating-point arithmetic on i386. Now I'm going to make excuses
for why this code is kinda scary:
- To avoid breaking the ABI with 5.3-RELEASE, we can't change
sizeof(fenv_t). I stuck the saved mxcsr in some discontiguous
reserved bits in the existing structure.
- Attempting to access the mxcsr on older processors results
in an illegal instruction exception, so support for SSE must
be detected at runtime. (The extra baggage is optimized away
if either the application or libm is compiled with -msse{,2}.)
I didn't run tests to ensure that this doesn't SIGILL on older 486's
lacking the cpuid instruction or on other processors lacking SSE.
Results from running the fenv regression test on these processors
would be appreciated. (You'll need to compile the test with
-DNO_STRICT_DFL_ENV.) If you have an 80386, or if your processor
supports SSE but the kernel didn't enable it, then you're probably out
of luck.
Also, I un-inlined some of the functions that grew larger as a result
of this change, moving them from fenv.h to fenv.c.
fedisableexcept(), and fegetexcept(). These two sets of routines
provide the same functionality. I implemented the former as an
undocumented internal interface to make the regression test easier to
write. However, fe(enable|disable|get)except() is already part of
glibc, and I would like to avoid gratuitous differences. The only
major flaw in the glibc API is that there's no good way to report
errors on processors that don't support all the unmasked exceptions.
to mistakes from day 1, it has always had semantics inconsistent with
SVR4 and its successors. In particular, given argument M:
- On Solaris and FreeBSD/{alpha,sparc64}, it clobbers the old flags
and *sets* the new flag word to M. (NetBSD, too?)
- On FreeBSD/{amd64,i386}, it *clears* the flags that are specified in M
and leaves the remaining flags unchanged (modulo a small bug on amd64.)
- On FreeBSD/ia64, it is not implemented.
There is no way to fix fpsetsticky() to DTRT for both old FreeBSD apps
and apps ported from other operating systems, so the best approach
seems to be to kill the function and fix any apps that break. I
couldn't find any ports that use it, and any such ports would already
be broken on FreeBSD/ia64 and Linux anyway.
By the way, the routine has always been undocumented in FreeBSD,
except for an MLINK to a manpage that doesn't describe it. This
manpage has stated since 5.3-RELEASE that the functions it describes
are deprecated, so that must mean that functions that it is *supposed*
to describe but doesn't are even *more* deprecated. ;-)
Note that fpresetsticky() has been retained on FreeBSD/i386. As far
as I can tell, no other operating systems or ports of FreeBSD
implement it, so there's nothing for it to be inconsistent with.
PR: 75862
Suggested by: bde
particularly good reason to do this, except that __strong_reference
does type checking, whereas __weak_reference does not.
On Alpha, the compiler won't accept a 'long double' parameter in
place of a 'double' parameter even thought the two types are
identical.
an invalid exception and return an NaN.
- If a long double has 113 bits of precision, implement fma in terms
of simple long double arithmetic instead of complicated double arithmetic.
- If a long double is the same as a double, alias fma as fmal.
identical to scalbnf, which is now aliased as ldexpf. Note that the
old implementations made the mistake of setting errno and were the
only libm routines to do so.
- Add nexttoward{,f,l} and nextafterl. On all platforms,
nexttowardl is an alias for nextafterl.
- Add fmal.
- Add man pages for new routines: fmal, nextafterl,
nexttoward{,f,l}, scalb{,l}nl.
Note that on platforms where long double is the same as double, we
generally just alias the double versions of the routines, since doing
so avoids extra work on the source code level and redundant code in
the binary. In particular:
ldbl53 ldbl64/113
fmal s_fma.c s_fmal.c
ldexpl s_scalbn.c s_scalbnl.c
nextafterl s_nextafter.c s_nextafterl.c
nexttoward s_nextafter.c s_nexttoward.c
nexttowardf s_nexttowardf.c s_nexttowardf.c
nexttowardl s_nextafter.c s_nextafterl.c
scalbnl s_scalbn.c s_scalbnl.c
sparc64's 128-bit long doubles.
- Define FP_FAST_FMAL for ia64.
- Prototypes for fmal, frexpl, ldexpl, nextafterl, nexttoward{,f,l},
scalblnl, and scalbnl.
- In scalbln and scalblnf, check the bounds of the second argument.
This is probably unnecessary, but strictly speaking, we should
report an error if someone tries to compute scalbln(x, INT_MAX + 1ll).
nexttowardl. These are not needed on machines where long doubles
look like IEEE-754 doubles, so the implementation only supports
the usual long double formats with 15-bit exponents.
Anything bizarre, such as machines where floating-point and integer
data have different endianness, will cause problems. This is the case
with big endian ia64 according to libc/ia64/_fpmath.h. Please contact
me if you managed to get a machine running this way.
that are intended to raise underflow and inexact exceptions.
- On systems where long double is the same as double, nextafter
should be aliased as nexttoward, nexttowardl, and nextafterl.
bit in a long double. For architectures that don't have such a bit,
LDBL_NBIT is 0. This makes it possible to say `mantissa & ~LDBL_NBIT'
in places that previously used an #ifdef to select the right expression.
The optimizer should dispense with the extra arithmetic when LDBL_NBIT
is 0 anyway.
- Add an XXX comment for the big endian case.
bit in a long double. For architectures that don't have such a bit,
LDBL_NBIT is 0. This makes it possible to say `mantissa & ~LDBL_NBIT'
in places that previously used an #ifdef to select the right expression.
The optimizer should dispense with the extra arithmetic when LDBL_NBIT
is 0.
Symptoms of the problem included assembler warnings and
nondeterministic runtime behavior when a fe*() call that affects the
fpsr is closely followed by a float point op.
The bug (at least, I think it's a bug) is that gcc does not insert a
break between a volatile asm and a dependent instruction if the
volatile asm came from an inlined function. Volatile asms seem to be
fine in other circumstances, even without -mvolatile-asm-stop, so
perhaps the compiler adds the stop bits before inlining takes place.
The problem does not occur at -O0 because inlining is disabled, and it
doesn't happen at -O2 because -fschedule-insns2 knows better.
a libalias application (e.g. natd, ppp, etc.) to crash. Note: Skinny support
is not enabled in natd or ppp by default.
Approved by: secteam (nectar)
MFC after: 1 day
Secuiryt: This fixes a remote DoS exploit
any pending HTTP request rather than calling shutdown(2) with SHUT_WR.
This makes libfetch (and thus fetch(1)) work again with Squid proxies
configured to not allow half-closed connections.
Reported by: Pawel Worach (pawel.worach AT telia DOT com)
the lock is held by other thread, but not when nobody owns it. According
to deischen@, this part of code will never be hit in our threads
library, since it does not use locks without wait/wakeup functions.
Spotted by: mingyanguo via ChinaUnix.net forum
Reviewed by: deischen
surrounding the undef'ing it. It does not seem necessary to
undef some symbol that is not exist, and gcc does not complain
about whether a symbol is exist before #undef'ing it out.
Spotted by: mingyanguo via ChinaUnix.net forum
Reviewed by: phk
it type and endian clean and removing of stdio dependency from NLS
functions (catalog files now are processed via mmap())
Also following changes were done (against NetBSD version):
. If mmap() failed, set errno to EINVAL and do not try to munmap() file
Obtained from: NetBSD
. Replace inclusion of sys/param.h to sys/cdefs.h and sys/types.h where
appropriate.
. move _*_init() prototypes to mblocal.h, and remove these prototypes
from .c files
. use _none_init() in __setrunelocale() instead of duplicating code
. move __mb* variables from table.c to none.c allowing us to not to
export _none_*() externs, and appropriately remove them from mblocal.h
Ok'ed by: tjr
introducing the disk formats for _RuneLocale and friends.
The disk formats do not have (useless) pointers and have 32-bit
quantities instead of rune_t and long. (htonl(3) only works
with 32-bit quantities, so there's no loss).
Bootstrap mklocale(1) when necessary. (Bootstrapping from 4.x
would be trivial (verified), but we no longer provide pre-5.3
source upgrades and this is the first commit to actually break
it.)
inputs. The trouble with replacing two floats with a double is that
the latter has 6 extra bits of precision, which actually hurts
accuracy in many cases. All of the constants are optimal when float
arithmetic is used, and would need to be recomputed to do this right.
Noticed by: bde (ucbtest)
return a generic text message instead.
(Someday, I'll track down all the places that
are generating errors but not recording messages. ;-/
Thanks to: Jaakko Heinonen
results in a performance gain on the order of 10% for amd64 (sledge),
ia64 (pluto1), i386+SSE (Pentium 4), and sparc64 (panther), and a
negligible improvement for i386 without SSE. (The i386 port still
uses the hardware instruction, though.)
changed to use the statclock. Make sure we calculate the value
of a tick correctly in userland.
Noticed by: Kazuaki Oda <kaakun at highway dot ne dot jp>
occurred with large read-ahead requests. This only affected
formats that incorrectly make large requests (ZIP did this until
recently) or with block sizes over 32k.
When reading the bodies of Zip archive entries, request a minimum of 1
byte, rather than a minimum of the full entry size. This is faster
(since it does not force the decompression layer to combine reads) and
works around a bug in the "none" decompression handler (which I'm
testing a separate fix for now). I've also renamed "bytes_read" to
"bytes_avail" in several places to more accurately reflect that the
value returned from (a->compression_read_ahead) is the number of bytes
available, not necessarily the number of bytes requested.
(padding) entries, extract inode value from PX entry, recognize SP and
ST (start/end of SUSP extensions).
I don't enforce SP yet, as I've seen CDROMs which use Rockridge
extensions but don't have the SP record (which is officially
required).
The ISO9660 support is now mature enough to extract FreeBSD
distribution CDROMs created with mkisofs.
- Remove all:. It's redundant, and ${LIB} in it is just a bug.
- Remove .ORDER:. *.mgc files can safely be built in parallel.
- Remove PITA. The mkmagic tool is smart to put the binary file
into the current directory (${.OBJDIR}) even if the source file
lives somewhere else, which is just what we need.
manpages. They are not very related, so separating them makes it
easier to add meaningful cross-references and extend some of the
descriptions.
- Move the part of math(3) that discusses IEEE 754 to the ieee(3)
manpage.
Only supports "deflate" and "none" compression for now.
Also, add a few clarifications to the archive_read.3 manpage as
requested by William Dean DeVries.
copy the acquired TGT from the in-memory cache to the on-disk cache
at login. This was documented but un-implemented behavior.
MFC after: 1 week
PR: bin/64464
Reported and tested by: Eric van Gyzen <vangyzen at stat dot duke dot edu>
FreeBSD currently implements the most up to date IPv6 APIs for
option and route header parsing. This checkin marks the older APIs
as deprecated and points the reader to the newer pages.
Reviewed by: Jun-ichiro Itojun
Approved by: rwatson (mentor)
- Rearrange the list of functions into categories.
- Remove the ulps column. It was appropriate for only some
of the functions in the list, and correct for even fewer
of them.
- Add some new paragraphs, and remove some old ones about
NaNs that may do more harm than good.
- Document precisions other than double-precision.
- Although ldexp() is in libc for backwards compatibility, ldexpf() is
in its proper place in libm. Document both as being in libm.
- The ldexp() and ldexpf() functions conform to C99.
- Neither frexp() nor frexpf() set errno.
- Although frexp() is in libc for backwards compatibility, frexpf() is
in its proper place in libm. Document both as being in libm.
- The frexp() and frexpf() functions conform to C99.
Reviewed by: Kame Project (including Itojun-san, Jinmei-san and Suzuki-san)
Approved by: Robert Watson (robert at freebsd dot org)
Obtained from: Kame Project and OpenBSD
Replace manual pages that may have violated the IETF's Copyright.
All come from the Kame tree.
Several were from OpenBSD except for ip6.4, and the inet6* pages which were
rewritten by me.
All of the text is new and drawn from reading the code and
documentation.
Approved by: Robert Watson (robert at freebsd dot org)
Remove files in preparation for replacement with totally new versions
of the manual pages.
Update the Makefile to handle the new file to be added.
scalbn() implementation from libm. (The two functions are defined to
be identical, but ldexp() lives in libc for backwards compatibility.)
The old ldexp() implementation...
- was more complicated than this one
- set errno instead of raising FP exceptions
- got some corner cases wrong
(e.g. ldexp(1.0, 2000) in round-to-zero mode)
The new implementation lives in libc/gen instead of
libc/$MACHINE_ARCH/gen, since we don't need N copies of a
machine-independent file. The amd64 and i386 platforms
retain their fast and correct MD implementations and
override this one.
really so.
"If the value of base is 16, the characters 0x or 0X may optionally
precede the sequence of letters and digits, following the sign if
present."
Found by: joerg
instead use the FPU to convert subnormals to normals. (NB: Further
simplification is possible, such as using the FPU for the rounding
step.)
This fixes a bug reported by stefanf where long double subnormals in
the Intel 80-bit format would be output with one fewer digit than
necessary when the default precision was used.
for libarchive error messages. Mostly, this
avoids a portability headache related to
copying va_list arguments (some FreeBSD 5
platforms require va_copy; FreeBSD 4 doesn't
support va_copy at all). It also dramatically reduces the
size of libarchive for embedded applications:
a minimal "untar" program using libarchive can now be
under 64k statically linked (as opposed to ~100k
using library *printf() functions).
MFC after: 14 days
Eliminate gdtoa.mk and move its contents to ${MACHINE_ARCH}/Makefile.inc.
The purpose of having a separate file involved an abandoned scheme that
would have kept contrib/gdtoa out of the include path for the rest of libc.
flags, so they are not pure. Remove the __pure2 annotation from them.
I believe that the following routines and their float and long double
counterparts are the only ones here that can be __pure2:
copysign is* fabs finite fmax fmin fpclassify ilogb nan signbit
When gcc supports FENV_ACCESS, perhaps there will be a new annotation
that allows the other functions to be considered pure when FENV_ACCESS
is off.
Discussed with: bde
basically support this, subject to gcc's lack of FENV_ACCESS support.
In any case, the previous setting of math_errhandling to 0 is not
allowed by POSIX.
registers as volatile. Instructions that *wrote* to FP state were
already marked volatile, but apparently gcc has license to move
non-volatile asms past volatile asms. This broke amd64's feupdateenv
at -O2 due to a WAR conflict between fnstsw and fldenv there.
GNU) for determining whether a string is an affirmative or negative
response to a question according to the current locale. This is done
by matching the response against nl_langinfo(3) items YESEXPR and NOEXPR.
make utilities like du(1) 64bit-clean.
When this field is used, one cannot use 'fts_number' and 'fts_pointer'
fields.
This commit doesn't break API nor ABI.
This work is part of the BigDisk project:
http://www.FreeBSD.org/projects/bigdisk/
Discussed on: arch@
MFC after: 5 days
which doesn't end in \n, since it may be very confusing. Also this should
increase consistency, since most other config files work just fine regardless
of the presence of traling \n in the last line.
MFC After: 2 weeks
* Reference-count the directory data so that
we don't leak memory.
* Correctly step through the directory records
(skipping unrecognized extensions)
* Use better defaults for file modes
* Sort directory entries by offset of the end of the file
rather than the beginning of the file. This fixes a
lot of "out-of-order" problems with zero-length files,
in particular.
* Style fixes, remove some debug code, add some error messages.
This seems to be able to extract a TOC and extract files from
the couple of ISO images I've tested it with.
Treat this as experimental proof-of-concept code for the
moment. There are still a bunch of debug messages (there
are a few oddities in ISO9660 that I haven't yet figured
out how to handle), a lot of bugs to be addressed (this
code leaks memory very badly), and a lot of missing features (no
Rockridge support, in particular). I'd appreciate
feedback from anyone who understands ISO9660 format
better than I do. ;-)
Suggested by: Robert Watson
the regular ustar entry. The old code sometimes created
a too-long name that overflowed the ustar fields and triggered
an internal assertion failure. This version should be more
robust.
Thanks to: Michal Listos
Fixes: bin/74385
MFC after: 15 days
a "null pointer".''
Making good use of the excellent explanations sent to me by Ruslan
Ermilov, Garrett Wollman and Bruce Evans, correct the descriptions of
null pointers. They are just "null pointers", not nil, not NULL or
".Dv NULL".
Suggested by: ru, wollman, bde
Reviewed by: ru, wollman
Pointy hat: keramida
ustar fields. Later, we're going to permit numeric extensions
for these fields, so we can support large values here. In particular,
this allows GNU tar to correctly extract such entries even
though it doesn't support the pax extended attributes.
Note: r1.18 and r1.17.2.1 of this file allowed similar treatment
of the uid/gid fields.
Thanks to: Ben Mesander
improves the recognition of hardlink entries
with/without bodies (which is implemented through
a look-ahead that uses the bid function).
MFC after: 7 days
signals instead of having more intricate knowledge of thread state
within signal handling.
Simplify signal code because of above (by David Xu).
Use macros for libpthread usage of pthread_cleanup_push() and
pthread_cleanup_pop(). This removes some instances of malloc()
and free() from the semaphore and pthread_once() implementations.
When single threaded and forking(), make sure that the current
thread's signal mask is inherited by the forked thread.
Use private mutexes for libc and libpthread. Signals are
deferred while threads hold private mutexes. This fix also
breaks www/linuxpluginwrapper; a patch that fixes it is at
http://people.freebsd.org/~deischen/kse/linuxpluginwrapper.diff
Fix race condition in condition variables where handling a
signal (pthread_kill() or kill()) may not see a wakeup
(pthread_cond_signal() or pthread_cond_broadcast()).
In collaboration with: davidxu
sigdelset(3) and sigismember(3) were killed about five years ago.
o The functions (specifically sigismember(3)) could return -1 and
set errno.
PR: bin/75156
Obtained from: NetBSD
MFC after: 2 weeks
o Bump the date of the document.
device numbers. In particular, this should fix
a bug where archiving a device node with a very
large minor number would sometimes overflow and
corrupt the major number.
Thanks to: Ben Mesander
MFC after: 7 days
http://www.opengroup.org/onlinepubs/009695399/functions/swab.html
the prototype for swab() should be in <unistd.h> and not in <string.h>.
Move it, and update to match SUS. Leave the prototype in string.h for
now, for backwards compat.
PR: 74751
Submitted by: Craig Rodrigues <rodrigc@crodrigues.org>
Discussed with: das
ID ranges that consist of exactly one attribute ID. libsdp(3) will check start
and end of the attribute ID range and if they are the same the range will be
collapsed to one atribute ID.
The problem was observed on Audiovox SMT5600 and Palm Treo 650.
MFC after: 3 days
regular 'ustar' entry, use narrow-character version,
not wide-character version, as the ustar entry always
uses the narrow-character filename.
Thanks to: Michal Listos
Inspired by, but doesn't fix: bin/74385
operation (by subtracting the absolute result from 0), don't test
for overflow.
This avoids an arithmetic exception when dividing LONG_MIN by 1:
This is the only case that causes overflow, and the resulting value
is correct under 2's compliment arithmetic.
PR: 72024
Approved by: dwmalone@
Obtained from: NetBSD
MFC after: 4 days
reading past 'stop' in various places when converting multibyte characters.
Reading too far caused truncation to not be detected when it should have
been, eventually causing regexec() to loop infinitely in with certain
combinations of patterns and strings in multibyte locales.
PR: 74020
MFC after: 4 weeks
is supported.
-Document the new more preferred syntax
-Add examples for the new syntax
-Add a note that the old syntax will be deprecated in the future.
Reviewed by: rwatson
the the pax attributes, I shouldn't try using the public
API for finishing out the attribute entry, either.
This also removes some old dubious state manipulations.
because the code was using the external API
(archive_write_data) and assuming internal
error-return conventions. Use the internal
API for writing data.
Thanks to: Joe Marcus Clarke
If turned on no NIS support and related programs will be built.
Lost parts rediscovered by: Danny Braniss <danny at cs.huji.ac.il>
PR: bin/68303
No objections: des, gshapiro, nectar
Reviewed by: ru
Approved by: rwatson (mentor)
MFC after: 2 weeks
seed, the random number generator rand(3) still sucks and is unlikely
sufficient for crypto use. Correct what appears to be a cut and paste
error from the srandomdev() man page.
Submitted by: Ben Mesander
run as a 32 bit support library for an amd64 kernel. 32 bit consumers of
libthr have zero chance of running on an amd64 kernel since we don't
implement the i386_set_ldt() family of functions. Note that this commit
doesn't make it actually work, it just removes one more obstacle.
can't use the i386_set_ldt() family of routines, because they are not
implemented. Instead, use the recently exposed direct access sysarch
routines for setting what %fs and %gs point to.
Use this for the i386 TLS _set_tp() routine, but only when compiling to
run as a 32 bit support binary for amd64 kernels.
syslog(3) if we are a priveleged program (sshd, su, etc.).
- Make syslogd open an additional socket /var/run/logpriv, with 0600
permissions.
- In libc, try to use this socket.
- Do not loop forever if we are using this socket (partial backout of 1.31)
Reviewed by: dwmalone, Andrea Campi <andrea webcom it>
Approved by: julian (mentor)
MFC after: 1 month
_(use space as padding), and 0(zero padding).
These GNU extensions are widely used ones that is worthy for us to
have.
Discussed with: stefanf, roam, -current
Approved by: murray
Prodded by: ports/72722, ports/72723
MFC After: 1 month
packages expect and seems to be most correct according to the slightly-
ambiguous standards.
MFC after: 1 month
Corroborated by: POSIX <http://tinyurl.com/4uvub>
Reviewed by: silence on threads@
the GPT partition on i386 and adm64 as type=gpt, subtype=0 and with the
sname set to the UUID. This prevents sysinstall from bombing out. This
also makes sure the GPT partition shows up in sysinstall so as to avoid
accidental "clobberage".
PR: bin/72896
file. In particular, this allows bsdtar to append (-r) to
an empty file.
Thanks to: Ryan Sommers
While I'm here, straighten out a misleading comment about GNU-compatible
sparse file handling.
put DEAD thread on GC list, this closes a race between pthread_join
and thr_cleanup.
2. Introduce a mutex to protect tcb initialization, tls allocation and
deallocation code in rtld seems no lock protection or it is broken,
under stress testing, memory is corrupted.
Reviewed by: deischen
patch partly provided by: deischen
probably means it also requires a .so version
bump. Defer it until I finish some related
work on cleaning up error returns throughout
the library.
Thanks to: Conrad J. Sabatier
no userland locks are heald, the dead thread lock can no longer protect
access to it. Therefore, instead of using an if (!dead)...else clause
after walking the active threads list test the thread pointer before
deciding not to walk the dead threads list. If the thread pointer is null
it means it was not found in the active threads list and the dead threads
list should be checked.
2. Do not free the stack of a thread that is not marked dead. This is the
2nd and final part of eliminating the race to free a thread's stack.
MFC after: 3 days
Extract the struct cdev pointer and the tty device from inside rather than
incorrectly casting the 'struct cdev *' pointer to a 'dev_t' int. Not
that this was particularly important since it was only used for reading
vmcore files.
- Make some minor rearrangements in the introduction.
- Mention the problem with argument reduction on i386.
- Add recently-implemented functions to the table.
- Un-document the error bounds that only apply to the old 4BSD math
library, and fill in the correct values where I know them. No
attempt has been made to document bounds lower than 1 ulp, although
smaller bounds are usually achievable in round-to-nearest mode.
Requested by: bde
o Remove unneeded sys/types.h and netinet/in.h from the synopsis and
the example.
o We do have struct in_addr in arpa/inet.h, so no need for netinet/in.h.
o Mention where AF_* constants defined are.
Educated by: bde
- Add a comment noting that the ru_[us]times values being read aren't
actually valid and need to be computed from the raw values.
Submitted by: many (1)
After some discussion the best option seems to be to signal the thread's
death from within the kernel. This requires that thr_exit() take an
argument.
Discussed with: davidxu, deischen, marcel
MFC after: 3 days
/lib/{libm,libreadline}
/usr/lib/{libhistory,libopie,libpcap}
in preparation for doing the same thing to RELENG_5. HUGE amounts of
help for determining what to bump provided by kris.
Discussed on: freebsd-current
Approved by: re (not required for commit but something like this should be)
the signal mask and pending signals of the calling thread. These
are stored in userland in libpthread.
There is a small race condition in this patch which could cause
problems if a signal arrives after setting the (kernel) signal
mask and before exec'ing. The thread's set of pending signals
also are not yet installed in the exec'd process. Both of these
will be corrected with the addition of a special syscall.
Reported & Tested by: Joost Bekkers <joost at jodocus dot org>
Reviewed by: julian, davidxu
1. Install man files and links for the lwres library.
2. Fix the path in various files to say /etc/namedb/ instead of just /etc.
3. Correctly install the conf file man pages for named and rndc.
to match how similar syntax is used in the ports system. Thanks to kris
for pointing out my mistake here.
Install the lwres library unless the user defines NO_BIND, or the new
knob, NO_BIND_LIBS_LWRES. There is at least one potential customer
for this library in the wings. Thanks to nectar for the reminder.
but have a knob (WANT_BIND_LIBS) to build and install them in /usr/lib
and /usr/include. Rumors are that this may be useful at a later point,
let's see.
What this really means is that all BIND libraries are now internal to
buildworld (by default, unless WANT_BIND_LIBS is defined), and linked
statically into various BIND executables.
While here, removed redundant -I's from CFLAGS in lib/bind makefiles.
Sponsored by: des
OK'ed by: dougb
__isnan() and __isnanf() must remain in libc for hysterical raisins.
On the other hand, __isnanl() must live in libm because libm uses it
internally and can't depend on older versions of libc to provide it.
Fortunately, we don't need __isnanl() in both libraries.
Prodded by: ale
PR: 71698
MT5 candidate
specified mutex is invalid. In spec parlance 'MAY FAIL' means it's
up to the implementor. So, remove the check for NULL pointers for two
reasons:
1. A mutex may be invalid without necessarily being NULL.
2. If the pointer to the mutex is NULL core-dumping in the
vicinity of the problem is much much much better than failing
in some other part of the code (especially when the application
doesn't check the return value of the function that you oh so
helpfully set to EINVAL).
full KSE support still have -lpthread as an alias for -lc_r. The only
thing that's different is the name of the knob that turns it off.
Pointed out by: ru@
POSIX threads libraries are not available. Add crypto support if
the crypto libraries are available. Build dnssec-{keygen,signzone}
if crypto is available.
Submitted by: (in part) dougb@
1. The correct cutoff for large uid/gid handling is 1<<18, not 1<<20.
2. Limit the uid/gid in the 'x' extension header (where numeric extensions
are not permitted) to 1<<18, but use the correct value in the regular
header (where numeric extensions are permitted).
Thanks to: Dan Nelson
MFC after: 3 days
caused by refering broken (uninitialized?) pointer which is retrieved
from __bt_new() (and from mpool_new()).
I don't know why this linp[0] is read before stored because this
should be controlled by .lower and .upper member of PAGE structure
which are correctly initialized.
But this workaround fixes the problem on my environment and this
module has #ifdef PURIFY option which initializes new and reused
memory from mpool by memset(p, 0xff, size) like as I did.
Please feel free to fix the real bug instead of my workaround.
transaction id from the request, this is useful for debugging.
Fix the autoh_freeall(3) function to properly free the array of
auto handles. Before it was freeing individual members of the list
OK, however it was then advancing the pointer and freeing the wrong
data for the whole list.
multibyte character support:
- In CHadd(), avoid writing past the end of the character set bitmap when
the opposite-case counterpart of wide characters with values less than
NC have values greater than or equal to NC.
- In CHaddtype(), fix a braino that caused alphabetic characters to be
added to all character classes! (but only with REG_ICASE)
PR: 71367
but with slightly cleaned up interfaces.
The KSE structure has become the same as the "per thread scheduler
private data" structure. In order to not make the diffs too great
one is #defined as the other at this time.
The KSE (or td_sched) structure is now allocated per thread and has no
allocation code of its own.
Concurrency for a KSEGRP is now kept track of via a simple pair of counters
rather than using KSE structures as tokens.
Since the KSE structure is different in each scheduler, kern_switch.c
is now included at the end of each scheduler. Nothing outside the
scheduler knows the contents of the KSE (aka td_sched) structure.
The fields in the ksegrp structure that are to do with the scheduler's
queueing mechanisms are now moved to the kg_sched structure.
(per ksegrp scheduler private data structure). In other words how the
scheduler queues and keeps track of threads is no-one's business except
the scheduler's. This should allow people to write experimental
schedulers with completely different internal structuring.
A scheduler call sched_set_concurrency(kg, N) has been added that
notifies teh scheduler that no more than N threads from that ksegrp
should be allowed to be on concurrently scheduled. This is also
used to enforce 'fainess' at this time so that a ksegrp with
10000 threads can not swamp a the run queue and force out a process
with 1 thread, since the current code will not set the concurrency above
NCPU, and both schedulers will not allow more than that many
onto the system run queue at a time. Each scheduler should eventualy develop
their own methods to do this now that they are effectively separated.
Rejig libthr's kernel interface to follow the same code paths as
linkse for scope system threads. This has slightly hurt libthr's performance
but I will work to recover as much of it as I can.
Thread exit code has been cleaned up greatly.
exit and exec code now transitions a process back to
'standard non-threaded mode' before taking the next step.
Reviewed by: scottl, peter
MFC after: 1 week
denote a directory. Unfortunately, in the presence of GNU or POSIX
extensions, this code was checking the truncated filename stored in the
regular header rather than the full filename stored in the extended
attribute. As a result, long filenames with '/' in just the right
position would trigger this check and be erroneously marked as
directories. Move the check so it only considers the full filename.
Note: the check can't simply be disabled for archives that contain
these extensions because there are some very broken archivers out
there.
Thanks to: Will Froning
MFC after: 3 days
since otherwise the initial seek offset will contain the directory
offset of the filesystem block that contained its directory entry.
This bug was mostly harmless because typically the directory is
less than one filesystem block in size so the offset would be zero.
It did however generally break loading a kernel from the (large)
kernel compile directory.
Also reset the seek pointer when a new inode is opened in read_inode(),
though this is not actually necessary now because all callers set
it afterwards.
By using r8 instead of r14 to do the swap, we put the dst argument
in the return register. Since bcopy() doesn't clobber r8, we don't
have to do anything else. This fixes ports/textproc/aspell.
documenting the obsoleteness of the msync(2) syscall and its single
remaining purpose.
PR: 70916
Submitted by: Radim Kolar <hsn@netmag.cz>
MFC after: 3 days
.h files. This simplifies the Makefile here a bit and makes it behave
better in a couple of situations. While I'm here, clean up some comments
and try to improve the organization a bit.
Thanks to: Ruslan Ermilov (The Marvelous Makefile Guru)
mpool_open(3) - it is *not* really used for synchronization; in fact,
it is not used at all.
PR: 70929
Submitted by: Martin Kammerhofer <dada@sbox.tugraz.at>
MFC after: 3 days
This should provide a big performance boost for folks using NIS or LDAP.
MFC after: 3 days
Thanks to: Jun Kuriyama (for reminding me that this was still on my TODO list)
This closes a security hole. Otherwise, libarchive will happily
extract into directories to which it lacks write permissions by
resetting the permissions during the extract.
Thanks to: Kris Kennaway
_mcount() stub when profiling is enabled. Emit this code sequence
for assembly routines as welli (MCOUNT definition in <machine/asm.h>.
We do not pass the GOT entry however as the 4th argument, because it's
not used. The _mcount() stub calls __mcount(), which does the actual
work. Define _MCOUNT_DECL to define __mcount. We do not have an
implementation of mcount(), so we define MCOUNT as empty, but have a
weak alias to _mcount() in _mcount.S.
Note that the _mcount() stub in the kernel is slightly different from
the stub in userland. This is because we do not have to worry about
nested routines in the kernel.
64 bit systems, years roughly -2^31 through 2^31 can be represented in
time_t without any trouble. 32 bit time_t systems only range from
roughly 1902 through 2038. As a consequence, none of the date munging
code for all the various calendar tweaks before then is present. There
are other problems including the fact that there was no 'year zero' and
so on. So rather than get excited about trying to figure out when the
calendar jumped by two weeks etc, simply disallow negative (ie: prior to
1900) years.
This happens to have an important side effect. If you bzero a 'struct
tm', it corresponds to 'Jan 0, 1900, 00:00 GMT'. This happens to be
representable (after canonification) in 64 bit time_t space. Zero tm
structs are generally an error and mktime normally returns -1 for them.
Interestingly, it tries to canonify the 'jan 0' to 'dec 31, 1899', ie:
year -1. This conveniently trips the negative year test above, which
means we can trivially detect the null 'tm' struct.
This actually tripped up code at work. :-/ (Don't ask)
GP register, because it's clobbered for calls across load modules. The
previous commit inserted the call to _init_tls() between the call to
atexit() and the restoration of the GP register clobbered by it. Fix:
restore GP before we call _init_tls().
Pointy hat: dfr@
19 column positions wide in the first line and 20 in the rest of the lines.
This fixes the example to provide the correct output.
PR: 53454
Noticed by: Kuang-che Wu <kcwu@kcwu.homeip.net>
Submitted by: Marc Silver <marcs@draenor.org>
Approved by: re (scottl)
simple errx() function.
Improve behavior when bzlib/zlib are missing by detecting and
issuing an error message on attempts to read gzip/bzip2 compressed
archives.
a knob to force process scope threads. If the environment variable
LIBPTHREAD_PROCESS_SCOPE is set, force all threads to be process
scope threads regardless of how the application creates them. If
LIBPTHREAD_SYSTEM_SCOPE is set (forcing system scope threads), it
overrides LIBPTHREAD_PROCESS_SCOPE.
$ # To force system scope threads
$ LIBPTHREAD_SYSTEM_SCOPE=anything threaded_app
$ # To force process scope threads
$ LIBPTHREAD_PROCESS_SCOPE=anything threaded_app
if it is true.
2.Add thread_db api td_thr_tls_get_addr to get tls address, the real code
is commented out util tls patch is committed.
Reviewed by: deischen
and close it) and "finish" (destroy the object) functions. For backwards
compat and simplicity, have "finish" invoke "close" transparently if needed.
This allows clients to close the archive and check end-of-operation
statistics before destroying the object.
LIBPTHREAD_SYSTEM_SCOPE in the environment.
You can still force libpthread to be built in strictly 1:1 by
adding -DSYSTEM_SCOPE_ONLY to CFLAGS. This is kept for archs
that don't yet support M:N mode.
Requested by: rwatson
Reviewed by: davidxu
is present for FreeBSD. If you "make distfile" on FreeBSD, you will
soon have a tar.gz file suitable for deploying to other systems
(complete with the expected "configure" script, etc). This latter
relies (at least for now) on the GNU auto??? tools. (I like autoconf
okay, but someday I hope to write a custom Makefile.in and dispense
with automake, which is somewhat odious.)
As part of this, I've cleaned up some of the conditional
compilation options, added make-foo to construct archive.h dynamically
(it now contains some version constants), and added some useful
informational files.
mode bits when setting permissions from ACL data.
Thanks to: David Gilbert for first reporting this and
Jimmy Olgeni for noticing that it only occurred on
ACL-enabled filesystems.
of releases. The -DNOCRYPT build option still exists for anyone who
really wants to build non-cryptographic binaries, but the "crypto"
release distribution is now part of "base", and anyone installing from a
release will get cryptographic binaries.
Approved by: re (scottl), markm
Discussed on: freebsd-current, in late April 2004
the chunk will never be added to the list in that case. Use type mbr
for GPT nested MBRs and use type part for any partition we don't know
or care about. Since the subtype is 0, this should not cause confusion.
libc. The externally-visible effect of this is to add __isnanl() to
libm, which means that libm.so.2 can once again link against libc.so.4
when LD_BIND_NOW is set. This was broken by the addition of fdiml(),
which calls __isnanl().
functions. Basically, the ip_next() function was used to get the PPTP and
Skinny headers when tcp_next() should have been used instead. Symptoms of
this included a segfault in natd when trying to process a PPTP or Skinny
packet.
Approved by: des
example. The externs haven't been needed in about 10 years, so
there's no reason to have them other than for hysterical raisins. And
the California Rasins haven't been around for a long time...
o In the rwlock code: move a duplicated check inside an if..else to after
the if...else clause.
o When initializing a static rwlock move the initialization check
inside the lock.
o In thr_setschedparam.c: When breaking out of the trylock...retry if busy
loop make sure to reset the mtx pointer to null if the mutex is nolonger
in a queue.
code on the existence of the relevant libraries (actually,
the existence of the include files).
This will allow the library to be easily ported to systems
that don't have these libraries. (Of course, it also means
that clients using the library on such systems will not be
able to take advantage of the automatic compression format
detection.)
to describe the 4.4BSD extension of accepting arguments outside the range
of unsigned char. This gives us freedom to remove this extension when we
remove the <rune.h> interface in FreeBSD 6.
These convert plain ASCII characters in-line, making them only slightly
slower than the single-byte ("NONE" encoding) version when processing
ASCII strings.
in the regular ustar header that are overridden by the pax
extended attributes. As a result, it makes perfect sense to
use numeric extensions in the regular ustar header so that readers
that don't understand pax extensions but do understand some other
extensions can still get useful information out of it.
This is especially important for filesizes, as the failure to
read a file size correctly can get the reader out of sync.
This commit introduces a "non-strict" option into the internal
function to format a ustar header. In non-strict mode, the formatter
will use longer octal values (overwriting terminators) or binary
("base-256") values as needed to ensure that large file sizes,
negative mtimes, etc, have the correct values stored in the regular
ustar header.
archive_version: Returns a text string, e.g., "libarchive 1.00.000"
archive_api_version: Returns the SHLIB major version
archive_api_feature: Returns a feature number useful for answering
questions such as "Is this recent enough to do XXX?"
The last two also have macros defined in archive.h, so you can compare
the compile-time and run-time environments. (In particular, you can
compare ARCHIVE_API_VERSION to archive_api_version() to detect library
version mismatches.)
With these in hand, it will soon be time to turn on the
shared-library version of libarchive... stay tuned.
Thanks to: David O'Brien for pointing this out.
Also, add in a few additional portability tweaks and make a few
more things conditional on features (HAVE_XXXX macros) rather
than platform.
In particular, this means we can now correctly read gtar archives that
contain timestamps prior to the start of the Epoch.
Also, make the code in this area more portable. ANSI C99 headers are
not yet ubiquitous (for example, FreeBSD 4 still lacks them), so be
prepared for systems that don't have the INT64_MAX, INT64_MIN, and
UINT64_MAX macros. This version still requires int64_t and uint64_t be
defined (which can be done in archive_platform.h if necessary), but
doesn't require them to be exactly 64 bits.
convenient when the source string isn't null-terminated.
Implement the other conversion functions (mbstowcs(), mbsrtowcs(), wcstombs(),
wcsrtombs()) in terms of these new functions.
for statfs(2). This is false, if the pathname specified
is a regular file, then the information for the file
system that the file lives on will be returned.
Approved by: bmilekic (mentor)
gcc is using. This fixes devstat consumers (like vmstat, iostat,
systat) so they don't print crazy zillion digit numbers for
disk transfers and bandwidth.
According to gcc, long doubles are 64-bits, rather than 128 bits
like the SVR4 ABI spec wants them to be.. Note that MacOSX also treats
long doubles as 64-bits, and not 128 bits, so we are in good company.
Reviewed by: das
Approved by: grehan
- It was added to libc instead of libm. Hopefully no programs rely
on this mistake.
- It didn't work properly on large long doubles because its argument
was converted to type double, resulting in undefined behavior.
notably, this restores some of the contents in thread_db.h as well
as David Xu's copyright notice. This also fixes the includes in
the MD libpthread files which Scott tried to provide a quick fix
for.
Pointy hat: marcel
i386, ia64 and sparc64. Add stubs for alpha, amd64, ia64 and sparc64 for
libpthread.
Restructure the source files to avoid unnecessary use of subdirectories
that also force us to use non-portable compilation flags to deal with
the uncommon compilation requirements (building archive libraries for
linkage into a shared library).
The libpthread support has been copied from the original local and
cleaned-up to make them WARNS=2 clean.
that also force us to use non-portable compilation flags to deal with
the uncommon compilation requirements (building archive libraries for
linkage into a shared library).
The libpthread support has been copied from the original local and
cleaned-up to make them WARNS=2 clean.
Tested on: amd64, i386, ia64
- Unlike the builtin relational operators, builtin floating-point
constants were not available until gcc 3.3, so account for this.[1]
- Apparently some versions of the Intel C Compiler fallaciously define
__GNUC__ without actually being compatible with the claimed gcc
version. Account for this, too.[2]
[1] Noticed by: Christian Hiris <4711@chello.at>
[2] Submitted by: Alexander Leidinger <Alexander@Leidinger.net>
The getfsstat(2) function expects a buffer and a count, and returns a count.
The confusing part is that the count it takes is a byte count, while the
return value is a count of the number of structures it has filled out.
Spell this out.
1. Add global varible _libkse_debug, debugger uses the varible to identify
libpthread. when the varible is written to non-zero by debugger, libpthread
will take some special action at context switch time, it will check
TMDF_DOTRUNUSER flags, if a thread has the flags set by debugger, it won't
be scheduled, when a thread leaves KSE critical region, thread checks
the flag, if it was set, the thread relinquish CPU.
2. Add pq_first_debug to select a thread allowd to run by debugger.
3. Some names prefixed with _thr are renamed to _thread prefix.
which is allowed to run by debugger.
idea is that we perform multibyte->wide character conversion while parsing
and compiling, then convert byte sequences to wide characters when they're
needed for comparison and stepping through the string during execution.
As with tr(1), the main complication is to efficiently represent sets of
characters in bracket expressions. The old bitmap representation is replaced
by a bitmap for the first 256 characters combined with a vector of individual
wide characters, a vector of character ranges (for [A-Z] etc.), and a vector
of character classes (for [[:alpha:]] etc.).
One other point of interest is that although the Boyer-Moore algorithm had
to be disabled in the general multibyte case, it is still enabled for UTF-8
because of its self-synchronizing nature. This greatly speeds up matching
by reducing the number of multibyte conversions that need to be done.
consequently the exponent is only 11 bits. Testing whether the
exponent equals 32767 in that case only effects to compiler warnings
and thus build breakage.
isnormal() the hard way, rather than relying on fpclassify(). This is
a lose in the sense that we need a total of 12 functions, but it is
necessary for binary compatibility because we have never bumped libm's
major version number. In particular, isinf(), isnan(), and isnanf()
were BSD libc functions before they were C99 macros, so we can't
reimplement them in terms of fpclassify() without adding a dependency
on libc.so.5. I have tried to arrange things so that programs that
could be compiled in FreeBSD 4.X will generate the same external
references when compiled in 5.X. At the same time, the new macros
should remain C99-compliant.
The isinf() and isnan() functions remain in libc for historical
reasons; however, I have moved the functions that implement the macros
isfinite() and isnormal() to libm where they belong. Moreover,
half a dozen MD versions of isinf() and isnan() have been replaced
with MI versions that work equally well.
Prodded by: kris
builtins are available: HUGE_VAL, HUGE_VALF, HUGE_VALL, INFINITY,
and NAN. These macros now expand to floating-point constant
expressions rather than external references, as required by C99.
Other compilers will retain the historical behavior. Note that
it is not possible say, e.g.
#define HUGE_VAL 1.0e9999
because the above may result in diagnostics at translation time
and spurious exceptions at runtime. Hence the need for compiler
support for these features.
Also use builtins to implement the macros isgreater(),
isgreaterequal(), isless(), islessequal(), islessgreater(),
and isunordered() when such builtins are available.
Although the old macros are correct, the builtin versions
are much faster, and they avoid double-expansion problems.
class. This is necessary in order to implement tr(1) efficiently in
multibyte locales, since the brute force method of finding all characters
in a class is infeasible with a 32-bit (or wider) wchar_t.
code. <whew!> This version handles all of the following edge cases:
* Restoring explicit dirs with 000 permissions (star fails this test)
* Restore of implicit or explicit dirs when umask=777
(gtar and star both fail this test)
* Restoring dir paths containing "." and ".." components
This version initially creates all dirs with permission 700 (ignoring
umask), then does a post-extract "fixup" pass to set the correct
permissions (which may or may not depend on umask, depending on the
restore flags and whether it's an explicit or implicit dir).
Permissions are restored depth-first so that permissions within
non-writable dirs can be correctly restored. (The depth-sorting does
correctly account for dirs with ".." components.)
under the RETURN VALUES section so it is consistent with others.
Cleanup the return value text for getenv(3) a little while I am here.
PR: docs/58033
MFC after: 3 days
{ip,udp,tcp} header and return a void * pointing to the payload (i.e. the
first byte past the end of the header and any required padding). Use them
consistently throughout libalias to a) reduce code duplication, b) improve
code legibility, c) get rid of a bunch of alignment warnings.
a short pointer. The previous implementation seems to be in a gray zone
of the C standard, and GCC generates incorrect code for it at -O2 or
higher on some platforms.
These trivial implementations are about 25 times slower than
rint{,f}() on x86 due to the FP environment save/restore.
They should eventually be redone in terms of fegetround() and
bit fiddling.
named link, foo_link or link_foo to lnk, foo_lnk or lnk_foo, fixing
signed / unsigned comparisons, and shoving unused function arguments
under the carpet.
I was hoping WARNS?=6 might reveal more serious problems, and perhaps
the source of the -O2 breakage, but found no smoking gun.
pointer to the corresponding struct thread to the thread ID (lwpid_t)
assigned to that thread. The primary reason for this change is that
libthr now internally uses the same ID as the debugger and the kernel
when referencing to a kernel thread. This allows us to implement the
support for debugging without additional translations and/or mappings.
To preserve the ABI, the 1:1 threading syscalls, including the umtx
locking API have not been changed to work on a lwpid_t. Instead the
1:1 threading syscalls operate on long and the umtx locking API has
not been changed except for the contested bit. Previously this was
the least significant bit. Now it's the most significant bit. Since
the contested bit should not be tested by userland, this change is
not expected to be visible. Just to be sure, UMTX_CONTESTED has been
removed from <sys/umtx.h>.
Reviewed by: mtm@
ABI preservation tested on: i386, ia64
applied to their permissions. Just calculate the
default dir mode once and use it consistently, rather than
trying to remember to calculate it everywhere it's needed.
* Rename some variables/functions/etc to try to make things clearer.
* Add separate flags to control fflag/acl restore
* Collect metadata restore into a single function for clarity
* Propagate errors in metadata restore back out to the client
* Fix some places where errors were being returned when they
shouldn't and vice-versa
* Modes are now always restored; ARCHIVE_EXTRACT_PERM just controls
whether or not umask is obeyed.
* Restore suid/sgid bits only if user/group matches archive
* Cache the last stat results to try to reduce the number of stat calls
archive_entry.
Update the Makefile MLINKS and manpage to bring it up-to-date with
the current status of archive_entry. At least the manpage actually
lists all of the functions now, even if it doesn't really yet explain
them all.
Mostly, these were being used correctly even though a lot of
variables and function names were mis-named.
In the process, I found and fixed a couple of latent bugs and
added a guard against adding an archive to itself.
a/././b/../b/../c/./../d/e/f now work correctly. And yes, a/b and a/c
both get created in this example; if you want, you can create an
entire dir heirarchy from a tar archive with only one entry.
More tweaks to umask support: umasks are now obeyed for all objects,
not just directories; the umask used is now the one in effect at the
corresponding call to archive_read_extract(), so clients that want to
tinker with umask during extract should get the expected behavior.
a fork, make sure that the current thread isn't detached and freed. As
a consequence the thread should be inserted into the head of the
active list only once (in the beginning).
umask in effect when the archive is closed
* Correct a typo that broke implicit dir creation for non-directories.
Thanks to: Garret A Wollman for pointing out my umask oversight
read_extract_dir (which creates directories in the archive). This
brings a number of advantages:
* FINALLY fix the problems creating dirs ending in "/." <sigh>
* Missing parent dirs now get created securely, just like explicit dirs.
(Created 0700 initially, then edited to 0755 at end of extraction.)
* Eliminate some duplicate code and some weird special cases.
While I'm cleaning, inline the regular-file creation code as well.
This change also pointed out one API deficiency: the
archive_read_data_into_XXX functions were originally defined to return
the total bytes read. This is, of course, ambiguous when dealing with
non-contiguous files. Change it to just return a status value.
with ``__'' to avoid polluting the namespace. This doesn't change the
documented rune interface at all, but breaks applications that accessed
_RuneLocale directly.
These routines are specified in C99 for the sake of
architectures where an int isn't big enough to represent
the full range of floating-point exponents. However,
even the 128-bit long double format has an exponent smaller
than 15 bits, so for all practical purposes, scalbln() and
scalblnf() are aliases for scalbn() and scalbnf(), respectively.
ki_childutime, and ki_emul. Also uses the timeradd() macro to correct
the calculation of ki_childtime. That will correct the value returned
when ki_childtime.tv_usec > 1,000,000.
This also implements a new KERN_PROC_GID option for kvm_getprocs().
It also implements the KERN_PROC_RGID and KERN_PROC_SESSION options
which were added to sys/kern/kern_proc.c revision 1.203.
PR: bin/65803 (a very tiny piece of the PR)
Submitted by: Cyrille Lefevre
kicking and screaming into the 1980's. This change converts most of
the markup from man(7) to mdoc(7) format, and I believe it removes or
updates everything that was flat out wrong. However, much work is
still needed to sanitize the markup, improve coverage, and reduce
overlap with other manpages. Some of the sections would better belong
in a philosophy_of_w_kahan.3 manpage, but they are informative and
remain at least as reminders of topics to cover.
Reviewed by: doc@, trhodes@
The big lines are:
NODEV -> NULL
NOUDEV -> NODEV
udev_t -> dev_t
udev2dev() -> findcdev()
Various minor adjustments including handling of userland access to kernel
space struct cdev etc.
file already exists on disk.
Pointed out by: www/resin3 port (whose distfile contains the same file
twice with different permissions and relies on the permissions associated
with the second instance)
Thanks again to: Kris Kennaway
This is a corresponding change to bin/67994. I'll soon commit
bin/67994 into 4-STABLE. Actually, 5-CURRENT's getaddrinfo()
doesn't have the problem mentiond in bin/67994. However, it is
good to be in sync variable name with 4-STABLE and KAME.
PR: bin/67994
Submitted by: JINMEI Tatuya <jinmei@ocean.jinmei.org>
* Restore directories with 0700 permissions initially,
then use the fixup pass to correct the permissions
* Trim trailing "/" and "/." in mkdirpath()
Suggested by: Garrett Wollman
sigsuspend, thread shouldn't wait, in old code, it may be
ignored.
When a signal handler is invoked in sigsuspend, thread gets
two different signal masks, one is in thread structure,
sigprocmask() can retrieve it, another is in ucontext
which is a third parameter of signal handler, the former is
the result of sigsuspend mask ORed with sigaction's sa_mask
and current signal, the later is the mask in thread structure
before sigsuspend is called. After signal handler is called,
the mask in ucontext should be copied into thread structure,
and becomes CURRENT signal mask, then sigsuspend returns to
user code.
Reviewed by: deischen
Tested by: Sean McNeil <sean@mcneil.com>
on all inputs of the form x.75, where x is an even integer and
log2(x) = 21. A similar problem occurred when rounding upward.
The bug involves the following snippet copied from rint():
i>>=1;
if((i0&i)!=0) i0 = (i0&(~i))|((0x100000)>>j0);
The constant 0x100000 should be 0x200000. Apparently this case was
never tested.
It turns out that the bit manipulation is completely superfluous
anyway, so remove it. (It tries to simulate 90% of the rounding
process that the FPU does anyway.) Also, the special case of +-0 is
handled twice (in different ways), so remove the second instance.
Throw in some related simplifications from bde:
- Work around a bug where gcc fails to clip to float precision by
declaring two float variables as volatile. Previously, we
tricked gcc into generating correct code by declaring some
float constants as doubles.
- Remove additional superfluous bit manipulation.
- Minor reorganization.
- Include <sys/types.h> explicitly.
Note that some of the equivalent lines in rint() also appear to be
unnecessary, but I'll defer to the numerical analysts who wrote it,
since I can't test all 2^64 cases.
Discussed with: bde
permission), try to continue in FTS_DONTCHDIR mode. Of course this
won't work for long paths, but we can't descend more than one pathname
component beyond the directory anyway if we lack search permission.
Here is a transcript demonstrating the change, where oldls is ls(1)
linked with the old fts(3):
das@VARK:~> mkdir t && touch t/{a,b,c} && chmod u-x t
das@VARK:~> oldls t
a b c
das@VARK:~> oldls -l t
das@VARK:~> \ls t
a b c
das@VARK:~> \ls -l t
ls: a: Permission denied
ls: b: Permission denied
ls: c: Permission denied
I had forgotten about this patch until bde reminded me. He reports
using it without problems for over a year.
PR: 45723
writable. Affected callers include fwrite(), put?(), and *printf().
The issue of whether this is the right errno for funopened streams is
unresolved, but that's an obscure case, and some errno is better than
no errno.
Discussed with: bde, jkh
distinguish files from dirs (trailing '/' indicated a dir). Since
POSIX.1-1987, this convention is no longer necessary. However, there
are current tar programs that pretend to write POSIX-compliant
archives, yet store directories as "regular files", relying on this
old filename convention to save them. <sigh> So, move the check for
this old convention so it applies to all tar archives, not just those
identified as "old."
Pointed out by: Broken distfile for audio/faad port
it sees a truncated input the first time it gets called.
(In particular, files shorter than 512 bytes cannot be tar archives.)
This allows the top-level archive_read_next_header code to
generate a proper error message for unrecognized file types.
Pointed out by: numerous ports that expect tar to extract non-tar files ;-(
Thanks to: Kris Kennaway
It does not appear to be possible to cross-build arm from i386 at the
moment, and I have no ARM hardware anyway. Thus, I'm sure there are
bugs. I will gladly fix these when the arm port is more mature.
Reviewed by: standards@
features appear to work, subject to the caveat that you tell gcc you
want standard rather than recklessly fast behavior
(-mieee-with-inexact -mfp-rounding-mode=d).
The non-standard feature of delivering a SIGFPE when an application
raises an unmasked exception does not work, presumably due to a kernel
bug. This isn't so bad given that floating-point exceptions on the
Alpha architecture are not precise, so making them useful in userland
requires a significant amount of wizardry.
Reviewed by: standards@
version called the higher-level archive_read_data and
archive_read_data_skip functions, which screwed up state management of
those functions. This bit of mis-design has existed for a long time,
but became a serious issue with the recent changes to the
archive_read_data APIs, which added more internal state to the
high-level archive_read_data function. Most common symptom was a
failure to correctly read 'L' entries (long filename) from GNU-style
archives, causing the message ": Can't open: No such file or
directory" with an empty filename.
Pointed out by: Numerous port build failures
Thanks to: Kris Kennaway
that as end-of-archive. Otherwise, a short read at this point
generates an error. This accomodates broken tar writers (such as the
one apparently in use at AT&T Labs) that don't even write a single
end-of-archive block.
Note that both star and pdtar behave this way as well.
In contrast, gtar doesn't complain in either case, and as a
result, will generate no warning for a lot of trashed archives.
Pointed out by: shells/ksh93 port (Thanks to Kris Kennaway)
push extract data down into archive_read_extract.c and out
of the library-global archive_private.h; push dir-specific
mode/time fixup down into dir restore function; now that the
fixup list is file-local, I can use somewhat more natural
naming.
Oh, yeah, update a bunch of comments to match current reality.
reflect src/libexec/rtld-elf/rtld.c rev. 1.68 - the globally-loaded
objects (RTLD_GLOBAL) are searched before the local object's DAG's.
PR: 62770
Submitted by: Kimura Fuyuki <fuyuki@nigredo.org>
We approximate pi with more than float precision using pi_hi+pi_lo in
the usual way (pi_hi is actually spelled pi in the source code), and
expect (float)0.5*pi_lo to give the low part of the corresponding
approximation for pi/2. However, the high part for pi/2 (pi_o_2) is
rounded to nearest, which happens to round up, while the high part for
pi was rounded down. Thus pi_o_2+(float)0.5*pi (in infinite precision)
was a very bad approximation for pi/2 -- the low term has the wrong
sign and increases the error drom less than half an ULP to a full ULP.
This fix rounds up instead of down for pi_hi. Consistently rounding
down instead of up should work, and is the method used in e_acosf.c
and e_asinf.c. The reason for the difference is that we sometimes
want to return precisely pi/2 in e_atan2f.c, so it is convenient to
have a correctly rounded (to nearest) value for pi/2 in a variable.
a_acosf.c and e_asinf.c also differ in directly approximating pi/2
instead pi; they multiply by 2.0 instead of dividing by 0.5 to convert
the approximation.
These complications are not directly visible in the double precision
versions because rounding to nearest happens to round down.
* New read_data_block is both sparse-file aware and uses zero-copy semantics
* Push read_data_block down into specific formats (opens door to
various encoded entry bodies, such as zip or gtar -S)
* Reimplement read_data, read_data_skip, read_data_into_fd in terms
of new read_data_block.
* Update documentation
It's unfortunate that I couldn't just call the new interface
archive_read_data, but didn't want to upset the API that much.
and not tanf() because float type can't represent numbers large enough
to trigger the problem. However, there seems to be a precedent that
the float versions of the fdlibm routines should mirror their double
counterparts.
Also update to the FDLIBM 5.3 license.
Obtained from: FDLIBM
Reviewed by: exhaustive comparison
where the exponent is an odd integer and the base is negative).
Obtained from: fdlibm-5.3
Sun finally released a new version of fdlibm just a coupe of weeks
ago. It only fixes 3 bugs (this one, another one in pow() that we
already have (rev.1.9), and one in tan(). I've learned too much about
powf() lately, so this fix was easy to merge. The patch is not verbatim,
because our base version has many differences for portability and I
didn't like global renaming of an unrelated variable to keep it separate
from the sign variable. This patch uses a new variable named sn for
the sign.
[t=p_l+p_h High]. We multiply t by lg2_h, and want the result to be
exact. For the bogus float case of the high-low decomposition trick,
we normally discard the lowest 12 bits of the fraction for the high
part, keeping 12 bits of precision. That was used for t here, but it
doesnt't work because for some reason we only discard the lowest 9
bits in the fraction for lg2_h. Discard another 3 bits of the fraction
for t to compensate.
This bug gave wrong results like:
powf(0.9999999, -2.9999995) = 1.0000002 (should be 1.0000001)
hex values: 3F7FFFFF C03FFFFE 3F800002 3F800001
As explained in the log for the previous commit, the bug is normally
masked by doing float calculations in extra precision on i386's, but
is easily detected by ucbtest on systems that don't have accidental
extra precision.
This completes fixing all the bugs in powf() that were routinely found
by ucbtest.
(1) The bit for the 1.0 part of bp[k] was right shifted by 4. This seems
to have been caused by a typo in converting e_pow.c to e_powf.c.
(2) The lower 12 bits of ax+bp[k] were not discarded, so t_h was actually
plain ax+bp[k]. This seems to have been caused by a logic error in
the conversion.
These bugs gave wrong results like:
powf(-1.1, 101.0) = -15158.703 (should be -15158.707)
hex values: BF8CCCCD 42CA0000 C66CDAD0 C66CDAD4
Fixing (1) gives a result wrong in the opposite direction (hex C66CDAD8),
and fixing (2) gives the correct result.
ucbtest has been reporting this particular wrong result on i386 systems
with unpatched libraries for 9 years. I finally figured out the extent
of the bugs. On i386's they are normally hidden by extra precision.
We use the trick of representing floats as a sum of 2 floats (one much
smaller) to get extra precision in intermediate calculations without
explicitly using more than float precision. This trick is just a
pessimization when extra precision is available naturally (as it always
is when dealing with IEEE single precision, so the float precision part
of the library is mostly misimplemented). (1) and (2) break the trick
in different ways, except on i386's it turns out that the intermediate
calculations are done in enough precision to mask both the bugs and
the limited precision of the float variables (as far as ucbtest can
check).
ucbtest detects the bugs because it forces float precision, but this
is not a normal mode of operation so the bug normally has little effect
on i386's.
On systems that do float arithmetic in float precision, e.g., amd64's,
there is no accidental extra precision and the bugs just give wrong
results.
there's no need to enable support for it separately
from 'tar.' (The call to enable gnutar support is
now just an alias for the tar support, left in to
avoid API breakage.)
multibyte representation in conversion state objects, store the
accumulated wide character, set number and number of bytes remaining
to avoid having to derive them every time mbrtowc() is called.
certain flags set (e.g., schg or uappend) would fail because the flags
were restored before the hardlink was created.
To address this, I've generalized the existing machinery for deferring
directory timestamp/mode restoration and used it to defer the
restoration of highly-restrictive flags to the end of the extraction,
after any links have been created.
Pointed out by: Pawel Jakub Dawidek (pjd@)
through byte by byte with mbrtowc(). In the usual case (buffer is big
enough to contain the multibyte character, character does not straddle
buffer boundary) this results in only one call to mbrtowc() for each
wide character read.
to the initial state when a stream is opened or seeked upon. Use the
stream's conversion state object instead of a freshly-zeroed one in
fgetwc(), fputwc() and ungetwc().
This is only a performance improvement for now, but it would also be
required in order to support state-dependent encodings.
followed are: Only 3 functions (pthread_cancel, pthread_setcancelstate,
pthread_setcanceltype) are required to be async-signal-safe by POSIX. None of
the rest of the pthread api is required to be async-signal-safe. This means
that only the three mentioned functions are safe to use from inside
signal handlers.
However, there are certain system/libc calls that are
cancellation points that a caller may call from within a signal handler,
and since they are cancellation points calls have to be made into libthr
to test for cancellation and exit the thread if necessary. So, the
cancellation test and thread exit code paths must be async-signal-safe
as well. A summary of the changes follows:
o Almost all of the code paths that masked signals, as well as locking the
pthread structure now lock only the pthread structure.
o Signals are masked (and left that way) as soon as a thread enters
pthread_exit().
o The active and dead threads locks now explicitly require that signals
are masked.
o Access to the isdead field of the pthread structure is protected by both
the active and dead list locks for writing. Either one is sufficient for
reading.
o The thread state and type fields have been combined into one three-state
switch to make it easier to read without requiring a lock. It doesn't need
a lock for writing (and therefore for reading either) because only the
current thread can write to it and it is an integer value.
o The thread state field of the pthread structure has been eliminated. It
was an unnecessary field that mostly duplicated the flags field, but
required additional locking that would make a lot more code paths require
signal masking. Any truly unique values (such as PS_DEAD) have been
reborn as separate members of the pthread structure.
o Since the mutex and condvar pthread functions are not async-signal-safe
there is no need to muck about with the wait queues when handling
a signal ...
o ... which also removes the need for wrapping signal handlers and sigaction(2).
o The condvar and mutex async-cancellation code had to be revised as a result
of some of these changes, which resulted in semi-unrelated changes which
would have been difficult to work on as a separate commit, so they are
included as well.
The only part of the changes I am worried about is related to locking for
the pthread joining fields. But, I will take a closer look at them once this
mega-patch is committed.
character, as some tar implementations incorrectly include a '/' with
the prefix.
Thanks to: Divacky Roman for the UnixWare 7 tarfile that
demonstrated this issue.
code available at tuhs.org, and found out that my chronology
is a bit off. In particular, Seventh Edition already used
the "linkflag" and "linkname" fields. Also, it appears that
there was no tar in Sixth Edition, contrary to what an earlier
tar.1 manpage claimed.
A few mdoc fixes also crept in here.
the size field for a hardlink entry. Specifically, ensure that
we do obey the size field for archives that we know are pax interchange
format archives, as required by POSIX.
Also, clarify the comment explaining why this is necessary and explain
the (very unusual) conditions under which it might fail.
a non-zero size but no body, some write a non-zero size and include
a body. To distinguish these cases, look for a valid tar header immediately
following a hardlink header with non-zero size.
of fcntl(2), flock(2), and lockf(3) advisory locks.
Add such a paragraph to the flock(2) manpage for the
sake of consistency.
Reviewed by: Cyrille Lefevre and Kirk McKusick on -arch
MFC after: 2 weeks
Correct my previous commit and add a comment to the manpage
indicating that the user must set errno to 0 if they wish to
distinguish "no such user" from "error".
Pointed out by: Jacques Vidrine (nectar@)
low bound, and the number of bytes remaining instead of storing the
raw byte sequence and deriving them every time mbrtowc() is called.
This is much faster -- about twice as fast in some crude benchmarks.
constants the wrong way on the VAX. Instead, use C99 hexadecimal
floating-point constants, which are guaranteed to be exact on binary
IEEE machines. (The correct hexadecimal values were already provided
in the source, but not used.) Also, convert the constants to
lowercase to work around a gcc bug that wasn't fixed until gcc 3.4.0.
Prompted by: stefanf
Main ones: mostly use conditional expressions in ifdefs instead of a
mixture of conditional expressions and nested ifdefs.
Nearby ones:
- don't do less than echo the code in the comment about libc_r
- fixed some internal insertion sort errors and indentation errors.
Remove "sys/types.h" as "sys/param.h" is already included
Use cast rather than back-pointer to convert from public to private
version of FTS data, and so avoid littering fts.h with any of the
details.
Pointed out By: bde, kientzle
"A trailing newline is added if none is present."
The code in syslogd, stderr, and console output always adds a newline
at the EOL. However, the existing code never actually removed a
trailing newline, and apparently relied on syslogd to convert it
into a space character. Thus, the existing newline was converted
to a trailing space at the EOL by syslogd, while stderr, and console
output resulted in an empty line.
MFC after: 2 weeks
RuneRange arrays. This is much faster when there are hundreds of
ranges (as is the case in UTF-8 locales) and was inspired by a
similar change made by Apple in Darwin.
of stat(2) calls by keeping an eye of the number of links a directory
has. It assumes that each subdirectory will have a hard link to its
parent, to represent the ".." node, and stops calling stat(2) when
all links are accounted for in a given directory.
This assumption is really only valid for UNIX-like filesystems: A
concrete example is NTFS. The NTFS "i-node" does contain a link
count, but most/all directories have a link count between 0 and 2
inclusive. The end result is that find on an NTFS volume won't
actually traverse the entire hierarchy of the directories passed
to it. (Those with a link count of two are not traversed at all)
The fix checks the "UFSness" of the filesystem before enabling the
optimisation.
Reviewed By: Tim Kientzle (kientzle@)
*printf() and *scanf(). Currently, this reduces the size of libc.so
by 9K on i386. But the real savings are for static binaries that use
*printf() or *scanf() but not strtod(); with an FP-disabled libc,
these binaries will not depend on the gdtoa routines, making each
binary about 22K smaller.
floating-point support, remove default definition of FLOATING_POINT
from the source, and change the compile-time option to
NO_FLOATING_POINT.
- Remove the HEXFLOAT option. It saves an insignificant amount of
space (<0.1% of the size of libc on i386) and complicates vfprintf()
and checkfmt().
-U flag to bsdtar. Essentially, this option breaks existing hard
links. According to SUSv2, tar is supposed to overwrite existing
files on extract by default which, in particular, preserves
existing hard links. Note that this is yet another bug in gtar; it
appears to always break existing links. (Maybe gtar's -U is broken?)
I'm unsure about how to handle this for other file types; the current
code always unlinks first unless the NO_OVERWRITE flag is specified.
I've commented this issue liberally and will come back to it later.
archive formats supported by libarchive, with some information about
the relative strengths and weaknesses of each format and notes about
issues with libarchive's support for those formats.
This page should make it unnecessary to list all of the libarchive
formats in the manpage of each program that uses libarchive.
Such programs can simply refer to libarchive-formats(5).
* little-endian old-style binary cpio archives
* big-endian old-style binary cpio archives
* SVR4 portable archives without CRC
* SVR4 portable archives with CRC
Note that I don't yet verify the CRC for the last one, and I'm
not quite certain I'm correctly parsing device numbers.
MS-CHAPv1 MPPE-keys).
- Added rad_demangle_mppe_key() for demangling mppe-keys (needed
for MPPE-keys).
- Added some typecasts for avoiding compiler warnings.
- Fix: better handle wrong usage of the lib (if the programmer
has not called rad_create_request() but rad_put_*(), then a
weird error message was returned).
- Added a new function for putting the Message-Authenticator.
- Verify the Message-Authenticator, if it was found inside a
response packet and silently drop the packet, if the validation
failed.
- Implicitly put the Message-Authenticator, if the EAP-Message
attribute was added.
- Added some missing defines.
Submitted by: Michael Bretterklieber
PR: 46555
The new fflags support in archive_entry supports Linux and FreeBSD
file flags and is a bit more gracious about unrecognized flag names
than strtofflags(3). This involves some minor API breakage.
The default tar format ("restricted pax") now enables pax extensions
when archiving files that have flags. In particular, copying dir
heirarchies with 'bsdtar cf - -C src . | bsdtar xpf - -C dest' now
preserves file flags. (Note the "p" on extract!)
While I'm here, fill in some additional explanation in the
archive_entry.3 manpage, fill in some missing MLINKS, mark some
overlooked internal functions 'static', and make a few minor style
fixes.
namespaces are visible. Previously, math.h failed to hide some C99-,
XSI-, and BSD-specific symbols in certain compilation environments.
The referenced PR has a nice listing of the appropriate conditions for
making symbols visible in math.h. The only non-stylistic difference
between the patch in the PR and this commit is that I superfluously
test for __BSD_VISIBLE in a few places to be more explicit about which
symbols have historically been part of the FreeBSD environment.
PR: 65939
Submitted by: Stefan Farfeleder <stefan@fafoe.narf.at>
makeing sure the spinlock isn't already in use might be a nice feature to
have in theory, it's hard to implement in practice since the passed in
pointer may not be NULL, but still be an invalid value (i.e. 1..2..3.. etc).
The original might have pointers to user-specified strings;
copying the string (instead of just the pointer) protects against
the client re-using their own buffers.
I'm trying hard to avoid dumping all of the 'set' string functions
in favor of slower, but more predictable 'copy' semantics.
after their change from an array of char to an array of enum.
This fixes problems that occurred when using positional arguments in
format strings, particularly with more than STATIC_ARG_TBL_SIZE (8)
of them.
PR: 65841
Submitted by: Steven Smith (mostly)
adjunct maps are used. One symtom of this bug is sshd saying:
login_get_lastlog: Cannot find account for uid X
when logging in. The problem here is caused by an incorrect reuse of the rv
variable when previous values are needed later.
High-resolution mtime/ctime/atime is not POSIX-standard, so hide
set/get of high-resolution time fields behind easily-mutable macros.
That makes it easier to change how those fields are accessed.
res_search only incremented got_servfail for h_errno == TRY_AGAIN *AND*
hp->rcode == SERVFAIL. However, there are cases such as timeouts where
rcode is not always set to SERVFAIL. This leads to inconsistent nameserver
operation during multi-domain and truncated dot searches, especially during
booting when portions of the network are being brought up simultanious with
dns lookups.
This patch attempts to correct the problem by unconditionally terminating
the search if TRY_AGAIN is returned (after res_query has gone through all
retries and name servers) instead of trying other domain elements in the
domain seach path.
This patch should fix reported problems (which I can reproduce) with some
NFS mounts failing during boot. This occured because mount_nfs thought the
host name lookup returned a definitive failure using a non-dotted host name
when, in fact, it timed out on the first part (host.search.domain.name) and
got a definitive host-not-found response on the second part (host.).
Generally speaking, search path name server timeouts can exceed 60 seconds
per element and most machines which consistently timeout on earlier portions
of a search path are effectively non-operational due to the imposed delays.
It is more important for DNS lookups to return the proper error code then
to be able to recover a valid lookup in later portions of the search path
in these situations.
Obtained from: DragonFly
MFC after: 3 weeks
Earlier versions of FreeBSD don't support ACLs.
Note that the ACL support code in archive_entry is standalone code and
unaffected by this. (In particular, it should be possible to
manipulate archives containing ACLs even if the ACLs cannot be
restored on the current system.)
* Re-use a single buffer for shar output formatting rather
than hammering the heap. (archive_write_set_format_shar.c)
* Fix a handful of minor memory leaks and clean up some of the
memory-management code.
try to set ACLs even if fflag restore fails, first cut at reading
Solaris tar ACLs
Code improvement: merge gnu tar read support into main tar reader;
this eliminates a lot of duplicate code and generalizes the tar
reader to handle formats with GNU-like extensions.
Style: Makefile cleanup, eliminate 'dmalloc' references, remove 'tartype'
from archive_entry (this makes archive_entry more format-agnostic)
Thanks to: David Magda for providing Solaris tar test files
mode (where the forked thread is the one and only thread and
is marked as system scope), set the system scope flag before
initializing the signal mask. This prevents trying to use
internal locks that haven't yet been initialized.
Reported by: Dan Nelson <dnelson at allantgroup.com>
Reviewed by: davidxu
* ACL storage is no longer erased before a group of entries are added.
* ACL text creation no longer tries to skip over non-existent text.
* UTF8 encoder no longer blows up on invalid wide characters.
* Fixed ACL state management for default ACLs.
Also, publicize function for obtaining text-format ACL in various
formats. The interface is now extensible through a "flags" argument
that allows you to select a variant format.
with 'star' ACL handling, though there's still a
bit more work needed in this area.
Added 'write_open_fd' and 'read_open_fd' to simplify, e.g.,
tar's u and r modes. Eliminated old 'write_open_file_position'
as a bad idea. (It required closing/reopening files to
do updates, which led to unpleasant implications.)
Various other minor fixes, API tweaks, etc.
on temporary nameserver failure. This is necessary to get
getipnodebyname(3) to correctly return h_errno=TRY_AGAIN instead
of HOST_NOT_FOUND.
Reviewed by: green, thomas
MFC after: 1 week
case where an /etc/nsswitch.conf file was present, but could not
be opened (e.g. due to permissions). Previously, the open failure
condition was suppressed, and the built-in defaults were used. In
revision 1.11, however, propagated the open failure causing all
nsdispatch() invocations to return NS_UNAVAIL, and thus many APIs
including getpwnam and gethostbyname unconditionally failed.
This commit restores the previous behavior.
Pointy hat: nectar (+1 for obstinance; ache had to use clue bat)
Reported by: ache
solved by a simple 'make world'. The signalcontext function was going
to the trouble of generating an even 16 byte alignment, but in fact it
needed to be odd aligned to simulate the 8-byte return address having
been pushed by the caller. This fixes yet another group of crashes in
applications using libpthread. And yet again, it was my fault all along.
While here, rename the duplicate internal ctx_wrapper() functions to
makectx_wrapper() and sigctx_wrapper() so that traces aren't ambiguous.
library, it may pull in that thread library at run time. If the
process started out single-threaded, this could cause attempts to
release locks that do not exist. Guard against this possibility by
checking __isthreaded before invoking thread primitives.
A similar problem remains if the process is linked against one thread
library, but the NSS module is linked against another. This can only
be avoided by careful design of the NSS module.
Submitted by: Sean McNeil <sean@mcneil.com> (mostly; bugs are mine)
functionality spelled out in SUSv3.
o Signal of 0 means do everything except send the signal
o Check that the signal is not invalid
o Check that the target thread is not dead/invalid
sigprocmask no longer needs to be wrapped.
o raise(3) is applied to the calling thread in a threaded program.
o In the sigaction wrapper reference the correct structure.
o Don't treat SIGTHR especially anymore (infact it won't exist in
a little while).
we still have to DTRT when an asynchronously cancellable thread is
cancelled while waiting for a mutex.
o While dequeueing a waiting mutex don't skip a thread if it has
a cancel pending. Only skip it if it is also async cancellable.
the cause of any bugs because it is *always* indirectly set
in the for...loop, but better to be explicit about it.
o Check the magic number of the passed in thread only after it has
been found in the active thread list. Otherwise, if the check is done
at the very beginning we may end up pointing to garbage if the
thread was once a valid thread, but has now been destroyed.
* Disabled shared-library building, as some API breakage is
still likely. (I didn't realize it was turned on by default.) If
you have an existing /usr/lib/libarchive.so.2, I recommend deleting it.
* Pax interchange format now correctly stores and reads UTF8
for extended attributes. In particular, pax format can portably
handle arbitrarily long pathnames containing arbitrary characters.
* Library compiles cleanly at -O2, -O3, and WARNS=6 on all
FreeBSD-CURRENT platforms.
* Minor portability improvements inspired by Juergen Lock
and Greg Lewis. (Less reliance on stdint.h, isolating of
various portability-challenged constructs.)
* archive_entry transparently converts multi-byte <-> wide character
strings, allowing clients and format handlers to deal with either
one, as appropriate.
* Support for reading 'L' and 'K' entries in standard tar archives
for star compatibility.
* Recognize (but don't yet handle) ACL entries from Solaris tar.
* Pushed format-specific data for format readers down into
format-specific storage and out of library-global storage. This
should make it easier to maintain individual formats without mucking
with the core library management.
* Documentation updates to track the above changes.
* Updates to tar.5 to correct a few mistakes and add some additional
information about GNU tar and Solaris tar formats.
Notes:
* The basic 'tar' reader is getting more general; there's not much
point in keeping the 'gnutar' reader separate. Merging the two
would lose a bunch of duplicate code.
* The libc ACL support is looking increasingly inadequate for my needs
here. I might need to assemble some fairly significant code for
parsing and building ACLs. <sigh>
ferror(), fileno() and clearerr(), using the value of __isthreaded to
decide between the fast inline single-threaded code and the more
general function equivalent. This gives most of the performance
benefits of the old unsafe macros while preserving thread safety.
addresses. For arch's with 64-bit longs, this is a nop, but for i386 this
allows sysinstall to properly handle disks and filesystems > 1 TB.
Changes from the original patch include:
- Use d_addr_t rather than inventing a blkcnt type based on int64_t.
- Use strtoimax() rather than strtoull() to parse d_addr_t's from config
files.
- Use intmax_t casts and %jd rather than %llu to printf d_addr_t values.
Tested on: i386
Tested by: kuriyama
Submitted by: julian
MFC after: 1 month
MATH_ERREXCEPTION and math_errhandling, so that C99 applications at
least have the possibility of determining that errno is not set for
math functions. Set math_errhandling to the non-standard-conforming
value of 0 for now to indicate that we don't support either method
of reporting errors. We intentionally don't support MATH_ERRNO
because errno is a mistake, and we are missing support for
MATH_ERREXCEPTION (<fenv.h>, compiler support for <fenv.h>, and
actually setting the exception flags correctly).
- Add DECL wrappers to libgeom.h.
- Rename structure members in libgeom.h to use a lg_ prefix for member
names. This is required because a few structures had members named
'class' which made g++ very unhappy.
- Catch gstat(8) and gconcat(8) up to these API changes.
Reviewed by: phk
Portability: Thanks to Juergen Lock, libarchive now compiles cleanly
on Linux. Along the way, I cleaned up a lot of error return codes and
reorganized some code to simplify conditional compilation of certain
sections.
Bug fixes:
* pax format now actually stores filenames that are 101-154
characters long.
* pax format now allows newline characters in extended attributes
(this fixes a long-standing bug in ACL handling)
* mtime/atime are now restored for directories
* directory list is now sorted prior to fix-up to permit
correct restore of non-writable dir heirarchies
structure and call stdio functions. In 5.X this was broken when FILE
locking was introduced into libc.
This change makes most (relevant) stdio functions work again when the
_extra file in FILE isn't initialised (and can't be without a libc
function to do it since the __sFILEX structure is private to libc).
This doesn't yet address the issue of selective restore
of hardlinked files. With cpio format, it's possible to correctly
restore any linked file; the API doesn't yet fully support this.
(There's no way for the library to inform a client whether or not
there's a file body associated with this entry. The assumption
right now is that "hardlink" entries have no file body.)
cleanups, handling 'ls -l-', handling '--*'
Note this is in the same time back out of our v1.3
"Don't print an error message if the bad option is '?'"
because it directly violates POSIX.
the size in the archive_entry object to zero if that format doesn't
store a body for that file type. This allows the client to determine
whether or not it should feed the file body to the archive. In
particular, cpio stores the file body for hardlinks, tar and shar
don't. With this change, bsdtar now correctly archives hardlinks in all
supported formats.
While I'm here, make shar output be more aggressive about creating directories.
Before this, commands such as:
bsdtar -cv -F shar some/explicit/path/to/a/file
wouldn't create the directory. Some simple logic to remember the last
directory creation helps reduce unnecessary mkdirs here.
At this point, I think the only flaw in libarchive's cpio support is
the failure to recognize hardlinks when reading.
While I'm here, fix a bug in reading filenames from
cpio files. (Copy should count the length of the name,
not the number of bytes available for input.)
that this provokes. "Wherever possible" means "In the kernel OR NOT
C++" (implying C).
There are places where (void *) pointers are not valid, such as for
function pointers, but in the special case of (void *)0, agreement
settles on it being OK.
Most of the fixes were NULL where an integer zero was needed; many
of the fixes were NULL where ascii <nul> ('\0') was needed, and a
few were just "other".
Tested on: i386 sparc64
Further contemplation has convinced me that this was
not going to really solve the problem of environment-poisoning
without raising serious administrative headaches. There
must be a better way...
- Fix syntax
- Remove the (slightly wrong) duplicate explanation of the error condition
- Change reference to invalid multibyte character into invalid wide character
This function removes all environment variables except
the ones listed on a "whitelist."
The function accepts two whitelist arguments.
If the first is NULL, a built-in default list will be
used. This allows callers to get a variety of behaviors:
* Default screening: provide NULL for both lists
* Custom screening: provide a custom list for the first argument
* Modified default screening: provide NULL for first arg,
list of additional variables to preserve in the second arg
Idea from: Jacques Vidrine
MFC after: 2 weeks
-static to CFLAGS). It just turned rev.1.5 into an obfuscated no-op.
As explained in the log for rev.1.5, testing should be done in the
host environment but there is a problem in cross-compilation environments.
As not explained in the log for rev.1.6, there was apparently a practical
problem with cross-compiling (makeworld should have set -static in
LDFLAGS but apparently didn't). Cross-compilation was especially
complicated because the relevant programs are test programs that were
run at beforeinstall time -- dynamic libraries might or might not exist
depending on the build options. The complications became moot in
rev.1.8 when beforeinstall was renamed "test".
The getaddrinfo(3), getipnodebyname(3) and resolver(3) can coincide now
with what should be totally reentrant, and h_errno values will now
be preserved correctly, but this does not affect interfaces such as
gethostbyname(3) which are still mostly non-reentrant.
In all of these relevant functions, the thread-safety has been pushed
down as far as it seems possible right now. This means that operations
that are selected via nsdispatch(3) (i.e. files, yp, dns) are protected
still under global locks that getaddrinfo(3) defines, but where possible
the locking is greatly reduced. The most noticeable improvement is
that multiple DNS lookups can now be run at the same time, and this
shows major improvement in performance of DNS-lookup threaded programs,
and solves the "Mozilla tab serialization" problem.
No single-threaded applications need to be recompiled. Multi-threaded
applications that reference "_res" to change resolver(3) options will
need to be recompiled, and ones which reference "h_errno" will also
if they desire the correct h_errno values. If the applications already
understood that _res and h_errno were not thread-safe and had their own
locking, they will see no performance improvement but will not
actually break in any way.
Please note that when NSS modules are used, or when nsdispatch(3)
defaults to adding any lookups of its own to the individual libc
_nsdispatch() calls, those MUST be reentrant as well.
the caller does not specify the rule number -- instead, the kernel
module is probed for the next available rule, which is then used.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, McAfee Research
through a realloc like function.
Make the malloc_active variable a local static to this new function.
Don't warn about recursion more than once per base call.
constify malloc_func.
has been hit, this makes it cover more cases.
Call the message function directly rather than fiddle with flag-saving
when we find an unknown character in our options.
The 'A' flag should not trigger on legal out of memory conditions.
has been executed. On return from the signal handler
the call will either be restarted or EINTR will be returned,
but it will not go back to its previous state. So, it is
sufficient to simply change the state to 'running' without
actually trying to wake up the thread.
a PTHREAD_RWLOCK_INITIALIZER to do for rwlocks what
a similarly named symbol does for statically initialized mutexes.
This symbol was dropped in The Open Group Base Specifications Issue 6
and does not exist in IEEE Std 1003.1, 2003, but it should still be
supported for backwards compatibility.
Pointy hat: mtm
o Instead of checking both the passed in pointer and its value
for NULL, only check the latter. Any caller that passes in
a NULL pointer is obviously wrong.
o Fix mutex priority protocols. Keep separate counts of priority
inheritance and protection mutexes to make things easier.
This will not have much affect since this is only the
userland side, and the rest involves kernel scheduling.
Unfortunately, the stock zlib.h is not:
line 885: 'err' parameter shadows global 'err' definition from <err.h>
Back the WARNS level down to 3 to accomodate borked zlib.h.
reply with a 416 error code (requested range not satisfiable) because
we ask it to start at the end of the file. Handle this gracefully by
considering a 416 reply a success if the requested offset exactly
matches the length of the file and the requested length is zero.
This is the second of two commits; bring in the userland support to finish.
Teach libipsec and setkey about the tcp-md5 class of security associations,
thus allowing administrators to add per-host keys to the SADB for use by
the tcpsignature_compute() function.
Document that a single SPI must be used until such time as the code which
adds support to the SPD to specify flows for tcp-md5 treatment is suitable
for production.
Sponsored by: sentex.net
These files had tags after teh copyright notice,
inside the comment block (incorrect, removed),
and outside the comment block (correct).
Approved by: rwatson (mentor)
These files had tags after the copyright notice,
inside the comment block (incorrect, removed),
and outside the comment block (correct).
Approved by: rwatson (mentor)
What it is:
A library for reading and writing various streaming archive
formats, especially tar and cpio. Being a library, it should
be easy to incorporate into pkg_* tools, sysinstall, and any
other place that needs to read or write such archives.
Features:
* Full automatic detection of both compression and archive format.
* Extensible internal architecture to make it easy to add new formats.
* Support for "pax interchange format," a new POSIX-standard tar format
that eliminates essentially all of the restrictions of historic formats.
* BSD license
Thanks to: jkh for pushing me to start this work, gordon for
encouraging me to commit it, bde for answering endless style
questions, and many others for feedback and encouragement.
Status: Pretty good overall, though there are still a few rough edges and
the library could always use more testing. Feedback eagerly solicited.
overridden by the threads library to provide a userland version
of non-pshared semaphores and cancellation points. Also add
a sem_timedwait().
The libc version of semaphores always uses kernel semaphores
regardless of whether pshared is set or not. When threads are
not present, it is difficult to get sem_wait() or sem_timedwait()
to do the right thing (since pthread_cond_timedwait() and
pthread_cond_wait() are stubs in libc and just return immediately).
caller without closing the disk device and freeing allocated
memory. Not closing the disk device prevents GEOM from retasting
after spoiling.
Pointy hat: marcel
thread library for i386, amd64, and ia64. For alpha
and sparc64 the library is not changed and remains libkse,
and links are installed so that libpthread -> libc_r.
The gcc -pthread option will be changed in a separate
commit so that it links to -lpthread instead of -lc_r.
Approved by: re@
what do I get for my troubles? libc breaks offcourse!
Reimplement a hack (in libthr) that allows libc to use
rwlocks without initializing them first. The hack was reimplemented
so that only a private libc version of the rwlock locking functions
initializes an uninitialized rwlock. The application version will
correctly fail.
the system call got interrupted and the absolute timeout is
converted to a relative timeout, it may happen that we get a
negative number. In such a case, simply set the timeout to
zero so that if the event that the thread wants to wait for has
happened it can still return successfully, but if it hasn't
happened then the thread doesn't suspend indefinitely. This should
fix certain applications (including mozilla) that seem to hang
indefinitely sometimes.
Noticed and debugged by: Morten Johansen <root@morten-johansen.net>
return an error value that made Write_Disk() abort. While on the
subject, improve the initialization of the error variable in read_gpt()
and update_gpt() even though nothing was broken there.
and NgAllocRecvData(), that dynamically allocate buffer for a binary
message, an ascii message, and a data packet, respectively. The size
of the allocated buffer is equal to the socket's receive buffer size
to guarantee that a message or a data packet is not truncated.
- Get rid of the static size buffer in NgSendAsciiMsg().
OK'ed by: archie, julian
instead of creating them by hand and storing them in the CVS tree. Add
gensnmptree to the bootstrap tools (it is used to generated these files).
This simplifies the update procedure.
Submitted by: ru
- bzipfs and gzipfs now properly return errno values directly from their
read routines rather than returning -1.
- missing errno values on error returns for the seek routines on almost
all filesystems were added.
- fstat() now returns -1 if an error occurs rather than ignoring it.
- nfs's readdir() routine now reports valid errno values if an error or
EOF occurs rather than EPERM (It was just returning 0 for success and
1 for failure).
- nullfs used the wrong semantics for every function besides close() and
seek(). Getting it right for close() appears to be an accident at that.
- read() for buffered files no longer returns 0 (EOF) if an error occurs,
but returns -1 instead.
Extend libsdp(3) API to allow service registration and removal.
Fix uninitialized variable bug in sdpcontrol(8).
Reviewed by: imp (mentor)
No objection: ru
This results in no functional change, aside from fixing a data
corruption bug on LP64 platforms. The code here could still use a
significant amount of cleanup.
PR: 56502
Submitted by: hrs (earlier version)
o Simplify the logic by removing a lot of unnecesary nesting
o Reduce the amount of local variables
o Zero-out the allocated structure and get rid of
all the unnecessary setting to 0 and NULL;
Refactor _pthread_mutex_destroy
o Simplify the logic by removing a lot of unnecesary nesting
o No need to check pointer that the mutex attributes points
to. Checking passed in pointer is enough.
a list in the thread structure to keep track of the locks and
how many times they have been locked. This list is checked
on every lock and unlock. The traversal through the list is
O(n). Most applications don't hold so many locks at once that
this will become a problem. However, if it does become a problem
it might be a good idea to review this once libthr is
off probation and in the optimization cycle.
This fixes:
o deadlock when a thread tries to recursively acquire a
read lock when a writer is waiting on the lock.
o a thread could previously successfully unlock a lock it did not own
o deadlock when a thread tries to acquire a write lock on
a lock it already owns for reading or writing [ this is admittedly
not required by POSIX, but is nice to have ]
- Update and improve the documentation for %[aA]
o Like %[eE], %[aA] may round the result if a precision is specified.
o Grammar police: Fix a split infinitive.
o The FreeBSD implementation does better than the minimum required
by C99 (literal translation of the mantissa). The digit before
the hexadecimal-point is never 0 unless the number itself is 0.
o Clarify that the exponent field represents a decimal exponent of 2.
o Discuss the fact that multiple valid representations are possible.
o Remove the entry in the BUGS section claiming that %[aA] is not
implemented.
- Remove the entry in the BUGS section claiming that the ' flag for
printing thousands separators is unimplemented for floating-point.
- Remove the entry in the BUGS section claiming that the L modifier
reduces the precision to "double" before conversion.
on the release media -- only put what is different in the crypto
version compared to the base version. This reduces PAM entries
in /usr/lib in the "crypto" distribution to:
libpam.a
libpam.so@
libpam.so.2
pam_krb5.so@
pam_krb5.so.2
pam_ksu.so@
pam_ksu.so.2
pam_ssh.so@
pam_ssh.so.2
The libpam.so* is still redundant (it is identical to the "base"
version), but we can't set DISTRIBUTION differently for libpam.a
and libpam.so.
(The removal of libpam.so* from the crypto distribution could be
addressed by the release/scripts/crypto-make.sh script, but then
we'd also need to remove redundant PAM headers, and I'm not sure
this is worth a hassle.)
these are not fully implemented and ifdef'd out, the bugs have
never manifested themselves. Specifically:
- Fix a memory leak in the case where %a follows another
floating-point format.
- Make the %a/%A code behave like %e/%E with respect to
precision.
- It is no longer valid to assume that '-' and '0x' are
mutually exclusive.
- Address other minor issues.
Makes it possible to have multiple packet aliasing instances in a
single process by moving all static and global variables into an
instance structure called "struct libalias".
Redefine a new API based on s/PacketAlias/LibAlias/g
Add new "instance" argument to all functions in the new API.
Implement old API in terms of the new API.
For pshared semaphore, this commit still does not enable cancellation
point, I think there should be a pthread_enter_cancellation_point_np
for libc to implement a safe cancellation point.
code and simply return EINVAL (which is allowed by the standard) in
all those pthread functions that previously initialized it.
o Refactor the pthread_rwlock_[try]rdlock() and pthread_rwlock_[try]wrlock()
functions. They are now completeley condensed into rwlock_rdlock_common()
and rwlock_wrlock_common(), respectively.
o If the application tries to destroy an rwlock that is currently
held by a thread return EBUSY where it previously went ahead and
freed all resources associated with the lock.
o Refactor _pthread_rwlock_init() to make it look (relatively) sane.
o When obtaining a read lock on an rwlock the check for whether it
would exceed the maximum allowed read locks should happen *before*
we obtain the lock.
o The pthread_rwlock_* functions shall *never* return EINTR, so make
sure to requeue/resuspend the thread if it encounters such an error.
o Make a note that pthread_rwlock_unlock() needs to ensure it holds a
lock on an rwlock it tries to unlock. It will be implemented in a
separate commit because it requires some additional rwlock infrastructure.
associated floppy if needed into a static split_openfile() function.
- Use this function in splitfs_open() to open the first chunk rather
than using open() directly. This allows the first chunk to be located
on a different disk than the actual foo.split file.
getpwent(3) or getpwuid(3) when using NIS adjunct maps. The bug was
present in the internal `nis_passwd' function. The lookup in the
adjunct map used the name passed into `nis_passwd', however no name
was of course supplied by getpwent or getpwuid. Correctly use the
name from the `struct pwd' that was found instead.
PR: bin/59962
Submitted by: Gabriel Gomez <ggomez@fing.edu.uy>
in contributed sources with just a hack made possible
by bsd.sys.mk,v 1.33. This is better because it just
nulls out the warning flags rather than adding gcc(1)
specific -w option to CFLAGS.
must first attach to the traced process. If the tracing process
exits without detaching, the traced process will be killed rather
than continued. For the duration of the tracing session, the traced
process is reparented to the tracing process (with resulting expected
behaviors). It is permissible to trace more than one other process
at a time. When using waitpid() to monitor the behavior of the traced
process, signals are intercepted: they may optionally then be
forwarded using ptrace(). Signals are generated normally by and for
the process, but also by the tracing facility (SIGTRAP).
Product of: Suffering
Sponsored by: DARPA, AFRL
at it, use the ANSI C generic pointer type for the second argument,
thus matching the documentation.
Remove the now extraneous (and now conflicting) function declarations
in various libc sources. Remove now unnecessary casts.
Reviewed by: bde
incorrectly when encountering `large' groups (many members and/or many
long member names). The reporter tracked this down to the glibc NSS
module compatibility code (nss_compat.c): it would prematurely record
that a NSS module was finished iterating through its database in some
cases.
Two aspects are corrected:
1. nss_compat.c recorded that a NSS module was finished iterating
whenever the module reported something other than SUCCESS. The
correct logic is to continue iteration when the module reports
either SUCCESS or RETURN. The __nss_compat_getgrent_r and
__nss_compat_getpwent_r routines are updated to reflect this.
2. An internal helper macro __nss_compat_result is used to map glibc
NSS status codes to BSD NSS status codes (e.g. NSS_STATUS_SUCCESS ->
NS_SUCCESS). It provided the obvious mapping.
When a NSS routine is called with a too-small buffer, the
convention in the BSD NSS code is to report RETURN. (This is used
to implement reentrant APIs such as getpwnam_r(3).) However, the
convention in glibc for this case is to set errno = ERANGE and
overload TRYAGAIN. __nss_compat_result is updated to handle this
case.
PR: bin/60287
Reported by: Lachlan O'Dea <odela01@ca.com>
on a rwlock while there are writers waiting. We normally favor
writers but when a reader already has at least one other read lock,
we favor the reader. We don't track all the rwlocks owned by a
thread, nor all the threads that own a rwlock -- we just keep
a count of all the read locks owned by a thread.
PR: 24641
waiting on a locked mutex. This involves passing a struct timespec
from the pthread mutex locking interfaces all the way down to the
function that suspends the thread until the mutex is released.
The timeout is assumed to be an absolute time (i.e. not relative to
the current time).
Also, in _thread_suspend() make the passed in timespec const.
o Remove some code duplication between _thread_init(), which is run once
to initialize libthr and the intitial thread, and pthread_create(), which
initializes newly created threads, into a new function called from both
places: init_td_common()
o Move initialization of certain parts of libthr into a separate
function. These include:
- Active threads list and it's lock
- Dead threads list and it's lock & condition variable
- Naming and insertion of the initial thread into the
active threads list.
ó++ ABI document at http://www.codesourcery.com/cxx-abi/abi.html#dso-dtor
The ABI was initially defined for ia64, but GCC3 and Intel compilers
have adopted it on other platforms.
This is the patch from PR bin/59552 with a number of changes by
me.
PR: bin/59552
Submitted by: Bradley T Hughes (bhughes at trolltech dot com)
C++ ABI document at http://www.codesourcery.com/cxx-abi/abi.html#dso-dtor
The ABI was initially defined for ia64, but GCC3 and Intel compilers
have adopted it on other platforms.
This is the patch from PR bin/59552 with a number of changes by
me.
PR: bin/59552
Submitted by: Bradley T Hughes (bhughes at trolltech dot com)
work before anyways, and I didn't want to fix broken code I had no
way of testing. It was necessary however, in order to get rid of GIANT_LOCK.
Pthread priorities will have to wait a little longer to get fixed.
problems: (1) The wrong flag was being checked for in the attribute
(2) The pthread's state was not being set to indicate it was
suspended.
Noticed by: Igor Sysoev <is@rambler-co.ru>
call (pam_get_authtok() will return the previous token if try_first_pass
or use_first_pass is specified). Incidentally fix an ugly bug where the
buffer holding the prompt was freed immediately before use, instead of
after.
likely to be non-zero. When leaving the cancellation point, check
the return value against -1 to see if cancellation should be
checked. While I'm here, make the same change to connect() just
to be consisitent.
Pointed out by: davidxu
_thr_leave_cancellation_point to _thr_cancel_leave, add a parameter
to _thr_cancel_leave to indicate whether cancellation point should be
checked, this gives us an option to not check cancallation point if
a syscall successfully returns to avoid any leaks, current I have
creat(), open() and fcntl(F_DUPFD) to not check cancellation point
after they sucessfully returned.
Replace some members in structure kse with bit flags to same some
memory.
Conditionally compile THR_ASSERT to nothing if _PTHREAD_INVARIANTS is
not defined.
Inline some small functions in thr_cancel.c.
Use __predict_false in thr_kern.c for some executed only once code.
Reviewd by: deischen
flags. We now create asynchronous contexts or syscall contexts only.
Syscall contexts differ from the minimal ABI dictated contexts by
having the scratch registers saved and restored because that's where
we keep the syscall arguments and syscall return values.
Since this change affects KSE, have it use kse_switchin(2) for the
"new" syscall context.
Instead of just deleting it, turn the original page into a general
overview of the multibyte character conversion functions, somewhat
similar to stdio(3).
UTS with the stack correctly aligned. Also, while here, use an indirect
jump rather than the pushq/ret hack.
This fixes threaded apps that use floating point for me, although
it hasn't solved all the problems. It is an improvement though.
Preservation of the 128 byte red zone hasn't been resolved yet.
Approved by: re (scottl)
ABI-required stack alignment. C code expects that the push of the
return address disturbed the 16 byte alignment and it will take corrective
measures to fix it before making another call. Of course, if its wrong
to start with, then all hell breaks loose. Essentially we "fix" this
by making the stack alignment odd to start with.
This was one of the things that broke on libkse with apps that use
floating point/varargs/etc.
Approved by: re (scottl)
we can end up with some threads with a non-16-byte-aligned stack. This
causes some interesting side effects, including general protection
faults leading to a SIGBUS when doing floating point or varargs. This
should be just a verbose NOP for the other platforms.
Approved by: re (scottl)
to sendfile(2) being erroneously automatically restarted after a signal
is delivered. Fixed by converting ERESTART to EINTR prior to exiting.
Updated manual page to indicate the potential EINTR error, its cause
and consequences.
Approved by: re@freebsd.org
through branch predict as suggested in INTEL IA32 optimization guide.
2.Allocate siginfo arrary separately to avoid pthread to be allocated at
2K boundary, which hits L1 address alias problem and causes context
switch to be slow down.
3.Simplify context switch code by removing redundant code, code size is
reduced, so it is expected to run faster.
Reviewed by: deischen
Approved by: re (scottl)
in init_main_thread. Also don't initialize lock and lockuser again for initial
thread, it is already done by _thr_alloc().
Reviewed by: deischen
Approved by: re (scottl)
initialization overhead, there's a problem in that we never call
imalloc() and thus malloc_init() for zero-sized allocations. As a
result, malloc(0) returns NULL when it's the first or only malloc in
the program. Any non-zero allocation will initialize the malloc code
with the side-effect that subsequent zero-sized allocations return a
non-NULL pointer. This is because the pointer we return for zero-
sized allocations is calculated from malloc_pageshift, which needs
to be initialized at runtime on ia64.
The result of the inconsistent behaviour described above is that
configure scripts failed the test for a GNU compatible malloc. This
resulted in a lot of broken ports.
Other, even simpler, solutions were possible as well:
1. initialize malloc_pageshift with some non-zero value (say 13 for
8KB pages) and keep the runtime adjustment.
2. Stop using malloc_pageshift to calculate ZEROSIZEPTR.
Removal of the runtime adjustment was chosen because then ia64 is the
same as any other platform. It is not to say that using a page size
obtained at runtime is bad per se. It's that there's currently a high
level of gratuity for its existence and the moment it causes problems
is the moment you need to get rid of it. Hence, it's not unthinkable
that this commit is (partially) reverted some time in the future when
we do have a good reason for it and a good way to achieve it.
Approved by: re@ (rwatson)
Reported by: kris (portmgr@) -- may the ports be with you
that they will be installed before application constructors are invoked.
Its possible to link applications such that this fails, application code
is invoked before they are installed, but, well, Don't Do That.
Approved by: re (jhb)
was rejected as a range error, while any values less than LONG_MIN
were silently substituted with LONG_MIN. Furthermore, on some
platforms `time_t' has less range than `long' (e.g. alpha), which may
give incorrect results when parsing some strings.
context of sockets, and document EINVAL as a possible failure mode
based on the object selected, not just the label provided.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
SO_PEERLABEL. This provides an interface to query the label of a
socket peer without embedding implementation details of mac_t in
the application. Previously, sizeof(*mac_t) had to be specified
by an application when performing getsockopt().
Document mac_get_peer(3), and expand documentation of the other
mac_get(3) functions. Note that it's possible to get EINVAL back
from mac_get_fd(3) when pointing it at an inappropriate object.
NOTE: mac_get_fd() and mac_set_fd() support for sockets will
follow shortly, so the documentation is slightly ahead of the
code.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
mac_free(3), which is used only for variables of type mac_t in
the FreeBSD implementation.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
constants NG_*SIZ that include the trailing NUL byte. This change
is mostly mechanical except for the replacement of a couple of snprintf()
and sprintf() calls with strlcpy.
is accessed for the first time as a result of an application looking
up label configuration information. Previously, the check and read
were kicked off by mac_prepare_(typename)() functions; since
mac_prepare_type() may now be directly employed by a user process,
push the check and initialization into that function.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
Replace occurences of the magic constant 2 with an offsetof macro
call that computes the size of the leading members of the sockaddr.
Use strlcpy instead of sprintf where appropriate. Document the new changes
in the man page.
- In __sigreturn call sigprocmask() to restore our signal state rather than
returning through sigreturn(). jmp to ___sigreturn to restore our register
state following this.
Requested by: pete
symbols exported by newer versions of libc, and so we want applications
depending on the newer library code to be required to link against the
newer libc.
Discussed with: scottl, kris, imp
set NAS-IP-Address attribute in requests generated by the pam_radius
module. This attribute is mandatory for some Radius servers out there.
Reviewed by: des
MFC after: 2 weeks
on whether the parent chunk is of type whole. This also applies to
MBR slices for non-GPT disks. Since most of the GPT handling is
conditionally compiled, do the same with the partition naming.
This fixes a braino that caused slices to be named as GPT partitions
and generally messing up an install.
Pointy hat: marcel
string files (__SSTR flag set). This is necessary because __sputc()
does not respect the __SALC flag, and crashes trying to flush the buffer
instead of resizing it.
PR: 59167
sorting strings with common prefixes by noting
when all the strings land in just one bin.
Testing shows significant speedups (on the order of
30%) on strings with common prefixes and no slowdowns on any
of my test cases.
Submitted by: Markus Bjartveit Kruger <markusk@pvv.ntnu.no>
PR: 58860
Approved by: gordon (mentor)
by a parent that is a session leader (e.g., login shell) by ignoring
SIGHUP in before calling fork(2) and then restoring SIGHUP's action
after setsid(3). Based on the patch by Martin Kammerhofer
<mkamm@gmx.net>.
PR: bin/25462
Reviewed by: bde, alex.neyman@auriga.ru
signal handling mode, there is no chance to handle the signal, something
must be wrong in the library, just call kse_thr_interrupt to dump its core.
I have the code for a long time, but forgot to commit it.
Catch up with renaming of "Japanese" to "ja_JP.eucJP". Comment out the
statement that EUC is provided for compatibility with UNIX-based systems;
this is not a very good opening paragraph.
- fixed a length of the sadb extension in the case of pfkey_send_x5().
- used getprotobynumber() for printing a upper layer protocol name.
- modified the output format against the change of the setkey syntax
about a icmp6 type/code.
- don't enumerate reserved fields. use memset.
Obtained from: KAME
Aside from the POSIX requirements for pthread_atfork(), when
fork()ing, take the malloc lock to keep malloc state consistent
in the child.
Reviewed by: davidxu
it around an application's fork() call. Our new thread libraries
(libthr, libpthread) can now have threads running while another
thread calls fork(). In this case, it is possible for malloc
to be left in an inconsistent state in the child. Our thread
libraries, libpthread in particular, need to use malloc internally
after a fork (in the child).
Reviewed by: davidxu
mbstate_t object that they ignore. The zeroing is fairly expensive, and it
will never be necessary in these functions; when we support state-dependent
encodings, we will pass in a pointer to the file's mbstate_t object, and
only zero it at the time the file gets opened.
tcpdump -y ieee802_11 will work in the basic senses, including the
code compilation for filters (where you may specify "link[]" to refer
to parts of the 802.11 header, as well as treat it like a normal
Ethernet header). Previously, it was just too far off to do anything
useful for us.
* While I'm here, fix some compile problems that will result from lex
and yacc namespace polution when linking with -lpcap. The namespace
is now "pcapyy*" instead of "yy*", and it tests fine with world and
some external applications that may or may not use "yy*".
index referencing it. We need to know the original type and name
so that we know what to put in the table when we reconstruct it.
o Clear the table entries before we rebuild it to avoid that we
end up with stale data.
o Sequentially populate the table entries from the chunks. For the
chunks that have an index (now referencing the saved copy) we
use the saved type and name. This way we can handle unknown types
better. In all cases we update the start and end LBAs.
rather than generating an error. This is consistent with other tools
printing user and group names, and means you can read the ACL using
our tools rather than being up a creek.
PR: 56991
Submitted by: Michael Bretterklieber <mbretter@a-quadrat.at>
filling in the GPT entry. Both are already in sector numbers (LBA)
and exactly what we need for the entry. We now write a structurally
correct GPT partitioning.
part of the disk. The first appears to be a typo and instead of
dividing the media size with the sector size, we multiplied. The
second is an off-by-1 error that's the result of mixing up count
and index. The code in question is only applicable for virgin disks
and is used to create the "whole" chunk, which covers only the GPT
usable portion of the disk.
mbrtowc() and wcrtomb() directly. GB18030, GBK and UTF2 are left
unconverted; GB18030 will be done eventually, but GBK and UTF2 may just
be removed, as they are subsets of GB18030 and UTF-8 respectively.
platforms except ia64 and use Int_Open_Disk() in open_ia64_disk.c
on ia64. We need to know more than GEOM can provide us so we're
forced to read from the disk. Move uuid_type() to open_ia64_disk.c
and remove all references on non-ia64.
o Pass the GEOM conftxt to Int_Open_Disk() so that only Open_Disk()
needs to know about GEOM and libdisk can more easily be used with
media not handled by GEOM.
o Create an ia64 specific definiton of struct disk on ia64, because
we don't need/have most of the fields other platforms need and
other fields not applicable on platforms other than ia64.
o Do not compile change.c on ia64. It's too PC specific.
o In Fixup_Names() in create_chunk.c, try all partition numbers
that are valid for the GPT disk. We have the total number of
partitions that can be allocated in the disk structure on ia64.
Also, use the GPT partition naming if we're creating one under
a chunk of type "whole". It's a GPT partition in that case.
o In Create_Chunk(), compile-out the PC specific code on ia64 that
checks BIOS geometry restrictions.
o In Debug_Disk() in disk.c, dump the ia64 specific fields.
o Save the partition index in the chunk on ia64 so that we can
preserve it when we write the data back to disk. This avoids that
partitions get moved around or swapped after installing FreeBSD,
which may render a disk unusable.
Cyl_Aligned(), Prev_Cyl_Aligned() and Next_Cyl_Aligned() into
tautologies on ia64. GPT removes all notion of tracks, heads and
sectors per track, so there are no alignment considerations.
doesn't have any meaning and only results in lines longer than 80
characters.
o In Delete_Chunk2(), also look for chunks of type "part" under
chunks of type "whole" on ia64. They're not only under chunks of
type "freebsd" there.
as wrappers around the deprecated 4.4BSD rune functions. This paves the
way for state-dependent encodings, which the rune API does not support.
- Add __emulated_sgetrune() and __emulated_sputrune(), which are
implementations of sgetrune() and sputrune() in terms of
mbrtowc() and wcrtomb().
- Rename the old rune-wrapper mbrtowc() and wcrtomb() functions to
__emulated_mbrtowc() and __emulated_wcrtomb().
- Add __mbrtowc and __wcrtomb function pointers, which point to the
current locale's conversion functions, or the __emulated versions.
- Implement mbrtowc() and wcrtomb() as calls to these function pointers.
- Make the "NONE" encoding implement mbrtowc() and wcrtomb() directly.
All of this emulation mess will be removed, together with rune support,
in FreeBSD 6.
when the current implementation won't use it, anyway. Just pass NULL.
This will need to be changed when state-dependent encodings are
supported, but there's no need to take the performance hit
in the meantime.
in KAME implementation, even when no policy is installed
into kernel, getaddrinfo(3) sorts addresses. Since it
causes POLA violation, I modified to don't sort addresses
when no policy is installed into kernel,
Obtained from: KAME
This enable us to use /dev/fwmem* as a core file.
e.g.
ps -M /dev/fwmem0.0 -N kernel.debug
dmesg -M /dev/fwmem0.0 -N kernel.debug
gdb -k -c /dev/fwmem0.0 kernel.debug
You need to set target EUI64 in hw.firewire.fwmem.eui64_hi/lo before
opening the device. On the target arch, (PCI) bus address must be
equivalent to physical address.
(We cannot use this for sparc64 because of IOMMU.)
No objection in: -audit
send strhash(3) off to sleep with the fishes. Nothing in our tree uses it.
It has no documentation. It is nonstandard and in spite of the filename
strhash.c and strhash.h, it lives in application namespace by providing
compulsory global symbols hash_create()/hash_destroy()/hash_search()/
hash_traverse()/hash_purge()/hash_stats() regardless of whether you
#include <strhash.h> or not. If it turns out that there is a huge
application for this after all, I can repocopy it somewhere safer and
we can revive it elsewhere. But please, not in libc!
that are only in libc.so.5. This broke some 4.X applications linked
to libm and run under 5.X.
Background:
In C99, isinf() and isnan() cannot be implemented as regular
functions. We use macros that call libc functions in 5.X, but for
libm-internal use, we need to use the old versions until the next
time libm's major version number is bumped.
Submitted by: bde
Reported by: imp, kris
documented naming scheme (unfortunately the documentation isn't in the
tree as far as I can tell); no repocopy is required as there is no
history to preserve.
- replace simple and almost-correct implementation with slightly hackish
but definitely correct implementation (tested on i386, alpha, sparc64)
which requires pulling in fpmath.h and the MD _fpmath.h from libc.
- try not to make a mess of the Makefile in the process.
- enterprising minds are encouraged to implement more C99 long double
functions.
(aka RFC2292bis). Though I believe this commit doesn't break
backward compatibility againt existing binaries, it breaks
backward compatibility of API.
Now, the applications which use Advanced Sockets API such as
telnet, ping6, mld6query and traceroute6 use RFC3542 API.
Obtained from: KAME
the denormal/unnormal trap, is not a standard IEEE trap. We did
not exclude it from being returned by fpgetmask(), nor did we make
sure that fpsetmask() didn't clobber it. Since the non-IEEE trap
is not part of fp_except_t, users of ifpgetmask()/fpsetmask() would
be confronted with unexpected behaviour, one of which is a SIGFPE
for denormal/unnormal FP results.
This commit makes sure that we don't leak the denormal/unnormal mask
bit in fp_except_t and also that we don't clobber it.
closer to reality. More work remains to be done. st_mtime should
be the most complete based on IEEE Std 1003.1, 2003 Edition, a
review of ufs_vnops.c, and some experimentation.
about the fpu code here. It should be using fxsave/fxrstor instead of
saving/restoring the control word. The SSE registers are used a lot in
gcc generated code on amd64. I'm not sure how this all fits together
though.
section alignnment of 16 bytes for amd64 and this breaks file(1).
Before:
./cp: ELF 64-bit LSB executable, AMD x86-64, version 1 (FreeBSD), for \
FreeBSD 127.7.9, statically linked, stripped
after: ^^^^^^^
./ls: ELF 64-bit LSB executable, AMD x86-64, version 1 (FreeBSD), for \
FreeBSD 5.0.1, dynamically linked (uses shared libs), stripped
The reason for this is that the NOTE sections are not contiguous
internally. If the note section has an alignment of 16, then anything
that looks for the data is supposed to round up the payload start to
the next multiple of the alignment. But FreeBSD/amd64 broke because the
structure is declared as a single structure, not a (header,payload) group,
where the payload had an explicit alignment roundup.
The alternative is to change things like file(1) to ignore the ELF payload
alignment rules for the PT_NOTE section only for FreeBSD.
- fix hard sentence breaks
- sprinkle a few .Vt's where neccessary
- remove incorrect use of `\-'
- proper quoting using .Dq, instead of manual ``...''
Approved by: des@ (mentor)
Reviewed by: ru@