o Header file cleanup.
o bus_dma(9) conversion.
- Removed all consumers of vtophys(9) and converted to use
bus_dma(9).
- 64bit DMA support was disabled because DP83821 is not capable
of handling the DMA request. 64bit DMA request on DP83820
requires different descriptor structures and it's hard to
dynamically change descriptor format at run time so I disabled
it. Note, this is the same behavior as previous one but
previously nge(4) didn't explicitly disable 64bit mode on
DP83820.
- Added Tx/Rx descriptor ring alignment requirements(8 bytes
alignment).
- Limit maximum number of Tx DMA segments to 16. In fact,
controller does not seem to have limitations on number of Tx
DMA segments but 16 should be enough for most cases and
m_collapse(9) will handle highly fragmented frames without
consuming a lot of CPU cycles.
- Added Rx buffer alignment requirements(8 bytes alignment). This
means driver should fixup received frames to align on 16bits
boundary on strict-alignment architectures.
- Nuked driver private data structure in descriptor ring.
- Added endianness support code in Tx/Rx descriptor access.
o Prefer faster memory mapped register access to I/O mapped access.
Added fall-back mechanism to use alternative register access.
The hardware supports both memory and I/O mapped access.
o Added suspend/resume methods but it wasn't tested as controller I
have does not support PCI PME.
o Removed swap argument in nge_read_eeprom() since endianness
should be handled after reading EEPROM.
o Implemented experimental 802.3x full-duplex flow-control. ATM
it was commented out but will be activated after we have generic
flow-control framework in mii(4) layer.
o Rearranged promiscuous mode settings and simplified logic.
o Always disable Rx filter prior to changing Rx filter functions as
indicated in DP83820/DP83821 datasheet.
o Added an explicit DELAY in timeout loop of nge_reset().
o Added a sysctl variable dev.nge.%d.int_holdoff to control
interrupt moderation. Valid ranges are 1 to 255(default 1) in
units of 100us. The actual delivery of interrupt would be delayed
based on the sysctl value. The interface has to be brought down
and up again before a change takes effect. With proper tuning
value, users do not need to resort to polling(4) anymore.
o Added ALTQ(4) support.
o Added missing IFCAP_VLAN_HWCSUM as nge(4) can offload Tx/Rx
checksum calculation on VLAN tagged frames as well as VLAN tag
insertion/stripping. Also add IFCAP_VLAN_MTU capability as nge(4)
can handle VLAN tagged oversized frames.
o Fixed media header length for VLAN.
o Rearranged nge_detach routine such that it's now used for general
clean-up routine.
o Enabled MWI.
o Accessing EEPROM takes very long time so read 6 bytes ethernet
address with one call instead of 3 separate accesses.
o Don't set if_mtu in device attach, it's already set in
ether_ifattach().
o Don't do any special things for TBI interface. Remove TBI
specific media handling in the driver and have gentbi(4) handle
it. Add glue code to read/write TBI PHY registers in miibus
method. This change removes a lot of PHY handling code in driver
and now its functionality is handled by mii(4).
o Alignment fixup code is now applied only for strict-alignment
architectures. Previously the code was applied for all
architectures except i386. With this change amd64 will get
instant Rx performance boost.
o When driver fails to allocate a new mbuf, update if_qdrops so
users can see what was wrong in Rx path.
o Added a workaround for a hardware bug which resulted in short
VLAN tagged frames(e.g. ARP) was rejected as if runt frame was
received. With this workaround nge(4) now accepts the short VLAN
tagged frame and nge(4) can take full advantage of hardware VLAN
tag stripping. I have no idea how this bug wasn't known so far,
without the workaround nge(4) may never work on VLAN
environments.
o Fixed Rx checksum offload logic such that it now honors active
interface capability configured with ifconfig(8).
o In nge_start()/nge_txencap(), always leave at least one free
descriptor as indicated in datasheet. Without this the hardware
would be confused with ring descriptor structure(e.g. no clue
for the end of descriptor ring).
o Removed dead-code that checks interrupts on PHY hardware. The
code was designed to detect link state changes but it was
disabled as driving nge_tick clock would break auto-negotiation
timer. This code is no longer needed as nge(4) now uses mii(4)
and link state change handling is done with mii callback.
o Rearranged ethernet address programming logic such that it works
on strict-alignment architectures.
o Added IFCAP_VLAN_HWTAGGING/IFCAP_VLAN_HWCSUM handler in
nge_ioctl() such that the functionality is configurable with
ifconfig(8). DP83820/DP83821 can do checksum offload for VLAN
tagged frames so enable Tx/Rx checksum offload for VLAN
interfaces.
o Simplified IFCAP_POLLING selection logic in nge_ioctl().
o Fixed module unload panic when bpf listeners are active.
o Tx/Rx descriptor ring address uses 64bit DMA address for
readability. High address part of DMA would be 0 as nge(4)
disabled 64bit DMA transfers so it's ok for DP83821.
o Removed volatile keyword in softc as bus_dmamap_sync(9) should
take care of this.
o Removed extra driver private structures in descriptor ring. These
extra elements are not part of descriptor structure. Embedding
private driver structure into descriptor ring is not good idea
as its size may be different on 32bit/64bit architectures.
o Added miibus_linkchg method handler to catch link state changes.
o Removed unneeded nge_ifmedia in softc. All TBI access is handled
in gentbi(4). There is no difference between TBI and non-TBI case
now.
o Removed "gigabit link up" message handling in nge_tick. Link
state change notification is already performed by mii(4) and
checking link state by accessing PHY registers in periodic timer
handler of driver is wrong. All link state and speed/duplex
monitoring should be handled in PHY driver.
o Use our own timer for watchdog instead of if_watchdog/if_timer
interface.
o Added hardware MAC statistics counter, users canget current MAC
statistics from dev.nge.%d.stats sysctl node(%d is unit number of
a device).
o Removed unused macros, NGE_LASTDESC, NGE_MODE, NGE_OWNDESC,
NGE_RXBYTES.
o Increased number of Tx/Rx descriptors from 128 to 256. From my
experience on gigabit ethernet controllers, number of descriptors
should be 256 or higher to get an optimal performance on gigabit
link.
o Increased jumbo frame length to 9022 bytes to cope with other
gigabit ethernet drivers. Experimentation shows no problems with
9022 bytes.
o Removed unused member variables in softc.
o Switched from bus_space_{read|write}_4 to bus_{read|write}_4.
o Added support for WOL.
lock sysid to be used for non-nlm remote locking. This is required
for the experimental nfsv4 server, so that it can acquire byte
range locks correctly on behalf of nfsv4 clients.
Reviewed by: dfr
Approved by: kib (mentor)
route is also being deleted, the link-layer address table
(arp or nd6) will flush those L2 llinfo entries that match
the removed prefix.
Reviewed by: kmacy
o replace DLT_IEEE802_11 support in net80211 with DLT_IEEE802_11_RADIO
and remove explicit bpf support from wireless drivers; drivers now
use ieee80211_radiotap_attach to setup shared data structures that
hold the radiotap header for each packet tx/rx
o remove rx timestamp from the rx path; it was used only by the tdma support
for debugging and was mostly useless due to it being 32-bits and mostly
unavailable
o track DLT_IEEE80211_RADIO bpf attachments and maintain per-vap and
per-com state when there are active taps
o track the number of monitor mode vaps
o use bpf tap and monitor mode vap state to decide when to collect radiotap
state and dispatch frames; drivers no longer explicitly directly check
bpf state or use bpf calls to tap frames
o handle radiotap state updates on channel change in net80211; drivers
should not do this (unless they bypass net80211 which is almost always
a mistake)
o update various drivers to be more consistent/correct in handling radiotap
o update ral to include TSF in radiotap'd frames
o add promisc mode callback to wi
Reviewed by: cbzimmer, rpaulo, thompsa
when it runs out of clientids is reboot. I had replaced cpu_reboot()
with printf(), since cpu_reboot() doesn't exist for sparc64.
This change replaces the printf() with panic(), so the reboot
would occur for this highly unlikely occurrence.
Approved by: kib (mentor)
o Convert K&R function definitions to ANSI
o Eliminate spaces/tabs that should have been deleted as part of the de__P
efforts
o Use struct thread * in preference to d_thread_t *.
bug referencing a destroyed lock within TX callbacks during device
detach.
Submitted by: hps (original version)
Tested by: Lucius Windschuh <lwindschuh at googlemail.com>
shared uses of a resource are recorded on a sub-list hanging off
a main resource object on a main resource list;
without this change a shared resource (e.g. irq) is reported only
once by devinfo -r/-u;
with this change the resource is reported for each driver that
allocates it (which is even more than what vmstat -i -a reports).
Approved by: jhb (mentor)
Args argument is a pointer to the structure located in user space in
which the socketcall arguments are packed. The structure must be
copied to the kernel instead of direct dereferencing.
Approved by: kib (mentor)
MFC after: 1 week
net.inet.ip.fw.one_pass is a classic ip_input.c variable and is used in
the pfil and bridge code as well. As ipfw is loadable we need to always
provide it. That is the reason why it lives in struct vnet_inet and
not in struct vnet_ipfw.
machine check code. Disable it by default for now.
- When computing the mask of bits that determines a non-restartable event
during a machine check exception, or-in the overflow flag rather than
replacing the other flags.
PR: i386/134586 [2]
Submitted by: Andi Kleen andi-fbsd firstfloor.org
the NFSv4 Close operations until ncl_inactive(). This is
necessary so that the Open StateIDs are available for doing
I/O on mmap'd files after VOP_CLOSE(). I also changed some
indentation for the nfscl_getclose() function.
Approved by: kib (mentor)
possible future I-cache coherency operation can succeed. On ARM
for example the L1 cache can be (is) virtually mapped, which
means that any I/O that uses temporary mappings will not see the
I-cache made coherent. On ia64 a similar behaviour has been
observed. By flushing the D-cache, execution of binaries backed
by md(4) and/or NFS work reliably.
For Book-E (powerpc), execution over NFS exhibits SIGILL once in
a while as well, though cpu_flush_dcache() hasn't been implemented
yet.
Doing an explicit D-cache flush as part of the non-DMA based I/O
read operation eliminates the need to do it as part of the
I-cache coherency operation itself and as such avoids pessimizing
the DMA-based I/O read operations for which D-cache are already
flushed/invalidated. It also allows future optimizations whereby
the bcopy() followed by the D-cache flush can be integrated in a
single operation, which could be implemented using on-chips DMA
engines, by-passing the D-cache altogether.
register 0x52, not ctrl1. This appears to be a mistake in the bcm
reverse engineering page, and has been corrected there. Tracing
through the code, this is more in keeping with the "documented"
register. Sephe thinks it looks interesting and may be worth
fixing. :)
Submitted by: ddkprog at yahoo com
Reviewed by: Sepherosa Ziehau
affinity for the interrupt thread, and requesting that underlying
hardware direct interrupts to the CPU. For software interrupt
threads, implement a no-op interrupt event binder that returns
success, so that the interrupt management code will just set the
ithread's affinity and succeed.
Reviewed by: jhb
MFC after: 1 week
These sysctls don't need any form of locking. At least cp_times is used
by powerd very often, which means I get 50% less calls to non-MPSAFE
sysctls on my system. The other 50% is consumed by dev.cpu.0.freq, but
this seems to need Giant for Newbus.
-- A routing socket message is not generated when an IPv6 address is
either inserted or deleted from an interface. The missing routing
message problem was discovered by Randall Stewart and Michael Tuxen
during SCTP testing.
-- Previously when an IPv6 address is configured on an interface, if the
prefix length is /128, then a host route is instaleld in the kernel
for this address. But this host route is not deleted when that IPv6
address is removed from the interface.
-- Routes to the link-local all-nodes multicast address and the
interface-local all-nodes multicast address are not removed when
the last IPv6 address is removed from an interface.
Reviewed by: bz, gnn
- In bce_rx_intr(), use BUS_DMASYNC_POSTREAD instead of
BUS_DMASYNC_POSTWRITE, as we want to "read" from the
rx page chain pages.
- Document why we need to do PREWRITE after we have updated
the rx page chain pages.
- In bce_intr(), use BUS_DMASYNC_POSTREAD and
BUS_DMASYNC_PREREAD when before and after CPU "reading"
the status block.
- Adjust some nearby style mismatches/etc.
Pointed out by: yongari
Approved by: davidch (no objection) but bugs are mine :)
# Note: The driver doesn't support either these PHY types, so this is
# effectively a nop.
Submitted by: "ddk"
Obtained from: http://paradox.lissyara.su/bwi.diff
made LINT happy this does the proper looping over all vnets as we are
only called `globally' and not once per vnet instance.
Reported by: zec [1]
Missed by: bz [1] in r192264
Reviewed by: zec
believe it was a BCM4319. However, it is the a/b/g variation of the
BCM4318. The chip itself is labelled BCM4318EKFBG, and the board is
BCM94318MKABG.
Paradox's patch includes the type of 802.11 wireless for each card,
but changes all the names (I don't think the latter is quite right).
Import that part of the patch, but keep the current set of BCM names
(with a minor tweak for the 4306 ones). I'll need to verify them via
some other means.
Obtained from: http://paradox.lissyara.su/bwi.diff (partially)
Provide a more descriptive comment.
Eliminate dead code. The page cannot possibly have PG_ZERO set.
Eliminate unnecessary blank lines.
Reviewed by: tegge
experimental nfsv4 server. It was setting the a_id argument
to a fixed value, but that wasn't sufficient for FreeBSD8.
Instead, set l_pid and l_sysid to 0 plus set the F_REMOTE
flag to indicate that these fields are used to check for
same lock owner. Since, for NFSv4, a lockowner is a ClientID plus
an up to 1024byte name, it can't be put in l_sysid easily.
I also renamed the p variable to td, since it's a thread ptr.
Approved by: kib (mentor)
nfsrv_dolocallocks can be changed via sysctl. I also added some non-empty
descriptor strings and reformatted some overly long lines.
Approved by: kib (mentor)
This makes siginfo output look a lot better when pressing it the first
time when in sh(1), for example:
$ load: 0.00 cmd: sh 1945 [ttyin] 3.94r 0.00u 0.00s 0% 1960k
load: 0.00 cmd: sh 1945 [ttyin] 4.19r 0.00u 0.00s 0% 1960k
will now become:
$
load: 0.00 cmd: sh 1945 [ttyin] 3.94r 0.00u 0.00s 0% 1960k
load: 0.00 cmd: sh 1945 [ttyin] 4.19r 0.00u 0.00s 0% 1960k
- Only pick up PROC_LOCK once, which means we can drop the PGRP_LOCK
right after picking up PROC_LOCK for the first time.
- Print the process real time, making it consistent with tools like
time(1).
- Use `p' and `td' to reference the process/thread we are going to
print. Only use pick-variables inside the loops. We already did this
for the threads, but not the processes.
SOCK_NONBLOCK flags, that allow to save fcntl() calls.
Implement a variation of the socket() syscall which takes a flags
in addition to the type argument.
Approved by: kib (mentor)
MFC after: 1 month
Temporarily use 0 for pid member as the FreeBSD does not cache remote
UNIX domain socket peer pid.
PR: kern/102956
Reviewed by: rwatson
Approved by: kib (mentor)
MFC after: 1 month
required two changes: setting the program and version numbers before
connect and fixing the handling of the Null Rpc case in newnfs_request().
Approved by: kib (mentor)
experimental nfs subsystem from sys/fs/nfs/nfs.h and sys/fs/nfs/nfsproto.h
to sys/fs/nfs/nfsport.h and rename nfsstat to ext_nfsstat. This was done
so that src/usr.bin/nfsstat.c could use it alongside the regular nfs
include files and struct nfsstat.
Approved by: kib (mentor)
before the struct file is fully initialized in vn_open(), in particular,
fp->f_vnode is NULL. Other thread calling file operation before f_vnode
is set results in NULL pointer dereference in devvn_refthread().
Initialize f_vnode before calling d_fdopen() cdevsw method, that might
set file ops too.
Reported and tested by: Chris Timmons <cwt networks cwu edu>
(RELENG_7 version)
MFC after: 3 days
that expect that oldlen is filled with required buffer length even when
supplied buffer is too short and returned error is ENOMEM.
Redo the fix for kern.proc.filedesc, by reverting the req->oldidx when
remaining buffer space is too short for the current kinfo_file structure.
Also, only ignore ENOMEM. We have to convert ENOMEM to no error condition
to keep existing interface for the sysctl, though.
Reported by: ed, Florian Smeets <flo kasimir com>
Tested by: pho
Apart from the 16 virtual terminals, Syscons allocates two device nodes
that should not really be TTYs, even though they are. One of them is
consolectl. In RELENG_7 and before, these device nodes are used in
single user mode. After I simplified input path, we only use this device
node to call ioctl() on (moused, Xorg, vidcontrol).
When you call ioctl() on consolectl, it will behave the same as being
called on the first window.
drops and re-grabs the softc mutex in the middle, resulting in kernel
trap 12. This may happen when a lot of traffic is being hammered on
one bge(4) interface while the system is shutting down.
Reported by: Alexander Sack <pisymbol gmail com>
PR: kern/134548
MFC After: 2 weeks
sysctl requests to avoid wiring too much user memory. Only grab this
lock if the user's old buffer is larger than a page as a tradeoff to
allow more concurrency for common small requests.
- Just use a shared lock on the sysctl tree for user sysctl requests now.
MFC after: 1 week
- Remove vga0 and the disabled uart2/uart3 hints from both platforms.
- Remove hints for ISA adv0, bt0, aha0, aic0, ed0, cs0, sn0, ie0, fe0, and
le0 from i386. All these hints were marked 'disabled' and thus already
did not work "out of the box".
Discussed with: imp
With the arrival of 128+ cores it is necessary to handle more than that.
One of the first thing to change is the support for cpumask_t that needs
to handle more than 32 bits masking (which happens now). Some places,
however, still assume that cpumask_t is a 32 bits mask.
Fix that situation by using always correctly cpumask_t when needed.
While here, remove the part under STOP_NMI for the Xen support as it
is broken in any case.
Additively make ipi_nmi_pending as static.
Reviewed by: jhb, kmacy
Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
- make mftb() shared, rewrite in C, provide complementary mttb()
- adjust SMP startup per the above, additional comments, minor naming
changes
- eliminate redundant TB defines, other minor cosmetics
Reviewed by: marcel, nwhitehorn
Obtained from: Freescale, Semihalf
chipset-specific code to attach chipset-specific data.
- Use chipset-specific data in the acard and promise chipsets rather than
changing the ivars of ATA PCI devices. ivars are reserved for use by the
parent bus driver and are _not_ available for use by devices directly.
This fixes a panic during sysctl -a with certain Promise controllers with
ACPI enabled.
Reviewed by: mav
Tested by: Magnus Kling (kingfon @ gmail) (on 7)
MFC after: 3 days
error due to copyout failure or short buffer.
The later breaks the usermode iterators of the sysctl results that pack
arbitrary number of variable-sized structures. Iterator expects that
kernel filled exactly oldlen bytes, and tries to interpret half-filled
or garbage structure at the end of the buffer. In particular,
kinfo_getfile(3) segfaulted.
Reported and tested by: pho
MFC after: 3 weeks
fget_unlocked().
- Save old file descriptor tables created on expansion until
the entire descriptor table is freed so that pointers may be
followed without regard for expanders.
- Mark the file zone as NOFREE so we may attempt to reference
potentially freed files.
- Convert several fget_locked() users to fget_unlocked(). This
requires us to manage reference counts explicitly but reduces
locking overhead in the common case.
new platform module. These are probed in early boot, and have the
responsibility of determining the layout of physical memory, determining
the CPU timebase frequency, and handling the zoo of SMP mechanisms
found on PowerPC.
Reviewed by: marcel, raj
Book-E parts by: raj
- For CPUs that only support MCE (the machine check exception) but not MCA
(i.e. Pentium), all this does is print out the value of the machine check
registers and then panic when a machine check exception occurs.
- For CPUs that support MCA (the machine check architecture), the support is
a bit more involved.
- First, there is limited support for decoding the CPU-independent MCA
error codes in the kernel, and the kernel uses this to output a short
description of any machine check events that occur.
- When a machine check exception occurs, all of the MCx banks on the
current CPU are scanned and any events are reported to the console
before panic'ing.
- To catch events for correctable errors, a periodic timer kicks off a
task which scans the MCx banks on all CPUs. The frequency of these
checks is controlled via the "hw.mca.interval" sysctl.
- Userland can request an immediate scan of the MCx banks by writing
a non-zero value to "hw.mca.force_scan".
- If any correctable events are encountered, the appropriate details
are stored in a 'struct mca_record' (defined in <machine/mca.h>).
The "hw.mca.count" is a count of such records and each record may
be queried via the "hw.mca.records" tree by specifying the record
index (0 .. count - 1) as the next name in the MIB similar to using
PIDs with the kern.proc.* sysctls. The idea is to export machine
check events to userland for more detailed processing.
- The periodic timer and hw.mca sysctls are only present if the CPU
supports MCA.
Discussed with: emaste (briefly)
MFC after: 1 month
direct dispatch policy for specific protocols (NETISR_USB). We leave
the additional 'flags' argument to netisr_register() for the time being,
even though it is no longer required.
NULL or change it. We initialize it before we set if_ioctl. It can
therefore never be NULL, and most other drivers don't bother with this
sanity check.
introduced in amd64 revision 1.540 and i386 revision 1.547. However, it
had no harmful effects until after a recent change, r189698, on amd64.
(In other words, the error is harmless in RELENG_7.)
The error is triggered by the failure to allocate a pv entry for the one
and only mapping in a page table page. I am addressing the error by
changing pmap_copy() to abort if either pv entry allocation or page
table page allocation fails. This is appropriate because the creation of
mappings by pmap_copy() is optional. They are a (possible) optimization,
and not a requirement.
Correct a nearby whitespace error in the i386 pmap_copy().
Crash reported by: jeff@
MFC after: 6 weeks
following changes:
Rename vfs_page_set_valid() to vfs_page_set_validclean() to reflect
what this function actually does. Suggested by: tegge
Introduce a new version of vfs_page_set_valid() that does no more than
what the function's name implies. Specifically, it does not update
the page's dirty mask, and thus it does not require the page queues
lock to be held.
Update two of the three callers to the old vfs_page_set_valid() to
call vfs_page_set_validclean() instead because they actually require
the page's dirty mask to be cleared.
Introduce vm_page_set_valid().
Reviewed by: tegge
registers contents.
- Use memory barriers to preserve the order of buffer space operations.
This might be needed if we'll ever use this driver on architectures
where ordering is not guaranteed.
major/minor numbers of the devices.
Pretending that the vnodes are character devices prevents file tree
walkers from descending into the directories opened by current process.
Also, not doing stat on the filedescriptors prevents the recursive entry
into the VFS.
Requested by: kientzle
Discussed with: Jilles Tjoelker <jilles stack nl>
representing too large integer, instead of overflowing and possibly
returning a random but valid vnode.
Noted by: Jilles Tjoelker <jilles stack nl>
MFC after: 3 days
to a non loopback/ppp link types) through the loopback interface. Prior
to the new L2/L3 rewrite, this host route is implicitly added by the L2
code during RTM_RESOLVE of that interface address. This host route is
deleted when that interface is removed.
Reviewed by: kmacy
commit and fix it correctly by removing the _KERNEL check entirely. Now
the kernel always sees the same value of NULL as userland meaning that it
sees __null, 0, or 0L when compiled as C++, and '(void *)0' when compiled
as C.
As an experiment, I changed snp(4) to use a mutex instead of an sx lock.
We can't enable this right now, because Syscons still picks up Giant.
It's nice to already have the framework there.
'(void *)0' for NULL for C++ compilers compiling kernel code. Together this
makes it easier to build kernel modules using C++.
Reviewed by: imp
MFC after: 3 days
the VFS. Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.
In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.
While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.
VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled. Bump __FreeBSD_version in order to signal such
situation.
suffer from the race condition that motivated revision 1.94. Consequently,
the work-around that was implemented by revision 1.94 is no longer needed.
Moreover, reverting this work-around eliminates the need for
vfs_busy_pages() to acquire the page queues lock when preparing a buffer
for read.
Reviewed by: tegge
directly in cpu_reset() in order to idle the APs before exiting
the kernel and letting the BSP enter the firmware so that processes
like init(8) which still might be running on an AP at that point
don't cause a panic there when it crashes due to the fact it no
longer can be supported by the kernel.
MFC after: 3 days
adapted to MPSAFE cam(4) to a isp(4) specific callout structure.
Thanks to Florian Smeets for providing access to a machine exhibiting
this problem for debugging.
Approved by: mjacob
MFC after: 3 days
to 2.4.0, as it has appeared in the 2.4.0-rc7 first time.
Being exported, AT_CLKTCK is returned by sysconf(_SC_CLK_TCK),
glibc falls back to the hard-coded CLK_TCK value when aux entry
is not present.
Glibc versions prior to 2.2.1 always use hard-coded CLK_TCK value.
For older applications/libc's which depends on hard-coded CLK_TCK
value user should set compat.linux.osrelease less than 2.4.0.
Approved by: kib (mentor)
designation of the emulated kernel version.
linux_kernver() returns integer value formatted as 'VVVMMMIII' where
VVV - version, MMM - major revision, III - minor revision.
Approved by: kib (mentor)
The frequency of the statistics clock is given by stathz.
Use stathz if it is available, otherwise use hz.
Pointed out by: bde
Approved by: kib (mentor)
- Just use #error when including <sys/ioctl.h> in the kernel. Code
hasn't used this header for years now and probably doesn't compile
anyway, because of -Werror.
- Get rid of struct ttysize, TIOCGSIZE and TIOCSSIZE. All code nowadays
use both TIOC[GS]SIZE and TIOC[GS]WINSZ. Because we have other popular
systems that don't implement the first, it's of little use to support
interfaces nowadays.
Credential might need to hang around longer than its parent and be used
outside of mnt_explock scope controlling netcred lifetime. Use separate
reference-counted ucred allocated separately instead.
While there, extend mnt_explock coverage in vfs_stdexpcheck and clean-up
some unused declarations in new NFS code.
Reported by: John Hickey
PR: kern/133439
Reviewed by: dfr, kib
At first I allowed ioctl_compat.h to be included, but it just returned
an empty file. I had to do this, to keep kdump happy. I really want to
raise a compiler error when including this header, so now it will just
throw an error if you don't set COMPAT_43TTY.
that vnode_pager_input_smlfs() zeroes the page, it should not mark the page
as valid until after the page is zeroed. Otherwise, the page could be
mapped for read access (e.g., by vm_map_pmap_enter()) before the page is
zeroed. Reviewed by: tegge
Eliminate gratuitous clearing of the page's dirty mask by
vnode_pager_input_smlfs(). Instead, assert that the page is clean.
Reviewed by: tegge
Eliminate some blank lines.
Eliminate pointless calls to pmap_clear_modify() and vm_page_undirty() from
vnode_pager_input_old(). The page is not mapped. Therefore, it cannot have
any page table entries that are modified.
Eliminate an incorrect comment from vnode_pager_generic_getpages().
'net.inet.ip.fw.default_to_accept'. The current value can also be queried
via a read-only sysctl of the same name.
Requested by: plosher
MFC after: 1 week
I really don't want any pieces of code to include ioctl_compat.h, so let
the ibcs2 and svr4 compat leave sgtty alone. If they want to support
sgtty, they should emulate it on top of termios, not sgtty.
The code has been marked with BURN_BRIDGES for a long time. ibcs2 and
svr4 are not really popular pieces of code anyway.
virtualized instances of hostname and domainname, as well as a new top-level
virtualization struct vimage, which holds pointers to struct vnet and struct
vprocg. Struct vprocg is likely to become replaced in the near future with
a new jail management API import.
As a consequence of this change, change struct ucred to point to a struct
vimage, instead of directly pointing to a vnet.
Merge vnet / vimage / ucred refcounting infrastructure from p4 / vimage
branch.
Permit kldload / kldunload operations to be executed only from the default
vimage context.
This change should have no functional impact on nooptions VIMAGE kernel
builds.
Reviewed by: bz
Approved by: julian (mentor)
This is based on a fix that went in to opensolaris on March 9th. However, it uses a dedicated
thread instead of a Solaris' taskq to avoid doing a blocking memory allocation with the vnode
interlock held.
This fixes a long-time deadlock in ZFS. This is not, strictly speaking, an LOR. The spa_zio
thread releases a vnode, this calls in to vn_reclaim which in turn needs to acquire range locks
to sync dirty data out to disk. The range locks are already held by a user-level process waiting
on a condition variable that it the process is waiting on a spa_zio thread to signal it on. The
process could not be signalled because the spa_zio thread could not proceed.
The nature of this problem was not apparent due to ZFS locks opting out of witness which meant
that DDB did not know about the locks that were held by ZFS.
Reviewed by: pjd
MFC after: 7 days
OSD-based jail extensions. This allows the Linux MIB to accessed via
jail_set and jail_get, and serves as a demonstration of adding jail support
to a module.
Reviewed by: dchagin, kib
Approved by: bz (mentor)
It turns out if we called cfmakeraw() on a TTY with only a rint handler
in place, it could inject data into the TTY, even though it should be
redirected. Always take a look at the hooks before looking at the
termios flags.
which is available for Glibc as sysconf(_SC_CLK_TCK). If AT_CLKTCK entry is
not exported, Glibc uses 100.
linux_times() shall use the value that is exported to user space.
Pointyhat to: dchagin
PR: kern/134251
Approved by: kib (mentor)
MFC after: 2 weeks
The entire world seems to use the non-standard TIOCSCTTY ioctl to make a
TTY a controlling terminal of a session. Even though tcsetsid(3) is also
non-standard, I think it's a lot better to use in our own source code,
mainly because it's similar to tcsetpgrp(), tcgetpgrp() and tcgetsid().
I stole the idea from QNX. They do it the other way around; their
TIOCSCTTY is just a wrapper around tcsetsid(). tcsetsid() then calls
into an IPC framework.
Use the protocol family constants for the domain argument validation.
Return EAFNOSUPPORT in case when the incorrect domain argument
is specified.
Return EPROTONOSUPPORT instead of passing values that are not 0
to the BSD layer.
Suggested by: rwatson
Approved by: kib (mentor)
MFC after: 1 month
sc_rixmap is an inverse map
NB: could eliminate the check for an invalid rate by filling in 0 for
invalid entries but the rate control modules use it to identify
bogus rates so leave it for now
This is necessary for two reasons:
1) In order to avoid collisions with the use of a BIOs flags set by a consumer
or a provider
2) Because GV_BIO_DONE was used to mark a BIO as done, not enough flags was
available, so the consumer flags of a BIO had to be misused in order to
support enough flags. The new queue makes it possible to recycle the
GV_BIO_DONE flag into GV_BIO_GROW.
As a consequence, gvinum will now work with any other GEOM class under it or
on top of it.
- Use bio_pflags for storing internal flags on downgoing BIOs, as the requests
appear to come from a consumer of a gvinum volume. Use bio_cflags only for
cloned BIOs.
- Move gv_post_bio to be used internally for maintenance requests.
- Remove some cases where flags where set without need.
PR: kern/133604
as they do not really relate and to prepare for an additional queue to be
covered by the BIO queue mutex.
- Implement wrappers for fetching the next element from the event queue as well
as for putting a new element into the BIO queue.
(i.e. seems to be) already set.
This should reduce console noise due to curvnet recursion reports.
This change has no impact on nooptions VIMAGE builds.
Approved by: julian (mentor)