Commit graph

73908 commits

Author SHA1 Message Date
Max Laier
b5a6cecbcd If we cannot immediately get the pf_consistency_lock in the purge thread,
restart the scan after acquiring the lock the hard way.  Otherwise we might
end up with a dead reference.

Reported by:		pfsense
Reviewed by:		eri
Initial patch by:	eri
Tested by:		pfsense
Approved by:		re (kib)
2009-08-19 00:10:10 +00:00
Stanislav Sedov
7f21e273a8 - Do not try to reevaluate current RX production index on each
loop iteration as it can be updated by the card while we
  process the RX ring forcing us to process RX descriptors
  for which DMA synchronisation operation has not been
  performed.  This fixes the bug when bge(4) drops packets
  under high load.

Discussed with:	yongari, marius
Approved by:	re (kib)
MFC after:	1 week
2009-08-18 21:07:39 +00:00
Kip Macy
3ee42584f9 - change the interface to flowtable_lookup so that we don't rely on
the mbuf for obtaining the fib index
 - check that a cached flow corresponds to the same fib index as the
   packet for which we are doing the lookup
 - at interface detach time flush any flows referencing stale rtentrys
   associated with the interface that is going away (fixes reported
   panics)
 - reduce the time between cleans in case the cleaner is running at
   the time the eventhandler is called and the wakeup is missed less
   time will elapse before the eventhandler returns
 - separate per-vnet initialization from global initialization
   (pointed out by jeli@)

Reviewed by:	sam@
Approved by:	re@
2009-08-18 20:28:58 +00:00
Pyun YongHyeon
44d9075392 Backout r193289. r193289 restored page select bits to previous
value instead of blindly resetting it to 0. However, it seems page
select bits of some 88E1116 PHY is initialized to invalid one such
that restoring page select bits after programming broke MII
register access. The correct solution would be reset page select
bits to 0 in PHY attach stage but it would require more testing.
Since we're in BETA stage such a change would be dangerous so just
back it out.
This change should fix nfe(4) breakage on NVIDIA MCP55.

Reported by:	Ryan Rogers < webmaster <> doghouserepair dot com >
		Sam Fourman Jr. < sfourman <> gmail dot com >
Tested by:	Ryan Rogers < webmaster <> doghouserepair dot com >
		Sam Fourman Jr. < sfourman <> gmail dot com >
Approved by:	re (kib)
2009-08-18 20:20:15 +00:00
Michael Tuexen
627dfd6df9 Fix a crash when using one-to-one stlye socket in non-blocking
mode and there is no listening server.
PR: 137795
Approved by: re, rrs (mentor)
MFC after:immediately.
2009-08-18 19:58:49 +00:00
Pawel Jakub Dawidek
e477e4fe8e Remove unused taskqueue_find() function.
Reviewed by:	dfr
Approved by:	re (kib)
2009-08-18 13:55:48 +00:00
Alexander Motin
5daa555d8d Fix copy/paste bug, that requests data read during ATA device probe sequence
for ATA_SETFEATURES/ATA_SF_SETXFER command which by definition transfers no
data. Most of controllers are irrelevant to this bug, but some nVidia's
doesn't.

Tested on:	current@
Approved by:	re (kib)
2009-08-18 09:27:17 +00:00
Alexander Motin
33ea30fed7 Fix iSCSI initiator and vpo driver operation, broken by CAM changes.
Reviewed by:	scottl, Danny Braniss
Approved by:	re (rwatson)
2009-08-18 08:46:54 +00:00
Kip Macy
d53e359b9a fix netboot issue by disabling flowtable lookups until initialization has been run
Reviewed by:	rwatson@
Approved by:	re@
2009-08-17 19:09:28 +00:00
Attilio Rao
353998acc3 * Change the scope of the ASSERT_ATOMIC_LOAD() from a generic check to
a pointer-fetching specific operation check. Consequently, rename the
  operation ASSERT_ATOMIC_LOAD_PTR().
* Fix the implementation of ASSERT_ATOMIC_LOAD_PTR() by checking
  directly alignment on the word boundry, for all the given specific
  architectures. That's a bit too strict for some common case, but it
  assures safety.
* Add a comment explaining the scope of the macro
* Add a new stub in the lockmgr specific implementation

Tested by: marcel (initial version), marius
Reviewed by: rwatson, jhb (comment specific review)
Approved by: re (kib)
2009-08-17 16:17:21 +00:00
Marcel Moolenaar
8530137252 The start of the EFI GPT partition in the PMBR can always be represented
by CHS addressing. Don't define these fields as 0xff, but rather define
them correctly. This prevents boot problems on PCs where GPT is being
used.

PR:		115406
Submitted by:	Kent Hauser <kent@khauser.net>
Approved by:	re (kib)
2009-08-17 16:16:46 +00:00
Rick Macklem
60f04e38ca Apply the same patch as r196205 for nfs_upgrade_lock() and
nfs_downgrade_lock() to the experimental nfs client.

Approved by:	re (kensmith), kib (mentor)
2009-08-17 16:12:28 +00:00
John Hay
e8c91ce393 Fix parse() so that the partition to boot (load /boot/loader) from can
be set. The syntax as printed in main() is used: 0:ad(0p3)/boot/loader

Reviewed by:	jhb
Approved by:	re (kib)
2009-08-17 15:19:03 +00:00
Konstantin Belousov
faccac2d45 Correct a critical accounting error in pmap_demote_pde(). Specifically,
when pmap_demote_pde() allocates a page table page to implement a
user-space demotion, it must increment the pmap's resident page count.
Not doing so, can lead to an underflow during address space termination
that causes pmap_remove() to exit prematurely, before it has destroyed
all of the mappings within the specified range.  The ultimate effect or
symptom of this error is an assertion failure in vm_page_free_toq()
because the page being freed is still mapped.

This error is only possible when superpage promotion is enabled.  Thus,
it only affects FreeBSD  versions greater than 7.2.

Tested by:	pho, alc
Reviewed by:	alc
Approved by:	re (rwatson)
MFC after:	1 week
2009-08-17 13:27:55 +00:00
Rui Paulo
fbe007da74 Fix a typo in ifdef mesh support. This would make mesh unworkable if
TDMA support was compiled out.

Approved by:	re (kib)
2009-08-17 12:57:57 +00:00
Pawel Jakub Dawidek
8e9fd65fbf getcwd() (when __getcwd() fails) works by stating current directory, going up
(..), calling readdir and looking for previous directory inode.  In case of
.zfs/ directory this doesn't work, because .zfs/ is hidden by default, so it
won't be visible in readdir output.

Fix this by implementing VPTOCNP for snapshot directories, so __getcwd()
doesn't fail and getcwd() doesn't have to use readdir method.

This fixes /bin/pwd from within .zfs/snapshot/<name>/.

Suggested by:	kib
Approved by:	re (rwatson)
2009-08-17 10:00:18 +00:00
Pawel Jakub Dawidek
8461b0f043 Manage asynchronous vnode release just like Solaris.
Discussed with:	kmacy
Approved by:	re (kib)
2009-08-17 09:48:34 +00:00
Pawel Jakub Dawidek
e35eb914f4 - Reduce z_teardown_lock lock scope a bit.
- The error variable is int, not bool.
- Convert spaces to tabs where needed.

Approved by:	re (kib)
2009-08-17 09:28:15 +00:00
Pawel Jakub Dawidek
0330a5dc10 If z_buf is NULL, we should free znode immediately.
Noticed by:	avg
Approved by:	re (kib)
2009-08-17 09:25:37 +00:00
Pawel Jakub Dawidek
d83cfc37a4 - We need to recycle vnode instead of freeing znode.
Submitted by:	avg

- Add missing vnode interlock unlock.
- Remove redundant znode locking.

Approved by:	re (kib)
2009-08-17 09:21:39 +00:00
Pawel Jakub Dawidek
f820bc079f Fix panic in zfs recv code. The last vnode (mountpoint's vnode) can have
0 usecount.

Reported by:	Thomas Backman <serenity@exscape.org>
Approved by:	re (kib)
2009-08-17 09:13:22 +00:00
Pawel Jakub Dawidek
159ef108e1 Remove OpenSolaris taskq port (it performs very poorly in our kernel) and
replace it with wrappers around our taskqueue(9).
To make it possible implement taskqueue_member() function which returns 1
if the given thread was created by the given taskqueue.

Approved by:	re (kib)
2009-08-17 09:01:20 +00:00
Pawel Jakub Dawidek
6a3b289388 Because taskqueue_run() can drop tq_mutex, we need to check if the
TQ_FLAGS_ACTIVE flag wasn't removed in the meantime, which means we missed a
wakeup.

Approved by:	re (kib)
2009-08-17 08:42:34 +00:00
Pawel Jakub Dawidek
fddc954016 - Fix a race where /dev/zfs control device is created before ZFS is fully
initialized. Also destroy /dev/zfs before doing other deinitializations.
- Initialization through taskq is no longer needed and there is a race
  where one of the zpool/zfs command loads zfs.ko and tries to do some work
  immediately, but /dev/zfs is not there yet.

Reported by:	pav
Approved by:	re (kib)
2009-08-17 08:36:41 +00:00
Pawel Jakub Dawidek
830940567b Remove files that are no longer used.
Discussed with:	kmacy
Approved by:	re (kib)
2009-08-17 08:03:02 +00:00
Ed Schouten
f6b3749db3 Fix small style regression introduced by the MPSAFE newbus code.
Approved by:	re (rwatson)
2009-08-16 19:55:53 +00:00
Andrew Thompson
532b195250 Change the usb workers from kernel processes to threads, this is mostly a
cosmetic change to reduce cruft in the proc table.

Also change the idle wait message to `-` like how taskqueues are.

Reviewed by:	julian
Approved by:	re (kib)
2009-08-16 14:13:55 +00:00
Marcel Moolenaar
538e86713d Fix misalignment in nvpair_native_embedded() caused by the compiler
replacing the bzero(). See also revision 195627, which fixed the
misalignment in nvpair_native_embedded_array().

Approved by:	re (kensmith)
2009-08-16 01:48:46 +00:00
Marcel Moolenaar
97e84697ae Decouple ACPI CPU Ids from FreeBSD's cpuid. The ACPI Ids can be
sparse, which causes a kernel assert.

Approved by:	re (kensmith)
2009-08-16 01:43:08 +00:00
Robert Watson
a38fdf2946 Rather than fix questionable ifnet list locking in the implementation of
the kern.polling.enable sysctl, remove the sysctl.  It has been deprecated
since FreeBSD 6 in favour of per-ifnet polling flags.

Reviewed by:	luigi
Approved by:	re (kib)
2009-08-15 23:07:43 +00:00
Robert Watson
d931ea0961 Remove unused if_rawoutput() macro; it has been unused since at least
FreeBSD 2.

Approved by:	re (kib)
2009-08-15 22:26:26 +00:00
Michael Tuexen
810ec53688 * Fix a bug where PR-SCTP settings are ignore when using implicit
association setup.
* Fix a bug where message with illegal stream ids are not deleted.
* Fix a crash when reporting back unsent messages from the send_queue.
* Fix a bug related to INIT retransmission when the socket is already
  closed.
* Fix a bug where associations were stalled when partial delivery API
  was enabled.
* Fix a bug where the receive buffer size was smaller than the
  partial_delivery_point.

Approved by: re, rrs (mentor)
MFC after: One day.
2009-08-15 21:10:52 +00:00
Attilio Rao
68a5590836 Port recent IPI enhachements to en:
* Introduce the ipi_nmi_handler() function for the Xen infrastructure
* Fixup adeguately the ipi sender functions

Approved by:	re (kib)
2009-08-15 18:37:06 +00:00
Stanislav Sedov
2dc8b759bd - Proprely intialize UART parameters at probe stage, so uart(4)
will initialize the FIFO memory correctly on attach.  Before
  that this values was intialized in only in at91_usart_bus_attach
  which is called after the uart(4) memory allocation happens.

Approved by:	re (kib)
MFC after:	1 week
2009-08-15 15:15:20 +00:00
Qing Li
3ef5e21d01 In function ip_output(), the cached route is flushed when there is a
mismatch between the cached entry and the intended destination. The
cached rtentry{} is flushed but the associated llentry{} is not. This
causes the wrong destination MAC address being used in the output
packets. The fix is to flush the llentry{} when rtentry{} is cleared.

Reviewed by:	kmacy, rwatson
Approved by:	re
2009-08-14 23:44:59 +00:00
Marko Zec
9abb486279 Appease VNET_DEBUG - in if_vmove we temporarily switch i.e.
recurse from one vnet to another which is OK, so no need
to flood the console with warnings here.

Approved by:	re (rwatson), julian (mentor)
2009-08-14 22:46:45 +00:00
Marko Zec
f92ae4d706 SCTP is not yet compatible with options VIMAGE kernels although it compiles
with VIMAGE defined, so explicitly disallow building such kernels.

Reviewed by:	rrs
Approved by:	re (rwatson), julian (mentor)
2009-08-14 22:43:25 +00:00
Marko Zec
67addcde86 Make VNET_DEBUG a standalone compile-time option, i.e. decouple it from
INVARIANTS.

Reviewed by:	bz
Approved by:	re (rwatson), julian (mentor)
2009-08-14 22:41:39 +00:00
Bjoern A. Zeeb
8d518523cc Add a new macro to test that a variable could be loaded atomically.
Check that the given variable is at most uintptr_t in size and that
it is aligned.

Note: ASSERT_ATOMIC_LOAD() uses ALIGN() to check for adequate
      alignment -- however, the function of ALIGN() is to guarantee
      alignment, and therefore may lead to stronger alignment
      enforcement than necessary for types that are smaller than
      sizeof(uintptr_t).

Add checks to mtx, rw and sx locks init functions to detect possible
breakage. This was used during debugging of the problem fixed with
r196118 where a pointer was on an un-aligned address in the dpcpu area.

In collaboration with:	rwatson
Reviewed by:		rwatson
Approved by:		re (kib)
2009-08-14 21:46:54 +00:00
John Baldwin
21157ad3b1 Adjust the handling of the local APIC PMC interrupt vector:
- Provide lapic_disable_pmc(), lapic_enable_pmc(), and lapic_reenable_pmc()
  routines in the local APIC code that the hwpmc(4) driver can use to
  manage the local APIC PMC interrupt vector.
- Do not enable the local APIC PMC interrupt vector by default when
  HWPMC_HOOKS is enabled.  Instead, the hwpmc(4) driver explicitly
  enables the interrupt when it is succesfully initialized and disables
  the interrupt when it is unloaded.  This avoids enabling the interrupt
  on unsupported CPUs which may result in spurious NMIs.

Reported by:	rnoland
Reviewed by:	jkoshy
Approved by:	re (kib)
MFC after:	2 weeks
2009-08-14 21:05:08 +00:00
Konstantin Belousov
165a3b418f When a UFS node is truncated to the zero length, e.g. by explicit
truncate(2) call, or by being removed or truncated on open, either
new softupdate freeblks structure is allocated to track the freed
blocks of the node, or truncation is done syncronously when too many SU
dependencies are accumulated. The decision does not take into account
the allocated freeblks dependencies, allowing workloads that do huge
amount of truncations to exhaust the kernel memory.

Take the number of allocated freeblks into consideration for
softdep_slowdown().

Reported by:	pluknet gmail com
Diagnosed and tested by:	pho
Approved by:	re (rwatson)
MFC after:	1 month
2009-08-14 11:00:38 +00:00
Konstantin Belousov
48bd6d4a49 In nfs_upgrade_vnlock(), assert that the vnode is locked. It is for all
pathes, as far as I see and testing seems to confirm it. Comparision of
old_lock with LK_SHARED make sense only if vnode is locked by current
thread.

When downgrading, pass LK_RETRY to the vn_lock(), since otherwise
vn_lock() unlocks the doomed vnode, causing extra unlock.

Reported and tested by:	pho
Approved by:	re (rwatson)
MFC after:	3 weeks
2009-08-14 10:59:17 +00:00
Konstantin Belousov
aabd624cf4 Add the address of the lock to the KTR_LOCK trace.
Tested by:	pho
Approved by:	re (rwatson)
2009-08-14 10:57:57 +00:00
Konstantin Belousov
8f40845151 Correctly handle unlock for !MAKEENTRY case, after successfull attempt of
lock upgrade cache shall be unlocked from write.

Reported by:	Lucius Windschuh <lwindschuh googlemail com>
Reviewed by:	kan
Approved by:	re (rwatson)
2009-08-14 10:57:28 +00:00
Julian Elischer
72034f5548 Fix ipfw crash on uid or gid check.
Receiving any ip packet for which there is no existing socket will
crash if ipfw has a uid or gid test rule, as the uid/gid
of the non existent owner of said non existent socket is tested.
Brooks introduced this error as part of his >16 gids patch.
It appears to be a cut-n-paste error from similar code a few lines
before. The old code used the 'pcb' variable here, but in the
new code that switched the 'inp' variable, which is often NULL
and what is tested in the code further up. The rest of the multi-gid
patch for ipfw seems solid (and cleaner than previous code).

Reviewed by:	brooks
Approved by:	re (rwatson)
2009-08-14 10:09:45 +00:00
Scott Long
763fae7918 ntroduce mfiutil, a basic utility for managing LSI SAS-RAID & Dell PERC5/6
controllers.  Controller, array, and drive status can be checked, basic
attributes can be changed, and arrays and spares can be created and deleted.
Controller firmware can also be flashed.

This does not replace MegaCLI, found in ports, as that is officially sanctioned
and supported by LSI and includes vastly more functionality.  However, mfiutil
is open source and guaranteed to provide basic functionality, which can be
especially useful if you have a problem and can't get MegaCLI to work.

Approved by:    re
Obtained from:  Yahoo! Inc.
2009-08-13 23:18:45 +00:00
Attilio Rao
dc6fbf6545 * Completely Remove the option STOP_NMI from the kernel. This option
has proven to have a good effect when entering KDB by using a NMI,
but it completely violates all the good rules about interrupts
disabled while holding a spinlock in other occasions.  This can be the
cause of deadlocks on events where a normal IPI_STOP is expected.
* Adds an new IPI called IPI_STOP_HARD on all the supported architectures.
This IPI is responsible for sending a stop message among CPUs using a
privileged channel when disponible. In other cases it just does match a
normal IPI_STOP.
Right now the IPI_STOP_HARD functionality uses a NMI on ia32 and amd64
architectures, while on the other has a normal IPI_STOP effect. It is
responsibility of maintainers to eventually implement an hard stop
when necessary and possible.
* Use the new IPI facility in order to implement a new userend SMP kernel
function called stop_cpus_hard(). That is specular to stop_cpu() but
it does use the privileged channel for the stopping facility.
* Let KDB use the newly introduced function stop_cpus_hard() and leave
stop_cpus() for all the other cases
* Disable interrupts on CPU0 when starting the process of APs suspension.
* Style cleanup and comments adding

This patch should fix the reboot/shutdown deadlocks many users are
constantly reporting on mailing lists.

Please don't forget to update your config file with the STOP_NMI
option removal

Reviewed by:	jhb
Tested by:	pho, bz, rink
Approved by:	re (kib)
2009-08-13 17:09:45 +00:00
Rafal Jaworowski
cbb3cb151c Use correct wbinv operation in pmap_l2cache_wbinv_range().
Submitted by:	Michal Hajduk
Reviewed by:	stas
Approved by:	re (kib)
Obtained from:	Semihalf
2009-08-13 15:56:09 +00:00
Edward Tomasz Napierala
abd370a36b Remove CDDL warning.
Approved by:	re (kib), core
2009-08-13 12:28:30 +00:00
Bjoern A. Zeeb
eb79e1c76e Make it possible to change the vnet sysctl variables on jails
with their own virtual network stack. Jails only inheriting a
network stack cannot change anything that cannot be changed from
within a prison.

Reviewed by:	rwatson, zec
Approved by:	re (kib)
2009-08-13 10:26:34 +00:00