numerous error recovery buglets.
Many thanks to Tor Egge for his assistance in diagnosing problems with
the error recovery code.
aic7xxx.c:
Report missed bus free events using their own sequencer interrupt
code to avoid confusion with other "bad phase" interrupts.
Remove a delay used in debugging. This delay could only be hit
in certain, very extreme, error recovery scenarios.
Handle transceiver state changes correctly. You can now
plug an SE device into a hot-plug LVD bus without hanging
the controller.
When stepping through a critical section, panic if we step
more than a reasonable number of times.
After a bus reset, disable bus reset interupts until we either
our first attempt to (re)select another device, or another device
attemps to select us. This removes the need to busy wait in
kernel for the scsi reset line to fall yet still ensures we
see any reset events that impact the state of either our initiator
or target roles. Before this change, we had the potential of
servicing a "storm" of reset interrupts if the reset line was
held for a significant amount of time.
Indicate the current sequencer address whenever we dump the
card's state.
aic7xxx.reg:
Transceiver state change register definitions.
Add the missed bussfree sequencer interrupt code.
Re-enable the scsi reset interrupt if it has been
disabled before every attempt to (re)select a device
and when we have been selected as a target.
When being (re)selected, check to see if the selection
dissappeared just after we enabled our bus free interrupt.
If the bus has gone free again, go back to the idle loop
and wait for another selection.
Note two locations where we should change our behavior
if ATN is still raised. If ATN is raised during the
presentation of a command complete or disconnect message,
we should ignore the message and expect the target to put
us in msgout phase. We don't currently do this as it
requires some code re-arrangement so that critical sections
can be properly placed around our handling of these two
events. Otherwise, we cannot guarantee that the check of
ATN is atomic relative to our acking of the message in
byte (the kernel could assert ATN).
Only set the IDENTIFY_SEEN flag after we have settled
on the SCB for this transaction. The kernel looks at
this flag before assuming that SCB_TAG is valid. This
avoids confusion during certain types of error recovery.
Add a critical section around findSCB. We cannot allow
the kernel to remove an entry from the disconnected
list while we are traversing it. Ditto for get_free_or_disc_scb.
aic7xxx_freebsd.c:
Only assume that SCB_TAG is accurate if IDENTIFY_SEEN is
set in SEQ_FLAGS.
Fix a typo that caused us to execute some code for the
non-SCB paging case when paging SCBs. This only occurred
during error recovery.
block bitmaps before unmount() completes. They were written using
bdwrite(), so they were normally written less than 32 seconds after
unmount(), but this is too late if the media is removed or the system
is rebooted soon after unmount(). sync()ing before unmount() didn't
help, because ext2fs uses buggy private caching for these blocks --
it doesn't even bdwrite() them until they are uncached or the filesystem
is unmounted. sync()ing after unmount() didn't help, because sync()
only applies to (vnodes for) mounted filesystems.
PR: 22726
As of this patchset, the loader builds (under NetBSD/macppc), boots, interacts
and talks to BOOTP/NFS servers.
(main.c was moved from boot/ofw/libofw to boot/ofw/common but has no revision
history)
Reviewed by: obrien
This brings the loader up to the point where I can compile it under
NetBSD/macppc and have it boot, interact and talk to NFS servers.
sys/boot/ofw/libofw/main.c has been deleted (it has no revision history) and
replaced with sys/boot/ofw/common/main.c
Reviewed by: obrien
device tree and resource manager contents. This is the kernel side of
the upcoming libdevinfo, which will expose this information to userspace
applications in a trivial fashion.
Remove the now-obsolete DEVICE_SYSCTLS code.
syscall compare against a variable sv_minsigstksz in struct
sysentvec as to properly take the size of the machine- and
ABI dependent struct sigframe into account.
The SVR4 and iBCS2 modules continue to have a minsigstksz of
8192 to preserve behavior. The real values (if different) are
not known at this time. Other ABI modules use the real
values.
The native MINSIGSTKSZ is now defined as follows:
Arch MINSIGSTKSZ
---- -----------
alpha 4096
i386 2048
ia64 12288
Reviewed by: mjacob
Suggested by: bde
This code has help us comprehence ACPI spec .
Contributors of this code is as follows(except for FreeBSD commiter):
Yasuo Yokoyama,
Munehiro Matsuda,
and ALL acpi-jp@jp.freebsd.org people.
Thanks.
R.I.P.
for an interrupt to enable/disable from the vector (and GID too, if we
had multiple GIDs)- so, stupidly for now, search for the right mcpcia's
softc so we have the right base address for the bridge CSR to apply
IRQ bit-twiddle's to. Alas- this doesn't yet allow us to run, but it's
the right direction.
Previously we had to include <machine/param.h> or <sys/param.h> bogusly
due to the fact that <sys/socket.h> CMSG macros needed the ALIGN macro,
which was defined in param.h. However, including param.h was a disaster
for namespace pollution.
This solution, as contributed by shin a while ago, fixes it elegantly
by wrapping the definitions around some namespace pollution preventer
definitions.
This patch was long overdue.
This should allow any network programmer to use <sys/socket.h> as
before.
PR: 19971, 20530
Submitted by: Martin Kaeske <MartinKaeske@lausitz.net>
Mark Andrews <Mark.Andrews@nominum.com>
Patch submitted by: shin
Reviewed by: bde
systems.
From the PR:
When 'probe.slot' is PCI_SLOTMAX (== 31) and 'probe.func' is 7,
call to 'pci_cfgread()' here and machine suddenly hangs up.
I don't know why... (or 450GX chipset's bug?)
PR: i386/20379
Submitted by: Masayuki FUKUI <fukui@sonic.nm.fujitsu.co.jp>
comments on the same line like so:
device foo # FooInc Brand NetEther cards
Also, move the wireless NIC cards to their own section.
Add commented out wl driver in wireless section.
Remove obsolete or redundant comments about some of the wireless cards
that used to apply but don't since we've removed 'at foobus'.
There should be no functional changes in this change.
happen when the vm system maps past the end of an object or tries
to map a zero length object, the pmap layer misses the fact that
offsets wrap into negative numbers and we get stuck.
Found by: Joost Pol aka Nohican <nohican@marcella.niets.org>
Submitted by: tegge
When the printer is turned off the pipe write will cause and error,
which causes lpd to close the device and reopen it to clear the error.
After a short while the device will disappear from the bus but lpd will
have opened the ulpt0 port by then. ulpt_status will check for status
without checking the sc->dying flag and panic the kernel when the device
finally disappears from the bus.
Submitted by: Ian Dowse <iedowse@maths.tcd.ie>
also
- sync with netbsd
- fix a bug that miscalculates tx cell counts when the pointer size isn't 4
tested both ENI and Adaptec cards on both i386 and alpha.
function declared in kern_ktr.c. The only inline checks left are the
checks that compare KTR_COMPILE with the supplied mask and thus should
be optimized away into either nothing or a direct call to ktr_tracepoint().
- Move several KTR-related options to opt_ktr.h now that they are only
needed by kern_ktr.c and not by ktr.h.
- Add in the ktr_verbose functionality if KTR_EXTEND is turned on. If the
global variable 'ktr_verbose' is non-zero, then KTR messages will be
dumped to the console. This variable can be set by either kernel code
or via the 'debug.ktr_verbose' sysctl. It defaults to off unless the
KTR_VERBOSE kernel option is specified in which case it defaults to on.
This can be useful when the machine locks up spinning in a loop with
interrupts disabled as you might be able to see what it is doing when it
locks up.
Requested by: phk
interface. In addition to using newbus, it also uses bus_space rather
than inb/outb to make it MI. The grody static softc allocation stuff
has been removed as well.
When restarting the sequencer, ensure that the SCBCNT register
is 0. A non-zero count will prevent the setting of the CCSCBDIR
bit in any future dma operations. The only time CCSCBCNT would
be non-zero is if we happened to halt the dma during a reset,
but even that should never happen. Better safe than sorry.
When a command completes before the target responds to an
ATN for a recovery command, we now notify the kernel so that
any recovery operation requeued in the qinfifo can be removed
safely. In the past, we did this in ahc_done(), but ahc_done()
may be called without the card paused. This also avoids a
recursive call to ahc_search_qinifo() which could have occurred if
ahc_search_qinififo() happened to be the routine to complete
a recovery action.
Fix 8bit math used for adjusting the qinfifo. The index must
be wrapped properly within the 256 entry array. We rely on the
fact that qinfifonext is a uint8_t in most cases to handle
this wrap, but we missed a few spots where the resultant
calculation was promoted to an int.
Change the way that we deal with aborting the first or second
entry from the qinfifo. We now swap the first entry in the
qinfifo with the "next queued scb" to force the sequencer
to see an abort collision if we ever touch the qinififo while
the sequencer is mid SCB dma.
aic7xxx.reg:
Add new MKMSG_FAILED sequencer interrupt. This displaced
the BOGUS_TAG interrupt used in some previous sequencer code
debugging.
aic7xxx.seq:
Increment our position in the qinfifo only once the dma
is complete and we have verified that the queue has not
been changed during our DMA. This simplifies code in the
kernel.
Protect against "instruction creep" when issuing a pausing
sequencer interrupt. On at least the 7890/91/96/97, the
sequencer will coast after issuing the interrupt for up
to two instructions. In the past we delt with this by
using carefully placed nops. Now we call a routine to
issue the interrupt followed by a nop and a ret.
Tell the kernel should an SCB complete with the MK_MESSAGE
flag still set. This means the target ignored our ATN request.
Clear the channel twice as we exit the data phase. On the
aic7890/91, the S/G preload logic may require the second
clearing to get the last S/G out of the FIFO.
aic7xxx_freebsd.c:
Don't bother searching the qinfifo for a doubly queued
recovery scb in ahc_done. This case is handled by the
core driver now.
Free the path used to issue async callbacks after the callback
is complete.
aic7xxx_inline.h:
Split the SCB queue routine into a routine that swaps
the SCB with the "next queued SCB" and a routine that
calls the swapping routine and notifies the card of
the new SCB. The swapping routine is now also used by
ahc_search_qinfifo.
The offset field in struct dirent was set to the offset of
the next dirent in rev 1.36. The offset was calculated from
the current offset and the record length. This offset does
not necessarily match the real offset when we are using
cookies. Therefore, also use the cookies to set the offset
field in struct dirent if we're using cookies to iterate
through the dirents.
circuit generates too much jitter to be used directly as xmit clock.
Don't miscount pending bytes in weird error conditions.
Drop the rest of a packet if we run out of tx-md's.
Trig the xmit-frame signal on rising edge, this fixed the one-bit-too-late
position of the HDLC frames in E1 mode.
wakeup all of the sleeping threads when we free only one buffer. This
avoids us having to needlessly try again (and fail, and go back to
sleep) for all the threads sleeping. We will now only wakeup the
thread we know will succeed.
Reviewed by: green
only be checked if the system is currently performing New Reno style
fast recovery. However, this value was being checked regardless of the
NR state, with the end result being that the congestion window was never
opened.
Change the logic to check t_dupack instead; the only code path that
allows it to be nonzero at this point is NewReno, so if it is nonzero,
we are in fast recovery mode and should not touch the congestion window.
Tested by: phk
pollution in <sys/mutex.h>. This was half fixed in rev.1.3 of
midwayreg.h. The pollution exposed the bug that this driver was using
toy versions of the bus space macros under FreeBSD. Disabling the
toy versions made this driver compile but dependent on the pollution.
There was still a toy version of bus_space_read_1() in unreachable code.
namespace pollution in <sys/mutex.h>. This was half fixed in rev.1.3
of midwayreg.h. The pollution exposed the bug that this driver was
using toy versions of the bus space macros under FreeBSD. Disabling
the toy versions made this driver compile and maybe support PIO space,
but dependent on the pollution.
ufs_vnops.c:
1) i_ino was confused with i_number, so the inode number passed to
VFS_VGET() was usually wrong (usually 0U).
2) ip was dereferenced after vgone() freed it, so the inode number
passed to VFS_VGET() was sometimes not even wrong.
Bug (1) was usually fatal in ext2_mknod(), since ext2fs doesn't have
space for inode 0 on the disk; ino_to_fsba() subtracts 1 from the
inode number, so inode number 0U gives a way out of bounds array
index. Bug(1) was usually harmless in ufs_mknod(); ino_to_fsba()
doesn't subtract 1, and VFS_VGET() reads suitable garbage (all 0's?)
from the disk for the invalid inode number 0U; ufs_mknod() returns
a wrong vnode, but most callers just vput() it; the correct vnode is
eventually obtained by an implicit VFS_VGET() just like it used to be.
Bug (2) usually doesn't happen.
freelist. Should now be thread-friendly, in part.
Note: More work is needed in uipc_syscalls.c, but it will have to wait until
the socket locking issues are at least 80% implemented and committed.
CDs.
With audio CDs, you can't just do a READ(10) call on most drives without
first setting the blocksize with a mode select command. The disklabel code
does a read of the first sector of the media to find a label if it exists.
This caused drives to return an error when an audio CD was in the drive,
due to the problem described above.
The solution is to read the table of contents on the CD, and only attempt
to read the disklabel if the first track is a data track.
This works on all the various CD and DVD media I have tried, but further
testing (especially with Video CDs and other mode 2 media) will be
needed to determine if this is a universal solution.
When this feature is enabled, mke2fs doesn't necessarily allocate a
super block and its associated descriptor blocks for every group.
The (non-)allocations are reflected in the block bitmap. Since the
filesystem code doesn't write to these blocks except for the first
superblock, all it has to do to support them is to not count them in
ext2_statfs() and not attempt to check them at mount time in
ext2_check_blocks_bitmap() (the check has never been enabled in
FreeBSD anyway).
and nobroadcast bits in the mode register and call it both from
pcn_init() and pcn_ioctl(). Sometimes we need to force the state
of the nobroadcast bit after switching out of promisc mode.
them. If we leave garbage in them, the dc_apply_fixup() routine may
try to follow bogus pointers when applying the reset fixup.
Noticed by: Andrew Gallatin
getnewvnode(). Otherwise routines called from VOP_INACTIVE() might
attempt to remove the vnode from a free list the vnode isn't on,
causing corruption.
PR: 18012
out of fashion. This particular case, unlike joy(8) and friends which
are just plain silly, did more than just load a kernel loadable module.
However, /etc/rc and the linux_base port were adjusted a while back to
cope with the absence of this script.
The only outstanding reason to hang on to it would have been for the
linux(8) manual page, which clued folks into the existence of the
Linuxulator. A new linux(4) was introduced a while back. It does
a much better job.
This script just isn't useful any more.
- Look for a hardwired interrupt in the routing table for this
bus/device/pin (we already did this).
- Look for another device with the same link byte which has a hardwired
interrupt.
- Look for a PCI device matching an entry with the same link byte
which has already been assigned an interrupt, and use that.
- Look for a routable interrupt listed in the "PCI only" interrupts
field and use that.
- Pick the first interrupt that's marked as routable and use that.
Xtal reference instead of the CLADI input.
In unframed E1 mode, tie SIGFRZ low so that the mysycc doesn't
get confused.
Don't mask errors with OOF. Don't ignore OOF errors.
Stop the channel before freeing mbufs in disconnect.
I still have no T1 devices to test with, so the T1 code is non-existent.
argument. These flags include INTR_FAST, INTR_MPSAFE, etc.
- Properly handle INTR_EXCL when it is passed in to allow an interrupt
handler to claim exclusive ownership of an interrupt thread.
- Add support for psuedo-fast interrupts on the alpha. For fast interrupts,
we don't allocate an interrupt thread; instead, during dispatching of an
interrupt, we run the handler directly instead of scheduling the thread
to run. Note that the handler is currently run without Giant and must be
MP safe. The only fast handler currently is for the sio driver.
Requested by: dfr
multicast filter on the Pegasus chip. Since IPv6 depends a lot
on multicasting, this caused several failures for people trying to
use IPv6 with Pegasus USB ethernet devices.
Submitted by: Jun Kuriyama <kuriyama@FreeBSD.org>
Return ENOTSUP for any opcode that is not supported by the XPT
device.
Add back a missing local declaration that seems to have been deleted
by my last commit.
Filter incoming transfer negotiation requests to ensure they
never exceed the settings specified by the user.
In restart sequencer attempt to deal with a bug in the aic7895.
If a third party reset occurs at just the right time, the
stack register can lock up. When restarting the sequencer
after handling the SCSI reset, poke SEQADDR1 before resting
the sequencers program counter.
When something strange happens, dump the card's transaction
state via ahc_dump_card_state(). This should aid in debugging.
Handle request sense transactions via the QINFIFO instead of
attaching them to the waiting queue directly. The waiting
queue consumes card SCB resources and, in the pathological case
of every target on the bus beating our selection attemps and
issuing a check condition, could have caused us to run out
of SCBs. I have never seen this happen, and only early
cards with 3 or 4 SCBs had any real chance of ever getting
into this state.
Add additional sequencer interrupt codes to support firmware
diagnostics. The diagnostic code is enabled with the
AHC_DEBUG_SEQUENCER kernel option.
Make it possible to switch into and out of target mode on
the fly. The card comes up by default as an initiator but
will switch into target mode as soon as an enable lun operation
is performed. As always, target mode behavior is gated
by the AHC_TMODE_ENABLE kernel option so most users will
not be affected by this change.
In ahc_update_target_msg_request(), also issue a new
request if the ppr_options have changed.
Never issue a PPR as a target. It is forbidden by the spec.
Correct a bug in ahc_parse_msg() that prevented us from
responding to PPR messages as a target.
Mark SCBs that are on the untagged queue with a flag instead
of checking several fields in the SCB to see if the SCB should
be on the queue. This makes it easier for things like automatic
request sense requests to be queued without touching the
untagged queues even though they are untagged requests.
When dealing with ignore wide residue messages that occur
in the middle of a transfer, reset HADDR, not SHADDR for
non-ultra2 chips. Although SHADDR is where the firmware
fetches the ending transfer address for a save data pointers
request, it is readonly. Setting HADDR has the side effect
of also updating SHADDR.
Cleanup the output of ahc_dump_card_state() by nulling out the
free scb list in the non-paging case. The free list is only
used if we must page SCBs.
Correct the transmission of cdbs > 12 bytes in length. When
swapping HSCBs prior to notifing the sequencer of the new
transaction, the bus address pointer for the cdb must also
be recalculated to reflect its new location. We now defer
the calculation of the cdb address until just before queing
it to the card.
When pulling transfer negotiation settings out of scratch
ram, convert 5MHz/clock doubled settings to 10MHz.
Add a new function ahc_qinfifo_requeue_tail() for use by
error recovery actions and auto-request sense operations.
These operations always occur when the sequencer is paused,
so we can avoid the extra expense incurred in the normal
SCB queue method.
Use the BMOV instruction for all single byte moves on
controllers that support it. The bmov instruction is
twice as fast as an AND with an immediate of 0xFF as
is used on older controllers.
Correct a few bugs in ahc_dump_card_state(). If we have
hardware assisted queue registers, use them to get the
sequencer's idea of the head of the queue. When enumerating
the untagged queue, it helps to use the correct index for
the queue.
aic7xxx.h:
Indicate via a feature flag, which controllers can take
on both the target and the initiator role at the same time.
Add the AHC_SEQUENCER_DEBUG flag.
Add the SCB_CDB32_PTR flag used for dealing with cdbs
with lengths between 13 and 32 bytes.
Add new prototypes.
aic7xxx.reg:
Allow the SCSIBUSL register to be written to. This is
required to fix a selection timeout problem on the 7892/99.
Cleanup the sequencer interrupt codes so that all debugging
codes are grouped at the end of the list.
Correct the definition of the ULTRA_ENB and DISC_DSB locations
in scratch ram. This prevented the driver from properly honoring
these settings when no serial eeprom was available.
Remove an unused sequencer flag.
aic7xxx.seq:
Just before a potential select-out, clear the SCSIBUSL
register. Occasionally, during a selection timeout, the
contents of the register may be presented on the bus,
causing much confusion.
Add sequencer diagnostic code to detect software and or
hardware bugs. The code attempts to verify most list
operations so any corruption is caught before it occurs.
We also track information about why a particular reconnection
request was rejected.
Don't clobber the digital REQ/ACK filter setting in SXFRCTL0
when clearing the channel.
Fix a target mode bug that would cause us to return busy
status instead of queue full in respnse to a tagged transaction.
Cleanup the overrun case. It turns out that by simply
butting the chip in bitbucket mode, it will ack any
bytes until the phase changes. This drasticaly simplifies
things.
Prior to leaving the data phase, make sure that the S/G
preload queue is empty.
Remove code to place a request sense request on the waiting
queue. This is all handled by the kernel now.
Change the semantics of "findSCB". In the past, findSCB
ensured that a freshly paged in SCB appeared on the disconnected
list. The problem with this is that there is no guarantee that
the paged in SCB is for a disconnected transation. We now
defer any list manipulation to the caller who usually discards
the SCB via the free list.
Inline some busy target table operations.
Add a critical section to protect adding an SCB to
the disconnected list.
aic7xxx_freebsd.c:
Handle changes in the transfer negotiation setting API
to filter incoming requests. No filtering is necessary
for "goal" requests from the XPT.
Set the SCB_CDB32_PTR flag when queing a transaction with
a large cdb.
In ahc_timeout, only take action if the active SCB is
the timedout SCB. This deals with the case of two
transactions to the same device with different timeout
values.
Use ahc_qinfifo_requeu_tail() instead of home grown
version.
aic7xxx_inline.h:
Honor SCB_CDB32_PTR when queuing a new request.
aic7xxx_pci.c:
Use the maximum data fifo threshold for all chips.
The XPT uses this to prevent tags from being used on parallel SCSI
interfaces immediately after a bus reset or BDR so that controllers
have an oportunity to renegotiate without tag messages in the way.
Somehow this got disabled... the functionality has been here for
quite some time.
Noticed by: my SCSI bus analyzer
This removes support for booting current kernels with very old bootblocks.
Device driver writers: Please remove initializations for the d_bmaj
field in your cdevsw{}.
terminated and the data_len field is no longer necessary.
Add ASCII2BINARY and BINARY2ASCII capabilities.
The old format is still understood and dealt with, but can't do
the ASCII2BINARY and BINARY2ASCII stuff.
Approved by: archie
current implementation, jail neither virtualizes the Sys V IPC namespace,
nor provides inter-jail protections on IPC objects.
o Support for System V IPC can be enabled by setting jail.sysvipc_allowed=1
using sysctl.
o This is not the "real fix" which involves virtualizing the System V
IPC namespace, but prevents processes within jail from influencing those
outside of jail when not approved by the administrator.
Reported by: Paulo Fragoso <paulo@nlink.com.br>
in the p_candebug() function. Synchronize with sef's CHECKIO()
macro from the old procfs, which seems to be a good source of security
checks.
Obtained from: TrustedBSD Project
PPTP links are no longer dropped by simple (and inappropriate in this
case) "inactivity timeout" procedure, only when requested through the
control connection.
It is now possible to have multiple PPTP servers running behind NAT.
Just redirect the incoming TCP traffic to port 1723, everything else
is done transparently.
Problems were reported and the fix was tested by:
Michael Adler <Michael.Adler@compaq.com>,
David Andersen <dga@lcs.mit.edu>
This is due to a bug that has been in there since Warneer did the
PCCARD stuff, the altioaddr is not offset 8 its offset 14 from
the base address.
Also only probe the master device, no known PCCARD ATA thingies
has a slave AFAIK..
- Add DRIVER_MODULE() declaration to make this driver a
child of cardbus
- Handle different width EEPROMs
The CIS parser still barfs when scanning this card, but it seems to
probe/attach correctly anyway. I can't do a traffic test just yet
since I don't have a proper crossover cable handy.
This allows writing to DVD-RAM, PD and similar drives that probe as CD
devices. Note that these are randomly writeable devices, not
sequential-only devices like CD-R drives, which are supported by cdrecord.
Add a new flag value for dsopen(), DSO_COMPATLABEL. The cd(4) driver now
uses this flag instead of the DSO_NOLABELS flag. The DSO_NOLABELS always
used a "fake" disklabel for the entire disk, provided by the caller.
With the DSO_COMPATLABEL flag, dsopen() will first search the media for a
label, and if it finds a label, it will use that label. Otherwise it will
use the fake disklabel provided by the caller. This provides backwards
compatibility, since we will still have labels for ISO9660 media.
It also provides new functionality, since you can now have a regular BSD
disklabel on read-only media, or on writeable media (e.g. DVD-RAM).
Bruce and I both think that we should eventually (in a few years) get
away from using disklabels for ISO9660 media, and just use the whole disk
device (/dev/cd0). At that point disklabel handling in the cd(4) driver
could follow the "normal" model, as used in the da(4) driver.
Also, clean up the path in a couple of places in cdregister(). (Thanks to
Nick Hibma for catching that bug.)
Reviewed by: bde
Otherwise, aio_read() and aio_write() on sockets are broken if a kevent is
registered. (The code after kevent registration for handling sockets assumes
that the struct file pointer "fp" still refers to the socket, not the kqueue.)
Don't check for a null pointer if malloc called with M_WAITOK.
Submitted by: josh@zipperup.org
Submitted by: Robert Drehmel <robd@gmx.net>
Approved by: bp
<sys/proc.h> to <sys/systm.h>.
Correctly document the #includes needed in the manpage.
Add one now needed #include of <sys/systm.h>.
Remove the consequent 48 unused #includes of <sys/proc.h>.
the offending inline function (BUF_KERNPROC) on it being #included
already.
I'm not sure BUF_KERNPROC() is even the right thing to do or in the
right place or implemented the right way (inline vs normal function).
Remove consequently unneeded #includes of <sys/proc.h>
used in lower layer (scsi_low.c).
The flag of ncv for KME KXLC004 was chaged from 0x1 to 0x100.
The flag of nsp for PIO mode was chaged from 0x1 to 0x100.
command register is too aggressive. Revert to the previous behaviour, but
leave the new behaviour available as an undocumented option. It's not
clear what the Right, Right Thing is to do here, but the more conservative
approach is safer.
a RealTek 8139 cardbus device. Unfortunately it doesn't quite work yet
because the CIS parser barfs on it.
Submitted by msmith, with some small tweaks by me.
return the last value returned by a nested method call. This violates
the ACPI spec, but is implemented by the Microsoft interpreter, and thus
vendors can (and do) get away with it.
Intel's stance is that this is illegal and should not be supported.
As they put it, however, we have to live in the real world. So go ahead
and implement it.
Submitted by: Mitsaru IWASAKI <iwasaki@jp.freebsd.org>
- Set debugger options for kernel build
- Define some missing functions
- Bring in GCC defines
- Disable the 'wbinvd' macro as it conflicts with our inline
- AcpiGetProcessorID (fetch the ACPI processor ID for a given ACPI_HANDLE)
- AcpiSetSystemSleepState (set the Sx sleeping state, proposed by Intel
but not actually implemented)
ACPICA. Most of these are still works in progress. Support exists for:
- Fixed feature and control method power, lid and sleep buttons.
- Detection of ISA PnP devices using ACPI namespace.
- Detection of PCI root busses using ACPI namespace.
- CPU throttling and sleep states (incomplete)
- Thermal monitoring and cooling control (incomplete)
- Interface to platform embedded controllers (mostly complete)
- ACPI timer (incomplete)
- Simple userland control of sleep states.
- Shutdown and poweroff.
seems to be that the nodes are bzero'd beforehand, but the submitter
found that this was not always the case, and in any event defensive
programming here costs epsilon squared.
PR: 22244
Submitted by: Dave Gillam <daveg@chiaro.com>
because it only takes a struct tag which makes it impossible to
use unions, typedefs etc.
Define __offsetof() in <machine/ansi.h>
Define offsetof() in terms of __offsetof() in <stddef.h> and <sys/types.h>
Remove myriad of local offsetof() definitions.
Remove includes of <stddef.h> in kernel code.
NB: Kernelcode should *never* include from /usr/include !
Make <sys/queue.h> include <machine/ansi.h> to avoid polluting the API.
Deprecate <struct.h> with a warning. The warning turns into an error on
01-12-2000 and the file gets removed entirely on 01-01-2001.
Paritials reviews by: various.
Significant brucifications by: bde
to reinstall boot1 after a 'make world'.
Unfortunately this means that people who have already installed a new
boot1 from a 'make world' after 2000/09/18 *must* reinstall it after
their next build using something like:
# disklabel -B /dev/da0c
of IP datagram. This fixes the problem when firewall denied fragmented
packets whose last fragment was less than minimum protocol header size.
Found by: Harti Brandt <brandt@fokus.gmd.de>
PR: kern/22309
it can function before malloc(9) is up and running.
- Add two new options WITNESS_DDB and WITNESS_SKIPSPIN. If WITNESS_SKIPSPIN
is enabled, then spin mutexes are ignored by the WITNESS code. If
WITNESS_DDB is turned on and DDB is compiled into the kernel, then the
kernel will drop into DDB when either a lock hierarchy violation occurs
or mutexes are held when going to sleep.
- Add some new sysctls:
debug.witness_ddb is a read-write sysctl that corresponds to WITNESS_DDB.
The kernel option merely changes the default value to on at boot.
debug.witness_skipspin is a read-only sysctl that one can use to determine
if the kernel was compiled with WITNESS_SKIPSPIN.
- Wipe out the BSD/OS-specific lock order lists. We get to build our own
lists now as we add mutexes to the kernel.
the PCI latency timer value to 0x80. Davicom's Linux driver does this,
and it drastically reduces the number of TX underruns in my tests. (Note:
this is done only for the Davicom chips. I'm not sure it's a good idea to
do it for all of them.)
Again, still waiting on confirmation before merging to stable.
change_ruid() in kern_prot.c. This fixes an incorrect use
of chgproccnt().
Update both osf1_setuid() and osf1_setgid() to use setsugid() instead
of just frobbing the flag.
(mostly) submitted by: truckman
in the code enforces this. So, do not check for and attempt a
false reassembly if only IP_RF is set.
Also, removed the dead code, since we no longer use dtom() on
return from ip_reass().
DM9100/DM9102 chips. Do not set DC_TX_ONE. The DC_TX_USE_TX_INTR flag
causes dc_encap() to set the 'interrupt on TX completion' bit only
once every 64 packets. This is an attempt to reduce the number
of interrupts generated by the chip. You're supposed to get a 'no more
TX buffers left' interrupt once you hit the last packet whether you
ask for one or not, however it seems the Davicom chip doesn't generate
this interrupt, or at least it doesn't generate it under the same
circumstances. The result is that if you transmit n packets, where
n is less than 64, and then wait 5 seconds, you'll get a watchdog
timeout whether you want one or not. The DC_TX_INTR_ALWAYS causes
dc_encap() to request an interrupt for every frame.
I'm still waiting on confirmation from a couple of users to see if this
fixes their problems with the Davicom DM9102 before I merge this into
-stable, but this fixed the problem for me in my own testing so I'm
willing to make the change to -current right away.
expands beyond the limit we will extend the address space before trying
to zero the BSS. This should give us plenty of headroom for modest
expansion of the loader.
- Change the softintr() macro to do nothing on FreeBSD. Previously,
this macro would set a bit in spending and schedule the softinterrupt
thread to run. However, the bs driver never actually registers a
a software interrupt handler, so all this work achieved nothing. From
the code it is not clear what exactly the softintr() macro is actually
supposed to be doing. It looks like it is supposed to be possibly
running the hardware interrupt handler maybe? This handler is only
present in the #ifdef __NetBSD__ code however. I have no idea how this
driver handles interrupts at all, but at least it compiles now.
- Layout reorganisation to enhance portability. The driver now has
a relatively MI 'core' and a FreeBSD-specific layer over the top.
Since the NetBSD people have already done their own port, this is
largely just to help me with the BSD/OS port.
- Request ID allocation changed to improve performance (I'd been
considering switching to this approach after having failed to come
up with a better way to dynamically allocate request IDs, and seeing
Andy Doran use it in the NetBSD port of the driver convinced me
that I was wasting my time doing it any other way). Now we just
allocate all the requests up front.
- Maximum request count bumped back to 255 after characterisation
of a firmware issue (off-by-one causing it to crash with 256
outstanding commands).
- Control interface implemented. This allows 3ware's '3dm' utility to
talk to the controller. 3dm will be available from 3ware shortly.
- Controller soft-reset feature added; if the controller signals a
firmware or protocol error, the controller will be reset and all
outstanding commands will be retried.
type of software interrupt. Roughly, what used to be a bit in spending
now maps to a swi thread. Each thread can have multiple handlers, just
like a hardware interrupt thread.
- Instead of using a bitmask of pending interrupts, we schedule the specific
software interrupt thread to run, so spending, NSWI, and the shandlers
array are no longer needed. We can now have an arbitrary number of
software interrupt threads. When you register a software interrupt
thread via sinthand_add(), you get back a struct intrhand that you pass
to sched_swi() when you wish to schedule your swi thread to run.
- Convert the name of 'struct intrec' to 'struct intrhand' as it is a bit
more intuitive. Also, prefix all the members of struct intrhand with
'ih_'.
- Make swi_net() a MI function since there is now no point in it being
MD.
Submitted by: cp
- Close a small race condition. The sched_lock mutex protects
p->p_stat as well as the run queues. Another CPU could change p_stat
of the process while we are waiting for the lock, and we would end up
scheduling a process that isn't runnable.
usb_ethersubr.c. This module maintains two queues for packets which
are each protected with one mutex. These are all the changes I can
do for now. Removing the USBD_NO_TSLEEP flag doesn't work yet: when
I tried it, the system would usually freeze up after a NIC had been
operating for a while. The usb_ethersubr module itself ought to
go away; this is the next thing I need to test.
* Fixes to the signal delivery code. Not quite right yet.
I would have preferred to wait until I have signal delivery actually
working but the current kernel in CVS doesn't build.
mail:
The problem seems to originate with NFS's postop_attr
information that is returned with a read or write RPC.
Within a vm_fault context, the code cannot deal with
vnode_pager_setsize() shrinking a vnode.
The workaround in the patch below stops the nfsm_postop_attr()
macro from ever shrinking a vnode. If the new size in the
postop_attr information is smaller, then it just sets the
nfsnode n_attrstamp to 0 to stop the wrong size getting
used in the future. This change only affects postop_attr
attributes; the nfsm_loadattr() macro works as normal.
The change is implemented by adding a new argument to
nfs_loadattrcache() called 'dontshrink'. When this is
non-zero, nfs_loadattrcache() will never reduce the
vnode/nfsnode size; instead it zeros n_attrstamp.
There remain other was processes can get stuck in vmopar.
Submitted by: Ian Dowse <iedowse@maths.tcd.ie>
Reviewed by: dillon
Tested by: Vadim Belman <voland@lflat.org>
more include file including <sys/proc.h>, but there still is this wonky
and (causes warnings on i386) reference in globals.h.
CURTHD is now defined in <machine/globals.h> as well. The correct thing
to do is provide a platform function for this.
will compile again. I can't quite see where this was a recursive inclusion.
We probably need to do something to fix the alpha, but let's not break it
in the interim- it's broken enough.
(a NetBSD port for NEC PC-98x1 machines). They are ncv for NCR 53C500,
nsp for Workbit Ninja SCSI-3, and stg for TMC 18C30 and 18C50.
I thank NetBSD/pc98 and bsd-nomads people.
Obtained from: NetBSD/pc98
in add_m6fc(), set interface list for all cases.
in response to a report from Hoerdt Mickael.
kame 1.31 -> 1.32
discard PIM register if the version of the inner packet is incorrect (i.e. IPv6)
(according to clarfication of recent discussion in the IETF pim ML)
Have if_ti stop "hiding" the softc pointer in the buffer region. Rather,
use the available void * passed to the free routine and pass the softc
pointer through there.
To note: in MEXTADD(), TI_JUMBO_FRAMELEN should probably be TI_JLEN. I left it
unchanged, because this way I'm sure to not damage anything in this respect...
Instead of:
foo = malloc(sizeof(foo), M_WAIT);
bzero(foo, sizeof(foo));
You can now (and please do) use:
foo = malloc(sizeof(foo), M_WAIT | M_ZERO);
In the future this will enable us to do idle-time pre-zeroing of
malloc-space.
reducues the maintenance load for the mutex code. The only MD portions
of the mutex code are in machine/mutex.h now, which include the assembly
macros for handling mutexes as well as optionally overriding the mutex
micro-operations. For example, we use optimized micro-ops on the x86
platform #ifndef I386_CPU.
- Change the behavior of the SMP_DEBUG kernel option. In the new code,
mtx_assert() only depends on INVARIANTS, allowing other kernel developers
to have working mutex assertiions without having to include all of the
mutex debugging code. The SMP_DEBUG kernel option has been renamed to
MUTEX_DEBUG and now just controls extra mutex debugging code.
- Abolish the ugly mtx_f hack. Instead, we dynamically allocate
seperate mtx_debug structures on the fly in mtx_init, except for mutexes
that are initiated very early in the boot process. These mutexes
are declared using a special MUTEX_DECLARE() macro, and use a new
flag MTX_COLD when calling mtx_init. This is still somewhat hackish,
but it is less evil than the mtx_f filler struct, and the mtx struct is
now the same size with and without mutex debugging code.
- Add some micro-micro-operation macros for doing the actual atomic
operations on the mutex mtx_lock field to make it easier for other archs
to override/optimize mutex ops if needed. These new tiny ops also clean
up the code in some places by replacing long atomic operation function
calls that spanned 2-3 lines with a short 1-line macro call.
- Don't call mi_switch() from mtx_enter_hard() when we block while trying
to obtain a sleep mutex. Calling mi_switch() would bogusly release
Giant before switching to the next process. Instead, inline most of the
code from mi_switch() in the mtx_enter_hard() function. Note that when
we finally kill Giant we can back this out and go back to calling
mi_switch().
in most of the atomic operations. Now for these operations, you can
use the normal atomic operation, you can use the operation with a read
barrier, or you can use the operation with a write barrier. The function
names follow the same semantics used in the ia64 instruction set. An
atomic operation with a read barrier has the extra suffix 'acq', due to
it having "acquire" semantics. An atomic operation with a write barrier
has the extra suffix 'rel'. These suffixes are inserted between the
name of the operation to perform and the typename. For example, the
atomic_add_int() function now has 3 variants:
- atomic_add_int() - this is the same as the previous function
- atomic_add_acq_int() - this function combines the add operation with a
read memory barrier
- atomic_add_rel_int() - this function combines the add operation with a
write memory barrier
- Add 'ptr' to the list of types that we can perform atomic operations
on. This allows one to do atomic operations on uintptr_t's. This is
useful in the mutex code, for example, because the actual mutex lock is
a pointer.
- Add two new operations for doing loads and stores with memory barriers.
The new load operations use a read barrier before the load, and the
new store operations use a write barrier after the load. For example,
atomic_load_acq_int() will atomically load an integer as well as
enforcing a read barrier.
length of the data properly. This should be moved into a tty_subr
function.
Also, disanle the setting of the CDC_CM_OVER_DATA flag. It breaks some
modems. I don't think that ther actually is a modem that needs this.
Submitted by: Brad Karp <bkarp@ICSI.Berkeley.EDU>
statistics on a per network address basis.
Teach the IPv4 and IPv6 input/output routines to log packets/bytes
against the network address connected to the flow.
Teach netstat to display the per-address stats for IP protocols
when 'netstat -i' is evoked, instead of displaying the per-interface
stats.
o Change name of bus
o Change the panic on resource allocation failure to just a message. We'll
work out why this fails later in the pcic/pccbb code merge.
This commit adds support for Xircom X3201 based cardbus cards.
Support for the TDK 78Q2120 MII is also added.
IBM Etherjet, Intel and Xircom cards uses these chips.
Note that as a result of this commit, some Intel/DEC 21143 based cardbus
cards will also attach, but not get link. That is being looked at.
"administrative" authorization checks. In most cases, the VADMIN test
checks to make sure the credential effective uid is the same as the file
owner.
o Modify vaccess() to set VADMIN as an available right if the uid is
appropriate.
o Modify references to uid-based access control operations such that they
now always invoke VOP_ACCESS() instead of using hard-coded policy checks.
o This allows alternative UFS policies to be implemented by replacing only
ufs_access() (such as mandatory system policies).
o VOP_ACCESS() requires the caller to hold an exclusive vnode lock on the
vnode: I believe that new invocations of VOP_ACCESS() are always called
with the lock held.
o Some direct checks of the uid remain, largely associated with the QUOTA
and SUIDDIR code.
Reviewed by: eivind
Obtained from: TrustedBSD Project
Fixes bugs in devfs when unloading and reloading
Syncs with NetBSD changes
Submitted by: Alexander Langer <alex@big.endian.de>
Submitted by: Thomas Klausner <wiz@netbsd.org>
Submitted by: Daniel O'Connor" <doconnor@gsoft.com.au>
priority "0" and without PCATCH, so it was uninterruptable. And
even when it did wake up after entropy arrived, it exited after the
wakeup without actually reading the freshly arrived entropy. I
sent this to Mark before but it seems he is in transit.
Mark: feel free to replace this if it gets in your way.
implementation.
Add bus_generic_rl_{get,set,delete,release,alloc}_resource() functions
which provide generic operations for devices using resource list style
resource management.
This should simplify a number of bus drivers. Further commits to follow.
Files:
dev/cardbus/cardbus.c
dev/cardbus/cardbusreg.h
dev/cardbus/cardbusvar.h
dev/cardbus/cardbus_cis.c
dev/cardbus/cardbus_cis.h
dev/pccbb/pccbb.c
dev/pccbb/pccbbreg.h
dev/pccbb/pccbbvar.h
dev/pccbb/pccbb_if.m
This should support:
- cardbus controllers:
* TI 113X
* TI 12XX
* TI 14XX
* Ricoh 47X
* Ricoh 46X
* ToPIC 95
* ToPIC 97
* ToPIC 100
* Cirrus Logic CLPD683x
- cardbus cards
* 3c575BT
* 3c575CT
* Xircom X3201 (includes IBM, Xircom and, Intel cards)
[ 3com support already in kernel, Xircom will be committed real soon now]
This doesn't work with 16bit pccards under NEWCARD.
Enable in your config by having "device pccbb" and "device cardbus".
(A "device pccard" will attach a pccard bus, but it means you system have
a high chance of panicing when a 16bit card is inserted)
It should be fairly simple to make a driver attach to cardbus under
NEWCARD -- simply add an entry for attaching to cardbus on a new
DRIVER_MODULE and add new device IDs as necessary. You should also make
sure the card can be detached nicely without the interrupt routine doing
something weird, like going into an infinite loop. Usually that should
entail adding an additional check when a pci register or the bus space is
read to check if it equals 0xffffffff.
Any problems, please let me know.
Reviewed by: imp
Files:
dev/cardbus/cardbus.c
dev/cardbus/cardbusreg.h
dev/cardbus/cardbusvar.h
dev/cardbus/cardbus_cis.c
dev/cardbus/cardbus_cis.h
dev/pccbb/pccbb.c
dev/pccbb/pccbbreg.h
dev/pccbb/pccbbvar.h
dev/pccbb/pccbb_if.m
This should support:
- cardbus controllers:
* TI 113X
* TI 12XX
* TI 14XX
* Ricoh 47X
* Ricoh 46X
* ToPIC 95
* ToPIC 97
* ToPIC 100
* Cirrus Logic CLPD683x
- cardbus cards
* 3c575BT
* 3c575CT
* Xircom X3201 (includes IBM, Xircom and, Intel cards)
[ 3com support already in kernel, Xircom will be committed real soon now]
This doesn't work with 16bit pccards under NEWCARD.
Enable in your config by having "device pccbb" and "device cardbus".
(A "device pccard" will attach a pccard bus, but it means you system have
a high chance of panicing when a 16bit card is inserted)
It should be fairly simple to make a driver attach to cardbus under
NEWCARD -- simply add an entry for attaching to cardbus on a new
DRIVER_MODULE and add new device IDs as necessary. You should also make
sure the card can be detached nicely without the interrupt routine doing
something weird, like going into an infinite loop. Usually that should
entail adding an additional check when a pci register or the bus space is
read to check if it equals 0xffffffff.
Any problems, please let me know.
Reviewed by: imp
o Report function number and config index on probe line
o Activate the resources (I hope) when RF_ACTIVE is set on those resources
I'm allocating on behalf of my children.
o Always enable interrupts on multifunction cards in the multifunction
register.
This was implemented by Shigeru YAMAMOTO-san and Jonathan Chen. I've
cleaned them up somewhat and they seem to work well enough to boot
current (but given current's state it can be hard to tell). Doug
Rabson also reviewed the design and signed off on it.
compile time will build in mutex locks, otherwise the old locking (splcam/splx
with a recursion counter) will be compiled in.
We still depend on config_intr_hook to tell us when it's okay to call
msleep instead of polling. It'd be real nice if we could do this early
enough to not hang up a machine struggling with a bad Fibre Channel loop,
but that's still to come.
write caching is disabled on both SCSI and IDE disks where large
memory dumps could take up to an hour to complete.
Taking an i386 scsi based system with 512MB of ram and timing (in
seconds) how long it took to complete a dump, the following results
were obtained:
Before: After:
WCE TIME WCE TIME
------------------ ------------------
1 141.820972 1 15.600111
0 797.265072 0 65.480465
Obtained from: Yahoo!
Reviewed by: peter
o Remember the resources we allocate for the config entry.
o When we get the resource, do an resource_list_add and do a
resource_list_delete if we fail later in the resource list.
o In the pccard bus, we allocate the resources. When a child asks for
them, just return the resources that we allocated (thanks to Paul
Richards and Mike Smith for the idea).
stacks near the top of their address space. If their TOS is greater
than vm_maxsaddr, vm_map_growstack() will confuse the thread stack
with the process stack and deliver a SEGV if they attempt to grow the
thread stack past their current stacksize rlimit. To avoid this,
adjust vm_maxsaddr upwards to reflect the current stacksize rlimit
rather than the maximum possible stacksize. It would be better to
adjust the mmap'ed region, but some apps (again, IBM's JDK 1.3) do not
check mmap's return value..
This commit (in conjunction with setting MINSIGSTKSZ to 2048 &
rebuilding your kernel and modules) will get IBM's JDK 1.3 working
with FreeBSD at least well enough to run many of the example applets.
Reviewed by: marcel
Tested by: sto@stat.duke.edu, many others on freebsd-java@
and associated user-level signal trampoline glue.
Without this patch, an SA_SIGINFO style handler can be installed by a linux
app, but if the handler accesses its sip argument, it will get a garbage
pointer and likely segfault.
We currently supply a valid pointer, but its contents are mainly
garbage. Filling this in properly is future work.
This is the second of 3 commits that will get IBM's JDK 1.3 working with
FreeBSD ...
rather than all the flags. This prevents setting being read from ROM,
which is a problem. If this breaks anything, it will only break the
3C556B cards minipci cards, which mainly exist at rpi as far as rpi
has been able to tell.
Submitted by: Louis Gerbarg <gerbal@rpi.edu>
loads, prints the copyright, and either hangs or locks solid. The
PC tends to be in the data segment and the RA is in XentMM
Doug really came up with the fix, I'm just the monkey typing. Doug says:
The alpha can only support 64k of globals with $gp pointing at
base+32k so that the code can use 16bit signed offsets from $gp to
access it. .... it is possible to have multiple .got subsections
and the linker handles this with the relocations for 'ldgp' pseudo
instructions. [Without this patch] the code in exception.s has been
linked to use a different gp from locore.s (where pal_kgp is set).
Reviewed by: dfr
#ifdef away the offending code until somebody with more newbus fu than
me can figure out where to put a default function that returns 255
without touching each alpha chipset driver..
kernel backing store.
* Implement syscalls via break instructions.
* Fix backing store copying in cpu_fork() so that the child gets the right
register values.
This thing is actually starting to work now. This set of changes takes me
up to the second execve (the one which runs the first shell). Next stop
single-user mode :-).
before the attach. Things aren't completely working, but this is a good
checkpoint.
Also, initialize the dev member of the function as soon as we add it
to the parent.
o initialize ivars with bzero.
o remove interrupt function pointer. netbsd needs it, but we don't.
o add lots of comments about bogus things that I've been kludging to try
to make the simple cases work.
o add new ivar accessor for cis4 to match cis3. likely neither will be
needed, but it doesn't hurt to have it.
number of ext_buf counters that are possibly allocatable.
Do this because:
(i) It will make it easier to influence EXT_COUNTERS for if_sk,
if_ti (or similar) users where the driver allocates its own
ext_bufs and where it is important for the mbuf system to take
it into account when reserving necessary space for counters.
(ii) Facilitate some percentile calculation for netstat(1)
as inline functions, renaming them to __uint16_swap_uint32,
__uint8_swap_uint32 and __uint8_swap_uint16.
Doing it properly suggested by: msmith
Reviewed by: msmith
now in dirs called sys/*/random/ instead of sys/*/randomdev/*.
Introduce blocking, but only at startup; the random device will
block until the first reseed happens to prevent clients from
using untrustworthy output.
Provide a read_random() call for the rest of the kernel so that
the entropy device does not need to be present. This means that
things like IPX no longer need to have "device random" hardcoded
into thir kernel config. The downside is that read_random() will
provide very poor output until the entropy device is loaded and
reseeded. It is recommended that developers do NOT use the
read_random() call; instead, they should use arc4random() which
internally uses read_random().
Clean up the mutex and locking code a bit; this makes it possible
to unload the module again.
description:
How it works:
--
Basically ifs is a copy of ffs, overriding some vfs/vnops. (Yes, hack.)
I didn't see the need in duplicating all of sys/ufs/ffs to get this
off the ground.
File creation is done through a special file - 'newfile' . When newfile
is called, the system allocates and returns an inode. Note that newfile
is done in a cloning fashion:
fd = open("newfile", O_CREAT|O_RDWR, 0644);
fstat(fd, &st);
printf("new file is %d\n", (int)st.st_ino);
Once you have created a file, you can open() and unlink() it by its returned
inode number retrieved from the stat call, ie:
fd = open("5", O_RDWR);
The creation permissions depend entirely if you have write access to the
root directory of the filesystem.
To get the list of currently allocated inodes, VOP_READDIR has been added
which returns a directory listing of those currently allocated.
--
What this entails:
* patching conf/files and conf/options to include IFS as a new compile
option (and since ifs depends upon FFS, include the FFS routines)
* An entry in i386/conf/NOTES indicating IFS exists and where to go for
an explanation
* Unstaticize a couple of routines in src/sys/ufs/ffs/ which the IFS
routines require (ffs_mount() and ffs_reload())
* a new bunch of routines in src/sys/ufs/ifs/ which implement the IFS
routines. IFS replaces some of the vfsops, and a handful of vnops -
most notably are VFS_VGET(), VOP_LOOKUP(), VOP_UNLINK() and VOP_READDIR().
Any other directory operation is marked as invalid.
What this results in:
* an IFS partition's create permissions are controlled by the perm/ownership of
the root mount point, just like a normal directory
* Each inode has perm and ownership too
* IFS does *NOT* mean an FFS partition can be opened per inode. This is a
completely seperate filesystem here
* Softupdates doesn't work with IFS, and really I don't think it needs it.
Besides, fsck's are FAST. (Try it :-)
* Inodes 0 and 1 aren't allocatable because they are special (dump/swap IIRC).
Inode 2 isn't allocatable since UFS/FFS locks all inodes in the system against
this particular inode, and unravelling THAT code isn't trivial. Therefore,
useful inodes start at 3.
Enjoy, and feedback is definitely appreciated!