opnsense-src

mirror of https://github.com/opnsense/src.git synced 2026-02-25 19:05:20 -05:00

Author	SHA1	Message	Date
Alexander Leidinger	bb63fdde6d	By using the 32-bit Linux version of Sun's Java Development Kit 1.6 on FreeBSD (amd64), invocations of "javac" (or "java") eventually end with the output of "Killed" and exit code 137. This is caused by: 1. After calling exec() in multithreaded linux program threads are not destroyed and continue running. They get killed after program being executed finishes. 2. linux_exit_group doesn't return correct exit code when called not from group leader. Which happens regularly using sun jvm. The submitters fix this in a similar way to how NetBSD handles this. I took the PRs away from dchagin, who seems to be out of touch of this since a while (no response from him). The patches committed here are from [2], with some little modifications from me to the style. PR: 141439 [1], 144194 [2] Submitted by: Stefan Schmidt <stefan.schmidt@stadtbuch.de>, gk Reviewed by: rdivacky (in april 2010) MFC after: 5 days	2010-11-22 09:06:59 +00:00
Brooks Davis	3ef5ae2dde	Since all other comparisons involving ngroups_max use "ngroups_max + 1", use ">= ngroups_max+1" instead of the equivalent "> ngroups_max" to reduce confusion.	2010-01-15 07:05:00 +00:00
Brooks Davis	412f9500e2	Replace the static NGROUPS=NGROUPS_MAX+1=1024 with a dynamic kern.ngroups+1. kern.ngroups can range from NGROUPS_MAX=1023 to INT_MAX-1. Given that the Windows group limit is 1024, this range should be sufficient for most applications. MFC after: 1 month	2010-01-12 07:49:34 +00:00
Konstantin Belousov	b55ef216fe	kern_select(9) copies fd_set in and out of userspace in quantities of longs. Since 32bit processes longs are 4 bytes, 64bit kernel may copy in or out 4 bytes more then the process expected. Calculate the amount of bytes to copy taking into account size of fd_set for the current process ABI. Diagnosed and tested by: Peter Jeremy <peterjeremy acm org> Reviewed by: jhb MFC after: 1 week	2009-09-09 20:59:01 +00:00
Brooks Davis	838d985825	Rework the credential code to support larger values of NGROUPS and NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024 and 1023 respectively. (Previously they were equal, but under a close reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it is the number of supplemental groups, not total number of groups.) The bulk of the change consists of converting the struct ucred member cr_groups from a static array to a pointer. Do the equivalent in kinfo_proc. Introduce new interfaces crcopysafe() and crsetgroups() for duplicating a process credential before modifying it and for setting group lists respectively. Both interfaces take care for the details of allocating groups array. crsetgroups() takes care of truncating the group list to the current maximum (NGROUPS) if necessary. In the future, crsetgroups() may be responsible for insuring invariants such as sorting the supplemental groups to allow groupmember() to be implemented as a binary search. Because we can not change struct xucred without breaking application ABIs, we leave it alone and introduce a new XU_NGROUPS value which is always 16 and is to be used or NGRPS as appropriate for things such as NFS which need to use no more than 16 groups. When feasible, truncate the group list rather than generating an error. Minor changes: - Reduce the number of hand rolled versions of groupmember(). - Do not assign to both cr_gid and cr_groups[0]. - Modify ipfw to cache ucreds instead of part of their contents since they are immutable once referenced by more than one entity. Submitted by: Isilon Systems (initial implementation) X-MFC after: never PR: bin/113398 kern/133867	2009-06-19 17:10:35 +00:00
Jamie Gritton	7455b100af	Add counterparts to getcredhostname: getcreddomainname, getcredhostuuid, getcredhostid Suggested by: rmacklem Approved by: bz	2009-06-13 00:12:02 +00:00
Robert Watson	bcf11e8d00	Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd	2009-06-05 14:55:22 +00:00
Jamie Gritton	76ca6f88da	Place hostnames and similar information fully under the prison system. The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible. The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed. Approved by: bz (mentor)	2009-05-29 21:27:12 +00:00
Dmitry Chagin	8d30f381ef	Do not export AT_CLKTCK when emulating Linux kernel prior to 2.4.0, as it has appeared in the 2.4.0-rc7 first time. Being exported, AT_CLKTCK is returned by sysconf(_SC_CLK_TCK), glibc falls back to the hard-coded CLK_TCK value when aux entry is not present. Glibc versions prior to 2.2.1 always use hard-coded CLK_TCK value. For older applications/libc's which depends on hard-coded CLK_TCK value user should set compat.linux.osrelease less than 2.4.0. Approved by: kib (mentor)	2009-05-10 18:43:43 +00:00
Dmitry Chagin	1ca16454b3	Rework r189362, r191883. The frequency of the statistics clock is given by stathz. Use stathz if it is available, otherwise use hz. Pointed out by: bde Approved by: kib (mentor)	2009-05-10 18:16:07 +00:00
Dmitry Chagin	c65b9bfa3c	Linux exports HZ value to user space via AT_CLKTCK auxiliary vector entry, which is available for Glibc as sysconf(_SC_CLK_TCK). If AT_CLKTCK entry is not exported, Glibc uses 100. linux_times() shall use the value that is exported to user space. Pointyhat to: dchagin PR: kern/134251 Approved by: kib (mentor) MFC after: 2 weeks	2009-05-07 14:24:50 +00:00
Dmitry Chagin	4d706dcc08	Change linux struct tms definition to match actual linux one. Approved by: kib (mentor) MFC after: 2 weeks	2009-05-07 12:55:58 +00:00
Dmitry Chagin	4d7c2e8a48	Add AT_PLATFORM, AT_HWCAP and AT_CLKTCK auxiliary vector entries which are used by glibc. This silents the message "2.4+ kernel w/o ELF notes?" from some programs at start, among them are top and pkill. Do the assignment of the vector entries in elf_linux_fixup() as it is done in glibc. Fix some minor style issues. Submitted by: Marcin Cieslak <saper at SYSTEM PL> Approved by: kib (mentor) MFC after: 1 week	2009-03-04 12:14:33 +00:00
Ed Schouten	ddf9d24349	Push down Giant inside sysctl. Also add some more assertions to the code. In the existing code we didn't really enforce that callers hold Giant before calling userland_sysctl(), even though there is no guarantee it is safe. Fix this by just placing Giant locks around the call to the oid handler. This also means we only pick up Giant for a very short period of time. Maybe we should add MPSAFE flags to sysctl or phase it out all together. I've also added SYSCTL_LOCK_ASSERT(). We have to make sure sysctl_root() and name2oid() are called with the sysctl lock held. Reviewed by: Jille Timmermans <jille quis cx>	2008-12-29 12:58:45 +00:00
Ed Schouten	a1b5a8955e	Mark uname(), getdomainname() and setdomainname() with COMPAT_FREEBSD4. Looking at our source code history, it seems the uname(), getdomainname() and setdomainname() system calls got deprecated somewhere after FreeBSD 1.1, but they have never been phased out properly. Because we don't have a COMPAT_FREEBSD1, just use COMPAT_FREEBSD4. Also fix the Linuxolator to build without the setdomainname() routine by just making it call userland_sysctl on kern.domainname. Also replace the setdomainname()'s implementation to use this approach, because we're duplicating code with sysctl_domainname(). I wasn't able to keep these three routines working in our COMPAT_FREEBSD32, because that would require yet another keyword for syscalls.master (COMPAT4+NOPROTO). Because this routine is probably unused already, this won't be a problem in practice. If it turns out to be a problem, we'll just restore this functionality. Reviewed by: rdivacky, kib	2008-11-09 10:45:13 +00:00
Konstantin Belousov	68da8b22d2	Current linux_fooaffinity() emulation fails, as the FreeBSD affinity syscalls expect the bitmap size in the range from 32 to 128. Old glibc always assumed size 1024, while newer glibc searches for approriate size, starting from 1024 and going up. For now, use FreeBSD size of cpuset_t for bitmap size parameter and return EINVAL if length of user space bitmap less than our size of cpuset_t. Submitted by: dchagin MFC after: 1 week [This requires MFC of the actual linux affinity syscalls]	2008-10-04 19:23:30 +00:00
Marko Zec	8b615593fc	Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(). () netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-10-02 15:37:58 +00:00
Edward Tomasz Napierala	9545354ed5	Fix usage of mac_vnode_check_open() in linuxulator - last argument should be VREAD, not FREAD. Approved by: rwatson (mentor)	2008-09-22 18:59:24 +00:00
Roman Divacky	0d62170990	The ERESTART to EINTR conversion is already done in kern_select so there is no need to repeat it in linux_select(). Submitted by: Dmitry Chagin <dchagin@> MFC after: 1 week Approved by: kib (mentor)	2008-09-11 15:28:28 +00:00
Attilio Rao	0359a12ead	Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-08-28 15:23:18 +00:00
Bjoern A. Zeeb	603724d3ab	Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch	2008-08-17 23:27:27 +00:00
Roman Divacky	0864e2a4f1	Fix linux_alarm, the linux behaviour is to limit the secs to INT_MAX when the passed in parameter is bigger than INT_MAX. Submitted by: Dmitry Chagin <chagin.dmitry gmail com> Approved by: kib (mentor)	2008-07-23 17:19:02 +00:00
Robert Watson	4f7d1876d5	Introduce a new lock, hostname_mtx, and use it to synchronize access to global hostname and domainname variables. Where necessary, copy to or from a stack-local buffer before performing copyin() or copyout(). A few uses, such as in cd9660 and daemon_saver, remain under-synchronized and will require further updates. Correct a bug in which a failed copyin() of domainname would leave domainname potentially corrupted. MFC after: 3 weeks	2008-07-05 13:10:10 +00:00
Roman Divacky	4732e446fb	Implement robust futexes. Most of the code is modelled after what Linux does. This is because robust futexes are mostly userspace thing which we cannot alter. Two syscalls maintain pointer to userspace list and when process exits a routine walks this list waking up processes sleeping on futexes from that list. Reviewed by: kib (mentor) MFC after: 1 month	2008-05-13 20:01:27 +00:00
Konstantin Belousov	48b05c3f82	Implement the linux syscalls openat, mkdirat, mknodat, fchownat, futimesat, fstatat, unlinkat, renameat, linkat, symlinkat, readlinkat, fchmodat, faccessat. Submitted by: rdivacky Sponsored by: Google Summer of Code 2007 Tested by: pho	2008-04-08 09:45:49 +00:00
Ruslan Ermilov	d7a38db650	Fix build. Reported by: ache, tinderbox	2008-03-25 13:20:52 +00:00
Roman Divacky	5dfb688191	Implement sched_setaffinity and get_setaffinity using real cpu affinity setting primitives. Reviewed by: jeff Approved by: kib (mentor)	2008-03-16 16:27:44 +00:00
Konstantin Belousov	cbd2c621f8	Sanitize arguments to linux_mremap(). Check that only MREMAP_FIXED and MREMAP_MAYMOVE flags are specified. Check for the page alignment of the addr argument. Submitted by: rdivacky MFC after: 1 week	2008-02-22 11:47:56 +00:00
Attilio Rao	22db15c06f	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
Robert Watson	30d239bc4c	Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer	2007-10-24 19:04:04 +00:00
Konstantin Belousov	b6e645c90f	Implement fake linux sched_getaffinity() syscall to enable java to work with Linux 2.6 emulation. This shall be reimplemented once FreeBSD gets native scheduler affinity syscalls. Submitted by: rdivacky Reviewed by: jkim Sponsored by: Google Summer of Code 2007 Approved by: re (kensmith)	2007-08-28 12:26:35 +00:00
Robert Watson	32f9753cfb	Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project	2007-06-12 00:12:01 +00:00
Attilio Rao	a1fe14bc33	rufetch and calcru sometimes should be called atomically together. This patch fixes places where they should be called atomically changing their locking requirements (both assume per-proc spinlock held) and introducing rufetchcalc which wrappers both calls to be performed in atomic way. Reviewed by: jeff Approved by: jeff (mentor)	2007-06-09 21:48:44 +00:00
Attilio Rao	2feb50bf7d	Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)	2007-05-31 22:52:15 +00:00
Konstantin Belousov	9e223287c0	Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)	2007-05-31 11:51:53 +00:00
Jeff Roberson	222d01951f	- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>	2007-05-18 07:10:50 +00:00
Alexander Leidinger	802e08a360	Partial MFp4 of 114977: Whitespace commit: Fix grammar, spelling and punctuation. Submitted by: "Scot Hetzel" <swhetzel@gmail.com>	2007-02-24 16:49:25 +00:00
Alexander Leidinger	1a26db0a3a	MFp4 (114193 (i386 part), 114194, 114195, 114200): - Dont "return" in linux_clone() after we forked the new process in a case of problems. - Move the copyout of p2->p_pid outside the emul_lock coverage in linux_clone(). - Cache the em->pdeath_signal in a local variable and move the copyout out of the emul_lock coverage. - Move the free() out of the emul_shared_lock coverage in a preparation to switch emul_lock to non-sleepable lock (mutex). Submitted by: rdivacky	2007-02-23 22:39:26 +00:00
Konstantin Belousov	75ee4e5462	No need to lock emul_lock in exit_group() because em->shared cannot change (because its referenced by curthread). This fixes a LOR caused by acquiring emul_shared_lock while holding emul_lock. Fix typo in comment. Submitted by: rdivacky	2007-02-01 13:33:33 +00:00
Alexander Leidinger	a849401985	MFp4 (112646): Now (ok it's been a while...) that FreeBSD has RLIMIT_AS too, we can use it in the linuxolator instead of ignoring it. This fixes a LTP test. Submitted by: rdivacky	2007-01-07 19:30:19 +00:00
Alexander Leidinger	0ed6f09c4e	MFp4 (112534): Dont lock em in a case of just using em->shared->group_pid because the group_pid never changes. Submitted by: rdivacky Reviewed by: kib Glanced at by: jhb	2007-01-07 19:14:06 +00:00
Alexander Leidinger	1c65504ca8	MFp4 (112498): Rename the locking flags to EMUL_DOLOCK and EMUL_DONTLOCK to prevent confusion. Submitted by: rdivacky	2007-01-07 19:00:38 +00:00
Alexander Leidinger	c9447c7551	MFp4 (111746, 108671, 108945, 112352): - add linux utimes syscall [1] - add linux rt_sigtimedwait syscall [2] Submitted by: "Scot Hetzel" <swhetzel@gmail.com> [1] Submitted by: Bruce Becker <hostmaster@whois.gts.net> [2] PR: 93199 [2]	2006-12-31 13:16:00 +00:00
Alexander Leidinger	9ce8f9bcdd	MFp4 (111746+): Redo the checking for 2.6 emulation. We now cache the value of use26 and replace calls to linux_get_osrelease() + parsing with a call to linux_use26(). Typical path is lockless now. Pointed out by: kib This allows to ship RELENG_7_0 with a default osrelease of 2.4.2 and the possibility to enable 2.6.x emulation without the possible performance impact of the previous version of the check. Submitted by: rdivacky	2006-12-31 12:39:10 +00:00
Alexander Leidinger	ef95cfeab9	MFp4: - semi-automatic style fixes - spelling fixes in comments - add some comments	2006-12-31 11:56:16 +00:00
Jung-uk Kim	b34608fea5	MFP4: 109653 Linux mknod(2) can open any files, not just char/block or fifo files. This fixes Linux Test Project test cases mknod01, mknod07 and mknod09.	2006-12-04 22:46:09 +00:00
Alexander Leidinger	f6018b1434	MFP4 (108673, 110519, 110874): - Currently LINUX_MAX_COMM_LEN is smaller than MAXCOMLEN, but in case this will change we have a buffer overflow. Apply some defensive programming to DTRT when this should happen. - Use copyinstr() instead of copyin where appropriate. * Fallback to copyin() in case of ENAMETOOLONG. [1] * Use the right source and destination (it was wrong before). - Use strlcpy instead of strcpy. - Properly lock the read case (PR_GET_NAME) like the write case. Reviewed by: rwatson (except [1]) Suggested by: rwatson [1]	2006-12-02 14:56:25 +00:00
Konstantin Belousov	cce1514679	Sync struct sysinfo with real one from linux. Submitted by: rdivacky	2006-11-18 14:37:54 +00:00
Konstantin Belousov	d559d18183	Add debuging printfs to syscalls that do not contain it yet. In sethostname do not print the hostname because it would require to copyin the string. Sethostname is not very frequently used. Submitted by: rdivacky	2006-11-18 13:00:59 +00:00
Konstantin Belousov	f472c6e35a	Remove unecessary locking of process in linux_getpid. Suggested by: jhb Submitted by: rdivacky	2006-11-18 10:12:43 +00:00
Konstantin Belousov	0132096dfd	In rev 1.188 of linux_misc.c the added check for valid options ommited __WCLONE. This fixes it thus fixing skype/teamspeak to not keep zombies after exit. Submitted by: rdivacky Reported by: Bakul Shah (bakul at bitblocks com)	2006-11-15 10:01:06 +00:00
Tom Rhodes	6aeb05d7be	Merge posix4/* into normal kernel hierarchy. Reviewed by: glanced at by jhb Approved by: silence on -arch@ and -standards@	2006-11-11 16:26:58 +00:00
Robert Watson	acd3428b7d	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
Alexander Leidinger	c4ce314b40	Fix style(9). Noticed by: rwatson	2006-10-28 16:47:38 +00:00
Alexander Leidinger	955d762aca	MFP4: Implement prctl(). Submitted by: rdivacky Tested with: LTP	2006-10-28 10:59:59 +00:00
Robert Watson	aed5570872	Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA	2006-10-22 11:52:19 +00:00
Alexander Leidinger	7660ace19c	- Replace homegrown check for FIFO with S_ISFIFO. [1] - Check the status of the options before messing with it. Inspired by: NetBSD [1] Submitted by: rdivacky Tested with: LTP	2006-10-08 17:08:27 +00:00
Alexander Leidinger	18f81b3dfa	- don't reboot() when feed with wrong parameters (and enough permissions) [1] - add support to power off the system [2] - check the linux magic values [3] Submitted by: Marcin Cieslak <saper@SYSTEM.PL> [1,2] Modelled after: linux man page of the reboot() syscall [3] Found by: LTP testcase "reboot02" [1] Tested with: LTP testcase "reboot02" [1,3] MFC after: 1 week	2006-09-16 14:12:04 +00:00
Robert Watson	3e8df637c0	Don't call suser_cred() directly from linux_sethostname(), as it just wraps userland_sysctl(), which performs necessary privilege checks as part of its normal operation. MFC after: 1 week	2006-08-25 11:02:42 +00:00
Alexander Leidinger	1a28c0df09	Sync the MI parts for amd64 with i386 and remove the corresponding special handling for amd64 in the common code. The MD parts for amd64 are still outstanding, but at least this fixes some panics on amd64. Sponsored by: Google SoC 2006 Submitted by: rdivacky Tested by: bsam	2006-08-20 13:50:27 +00:00
Alexander Leidinger	590e3a06e8	- disable some more code when osrelease=2.4.2 - protect td->td_proc->p_pid with the proc lock in linux_getpid in the amd64 (= non i386) case [1] Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: netchild [1]	2006-08-17 21:21:30 +00:00
Alexander Leidinger	94cb2ecf79	Move some stuff into headers where they belong. Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: jhb, ssouhlal	2006-08-17 21:06:48 +00:00
Alexander Leidinger	a43eeaabe4	Disable some parts of the code on amd64 for now to prevent a panic. A better fix will come later. Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-08-15 15:15:17 +00:00
Alexander Leidinger	9b44bfc556	Add the linux 2.6.x stuff (not used by default!): - TLS - complete - pid/tid mangling - complete - thread area - complete - futexes - complete with issues - clone() extension - complete with some possible minor issues - mq/timer/clock* stuff - complete but untested and the mq* stuff is disabled when not build as part of the kernel with native FreeBSD mq* support (module support for this will come later) Tested with: - linux-firefox - works, tested - linux-opera - works, tested - linux-realplay - doesnt work, issue with futexes - linux-skype - doesnt work, issue with futexes - linux-rt2-demo - works, tested - linux-acroread - doesnt work, unknown reason (coredump) and sometimes issue with futexes - various unix utilities in linux-base-gentoo3 and linux-base-fc4: everything tried worked On amd64 not everything is supported like on i386, the catchup is planned for later when the remaining bugs in the new functions are fixed. To test this new stuff, you have to run sysctl compat.linux.osrelease=2.6.16 to switch back use sysctl compat.linux.osrelease=2.4.2 Don't switch while running a linux program, strange things may or may not happen. Sponsored by: Google SoC 2006 Submitted by: rdivacky Some suggestions/help by: jhb, kib, manu@NetBSD.org, netchild	2006-08-15 12:54:30 +00:00
John Baldwin	b4c63329d5	- Pass the MPSAFE flag to namei() in linux_uselib() and handle conditional Giant VFS locking in that function. - Remove bogus code to handle the case where namei() returns success but a NULL vnode pointer. - Note that this code duplicates exec_check_permissions() and annotate where it differs. - Hold the vnode lock longer to protect the write to set VV_TEXT in v_vflag. - Mark linux_uselib() MPSAFE. Reviewed by: rwatson	2006-07-21 20:22:13 +00:00
Alexander Leidinger	555f86b8b6	The linux times syscall can be called with a NULL pointer, so keep cool and don't panic. This fix is different from the patch submitted as it not only prevents a NULL-pointer dereference, but also skips some work in this case. Noticed by: Dmitry Ganenko <dima@apk-inform.com> Reviewed by: rdivacky (the original version as in emulation@) MFC after: 1 week Security: This is a RELENG_x_y candidate (local DoS). Go ahead by: secteam (cperciva)	2006-06-23 18:49:38 +00:00
Alexander Leidinger	01e0ffbae8	Now that we don't have a linuxolator on alpha anymore: - unifdef __alpha__ - revert rev. 1.66 of linux_socket.c	2006-05-10 20:38:16 +00:00
Tai-hwa Liang	d9d46ed258	Unbreaking build by removing a now unused variable.	2006-03-27 23:27:11 +00:00
John Baldwin	b77619bd7f	Use td_ucred rather than p_ucred to avoid panics and general unhappiness. Pointy hat to: netchild	2006-03-27 19:16:31 +00:00
Ruslan Ermilov	aefce619cf	Unbreak COMPAT_LINUX32 option support on amd64. Broken by: netchild	2006-03-19 11:10:33 +00:00
Alexander Leidinger	d4a3f5ddb6	Fixup some problems in my previous commit (COMPAT_43). Pointyhat to: netchild	2006-03-18 20:47:36 +00:00
Alexander Leidinger	5c8919adf4	Get rid of the need of COMPAT_43 in the linuxolator. Submitted by: Divacky Roman <xdivac02@stud.fit.vutbr.cz> Obtained from: DragonFly (some parts)	2006-03-18 18:20:17 +00:00
Tom Rhodes	0e36e11d57	Cast tv_sec to intmax_t and print with %jd in some ifdef'ed code.	2005-12-28 07:08:54 +00:00
David Xu	9104847f21	1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code. 2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread. 3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc. 4. Add td_sigqueue to thread structure to hold all signals delivered to thread. 5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed. 6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals. Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64	2005-10-14 12:43:47 +00:00
John Baldwin	8d948cd1ec	Fix the computation of uptime for linux_sysinfo(). Before it was returning the uptime in seconds mod 60 which wasn't very useful. Approved by: re (scottl)	2005-07-07 19:17:55 +00:00
Maxim Sobolev	bc165ab0fe	Properly convert FreeBSD priority values into Linux values in the getpriority(2) syscall. PR: kern/81951 Submitted by: Andriy Gapon <avg@icyb.net.ua>	2005-06-08 20:41:28 +00:00
Jeff Roberson	7625cbf3cc	- Pass the ISOPEN flag to namei so filesystems will know we're about to open them or otherwise access the data.	2005-04-27 09:05:19 +00:00
John Baldwin	98df9218da	- Change the vm_mmap() function to accept an objtype_t parameter specifying the type of object represented by the handle argument. - Allow vm_mmap() to map device memory via cdev objects in addition to vnodes and anonymous memory. Note that mmaping a cdev directly does not currently perform any MAC checks like mapping a vnode does. - Unbreak the DRM getbufs ioctl by having it call vm_mmap() directly on the cdev the ioctl is acting on rather than trying to find a suitable vnode to map from. Reviewed by: alc, arch@	2005-04-01 20:00:11 +00:00
Maxim Sobolev	e3478fe000	Handle unimplemented syscall by instantly returning ENOSYS instead of sending signal first and only then returning ENOSYS to match what real linux does. PR: kern/74302 Submitted by: Travis Poppe <tlp@LiquidX.org>	2005-03-07 00:18:06 +00:00
John Baldwin	12dd959a7d	Use kern_setitimer() to implement linux_alarm() instead of fondling the real interval timer directly.	2005-02-07 18:36:21 +00:00
Maxim Sobolev	cfa0efe7ab	Split out kernel side of {get,set}itimer(2) into two parts: the first that pops data from the userland and pushes results back and the second which does actual processing. Use the latter to eliminate stackgap in the linux wrappers of those syscalls. MFC after: 2 weeks	2005-01-25 21:28:28 +00:00
David E. O'Brien	1997c537be	Match the LINUX32's style with existing style Submitted by: Jung-uk Kim <jkim@niksun.com> Use positive, not negative logic.	2005-01-14 04:44:56 +00:00
David E. O'Brien	9c0552ce3e	Fix Linux compat 'uname -m' on AMD64. Submitted by: Jung-uk Kim <jkim@niksun.com> (patch reworked by me)	2005-01-14 03:45:26 +00:00
John Baldwin	78c85e8dfc	Rework how we store process times in the kernel such that we always store the raw values including for child process statistics and only compute the system and user timevals on demand. - Fix the various kern_wait() syscall wrappers to only pass in a rusage pointer if they are going to use the result. - Add a kern_getrusage() function for the ABI syscalls to use so that they don't have to play stackgap games to call getrusage(). - Fix the svr4_sys_times() syscall to just call calcru() to calculate the times it needs rather than calling getrusage() twice with associated stackgap, etc. - Add a new rusage_ext structure to store raw time stats such as tick counts for user, system, and interrupt time as well as a bintime of the total runtime. A new p_rux field in struct proc replaces the same inline fields from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime). A new p_crux field in struct proc contains the "raw" child time usage statistics. ruadd() has been changed to handle adding the associated rusage_ext structures as well as the values in rusage. Effectively, the values in rusage_ext replace the ru_utime and ru_stime values in struct rusage. These two fields in struct rusage are no longer used in the kernel. - calcru() has been split into a static worker function calcru1() that calculates appropriate timevals for user and system time as well as updating the rux_[isu]u fields of a passed in rusage_ext structure. calcru() uses a copy of the process' p_rux structure to compute the timevals after updating the runtime appropriately if any of the threads in that process are currently executing. It also now only locks sched_lock internally while doing the rux_runtime fixup. calcru() now only requires the caller to hold the proc lock and calcru1() only requires the proc lock internally. calcru() also no longer allows callers to ask for an interrupt timeval since none of them actually did. - calcru() now correctly handles threads executing on other CPUs. - A new calccru() function computes the child system and user timevals by calling calcru1() on p_crux. Note that this means that any code that wants child times must now call this function rather than reading from p_cru directly. This function also requires the proc lock. - This finishes the locking for rusage and friends so some of the Giant locks in exit1() and kern_wait() are now gone. - The locking in ttyinfo() has been tweaked so that a shared lock of the proctree lock is used to protect the process group rather than the process group lock. By holding this lock until the end of the function we now ensure that the process/thread that we pick to dump info about will no longer vanish while we are trying to output its info to the console. Submitted by: bde (mostly) MFC after: 1 month	2004-10-05 18:51:11 +00:00
David E. O'Brien	4a16b489ca	Fix the 'DEBUG' argument code to unbreak the amd64 LINT build.	2004-08-16 11:12:57 +00:00
David E. O'Brien	3a2e3a4aa7	Fix the 'DEBUG' argument code to unbreak the LINT build.	2004-08-16 10:36:12 +00:00
Tim J. Robbins	4af2762336	Changes to MI Linux emulation code necessary to run 32-bit Linux binaries on AMD64, and the general case where the emulated platform has different size pointers than we use natively: - declare certain structure members as l_uintptr_t and use the new PTRIN and PTROUT macros to convert to and from native pointers. - declare some structures __packed on amd64 when the layout would differ from that used on i386. - include <machine/../linux32/linux.h> instead of <machine/../linux/linux.h> if compiling with COMPAT_LINUX32. This will need to be revisited before 32-bit and 64-bit Linux emulation support can coexist in the same kernel. - other small scattered changes. This should be a no-op on i386 and Alpha.	2004-08-16 07:28:16 +00:00
Tim J. Robbins	ae8e14a6ac	Replace linux_getitimer() and linux_setitimer() with implementations based on those in freebsd32_misc.c, removing the assumption that Linux uses the same layout for struct itimerval as we use natively.	2004-08-15 12:34:15 +00:00
Tim J. Robbins	d1d6dbf120	Avoid assuming that l_timeval is the same as the native struct timeval in linux_select().	2004-08-15 12:24:05 +00:00
Colin Percival	56f21b9d74	Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags. The old name is still defined, but will be removed in a few days (unless I hear any complaints...) Discussed with: rwatson, scottl Requested by: jhb	2004-07-26 07:24:04 +00:00
Poul-Henning Kamp	1930e303cf	Deorbit COMPAT_SUNOS. We inherited this from the sparc32 port of BSD4.4-Lite1. We have neither a sparc32 port nor a SunOS4.x compatibility desire these days.	2004-06-11 11:16:26 +00:00
John Baldwin	b7e23e826c	- Replace wait1() with a kern_wait() function that accepts the pid, options, status pointer and rusage pointer as arguments. It is up to the caller to copyout the status and rusage to userland if needed. This lets us axe the 'compat' argument and hide all that functionality in owait(), by the way. This also cleans up some locking in kern_wait() since it no longer has to drop locks around copyout() since all the copyout()'s are deferred. - Convert owait(), wait4(), and the various ABI compat wait() syscalls to use kern_wait() rather than wait1() or wait4(). This removes a bit more stackgap usage. Tested on: i386 Compiled on: i386, alpha, amd64	2004-03-17 20:00:00 +00:00
John Baldwin	91d5354a2c	Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64	2004-02-04 21:52:57 +00:00
Alan Cox	277b62040d	Lock the traversal of the vm object list. Use TAILQ_FOREACH consistently.	2004-01-02 19:29:31 +00:00
Maxim Sobolev	d09c47acd9	Pull latest changes from OpenBSD: - improve sysinfo(2) syscall; - add dummy fadvise64(2) syscall; - add dummy *xattr(2) family of syscalls; - add protos for the syscalls 222-225, 238-249 and 253-267; - add exit_group(2) syscall, which is currently just wired to exit(2). Obtained from: OpenBSD MFC after: 2 weeks	2003-11-16 15:07:10 +00:00
Tim J. Robbins	1d2d5501f9	Reject negative ngrp arguments in linux_setgroups() and linux_setgroups16(); stops users being able to cause setgroups to clobber the kernel stack by copying in data past the end of the linux_gidset array.	2003-10-21 11:00:33 +00:00
Bruce Evans	34eec0a169	Restored a non-egregious cast so that this file compiles on i386's with 64-bit longs again. This was fixed in rev.1.42 but the fix rotted non-fatally in rev.1.105 and fatally in rev.1.137. Many more non-egregrious casts are strictly required for conversions from semi-opaque types to pointers, but we avoid most of them by using types that are almost certain to be compatible with uintptr_t for representing pointers (e.g., vm_offset_t). Here we don't really want the u_longs, but we have them because a.out.h and its support code doesn't use typedefs (it uses unsigned in V7 and unsigned long in FreeBSD) and is too obsolete to fix now.	2003-09-07 13:03:13 +00:00
Dag-Erling Smørgrav	7576b4b4c0	Try to make 'uname -a' look more like it does on Linux: - cut the version string at the newline, suppressing information about who built the kernel and in what directory. Most of this information was already lost to truncation. - on i386, return the precise CPU class (if known) rather than just "i386". Linux software which uses this information to select which binary to run often does not know what to make of "i386".	2003-07-29 10:03:15 +00:00
Poul-Henning Kamp	a8d43c90af	Add a "int fd" argument to VOP_OPEN() which in the future will contain the filedescriptor number on opens from userland. The index is used rather than a "struct file " since it conveys a bit more information, which may be useful to in particular fdescfs and /dev/fd/ For now pass -1 all over the place.	2003-07-26 07:32:23 +00:00
Poul-Henning Kamp	567104a148	Add a new function swap_pager_status() which reports the total size of the paging space and how much of it is in use (in pages). Use this interface from the Linuxolator instead of groping around in the internals of the swap_pager.	2003-07-18 10:26:09 +00:00
David E. O'Brien	16dbc7f228	Use __FBSDID().	2003-06-10 21:29:12 +00:00
Alexander Kabaev	104a9b7e3e	Deprecate machine/limits.h in favor of new sys/limits.h. Change all in-tree consumers to include <sys/limits.h> Discussed on: standards@ Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>	2003-04-29 13:36:06 +00:00
John Baldwin	8804bf6b03	Use local struct proc variables to reduce repeated td->td_proc dereferences and improve readability.	2003-04-17 22:02:47 +00:00
John Baldwin	20b04da89c	Explicitly cast a l_ulong to an unsigned long to make all arch's happy with the printf format.	2003-04-16 20:43:10 +00:00
John Baldwin	760eb2e033	Fix printf format in a debug printf.	2003-04-16 20:07:48 +00:00
John Baldwin	b62f75cf44	- Change the linux_[gs]et_os{name, release, s_version}() functions to take a thread instead of a proc for their first argument. - Add a mutex to protect the system-wide Linux osname, osrelease, and oss_version variables. - Change linux_get_prison() to take a thread instead of a proc for its first argument and to use td_ucred rather than p_ucred. This is ok because a thread's prison does not change even though it's ucred might. - Also, change linux_get_prison() to return a struct prison * instead of a struct linux_prison * since it returns with the struct prison locked and this makes it easier to safely unlock the prison when we are done messing with it.	2003-03-13 22:45:43 +00:00
Dag-Erling Smørgrav	1d062e2be8	Clean up whitespace and remove register keyword.	2003-03-03 09:17:12 +00:00
Dag-Erling Smørgrav	4b7ef73d71	More caddr_t removal, in conjunction with copy{in,out}(9) this time. Also clean up some egregious casts and incorrect use of sizeof.	2003-03-03 09:14:26 +00:00
Tim J. Robbins	96d7f8ef46	Use the proc lock to protect p_realtimer instead of Giant, and obtain sched_lock around accesses to p_stats->p_timer[] to avoid a potential race with hardclock. getitimer(), setitimer() and the realitexpire() callout are now Giant-free.	2003-02-17 10:03:02 +00:00
Tim J. Robbins	fb30aed1a5	Obtain proc lock around modification of p_siglist in linux_wait4().	2003-02-14 08:59:49 +00:00
Robert Drehmel	75e8f2dad8	- Use strlcpy() rather than strncpy() to copy NUL terminated strings. - Pass the correct buffer size to getcredhostname().	2002-10-17 22:00:30 +00:00
Juli Mallett	1d9c56964d	Back our kernel support for reliable signal queues. Requested by: rwatson, phk, and many others	2002-10-01 17:15:53 +00:00
Juli Mallett	1226f694e6	First half of implementation of ksiginfo, signal queues, and such. This gets signals operating based on a TailQ, and is good enough to run X11, GNOME, and do job control. There are some intricate parts which could be more refined to match the sigset_t versions, but those require further evaluation of directions in which our signal system can expand and contract to fit our needs. After this has been in the tree for a while, I will make in kernel API changes, most notably to trapsignal(9) and sendsig(9), to use ksiginfo more robustly, such that we can actually pass information with our (queued) signals to the userland. That will also result in using a struct ksiginfo pointer, rather than a signal number, in a lot of kern_sig.c, to refer to an individual pending signal queue member, but right now there is no defined behaviour for such. CODAFS is unfinished in this regard because the logic is unclear in some places. Sponsored by: New Gold Technology Reviewed by: bde, tjr, jake [an older version, logic similar]	2002-09-30 20:20:22 +00:00
Jeff Roberson	0fa89fc7d9	- Hold the vn lock over vm_mmap().	2002-09-25 02:42:04 +00:00
Matthew N. Dodd	c5afa58784	Pass flags to msync() accounting for differences in the definition of MS_SYNC on FreeBSD and Linux. Submitted by: Christian Zander <zander@minion.de>	2002-09-19 19:02:54 +00:00
Bruce Evans	367797e031	Do not cast from a pointer to an integer of a possibly different size. This fixes a warning on i386's with 64-bit longs.	2002-09-05 12:30:54 +00:00
Bruce Evans	85422e62d3	Include <sys/malloc.h> instead of depending on namespace pollution 2 layers deep in <sys/proc.h> or <sys/vnode.h>. Removed unused includes. Sorted includes.	2002-09-05 08:13:20 +00:00
Ian Dowse	206a5d3a0c	Use the new kern_* functions to avoid the need to store arguments in the stack gap. This converts most VFS and signal related system calls, as well as select(). Discussed on: -arch Approved by: marcel	2002-09-01 22:30:27 +00:00
Jeff Roberson	e6e370a7fe	- Replace v_flag with v_iflag and v_vflag - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS	2002-08-04 10:29:36 +00:00
Robert Watson	eddc160e00	Introduce support for Mandatory Access Control and extensible kernel access control. Invoke appropriate MAC entry points for a number of VFS-related operations in the Linux ABI module. In particular, handle uselib in a manner similar to open() (more work is probably needed here), as well as handle statfs(), and linux readdir()-like calls. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 22:23:02 +00:00
Robert Watson	fa3b8ffb32	Add a comment about how we should use vn_open() here instead of directly invoking VOP_OPEN(). This would reduce code redundancy with the rest of the kernel, and also is required for MAC to work properly.	2002-06-14 07:24:01 +00:00
Jens Schweikhardt	21dc7d4f57	Fix typo in the BSD copyright: s/withough/without/ Spotted and suggested by: des MFC after: 3 weeks	2002-06-02 20:05:59 +00:00
Peter Wemm	4924b9dd80	Zap some stale unused headers, including one machine/psl.h (which is a stub on alpha). Compile tested on alpha and x86.	2002-05-01 02:17:33 +00:00
Robert Watson	b099af16dd	Add an XXX: linux_uselib() should be using vn_open() rather than invoking VOP_OPEN() and doing lots of manual checking. This would further centralize use of the name functions, and once the MAC code is integrated, meaning few extraneous MAC checks scattered all over the place. I don't have time to fix this now, but want to make sure it doesn't get forgotten. Anyone interested in fixing this should feel free. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-04-20 14:43:34 +00:00
John Baldwin	094a945562	Rework logic of syscalls that modify process credentials as described in rev 1.152 of sys/kern/kern_prot.c.	2002-04-13 23:11:23 +00:00
John Baldwin	0af24d5151	Use td_ucred in a few spots.	2002-04-11 21:00:05 +00:00
John Baldwin	44731cab3b	Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag. Discussed on: smp@	2002-04-01 21:31:13 +00:00
Jeff Roberson	851031501a	Remove references to vm_zone.h and switch over to the new uma API.	2002-03-20 10:35:22 +00:00
John Baldwin	a854ed9893	Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.	2002-02-27 18:32:23 +00:00
Robert Drehmel	668ae58863	Use the updated getcredhostname() function.	2002-02-27 16:47:27 +00:00
Robert Drehmel	5597f0ccf2	Use the getcredhostname function to fill the hostname into the linux_newuname_args structure. This should fix the case of jailed linux processes not using the jail's hostname. PR: 35336 Reviewed by: phk	2002-02-27 15:06:33 +00:00
Andrew Gallatin	21e06996e4	Linux/alpha uses the same BSDish return mechanism we do for getpid, getuid, getgid and pipe, since they bootstrapped from OSF/1 and never cleaned up. Switch to the native syscalls on alpha so that the above functions work MFC after: 7 days	2002-01-23 22:46:14 +00:00
Robert Watson	011376308f	o Introduce pr_mtx into struct prison, providing protection for the mutable contents of struct prison (hostname, securelevel, refcount, pr_linux, ...) o Generally introduce mtx_lock()/mtx_unlock() calls throughout kern/ so as to enforce these protections, in particular, in kern_mib.c protection sysctl access to the hostname and securelevel, as well as kern_prot.c access to the securelevel for access control purposes. o Rewrite linux emulator abstractions for accessing per-jail linux mib entries (osname, osrelease, osversion) so that they don't return a pointer to the text in the struct linux_prison, rather, a copy to an array passed into the calls. Likewise, update linprocfs to use these primitives. o Update in_pcb.c to always use prison_getip() rather than directly accessing struct prison. Reviewed by: jhb	2001-12-03 16:12:27 +00:00
Dag-Erling Smørgrav	c798b36242	Revert incorrect KSEfication: realitexpire expects a struct proc , not a struct thread .	2001-11-24 14:09:50 +00:00
Paul Saab	cbc89bfbfe	Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loader tunable. Reviewed by: peter MFC after: 2 weeks	2001-10-10 23:06:54 +00:00
Marcel Moolenaar	9b130a99cf	Remove linux_getpgid(). We map the syscall natively now. PR: kern/21402	2001-09-28 01:40:51 +00:00
Michael Reifenberger	b8febfd1f2	Add a wrapper for linux_getsid -> getsid Syscall.	2001-09-15 09:57:30 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
Marcel Moolenaar	5002a60f9b	Round of cleanups and enhancements. These include (in random order): o Introduce private types for use in linux syscalls for two reasons: 1. establish type independence for ease in porting and, 2. provide a visual queue as to which syscalls have proper prototypes to further cleanup the i386/alpha split. Linuxulator types are prefixed by 'l_'. void and char have not been "virtualized". o Provide dummy functions for all syscalls and remove dummy functions or implementations of truely obsolete syscalls. o Sanitize the shm, sem and msg* syscalls. o Make a first attempt to implement the linux_sysctl syscall. At this time it only returns one MIB (KERN_VERSION), but most importantly, it tells us when we need to add additional sysctls :-) o Bump the kenel version up to 2.4.2 (this is not the same as the KERN_VERSION MIB, BTW). o Implement new syscalls, of which most are specific to i386. Our syscall table is now up to date with Linux 2.4.2. Some highlights: - Implement the 32-bit uid_t and gid_t bases syscalls. - Implement a couple of 64-bit file size/offset bases syscalls. o Fix or improve numerous syscalls and prototypes. o Reduce style(9) violations while I'm here. Especially indentation inconsistencies within the same file are addressed. Re-indenting did not obfuscate actual changes to the extend that it could not be combined. NOTE: I spend some time testing these changes and found that if there were regressions, they were not caused by these changes AFAICT. It was observed that installing a RH 7.1 runtime environment did make matters worse. Hangs and/or reboots have been observed with and without these changes, so when it failed to make life better in cases it doesn't look like it made it worse.	2001-09-08 19:07:04 +00:00
Jim Pirzyk	814c95264f	Added the linux_sysinfo function to implement sysinfo(2). PR: kern/27759 Reviewed by: marcel Approved by: marcel MFC after: 1 week	2001-07-23 06:22:10 +00:00
Peter Wemm	2e17a05929	Fix warning: 413: warning: long unsigned int format, vm_offset_t arg (arg 2)	2001-06-15 07:46:18 +00:00
Robert Watson	b1fc0ec1a7	o Merge contents of struct pcred into struct ucred. Specifically, add the real uid, saved uid, real gid, and saved gid to ucred, as well as the pcred->pc_uidinfo, which was associated with the real uid, only rename it to cr_ruidinfo so as not to conflict with cr_uidinfo, which corresponds to the effective uid. o Remove p_cred from struct proc; add p_ucred to struct proc, replacing original macro that pointed. p->p_ucred to p->p_cred->pc_ucred. o Universally update code so that it makes use of ucred instead of pcred, p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo, cr_{r,sv}{u,g}id instead of p_*, etc. o Remove pcred0 and its initialization from init_main.c; initialize cr_ruidinfo there. o Restruction many credential modification chunks to always crdup while we figure out locking and optimizations; generally speaking, this means moving to a structure like this: newcred = crdup(oldcred); ... p->p_ucred = newcred; crfree(oldcred); It's not race-free, but better than nothing. There are also races in sys_process.c, all inter-process authorization, fork, exec, and exit. o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid; remove comments indicating that the old arrangement was a problem. o Restructure exec1() a little to use newcred/oldcred arrangement, and use improved uid management primitives. o Clean up exit1() so as to do less work in credential cleanup due to pcred removal. o Clean up fork1() so as to do less work in credential cleanup and allocation. o Clean up ktrcanset() to take into account changes, and move to using suser_xxx() instead of performing a direct uid==0 comparision. o Improve commenting in various kern_prot.c credential modification calls to better document current behavior. In a couple of places, current behavior is a little questionable and we need to check POSIX.1 to make sure it's "right". More commenting work still remains to be done. o Update credential management calls, such as crfree(), to take into account new ruidinfo reference. o Modify or add the following uid and gid helper routines: change_euid() change_egid() change_ruid() change_rgid() change_svuid() change_svgid() In each case, the call now acts on a credential not a process, and as such no longer requires more complicated process locking/etc. They now assume the caller will do any necessary allocation of an exclusive credential reference. Each is commented to document its reference requirements. o CANSIGIO() is simplified to require only credentials, not processes and pcreds. o Remove lots of (p_pcred==NULL) checks. o Add an XXX to authorization code in nfs_lock.c, since it's questionable, and needs to be considered carefully. o Simplify posix4 authorization code to require only credentials, not processes and pcreds. Note that this authorization, as well as CANSIGIO(), needs to be updated to use the p_cansignal() and p_cansched() centralized authorization routines, as they currently do not take into account some desirable restrictions that are handled by the centralized routines, as well as being inconsistent with other similar authorization instances. o Update libkvm to take these changes into account. Obtained from: TrustedBSD Project Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit	2001-05-25 16:59:11 +00:00
Mark Murray	fb919e4d5a	Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)	2001-05-01 08:13:21 +00:00
Robert Watson	c7e1887023	o Change a suser() call to a suser_xxx(..., PRISON_ROOT) call in the linuxulator so as to allow privileged processes within a jail() to invoke the Linux initgroups() system call. This allows the Linux "su" to work properly (better) when running a complete Linux environment under jail(). This problem was reported by Attila Nagy <bra@fsn.hu>. Reviewed by: marcel	2001-04-24 19:08:53 +00:00
John Baldwin	33a9ed9d0e	Change the pfind() and zpfind() functions to lock the process that they find before releasing the allproc lock and returning. Reviewed by: -smp, dfr, jake	2001-04-24 00:51:53 +00:00
Alan Cox	21c8cdfb96	Add linux_sched_get_priority_max() and linux_sched_get_priority_min(): The policy parameter requires translation.	2001-04-01 06:37:40 +00:00
Andrew Gallatin	6d4aa00ac1	fix linux_times() to take into account linux's value of CLK_TCK on the alpha. Previously, results were off by a factor of 10 Tested by: Yoriaki FUJIMORI <fujimori@grafin.fujimori.cache.waseda.ac.jp>	2001-03-23 19:22:21 +00:00
Jonathan Lemon	2459336973	Allow debugging output to be controlled on a per-syscall granularity. Also clean up debugging output in a slightly more uniform fashion. The default behavior remains the same (all debugging output is turned on)	2001-02-16 16:40:43 +00:00
Jonathan Lemon	705deb78a3	Add mount syscall to linux emulation. Also improve emulation of reboot.	2001-02-16 14:42:11 +00:00
Bosko Milekic	9ed346bab0	Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)	2001-02-09 06:11:45 +00:00

1 2 3 4 5 ...

343 commits