opnsense-src

mirror of https://github.com/opnsense/src.git synced 2026-06-03 13:58:30 -04:00

Author	SHA1	Message	Date
Mateusz Guzik	2796c209b0	vfs: stop refing freed mount points in vop_stdgetwritemount The code used blindly ref based on an unsafely red address and then would backpedal if necessary. This was safe in terms of memory access since mounts are type-stable, but made for a potential a bug where the mount was reused and had the count reset to 0 before this code decreased it. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21411	2019-09-01 14:01:09 +00:00
Kyle Evans	32287ea72b	posixshm: switch to OBJT_SWAP in advance of other changes Future changes to posixshm will start tracking writeable mappings in order to support file sealing. Tracking writeable mappings for an OBJT_DEFAULT object is complicated as it may be swapped out and converted to an OBJT_SWAP. One may generically add this tracking for vm_object, but this is difficult to do without increasing memory footprint of vm_object and blowing up memory usage by a significant amount. On the other hand, the swap pager can be expanded to track writeable mappings without increasing vm_object size. This change is currently in D21456. Switch over to OBJT_SWAP in advance of the other changes to the swap pager and posixshm.	2019-09-01 00:33:16 +00:00
Mateusz Guzik	c2b600f98f	vfs: add a missing VNODE_REFCOUNT_FENCE_REL to v_incr_usecount_locked Sponsored by: The FreeBSD Foundation	2019-08-30 21:54:45 +00:00
Mateusz Guzik	3bb8d8d8c9	vfs: tidy up assertions in vfs_subr - assert unlocked vnode interlock in vref - assert right counts in vputx - print debug info for panic in vdrop Sponsored by: The FreeBSD Foundation	2019-08-30 00:45:53 +00:00
Konstantin Belousov	6470c8d3db	Rework v_object lifecycle for vnodes. Current implementation of vnode_create_vobject() and vnode_destroy_vobject() is written so that it prepared to handle the vm object destruction for live vnode. Practically, no filesystems use this, except for some remnants that were present in UFS till today. One of the consequences of that model is that each filesystem must call vnode_destroy_vobject() in VOP_RECLAIM() or earlier, as result all of them get rid of the v_object in reclaim. Move the call to vnode_destroy_vobject() to vgonel() before VOP_RECLAIM(). This makes v_object stable: either the object is NULL, or it is valid vm object till the vnode reclamation. Remove code from vnode_create_vobject() to handle races with the parallel destruction. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D21412	2019-08-29 07:50:25 +00:00
Mateusz Guzik	1e2f0ceb2f	vfs: add VOP_NEED_INACTIVE vnode usecount drops to 0 all the time (e.g. for directories during path lookup). When that happens the kernel would always lock the exclusive lock for the vnode in order to call vinactive(). This blocks other threads who want to use the vnode for looukp. vinactive is very rarely needed and can be tested for without the vnode lock held. This patch gives filesytems an opportunity to do it, sample total wait time for tmpfs over 500 minutes of poudriere -j 104: before: 557563641706 (lockmgr:tmpfs) after: 46309603301 (lockmgr:tmpfs) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21371	2019-08-28 20:34:24 +00:00
Mark Johnston	772dd133c6	Avoid direct accesses of the vm_page wire_count field. No functional change intended. Sponsored by: Netflix	2019-08-28 18:01:54 +00:00
Mateusz Guzik	88cc62e5a5	proc: eliminate the zombproc list It is not needed by anything in the kernel and it slightly drives up contention on both proctree and allproc locks. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21447	2019-08-28 16:18:23 +00:00
Mark Johnston	b5d239cb97	Wire pages in vm_page_grab() when appropriate. uiomove_object_page() and exec_map_first_page() would previously wire a page after having grabbed it. Ask vm_page_grab() to perform the wiring instead: this removes some redundant code, and is cheaper in the case where the requested page is not resident since the page allocator can be asked to initialize the page as wired, whereas a separate vm_page_wire() call requires the page lock. In vm_imgact_hold_page(), use vm_page_unwire_noq() instead of vm_page_unwire(PQ_NONE). The latter ensures that the page is dequeued before returning, but this is unnecessary since vm_page_free() will trigger a batched dequeue of the page. Reviewed by: alc, kib Tested by: pho (part of a larger patch) MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21440	2019-08-28 16:08:06 +00:00
Mateusz Guzik	2319489b6e	proc: remove zpfind It is not used by anything. If someone wants it back it should be reimplemented to use the proc hash. Sponsored by: The FreeBSD Foundation	2019-08-28 01:22:21 +00:00
John Baldwin	818d755318	Only define the 'tls' member of sfio in KERN_TLS is defined. This field was not initialized in the !KERN_TLS case triggering an assertion failure when using sendfile(2). Reported by: pho, asomers Sponsored by: Netflix	2019-08-27 22:21:18 +00:00
Mateusz Guzik	368cabbcb5	vfs: stop passing LK_INTERLOCK to VOP_UNLOCK The plan is to drop the flags argument. There is also a temporary bug now that nullfs ignores the flag. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21252	2019-08-27 20:30:56 +00:00
Mark Johnston	44e4def73b	Remove an extraneous + 1 in _domainset_create(). DOMAINSET_FLS, like our fls(), is 1-indexed. Reported by: alc MFC after: 1 week Sponsored by: The FreeBSD Foundation	2019-08-27 15:42:08 +00:00
Mark Johnston	8e6975047e	Fix several logic issues in domainset_empty_vm(). - Don't add 1 to the result of DOMAINSET_FLS. - Do not modify domainsets containing only empty domains. - Always flatten a _PREFER policy to _ROUNDROBIN if the preferred domain is empty. Previously we were doing this only when ds_cnt > 1. These bugs could cause hangs during boot if a VM domain is empty. Tested by: hselasky Reviewed by: hselasky, kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21420	2019-08-27 14:06:34 +00:00
Konstantin Belousov	95acb40caa	vn_vget_ino_gen(): relock the lower vnode on error. The function' interface assumes that the lower vnode is passed and returned locked always. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-08-27 08:28:38 +00:00
John Baldwin	b2e60773c6	Add kernel-side support for in-kernel TLS. KTLS adds support for in-kernel framing and encryption of Transport Layer Security (1.0-1.2) data on TCP sockets. KTLS only supports offload of TLS for transmitted data. Key negotation must still be performed in userland. Once completed, transmit session keys for a connection are provided to the kernel via a new TCP_TXTLS_ENABLE socket option. All subsequent data transmitted on the socket is placed into TLS frames and encrypted using the supplied keys. Any data written to a KTLS-enabled socket via write(2), aio_write(2), or sendfile(2) is assumed to be application data and is encoded in TLS frames with an application data type. Individual records can be sent with a custom type (e.g. handshake messages) via sendmsg(2) with a new control message (TLS_SET_RECORD_TYPE) specifying the record type. At present, rekeying is not supported though the in-kernel framework should support rekeying. KTLS makes use of the recently added unmapped mbufs to store TLS frames in the socket buffer. Each TLS frame is described by a single ext_pgs mbuf. The ext_pgs structure contains the header of the TLS record (and trailer for encrypted records) as well as references to the associated TLS session. KTLS supports two primary methods of encrypting TLS frames: software TLS and ifnet TLS. Software TLS marks mbufs holding socket data as not ready via M_NOTREADY similar to sendfile(2) when TLS framing information is added to an unmapped mbuf in ktls_frame(). ktls_enqueue() is then called to schedule TLS frames for encryption. In the case of sendfile_iodone() calls ktls_enqueue() instead of pru_ready() leaving the mbufs marked M_NOTREADY until encryption is completed. For other writes (vn_sendfile when pages are available, write(2), etc.), the PRUS_NOTREADY is set when invoking pru_send() along with invoking ktls_enqueue(). A pool of worker threads (the "KTLS" kernel process) encrypts TLS frames queued via ktls_enqueue(). Each TLS frame is temporarily mapped using the direct map and passed to a software encryption backend to perform the actual encryption. (Note: The use of PHYS_TO_DMAP could be replaced with sf_bufs if someone wished to make this work on architectures without a direct map.) KTLS supports pluggable software encryption backends. Internally, Netflix uses proprietary pure-software backends. This commit includes a simple backend in a new ktls_ocf.ko module that uses the kernel's OpenCrypto framework to provide AES-GCM encryption of TLS frames. As a result, software TLS is now a bit of a misnomer as it can make use of hardware crypto accelerators. Once software encryption has finished, the TLS frame mbufs are marked ready via pru_ready(). At this point, the encrypted data appears as regular payload to the TCP stack stored in unmapped mbufs. ifnet TLS permits a NIC to offload the TLS encryption and TCP segmentation. In this mode, a new send tag type (IF_SND_TAG_TYPE_TLS) is allocated on the interface a socket is routed over and associated with a TLS session. TLS records for a TLS session using ifnet TLS are not marked M_NOTREADY but are passed down the stack unencrypted. The ip_output_send() and ip6_output_send() helper functions that apply send tags to outbound IP packets verify that the send tag of the TLS record matches the outbound interface. If so, the packet is tagged with the TLS send tag and sent to the interface. The NIC device driver must recognize packets with the TLS send tag and schedule them for TLS encryption and TCP segmentation. If the the outbound interface does not match the interface in the TLS send tag, the packet is dropped. In addition, a task is scheduled to refresh the TLS send tag for the TLS session. If a new TLS send tag cannot be allocated, the connection is dropped. If a new TLS send tag is allocated, however, subsequent packets will be tagged with the correct TLS send tag. (This latter case has been tested by configuring both ports of a Chelsio T6 in a lagg and failing over from one port to another. As the connections migrated to the new port, new TLS send tags were allocated for the new port and connections resumed without being dropped.) ifnet TLS can be enabled and disabled on supported network interfaces via new '[-]txtls[46]' options to ifconfig(8). ifnet TLS is supported across both vlan devices and lagg interfaces using failover, lacp with flowid enabled, or lacp with flowid enabled. Applications may request the current KTLS mode of a connection via a new TCP_TXTLS_MODE socket option. They can also use this socket option to toggle between software and ifnet TLS modes. In addition, a testing tool is available in tools/tools/switch_tls. This is modeled on tcpdrop and uses similar syntax. However, instead of dropping connections, -s is used to force KTLS connections to switch to software TLS and -i is used to switch to ifnet TLS. Various sysctls and counters are available under the kern.ipc.tls sysctl node. The kern.ipc.tls.enable node must be set to true to enable KTLS (it is off by default). The use of unmapped mbufs must also be enabled via kern.ipc.mb_use_ext_pgs to enable KTLS. KTLS is enabled via the KERN_TLS kernel option. This patch is the culmination of years of work by several folks including Scott Long and Randall Stewart for the original design and implementation; Drew Gallatin for several optimizations including the use of ext_pgs mbufs, the M_NOTREADY mechanism for TLS records awaiting software encryption, and pluggable software crypto backends; and John Baldwin for modifications to support hardware TLS offload. Reviewed by: gallatin, hselasky, rrs Obtained from: Netflix Sponsored by: Netflix, Chelsio Communications Differential Revision: https://reviews.freebsd.org/D21277	2019-08-27 00:01:56 +00:00
Xin LI	4e8671dd78	GZIO: Update to use zlib 1.2.11. PR: 229763 Submitted by: Yoshihiro Ota <ota j email ne jp> Differential Revision: https://reviews.freebsd.org/D21408	2019-08-25 07:50:44 +00:00
Mateusz Guzik	0256405e98	vfs: add vholdnz (for already held vnodes) Reviewed by: kib (previous version) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21358	2019-08-25 05:11:43 +00:00
Mateusz Guzik	5b596b9fa5	Remove the obsolete pcpu_zone_ptr zone. It was only used by flowtable (removed in r321618). Sponsored by: The FreeBSD Foundation	2019-08-24 00:01:19 +00:00
Konstantin Belousov	e671edac06	De-commision the MNTK_NOINSMNTQ kernel mount flag. After all the changes, its dynamic scope is same as for MNTK_UNMOUNT, but to allow the syncer vnode to be re-installed on unmount failure. But the case of syncer was already handled by using the VV_FORCEINSMQ flag for quite some time. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-08-23 19:40:10 +00:00
Xin LI	a11bf9a49b	INVARIANTS: treat LA_LOCKED as the same of LA_XLOCKED in mtx_assert. The Linux lockdep API assumes LA_LOCKED semantic in lockdep_assert_held(), meaning that either a shared lock or write lock is Ok. On the other hand, the timeout code uses lc_assert() with LA_XLOCKED, and we need both to work. For mutexes, because they can not be shared (this is unique among all lock classes, and it is unlikely that we would add new lock class anytime soon), it is easier to simply extend mtx_assert to handle LA_LOCKED there, despite the change itself can be viewed as a slight abstraction violation. Reviewed by: mjg, cem, jhb MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D21362	2019-08-23 06:39:40 +00:00
Brooks Davis	075ac3b446	Reorganise conditionals to reduce duplication. No functional change. Obtained from: CheriBSD MFC after: 3 days Sponsored by: DARPA, AFRL	2019-08-22 10:21:07 +00:00
Rick Macklem	df9bc7df42	Map ENOTTY to EINVAL for lseek(SEEK_DATA/SEEK_HOLE). Without this patch, when an application performed lseek(SEEK_DATA/SEEK_HOLE) on a file in a file system that does not have its own VOP_IOCTL(), the lseek(2) fails with errno ENOTTY. This didn't seem appropriate, since ENOTTY is not listed as an error return by either the lseek(2) man page nor the POSIX draft for lseek(2). This was discussed on freebsd-current@ here: http://docs.FreeBSD.org/cgi/mid.cgi?CAOtMX2iiQdv1+15e1N_r7V6aCx_VqAJCTP1AW+qs3Yg7sPg9wA This trivial patch maps ENOTTY to EINVAL for lseek(SEEK_DATA/SEEK_HOLE). Reviewed by: markj Relnotes: yes Differential Revision: https://reviews.freebsd.org/D21300	2019-08-22 01:15:06 +00:00
Mark Johnston	5b699f1614	Add lockmgr(9) probes to the lockstat DTrace provider. They follow the conventions set by rw and sx lock probes. There is an additional lockstat:::lockmgr-disown probe. Update lockstat(1) to report on contention and hold events for lockmgr locks. Document the new probes in dtrace_lockstat.4, and deduplicate some of the existing probe descriptions. Reviewed by: mjg MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21355	2019-08-21 23:43:58 +00:00
Mark Johnston	9fb7c918ef	Remove manual wire_count adjustments from the unmapped mbuf code. The original code came from a desire to minimize the number of updates to v_wire_count, which prior to r329187 was updated using atomics. However, there is no significant benefit to batching today, so simply allocate pages using VM_ALLOC_WIRED and rely on system accounting. Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D21323	2019-08-21 20:01:52 +00:00
Mark Johnston	6bc13e042f	Modify pipe_poll() to properly check for pending direct writes. With r349546, it is a responsibility of the writer to clear PIPE_DIRECTW after pinned data has been read. In particular, once a reader has drained this data, there is a small window where the pipe is empty but PIPE_DIRECTW is set. pipe_poll() was using the presence of PIPE_DIRECTW to determine whether to return POLLIN, so in this window it would claim that data was available to read when this was not the case. Fix this by modifying several checks for PIPE_DIRECTW to instead look at the number of residual bytes in data pinned by a direct writer. In some cases we really do want to check for PIPE_DIRECTW, since the presence of this flag indicates that any attempt to write to the pipe will block on the existing direct writer. Bisected and test case provided by: mav Tested by: pho Reviewed by: kib MFC after: 3 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21333	2019-08-21 19:35:04 +00:00
Ed Maste	f37192064a	mqueuefs: fix compat32 struct file leak In a compat32 error case we previously leaked a struct file. Submitted by: Karsten König, Secfault Security Security: CVE-2019-5603	2019-08-20 17:44:03 +00:00
Jeff Roberson	cf27e0d125	Use an atomic reference count for paging in progress so that callers do not require the object lock. Reviewed by: markj Tested by: pho (as part of a larger branch) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21311	2019-08-19 23:09:38 +00:00
Mateusz Guzik	4b3f767340	vfs: fix up r351193 ("stop always overwriting ->mnt_stat in VFS_STATFS") fs-specific part of vfs_statfs routines only fill in small portion of the structure. Previous code was always copying everything at a higher layer to acoomodate it and this patch does the same. 'df' (no arguments) worked fine because the caller uses mnt_stat itself as the target buffer, making all the copying a no-op for its own case. 'df /' and similar use a different consumer which passes its own buffer and this is where you can run into trouble. Reported by: cy Fixes: r351193 Sponsored by: The FreeBSD Foundation	2019-08-19 14:11:54 +00:00
Andrey V. Elsukov	75697b16b6	Use TAILQ_FOREACH_SAFE() macro to avoid use after free in soclose(). PR: 239893 MFC after: 1 week	2019-08-19 12:42:03 +00:00
Andriy Gapon	0db7afd0ae	assert that td_lk_slocks is not leaked upon return from kernel This is similar to checks for td_sx_slocks and td_rw_rlocks. Although td_lk_slocks is an implementation detail, it still makes sense to validate it. MFC after: 1 week Sponsored by: Panzura	2019-08-19 11:18:36 +00:00
Rick Macklem	2e1b32c0e3	Add a vop_stdioctl() that performs a trivial FIOSEEKDATA/FIOSEEKHOLE. Without this patch, when an application performed lseek(SEEK_DATA/SEEK_HOLE) on a file in a file system that does not have its own VOP_IOCTL(), the lseek(2) fails with errno ENOTTY. This didn't seem appropriate, since ENOTTY is not listed as an error return by either the lseek(2) man page nor the POSIX draft for lseek(2). A discussion on freebsd-current@ seemed to indicate that implementing a trivial algorithm that returns the offset argument for FIOSEEKDATA and returns the file's size for FIOSEEKHOLE was the preferred fix. http://docs.FreeBSD.org/cgi/mid.cgi?CAOtMX2iiQdv1+15e1N_r7V6aCx_VqAJCTP1AW+qs3Yg7sPg9wA The Linux kernel appears to implement this trivial algorithm as well. This patch adds a vop_stdioctl() that implements this trivial algorithm. It returns errors consistent with vn_bmap_seekhole() and, as such, will still return ENOTTY for non-regular files. I have proposed a separate patch that maps errors not described by the lseek(2) man page nor POSIX draft to EINVAL. This patch is under separate review. Reviewed by: kib Relnotes: yes Differential Revision: https://reviews.freebsd.org/D21299	2019-08-19 00:29:05 +00:00
Konstantin Belousov	de4e1aeb21	Fix an issue with executing tmpfs binary. Suppose that a binary was executed from tmpfs mount, and the text vnode was reclaimed while the binary was still running. It is possible during even the normal operations since tmpfs vnode' vm_object has swap type, and no references on the vnode is held. Also assume that the text vnode was revived for some reason. Then, on the process exit or exec, unmapping of the text mapping tries to remove the text reference from the vnode, but since it went from recycle/instantiation cycle, there is no reference kept, and assertion in VOP_UNSET_TEXT_CHECKED() triggers. Fix this by keeping a use reference on the tmpfs vnode for each exec reference. This prevents the vnode reclamation while executable map entry is active. Do it by adding per-mount flag MNTK_TEXT_REFS that directs vop_stdset_text() to add use ref on first vnode text use, and per-vnode VI_TEXT_REF flag, to record the need on unref in vop_stdunset_text() on last vnode text use going away. Set MNTK_TEXT_REFS for tmpfs mounts. Reported by: bdrewery Tested by: sbruno, pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-08-18 20:36:11 +00:00
Konstantin Belousov	bb9e2184f0	Change locking requirements for VOP_UNSET_TEXT(). Require the vnode to be locked for the VOP_UNSET_TEXT() call. This will be used by the following bug fix for a tmpfs issue. Tested by: sbruno, pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-08-18 20:24:52 +00:00
Mateusz Guzik	e7c1709aaf	vfs: stop always overwriting ->mnt_stat in VFS_STATFS The struct is already populated on each mount (and remount). Fields are either constant or not used by filesystem in the first place. Some infrequently used functions use it to avoid having to allocate a new buffer and are left alone. The current code results in an avoidable copying single-threaded and significant cache line bouncing multithreaded While here deduplicate initial filling of the struct. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21317	2019-08-18 18:40:12 +00:00
Jeff Roberson	33205c60e7	Add a blocking wait bit to refcount. This allows refs to be used as a simple barrier. Reviewed by: markj, kib Discussed with: jhb Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21254	2019-08-18 11:43:58 +00:00
Mateusz Guzik	50c7615fb0	fork: rework locking around do_fork - move allproc lock into the func, it is of no use prior to it - the code would lock p1 and p2 while holding allproc to partially construct it after it gets added to the list. instead we can do the work prior to adding anything. - protect lastpid with procid_lock As a side effect we do less work with allproc held. Sponsored by: The FreeBSD Foundation	2019-08-17 18:19:49 +00:00
Mateusz Guzik	60cdcb644d	fork: bump process count before checking for permission to cross the limit The limit is almost never reached. Do the check only on failure to see if we can override it. No change in user-visible behavior. Sponsored by: The FreeBSD Foundation	2019-08-17 17:56:43 +00:00
Mateusz Guzik	b05641b6bd	fork: stop skipping < 100 ids on wrap around Code doing this is commented with a claim that these IDs are occupied by daemons, but that's demonstrably false. To an extent the range is used by init and kernel processes (and on sufficiently big machines it indeed is fully populated). On a sample box 40-way box the highest id in the range is 63. On a different one it is 23. Just use the range. Sponsored by: The FreeBSD Foundation	2019-08-17 17:42:01 +00:00
Alexander Motin	3a60f3dad0	Add support for 'j', 't' and 'z' flags to kernel sscanf(). MFC after: 2 weeks	2019-08-16 19:46:22 +00:00
Jeff Roberson	2194393787	Move phys_avail definition into MI code. It is consumed in the MI layer and doing so adds more flexibility with less redundant code. Reviewed by: jhb, markj, kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21250	2019-08-16 00:45:14 +00:00
Rick Macklem	c61b14315f	Fix copy_file_range(2) so that unneeded blocks are not allocated to the output file. When the byte range for copy_file_range(2) doesn't go to EOF on the output file and there is a hole in the input file, a hole must be "punched" in the output file. This is done by writing a block of bytes all set to 0. Without this patch, the write is done unconditionally which means that, if the output file already has a hole in that byte range, a unneeded data block of all 0 bytes would be allocated. This patch adds code to check for a hole in the output file, so that it can skip doing the write if there is already a hole in that byte range of the output file. This avoids unnecessary allocation of blocks to the output file. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D21155	2019-08-15 23:21:41 +00:00
Jeff Roberson	018ff6860f	Move scheduler state into the per-cpu area where it can be allocated on the correct NUMA domain. Reviewed by: markj, gallatin Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D19315	2019-08-13 04:54:02 +00:00
Konstantin Belousov	7e097daa93	Only enable COMPAT_43 changes for syscalls ABI for a.out processes. Reviewed by: imp, jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D21200	2019-08-11 19:16:07 +00:00
Jonathan T. Looney	afd959f332	In m_pulldown(), before trying to prepend bytes to the subsequent mbuf, ensure that the subsequent mbuf contains the remainder of the bytes the caller sought. If this is not the case, fall through to the code which gathers the bytes in a new mbuf. This fixes a bug where m_pulldown() could fail to gather all the desired bytes into consecutive memory. PR: 238787 Reported by: A reddit user Discussed with: emaste Obtained from: NetBSD MFC after: 3 days	2019-08-09 05:18:59 +00:00
Rick Macklem	6b1bc6f7dd	Remove some harmless cruft from vn_generic_copy_file_range(). An earlier version of the patch had code that set "error" between line#s 2797-2799. When that code was moved, the second check for "error != 0" could never be true and the check became harmless cruft. This patch removes the cruft, mainly to make Coverity happy. Reported by: asomers, cem	2019-08-08 20:07:38 +00:00
Rick Macklem	614633146f	Fix copy_file_range(2) for an unlikely race during hole finding. Since the VOP_IOCTL(FIOSEEKDATA/FIOSEEKHOLE) calls are done with the vnode unlocked, it is possible for another thread to do: - truncate(), lseek(), write() between the two calls and create a hole where FIOSEEKDATA returned the start of data. For this case, VOP_IOCTL(FIOSEEKHOLE) will return the same offset for the hole location. This could result in an infinite loop in the copy code, since copylen is set to 0 and the copy doesn't advance. Usually, this race is avoided because of the use of rangelocks, but the NFS server does not do range locking and could do a sequence like the above to create the hole. This patch checks for this case and makes the hole search fail, to avoid the infinite loop. At this time, it is an open question as to whether or not the NFS server should do range locking to avoid this race.	2019-08-08 19:53:07 +00:00
Konstantin Belousov	b706be23b4	Update comment explaining create_init(). Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-08-08 16:42:53 +00:00
Xin LI	22bbc4b242	Convert DDB_CTF to use newer version of ZLIB. PR: 229763 Submitted by: Yoshihiro Ota <ota j email ne jp> Differential Revision: https://reviews.freebsd.org/D21176	2019-08-08 07:27:49 +00:00
Conrad Meyer	7d0658ad55	Fix !DDB kernel configurations after r350713 KDB is standard and the kdb_active variable is always available. So, de-conditionalize inclusion of sys/kdb.h in kern_sysctl.c. Reported by: Michael Butler <imb AT protected-networks.net> X-MFC-With: r350713 Sponsored by: Dell EMC Isilon	2019-08-08 01:37:41 +00:00

1 2 3 4 5 ...

16781 commits