While we have mechanisms in place to protect ourselves against the read
behind the disk end, there is still one corner case. As the GPT
partition table has backup table at the end of the disk, and we yet
do not know the size of the disk (if the wrong size is provided by the
firmware/bios), we need to limit the reads to avoid read ahead in such case.
Note: this update does add constant into stand.h, so the incremental build
will need to get local stand.h updated first.
Reviewed by: allanjude
Differential Revision: https://reviews.freebsd.org/D10187
As we provide the disk size verification and correction via disk_ioctl
and disk state provided by disk_open(), we can not share the partition
state in disk_devdesc structure. Also the sharing does make a lot of sense
with ufs, as only one partition is open at any given time, but zfs pools
do keep the disk devices open.
To make sure we do get the correct information about the open device,
just remove the cache.
Reviewed by: allanjude, smh
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D9757
Need interface to extract information about disk abstraction,
to read disk or partition size depending on the provided argument
and adjust disk size based on information in partition table.
The disk handle from disk_open() has d_offset field to point to
partition start. So we can use this fact to return either whole disk
size or partition size. For this we only need to record partition size
we get from disk_open() anyhow.
In addition, this will also make it possible to adjust the disk media size
based on information from partition table. The problem with disk size is
about some BIOS systems reporting bogus disk size for 2+TB disks, but
since such disks are using GPT partitioning, and GPT does have information
about disk size (alternate LBA + 1), we can use this fact to record disk
size based on partition table.
This patch does exactly this: implements DIOCGSECTORSIZE and DIOCGMEDIASIZE
ioctl, and DIOCGMEDIASIZE will report either disk media size or partition size.
Adds ptable_getsize() call to read partition size in bytes from ptable pointer.
Updates disk_open() to use ptable_getsize() to update mediasize value.
Implements GPT detection function to update ptable size (used by
ptable_getsize()) according to alternate lba (which is location of backup copy
of GPT header table).
Reviewed by: allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D8594
The disk_* and part_* api is using 64bit values for media size and
offsets. However, the current api is using off_t type, which is signed
64-bit int.
In this context the signed media size does not make any sense, and
the offsets are used to mark absolute, not relative locations.
Also, the data from GPT partition table and some other sources is
already using uint64_t data type, so using signed off_t can cause sign
issues.
Reviewed by: imp
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D8710
Apparently the libstand dosfs optimization is a bit too optimistic
and did introduce possible memory corruption.
This patch is backing out the bad part and since this results in
dosfs reading full blocks now, we can also remove extra offset argument
from dv_strategy callback.
The analysis of the issue and the backout patch is provided by Mikhail Kupchik.
PR: 214423
Submitted by: Mikhail Kupchik
Reported by: Mikhail Kupchik
Reviewed by: bapt, allanjude
Approved by: allanjude (mentor)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D8644
return value when it could return 1 (indicating we should stop).
Fix a few instances of pager_open() / pager_close() not being called.
Actually use these routines for the environment variable printing code
I just committed.
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
The initial IOCTL implementation supports reading disk physical
geometry.
Two additional functions were added. They allow reading/writing raw
data to the disk (default partition).
Submitted by: Wojciech Macek <wma@semihalf.com>
Obtained from: Semihalf
Sponsored by: Juniper Networks Inc.
Differential Revision: https://reviews.freebsd.org/D4143
disk_open(). Very often this is called several times for one file.
This leads to reading partition table metadata for each call. To
reduce the number of disk I/O we have a simple block cache, but it
is very dumb and more than half of I/O operations related to reading
metadata, misses this cache.
Introduce new cache layer to resolve this problem. It is independent
and doesn't need initialization like bcache, and will work by default
for all loaders which use the new DISK API. A successful disk_open()
call to each new disk or partition produces new entry in the cache.
Even more, when disk was already open, now opening of any nested
partitions does not require reading top level partition table.
So, if without this cache, partition table metadata was read around
20-50 times during boot, now it reads only once. This affects the booting
from GPT and MBR from the UFS.
number is not exactly specified. When the disk has MBR, also try to read
BSD label after ptable_getpart() call. When the disk has GPT, also set
d_partition to 255. Mostly, this is how it worked before.
When we open the disk, check the type of partition table, that has
been detected. If this is BSD label, then we assume this is DD mode.
Reported by: dim@
It uses new API from the part.c to work with partition tables.
Update userboot's disk driver to use new API. Note that struct
loader_callbacks_v1 has changed.