Implement the read-only repository check by hashing pack and index
objects and comparing against the stored hashes, without writing to the
repository.
Report check progress separately for the index and for the packs, each
ending at 100%.
Port of borgbackup/borg#9790 to master.
Note: the PR's second commit (Archive.delete: don't reuse msgpack Unpacker
after an unpacking failure) does not apply here - on master Archive.delete
no longer unpacks item metadata (it just removes the archive and lets
"borg compact" reclaim space), so the reuse-after-failure code path does
not exist. The streaming RobustUnpacker already creates a fresh Unpacker
on resync.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Remove the dead BORG_TESTONLY_SHA256_PACK_ID tox env and CI job, fix the packs.rst pack-id docs, and reword comments/tests to describe sha256 pack naming instead of the removed shortcut.
Drop the single-chunk shortcut that reused the chunk_id as the pack_id.
A pack is now always named by sha256 of its bytes, even when it holds a
single chunk, so no code can depend on pack_id == chunk_id.
The repo-create.rst examples were not updated after --encryption was
split into --encryption + --id-hash. Use the real mode name (aes256-ocb)
and show --id-hash as an orthogonal option.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The new borgstore-based Repository inherited put(..., wait=) / delete(..., wait=)
and async_response() from the legacy RemoteRepository's pipelined RPC protocol.
In the new architecture these are dead: borgstore's API is strictly synchronous,
so put/delete ignored wait and ran synchronously, and async_response() was an
empty stub always returning None.
Remove wait and async_response() so the synchronous behavior is explicit, and
clean up every caller that still threaded wait=False / drained async_response()
(cache.add_chunk, archive.py, transfer_cmd.py, and the archive_test mock).
The legacy repository/remote keep their real wait/async implementation.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
NAME was only read by "borg repo-info" to show the crypto suite. Remove it from
all (current and legacy) key classes and let repo-info assemble the display from
the two real dimensions instead: "<mode>, <ENC_NAME>, <IDHASH_NAME>", e.g.
"Encrypted: Yes (repokey, aes256-ocb, sha256)" or
"Encrypted: No (repokey, authenticated, blake3)".
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The "you need KEY AND PASSPHRASE" warning was gated on key.NAME != "plaintext",
a brittle dependency on a human-readable display string. Use the cipher dimension
instead: only plaintext has ENC_NAME == "none"; every key-bearing suite (including
the authenticated ones, which should warn) differs.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The keyfile detection used `key.NAME.startswith("key file")`, but since the
keyfile/repokey unification no key class has such a NAME, so encryption.keyfile
was never emitted (even for keyfile repos). Decide it from the key storage
(KeyBlobStorage.KEYFILE), matching the text repo-info output. Add a test.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ARG_NAME had no readers left after the JSON output switched to ENC_NAME/IDHASH_NAME,
so remove it from all (current and legacy) key classes.
Give the legacy (borg 1.x, read-only) key classes the two dimensions so that
repo-info/list --json reports a meaningful crypto suite for legacy repos instead
of null/null:
- AESCTRKey: encryption=aes256-ctr, id_hash=sha256
- Blake2AESCTRKey: encryption=aes256-ctr, id_hash=blake2
- Blake2AuthenticatedKey: encryption=authenticated, id_hash=blake2
IDHASH_NAME="blake2" is set on the ID_BLAKE2b_256 mix-in (parallel to the
ID_HMAC_SHA_256 / ID_BLAKE3_256 mix-ins). These legacy values never become CLI
choices: encryption_argument_names()/id_hash_argument_names() only iterate
AVAILABLE_KEY_TYPES, not LEGACY_KEY_TYPES.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Stop using key.ARG_NAME (the combined crypto-suite name) for the JSON output.
The "encryption" object now mirrors the split CLI options: the "mode" field is
replaced by "encryption" (cipher / AE algorithm, key.ENC_NAME) and "id_hash"
(id hash function, key.IDHASH_NAME).
This is a breaking change to the documented JSON API.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The combined --encryption value packed two orthogonal dimensions (cipher / AE
algorithm and id hash function) into a single string, causing a combinatorial
explosion of mode names. Key location was already split out into --key-location.
Now:
- --encryption selects only the cipher / AE algorithm:
none, authenticated, aes256-ocb, chacha20-poly1305
- --id-hash selects the id hash function: sha256 (default) or blake3
- --key-location (unchanged) selects key storage: repokey (default) or keyfile
The old combined names were removed (clean break): select a BLAKE3 suite via
--encryption ... --id-hash blake3 instead of blake3-*. aes-ocb was renamed to
aes256-ocb (key NAME shown by repo-info and ARG_NAME in JSON updated to match).
"none" has no key, so it only supports the sha256 id hash.
No on-disk format, key-type byte, or crypto behavior changes: the existing key
classes form a clean cross-product of {cipher} x {id-hash}, selected via the new
ENC_NAME / IDHASH_NAME class attributes.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Follow-up to the py38-modernize work, covering features available
since Python 3.7 (the minimum supported version is 3.11):
- replace collections.OrderedDict with plain dict where no OrderedDict-
specific API is used: dict preserves insertion order since 3.7, so
OrderedDict is only needed for move_to_end()/popitem(last=...):
- archive.compare_archives: orphans_archive1/2 (annotation + value)
- prune_cmd.PRUNING_PATTERNS (now a plain dict literal)
- help_cmd.HelpMixIn.helptext
- parseformat prepare_dump_dict.decode
helpers.lrucache keeps OrderedDict (uses move_to_end/popitem(last)).
Most other Python 3.7 features are already in use (time_ns/monotonic_ns,
datetime.fromisoformat, namedtuple defaults), so this pass is small.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Follow-up to the py39-modernize work, covering features available
since Python 3.8 (the minimum supported version is 3.11):
- use the walrus operator (PEP 572) for assignment expressions where
it removes a separate assign-then-test line and reads more clearly:
- version.parse_version: inline the re.match result into the test
- nanorst.rst_to_text: while char := text.read(1)
- cockpit.runner read_stream: while line := await stream.readline()
- tar_cmds._import_tar: while tarinfo := tar.next()
Most other Python 3.8 features are already in use (shlex.join, bare
@lru_cache, typing.Protocol/Literal), so this pass is small.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Follow-up to the py310-modernize work, covering features available
since Python 3.9 (the minimum supported version is 3.11):
- functools.cache instead of lru_cache(maxsize=None) (posix_ug, windows_ug)
- str.removeprefix() instead of slicing by len() (serve_cmd)
- PEP 585 built-in generics (list/dict/tuple/set/type) in the .pyi stubs
- collections.abc instead of typing for Callable/Iterator/Mapping/...
in the .pyi stubs and runtime modules (chunkers, cockpit)
- PEP 604 union types (X | None) in crypto/low_level.pyi, which the
py310-modernize pass had missed
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Convert if/elif chains that dispatch on a single value to match
statements: repository.borg_permissions (permission preset),
CompressionSpec.__init__/compressor/__str__ (compression name),
calculate_relative_offset (relative ts unit) and the cockpit widget
file-status counter. Membership checks like 'in ("none", "lz4")'
become alternative patterns ('none' | 'lz4').
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the remaining typing.Optional[...] annotations with the
X | None syntax (PEP 604, Python 3.10) and drop the now-unused
Optional/List imports; List[str] -> list[str] in cockpit/runner.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the hand-rolled changedir() context manager with stdlib
contextlib.chdir (kept under the changedir name for importers), and
use it in the archiver fixture so the cwd is restored even if the
test body raises.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Manifest.Operation now derives from enum.StrEnum, so its members are
real str instances; drop the .value indirection in the feature-flag
lookups.
Replace timezone.utc with the datetime.UTC alias (3.11) across the
non-test modules and drop the now-unused timezone imports.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
flush() drops the entries add() pre-marked when store.store() raises, so a never-stored chunk is not left indexed for dedup. adds rollback tests, bumps large-meta test to 5KB, drops security wording from the read_data=False clamp comments.
Several check_cmd tests corrupted a repo object by overwriting a byte at
a fixed position with a fixed value, e.g.:
manifest[:250] + b"x" + manifest[251:]
Manifests/chunks are stored as AEAD-encrypted repo objects, so their
bytes are ~random. When the target byte already happened to hold the
overwrite value (~1/256), the "corruption" was a no-op: the object
stayed valid, "check" returned 0 instead of 1, and the test failed
intermittently (observed in test_spoofed_archive).
Introduce a corrupt(data, position) helper that flips the byte (XOR
0xFF), so the result is guaranteed to differ, and use it in all the
byte-overwrite corruption sites: test_corrupted_manifest,
test_spoofed_manifest, test_manifest_rebuild_corrupted_chunk,
test_spoofed_archive, test_verify_data and test_corrupted_file_chunk.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>