Commit graph

10 commits

Author SHA1 Message Date
Vault Automation
61b6ae14e5
[VAULT-40147] pipeline: add pipeline.hcl with changed_files (#12302) (#12408)
The `pipeline` utility started as collection of small CLI utilities that we found useful for the Vault CI/CD pipeline. Rather than engineering complex bash scripts in YAML blocks, instead, we could build small, reusable, testable actions and integrate the into a single binary. No more copying and pasting loads of bash from YAML, instead we can copy a single command and run the same thing locally that we can in CI.

As we've continued to invest in the utilities capability, it's become clear that other CI pipelines would benefit from the same functionality that we've been building. This change represents the first significant work to make the utility truly generic in a HashiCorp repo that utilizes CRT sense. Once all the Vault specifics have been extracted we hope to move the utility out of the repo and make it available everywhere.

The primary change here is to move our changed file grouping configuration out of the `changed` package entirely. Instead of checkers that are written as Go code, we have created a new configuration file for the `pipeline` utility called `pipeline.hcl` While there are certainly other things that will eventually be configurable here, the only thing we've added support for is `changed_files`, which allows configuring how to match a given changed files path to a group name.

The DSL is fairly simple:

```hcl
changed_files {
  // One or more groups can be defined
  group "group_name_label" {
    // Zero or more ignore blocks can be defined
    ignore {
      base_dir         = []
      base_name        = []
      base_name_prefix = []
      contains         = []
      extension        = []
      file             = []
    }

    // One or more match blocks can be defined
    match {
      base_dir         = []
      base_name        = []
      base_name_prefix = []
      contains         = []
      extension        = []
      file             = []
    }
  }
}
```

For example,
```hcl
// Create a changed_files block where we can define our changed files groups
changed_files {

  // Group blocks take one label which is the name of the group
  group "app" {

    // Groups can ignore based on some criteria.
    ignore {

      // In this instance, we'll ignore any file that begins with
      // tools/pipeline. All paths will be relative to the git repository
      // root directory. The joinpath() function is here to support paths
      // that are agnostic to the operating systems path separator. While
      // it's unlikely that you'll need them, several cty stdlib functions
      // are available.
      base_dir = [joinpath("tools", "pipeline")]
    }

    // Groups must define at least one match block.
    match {
      // This will match any file with the .go extension (except for
      // those that will be excluded with our ignore directive aboe
      extension = [".go"]
    }

    // Groups can contain more than one match block. If any of the match
    // blocks meet their criteria the group will be associated with the
    // changed file
    match {
      base_name = ["go.mod", "go.sum"]
    }

    // If groups have more than one attribute set, each attribute group
    // must match in order for the match.
    match {
      // Here we only match files that contain "raft_autopilot" in the
      // path with the .go extension
      extension = [".go"]
      contains  = ["raft_autopilot"]
    }
  }

  group "autopilot" {
    // Ignore blocks have the same attributes as match blocks
    match {
      // The base directory.
      base_dir = [
        "changelog",
        joinpath("tools", "codechecker"),
      ]
      // The base of the file
      base_name = ["README.md"]
      // A prefix string match on a files name.
      base_name_prefix = ["buf."]
      // Any string match in the files full path
      contains = [
        "-ce",
        "_ce",
        "-oss",
        "_oss",
      ]
      // The file's extension
      extension = [
        ".hcl",
        ".md",
        ".sh",
        ".yaml",
        ".yml",
      ]
      // An exact file match
      file = [
        # These exist on CE branches to please Github Actions.
        joinpath(".github", "workflows", "build-artifacts-ent.yml"),
        joinpath(".github", "workflows", "backport-automation-ent.yml"),
      ]
    }
  }
}
```

The default location of the config is `.release/pipeline.hcl`. All of our prior checks have been migrated to the DSL file present in this change.

  - We had several commands that used the changed files groups that were built into the library. This change requires us to instead load the configuration from the file and use the user defined groupings.

  - Several commands now take some part of that configuration in the request type. When possible we use the version parsed by the root command and verify in the request body rather than attempt to load the configuration.

  - We also refactor the loading and parsing of `.release/versions.hcl` in the same manner. Now we automatically parse the file in the default locations relative to the git repo root.

  - Our root command now has two new flags `--pipeline-config` and `--versions-config` which allow specifying a default location for each file. Commands which previously accepted flags or args to configure the versions file have been updated to use the global root flags instead. We've also removed the previous implementation that would recursively search backwards from the working directory to find the `versions.hcl` file. Instead we only support loading the file from the default location relative to the Git repo root.

  - All instances of changed `pipeline` command invocations have been update to support the new auto-loading of configuration.

  - A new configuration sub-command with validation exists to quickly validate a configuration file. `pipeline config validate`

Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
2026-02-23 10:51:31 -08:00
Vault Automation
34b5b5b2ff
[VAULT-39994] pipeline(changed-files): add support for listing and checking changed files (#12127) (#12215)
We've already deployed some changed file detection in the CI pipeline. It uses the Github API to fetch a list of all changed files on a PR and then run it through a simple groups categorization pass. It's been a useful strategy in the context of a Pull Request because it does not depend on the local state of the Git repo.

This commit introduces a local git-based file change detection and validation system for the pipeline tool, enabling developers to identify and validate changed files before pushing code. We intend to use the new tool in two primary ways:
  - As a Git pre-push hook when pushing new or updated branches. (Implemented here)
  - As part of the scheduled automated repository synchronization. (Up next, and it will use the same `git.CheckChangedFilesReq{}` implementation.

This will allow us to guard all pushes to `hashicorp/vault` and `ce/*` branches in `hashicorp/vault-enterprise`, whether run locally on a developer machine or in CI by our service user.

We introduce two new `pipeline` CLI commands:
  - `pipeline git list changed-files`
  - `pipeline git check changed-files`

Both support specifying what method of git inspection we want to use for the changed files list:
  - **`--branch <branch>`**: Lists all files added in the entire history of a specific branch. We use this when pushing a _new_ branch.
  - **`--range <range>`**: Lists all changed files within a commit range (e.g., `HEAD~5..HEAD`). We use this when updating an existing branch.
  - **`--commit <sha>`**: Lists all changed files in a specific commit (using `git show`). This isn't actually used at all in the pre-push hook but it useful if you wish to inspect a single commit on your branch.

The behavior when passing the `range` and `commit` is similar. We inspect the changed file list either for one or many commits (but with slightly different implementations for efficiency and accuracy.  The `branch` option is a bit different. We use it to inspect the branches entire history of changed files for enterprise files before pushing a new branch. We do this to ensure that our branch doesn't accidentally add and then subsequently remove enterprise files, leaving the contents in the history but nothing obvious in the diff.

Each command supports several different output formats. The default is the human readable text table, though `--format json` will write all of the details as valid JSON to STDOUT. When given the `--github-output` command each will write a more concise version of the JSON output to `$GITHUB_OUTPUT`. It differs from our standard JSON output as it has been formatted to be easier to use in Github Actions contexts without requiring complex filtering.

When run, changed files are automatically categorized into logical groups based on their file name, just like our existing changed file detection. A follow-up to this PR will introduce a configuration based system for classifying file groups. This will allow us to create generic support for changed file detection so that many repositories can adopt this pattern. 

The major difference in behavior between the two new commands is that the `list` command will always list the changed files for the given method/target, while the `check` command requires one-or-more changed file groups that we want to disallow to be included via the `-g` flag. If any changed files match the given group(s) then the command will fail. That allows us to specify the `enterprise` group and disallow the command to succeed if any of the changed files match the group.

The pre-push git hook now uses this system to prevent accidental pushes, however, it requires the local machine to have the `pipeline` tool in the `$PATH`. This ought not be much of a requirement as a working Go toolchain is required for any Vault developer. When it is not present we explain in our error messages how to resolve the problem and direct them to our slack channel if they need further assistance.

Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
2026-02-05 22:37:08 +00:00
Vault Automation
0a163f449e
[VAULT-40165] pipeline(github): add check go-mod-diff command (#10369) (#10377)
* [VAULT-40165] pipeline(github): add `check go-mod-diff` command

Add `pipeline github check go-mod-diff` command that is capable of
creating a Go module diff between one-or-more go.mod files in two
different Github branches. There are flags for the owner, repo, and
branch for both the A and B sides of the diff, as well as the `--path`
or `-p` flag that can be specified any number of times with relative
paths in the repository of go.mod files to compare. We assume that the
path is the same in both repositories.

This work will be followed up with another PR that removes the
enterprise only go.mod file and enables Go module diff checking on pull
requests to CE branches that change the go toolchain.

Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
2025-10-27 20:03:36 +00:00
Vault Automation
0c6c13dd38
license: update headers to IBM Corp. (#10229) (#10233)
* license: update headers to IBM Corp.
* `make proto`
* update offset because source file changed

Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
2025-10-21 15:20:20 -06:00
Vault Automation
0154b89fd3
pipeline(copy-pr): skip merge commits when copying PR (#9729) (#9735)
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
2025-09-29 16:27:26 -06:00
Vault Automation
e1e1518141
[VAULT-39153] pipeline(backport): remove docs and pipeline from allowed ce inactive (#8819) (#8842)
Docs have been moved since the tool was written so that exclusion is no
longer needed. Since the defaults were added the `pipeline` group has
expanded to include all `.github`, which we don't want to always
backport. It seems unlike that `pipeline` tooling changes are likely to
be required often on inactive branches so we'll exclude all together for
now.

Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
2025-08-22 16:19:35 +00:00
Ryan Cragun
3611b8b709
[VAULT-32028] pipeline(github): add sync branches sub-command (#31252)
Add a new `pipeline github sync branches` command that can synchronize
two branches. We'll use this to synchronize the
`hashicorp/vault-enterprise/ce/*` branches with `hashicorp/vault/*`.

As the community repository is effectively a mirror of what is hosted in
Enterprise, a scheduled sync cadence is probably fine. Eventually we'll
hook the workflow and sync into the release pipeline to ensure that
`hashicorp/vault` branches are up-to-date when cutting community
releases.

As part of this I also fixed a few static analysis issues that popped up
when running `golangci-lint` and fixed a few smaller bugs.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2025-07-11 14:08:14 -06:00
Ryan Cragun
d595a95c01
[VAULT-37096] pipeline(github): add github copy pr command (#31095)
After the merge workflow has been reversed and branches hosted in
`hashicorp/vault` are downstream from community branches hosted in
`hashicorp/vault-enterprise`, most contributions to the source code
will originate in `hashicorp/vault-enterprise` and be backported to
a community branch in hosted in `hashicorp/vault-enterprise`. These
community branches will be considered the primary source of truth and
we'll automatically push changes from them to mirrors hosted in
`hashicorp/vault`.

This workflow ought to yield a massive efficiency boost for HashiCorp
contributors with access to `hashicorp/vault-enterprise`. Before the
workflow would look something like:
  - Develop a change in vault-enterprise
  - Manually extract relevant changes from your vault-enterprise branch
    into a new community vault branch.
  - Add any stubs that might be required so as to support any enterprise
    only changes.
  - Get the community change reviewed. If changes are necessary it often
    means changing and testing them on both the enteprise and community
    branches.
  - Merge the community change
  - Wait for it to sync to enterprise
  - *Hope you changes have not broken the build*. If they have, fix the
    build.
  - Update your enterprise branch
  - Get the enterprise branch reviewed again
  - Merge enterprise change
  - Deal with complicated backports.

After the workflow will look like:
  - Develop the change on enterprise
  - Get the change reviewed
  - Address feedback and test on the same branch
  - Merge the change
  - Automation will extract community changes and create a community
    backport PR for you depending on changes files and branch
    activeness.
  - Automation will create any enterprise backports for you.
  - Fix any backport as necessary
  - Merge the changes
  - The pipeline will automatically push the changes to the community
    branch mirror hosted in hashicorp/vault.

No more
 - Duplicative reviews
 - Risky merges
 - Waiting for changes to sync from community to enterprise
 - Manual decompistion of changes from enterprise and community
 - *Doing the same PR 3 times*
 - Dealing with a different backport process depending on which branches
   are active or not.

These changes do come at cost however. Since they always originate from
`vault-enterprise` only HashiCorp employees can take advatange of the
workflow. We need to be able to support community contributions that
originate from the mirrors but retain attribution.

That's what this PR is designed to do. The community will be able to
open a pull request as normal and have it reviewed as such, but rather
than merging it into the mirror we'll instead copy the PR and open it
against the corresponding enterprise base branch and have it merged it
from there. The change will automatically get backported to the
community branch if necessary, which eventually makes it back to the
mirror in hashicorp/vault.

To handle our squash merge workflow while retaining the correct
attribution, we'll automatically create merge commits in the copied PR
that include `Co-Authored-By:` trailers for all commit authors on the
original PR.

We also take care to ensure that the HashiCorp maitainers that approve
the PR and/or are assigned to it are also assigned to the copied PR.

This change is only the tooling to enable it. The workflow that drives
it will be implemented in VAULT-34827.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2025-06-25 15:20:57 -06:00
Ryan Cragun
025a6d5071
[VAULT-34829] pipeline(backport): add github create backport command (#30713)
Add a new `github create backport` sub-command that can create a
backport of a given pull request. The command has been designed around a
Github Actions workflow where it is triggered on a closed pull request
event with a guard that checks for merges:

```yaml
pull_request_target:
  types: closed

jobs:
  backport:
    if: github.even.pull_request.merged
    runs-on: "..."
```

Eventually this sub-command (or another similar one) can be used to
implemente backporting a CE pull request to the corresponding ce/*
branch in vault-enterprise. This functionality will be implemented in
VAULT-34827.

This backport runner has several new behaviors not present in the
existing backport assistant:
  - If the source PR was made against an enterprise branch we'll assume
    that we want create a CE backport.
  - Enterprise only files will be automatically _removed_ from the CE
    backport for you. This will not guarantee a working CE pull request
    but does quite a bit of the heavy lifting for you.
  - If the change only contains enterprise files we'll skip creating a
    CE backport.
  - If the corresponding CE branch is inactive (as defined in
    .release/versions.hcl) then we will skip creating a backport in most
    cases. The exceptions are changes that include docs, README, or
    pipeline changes as we assume that even active branches will want
    those changes.
  - Backport labels still work but _only_ to enterprise PR's. It is
    assumed that when the subsequent PRs are merged that their
    corresponding CE backports will be created.
  - Backport labels no longer include editions. They will now use the
    same schema as active versions defined .release/verions.hcl. E.g.
    `backport/1.19.x`. `main` is always assumed to be active.
  - The runner will always try and update the source PR with a Github
    comment regarding the status of each individual backport. Even if
    one attempt at backporting fails we'll continue until we've
    attempted all backports.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2025-05-23 14:02:24 -06:00
Ryan Cragun
cb321ce774
[VAULT-36201] pipeline(git): add Go git client (#30645)
As I was working on VAULT-34829 it became clear that we needed to solve
the problem of using Git from Go. Initially I tried to use the
go-git/go-git pure Go implementation of Git, but it lacked several
features that we needed.

The next best option seemed to be shelling out to Git. What started as a
simple way to execute Git commands with the requisite environment and
configuration led to the implementation.

A few notes:
- I did not add support for all flags for the implemented sub-commands
- I wanted it to handle automatic inject and configuration of PATs when
  operating against private remote repositories in Github.
- I did not want to rely on the local machine having been configured.
  The client that is built in is capable of configuring everything that
  we need via environment variables.
- I chose to go the environment variable route for configuration as it's
  the only way to not overwrite local configuration or set up our own
  cred store.

As the git client ended up being ~50% of the work for VAULT-34829, I
decided to extract it out into its own PR to reduce the review burden.

NOTE: While we don't use it in CI, there is .golanci.yml present in the
 the tree and it causes problems in editors that expect it to be V2. As
 such I migrated it.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2025-05-16 12:41:31 -06:00