Updating RootPKIPath but not IntermediatePKIPath would not update
leaf signing certs with the new root. Unsure if this happens in practice
but manual testing showed it is a bug that would break mesh and agent
connections once the old root is pruned.
* Rename Intermediate cert references to LeafSigningCert
Within the Consul CA subsystem, the term "Intermediate"
is confusing because the meaning changes depending on
provider and datacenter (primary vs secondary). For
example, when using the Consul CA the "ActiveIntermediate"
may return the root certificate in a primary datacenter.
At a high level, we are interested in knowing which
CA is responsible for signing leaf certs, regardless of
its position in a certificate chain. This rename makes
the intent clearer.
* Move provider state check earlier
* Remove calls to GenerateLeafSigningCert
GenerateLeafSigningCert (formerly known
as GenerateIntermediate) is vestigial in
non-Vault providers, as it simply returns
the root certificate in primary
datacenters.
By folding Vault's intermediate cert logic
into `GenerateRoot` we can encapsulate
the intermediate cert handling within
`newCARoot`.
* Move GenerateLeafSigningCert out of PrimaryProvidder
Now that the Vault Provider calls
GenerateLeafSigningCert within
GenerateRoot, we can remove the method
from all other providers that never
used it in a meaningful way.
* Add test for IntermediatePEM
* Rename GenerateRoot to GenerateCAChain
"Root" was being overloaded in the Consul CA
context, as different providers and configs
resulted in a single root certificate or
a chain originating from an external trusted
CA. Since the Vault provider also generates
intermediates, it seems more accurate to
call this a CAChain.
Previously, we'd begin a session with the xDS concurrency limiter
regardless of whether the proxy was registered in the catalog or in
the server's local agent state.
This caused problems for users who run `consul connect envoy` directly
against a server rather than a client agent, as the server's locally
registered proxies wouldn't be included in the limiter's capacity.
Now, the `ConfigSource` is responsible for beginning the session and we
only do so for services in the catalog.
Fixes: https://github.com/hashicorp/consul/issues/15753
The fix outlined and merged in #15253 fixed the issue as it occurs in the primary
DC. There is a similar issue that arises when vault is used as the Connect CA in a
secondary datacenter that is fixed by this PR.
Additionally: this PR adds support to run the existing suite of vault related integration
tests against the last 4 versions of vault (1.9, 1.10, 1.11, 1.12)
After fixing that bug I uncovered a couple more:
Fix an issue where we might try to cross sign a cert when we never had a valid root.
Fix a potential issue where reconfiguring the CA could cause either the Vault or AWS PCA CA providers to delete resources that are still required by the new incarnation of the CA.
* Change CA Configure struct to pass Datacenter through
* Remove connect/ca/plugin as we don't have immediate plans to use it.
We still intend to one day but there are likely to be several changes to the CA provider interface before we do so it's better to rebuild from history when we do that work properly.
* Rename PrimaryDC; fix endpoint in secondary DCs
* Support Connect CAs that can't cross sign
* revert spurios mod changes from make tools
* Add log warning when forcing CA rotation
* Fixup SupportsCrossSigning to report errors and work with Plugin interface (fixes tests)
* Fix failing snake_case test
* Remove misleading comment
* Revert "Remove misleading comment"
This reverts commit bc4db9cabed8ad5d0e39b30e1fe79196d248349c.
* Remove misleading comment
* Regen proto files messed up by rebase
* pass logger through to provider
* test for proper operation of NeedsLogger
* remove public testServer function
* Ooops actually set the logger in all the places we need it - CA config set wasn't and causing segfault
* Fix all the other places in tests where we set the logger
* Allow CA Providers to persist some state
* Update CA provider plugin interface
* Fix plugin stubs to match provider changes
* Update agent/connect/ca/provider.go
Co-Authored-By: R.B. Boyer <rb@hashicorp.com>
* Cleanup review comments
This adds the `agent/connect/ca/plugin` library for consuming/serving Connect CA providers as [go-plugin](https://github.com/hashicorp/go-plugin) plugins. This **does not** wire this up in any way to Consul itself, so this will not enable using these plugins yet.
## Why?
We want to enable CA providers to be pluggable without modifying Consul so that any CA or PKI system can potentially back the Connect certificates. This CA system may also be used in the future for easier bootstrapping and internal cluster security.
### go-plugin
The benefit of `go-plugin` is that for the plugin consumer, the fact that the interface implementation is communicating over multi-process RPC is invisible. Internals of Consul will continue to just use `ca.Provider` interface implementations as if they're local. For plugin _authors_, they simply have to implement the interface. The network/transport/process management issues are handled by go-plugin itself.
The CA provider plugins support both `net/rpc` and gRPC transports. This enables easy authoring in any language. go-plugin handles the actual protocol handshake and connection. This is just a feature of go-plugin.
`go-plugin` is already in production use for years by Packer, Terraform, Nomad, Vault, and Sentinel. We've shown stability for both desktop and server-side software. It is very mature.
## Implementation Details
### `map[string]interface{}`
The `Configure` method passes a `map[string]interface{}`. This map contains only Go primitives and containers of primitives (no funcs, chans, etc.). For `net/rpc` we encode as-is using Gob. For gRPC we marshal to JSON and transmit as a `bytes` type. This is the same approach we take with Vault and other software.
Note that this is just the transport protocol, the end software views it fully decoded.
### `x509.Certificate` and `CertificateRequest`
We transmit the raw ASN.1 bytes and decode on the other side. Unit tests are verifying we get the same cert/csrs across the wire.
### Testing
`go-plugin` exposes test helpers that enable testing the full plugin RPC over real loopback network connections. We test all endpoints for success and error for both `net/rpc` and gRPC.
### Vendoring
This PR doesn't introduce vendoring for two reasons:
1. @banks's `f-envoy` branch introduces a lot of these and I didn't want conflict.
2. The library isn't actually used yet so it doesn't introduce compile-time errors (it does introduce test errors).
## Next Steps
With this in place, we need to figure out the proper way to actually hook these up to Consul, load them, etc. This discussion can happen elsewhere, since regardless of approach this plugin library implementation is the exact same.