2018-05-23 21:43:40 +00:00
|
|
|
package ca
|
2018-04-09 04:56:11 +00:00
|
|
|
|
|
|
|
import (
|
|
|
|
"crypto/x509"
|
2019-11-21 17:40:29 +00:00
|
|
|
"errors"
|
2020-01-28 23:50:41 +00:00
|
|
|
|
|
|
|
"github.com/hashicorp/go-hclog"
|
2018-04-09 04:56:11 +00:00
|
|
|
)
|
|
|
|
|
CA Provider Plugins (#4751)
This adds the `agent/connect/ca/plugin` library for consuming/serving Connect CA providers as [go-plugin](https://github.com/hashicorp/go-plugin) plugins. This **does not** wire this up in any way to Consul itself, so this will not enable using these plugins yet.
## Why?
We want to enable CA providers to be pluggable without modifying Consul so that any CA or PKI system can potentially back the Connect certificates. This CA system may also be used in the future for easier bootstrapping and internal cluster security.
### go-plugin
The benefit of `go-plugin` is that for the plugin consumer, the fact that the interface implementation is communicating over multi-process RPC is invisible. Internals of Consul will continue to just use `ca.Provider` interface implementations as if they're local. For plugin _authors_, they simply have to implement the interface. The network/transport/process management issues are handled by go-plugin itself.
The CA provider plugins support both `net/rpc` and gRPC transports. This enables easy authoring in any language. go-plugin handles the actual protocol handshake and connection. This is just a feature of go-plugin.
`go-plugin` is already in production use for years by Packer, Terraform, Nomad, Vault, and Sentinel. We've shown stability for both desktop and server-side software. It is very mature.
## Implementation Details
### `map[string]interface{}`
The `Configure` method passes a `map[string]interface{}`. This map contains only Go primitives and containers of primitives (no funcs, chans, etc.). For `net/rpc` we encode as-is using Gob. For gRPC we marshal to JSON and transmit as a `bytes` type. This is the same approach we take with Vault and other software.
Note that this is just the transport protocol, the end software views it fully decoded.
### `x509.Certificate` and `CertificateRequest`
We transmit the raw ASN.1 bytes and decode on the other side. Unit tests are verifying we get the same cert/csrs across the wire.
### Testing
`go-plugin` exposes test helpers that enable testing the full plugin RPC over real loopback network connections. We test all endpoints for success and error for both `net/rpc` and gRPC.
### Vendoring
This PR doesn't introduce vendoring for two reasons:
1. @banks's `f-envoy` branch introduces a lot of these and I didn't want conflict.
2. The library isn't actually used yet so it doesn't introduce compile-time errors (it does introduce test errors).
## Next Steps
With this in place, we need to figure out the proper way to actually hook these up to Consul, load them, etc. This discussion can happen elsewhere, since regardless of approach this plugin library implementation is the exact same.
2019-01-07 17:48:44 +00:00
|
|
|
//go:generate mockery -name Provider -inpkg
|
|
|
|
|
2019-11-21 17:40:29 +00:00
|
|
|
// ErrRateLimited is a sentinel error value Providers may return from any method
|
|
|
|
// to indicate that the operation can't complete due to a temporary rate limit.
|
|
|
|
// In the case of signing new certificates, Consul clients will respect this and
|
|
|
|
// intelligently backoff to optimize rotation rollout time while reducing load
|
|
|
|
// on servers and CA provider.
|
|
|
|
var ErrRateLimited = errors.New("operation rate limited by CA provider")
|
|
|
|
|
2020-10-09 19:18:59 +00:00
|
|
|
// PrimaryIntermediateProviders is a list of CA providers that make use use of an
|
|
|
|
// intermediate cert in the primary datacenter as well as the secondary. This is used
|
|
|
|
// when determining whether to run the intermediate renewal routine in the primary.
|
|
|
|
var PrimaryIntermediateProviders = map[string]struct{}{
|
|
|
|
"vault": {},
|
|
|
|
}
|
|
|
|
|
2019-11-18 14:22:19 +00:00
|
|
|
// ProviderConfig encapsulates all the data Consul passes to `Configure` on a
|
|
|
|
// new provider instance. The provider must treat this as read-only and make
|
|
|
|
// copies of any map or slice if it might modify them internally.
|
|
|
|
type ProviderConfig struct {
|
|
|
|
// ClusterID is the current Consul cluster ID.
|
|
|
|
ClusterID string
|
|
|
|
|
|
|
|
// Datacenter is the current Consul datacenter.
|
|
|
|
Datacenter string
|
|
|
|
|
|
|
|
// IsPrimary is true when the CA instance is in the primary DC typically it
|
|
|
|
// may choose to act as a root in this case while secondaries are typically
|
|
|
|
// intermediate CAs. In some case the primary DC in Consul is an intermediate
|
|
|
|
// signed by some external CA along with that CA's public cert so the old name
|
|
|
|
// of `IsRoot` was misleading.
|
|
|
|
IsPrimary bool
|
|
|
|
|
|
|
|
// RawConfig is the user configuration for the provider and is
|
|
|
|
// provider-specific to be interpreted as the provider wishes.
|
|
|
|
RawConfig map[string]interface{}
|
|
|
|
|
|
|
|
// State contains the State the same provider last persisted. It is provided
|
|
|
|
// after a restart or reconfiguration, or on a leader election on a new server
|
|
|
|
// to maintain operation. It MUST NOT be used for secret storage since it is
|
|
|
|
// visible in the API to operators. It's intended use is to store small bits
|
|
|
|
// of state like UUIDs of external resources that the provider has created and
|
|
|
|
// needs to continue to manage.
|
|
|
|
State map[string]string
|
|
|
|
}
|
|
|
|
|
2018-05-03 19:50:45 +00:00
|
|
|
// Provider is the interface for Consul to interact with
|
2018-04-09 04:56:11 +00:00
|
|
|
// an external CA that provides leaf certificate signing for
|
|
|
|
// given SpiffeIDServices.
|
2018-05-03 19:50:45 +00:00
|
|
|
type Provider interface {
|
2019-11-11 20:57:16 +00:00
|
|
|
// Configure initializes the provider based on the given cluster ID, root
|
|
|
|
// status and configuration values. rawConfig contains the user-provided
|
|
|
|
// Config. State contains a the State the same provider last persisted on a
|
|
|
|
// restart or reconfiguration. The provider must not modify `rawConfig` or
|
|
|
|
// `state` maps directly as it may be being read from other goroutines.
|
2019-11-18 14:22:19 +00:00
|
|
|
Configure(cfg ProviderConfig) error
|
2019-11-11 20:57:16 +00:00
|
|
|
|
|
|
|
// State returns the current provider state. If the provider doesn't need to
|
|
|
|
// store anything other than what the user configured this can return nil. It
|
|
|
|
// is called after any config change before the new active config is stored in
|
|
|
|
// the state store and the most recent value returned by the provider is given
|
|
|
|
// in subsequent `Configure` calls provided that the current provider is the
|
|
|
|
// same type as the new provider instance being configured. This provides a
|
|
|
|
// simple way for providers to persist information like UUIDs of resources
|
|
|
|
// they manage. This state is visible to anyone with operator:read via the API
|
|
|
|
// so it's not intended for storing secrets like root private keys. Only
|
|
|
|
// strings are permitted since this has to pass through msgpack and so
|
|
|
|
// interface values will end up mangled in many cases which is ugly for all
|
|
|
|
// provider code to have to remember to reason about.
|
|
|
|
//
|
|
|
|
// Note that the map returned will be accessed (read-only) in other goroutines
|
|
|
|
// - for example passed to Configure in the Connect CA Config RPC endpoint -
|
|
|
|
// so it must not just be a pointer to a map that may internally be modified.
|
|
|
|
// If the Provider only writes to it during Configure it's safe to return
|
|
|
|
// as-is, but otherwise it's assumed the map returned is a copy of the state
|
|
|
|
// in the Provider struct so it won't change after being returned.
|
|
|
|
State() (map[string]string, error)
|
2018-09-07 02:18:54 +00:00
|
|
|
|
|
|
|
// GenerateRoot causes the creation of a new root certificate for this provider.
|
|
|
|
// This can also be a no-op if a root certificate already exists for the given
|
2019-11-21 17:40:29 +00:00
|
|
|
// config. If IsPrimary is false, calling this method is an error.
|
2018-09-07 02:18:54 +00:00
|
|
|
GenerateRoot() error
|
|
|
|
|
|
|
|
// ActiveRoot returns the currently active root CA for this
|
2018-04-21 03:39:51 +00:00
|
|
|
// provider. This should be a parent of the certificate returned by
|
|
|
|
// ActiveIntermediate()
|
2018-04-24 23:16:37 +00:00
|
|
|
ActiveRoot() (string, error)
|
2018-04-21 03:39:51 +00:00
|
|
|
|
2018-09-13 02:52:24 +00:00
|
|
|
// GenerateIntermediateCSR generates a CSR for an intermediate CA
|
2019-11-21 17:40:29 +00:00
|
|
|
// certificate, to be signed by the root of another datacenter. If IsPrimary was
|
2018-09-14 23:08:54 +00:00
|
|
|
// set to true with Configure(), calling this is an error.
|
2018-09-13 02:52:24 +00:00
|
|
|
GenerateIntermediateCSR() (string, error)
|
|
|
|
|
2018-09-13 20:09:07 +00:00
|
|
|
// SetIntermediate sets the provider to use the given intermediate certificate
|
2018-09-14 23:08:54 +00:00
|
|
|
// as well as the root it was signed by. This completes the initialization for
|
2019-11-21 17:40:29 +00:00
|
|
|
// a provider where IsPrimary was set to false in Configure().
|
2018-09-13 02:52:24 +00:00
|
|
|
SetIntermediate(intermediatePEM, rootPEM string) error
|
|
|
|
|
2018-06-20 11:37:36 +00:00
|
|
|
// ActiveIntermediate returns the current signing cert used by this provider
|
|
|
|
// for generating SPIFFE leaf certs. Note that this must not change except
|
|
|
|
// when Consul requests the change via GenerateIntermediate. Changing the
|
|
|
|
// signing cert will break Consul's assumptions about which validation paths
|
|
|
|
// are active.
|
2018-04-24 23:16:37 +00:00
|
|
|
ActiveIntermediate() (string, error)
|
2018-04-21 03:39:51 +00:00
|
|
|
|
2018-06-20 11:37:36 +00:00
|
|
|
// GenerateIntermediate returns a new intermediate signing cert and sets it to
|
|
|
|
// the active intermediate. If multiple intermediates are needed to complete
|
|
|
|
// the chain from the signing certificate back to the active root, they should
|
|
|
|
// all by bundled here.
|
2018-04-24 23:16:37 +00:00
|
|
|
GenerateIntermediate() (string, error)
|
2018-04-21 03:39:51 +00:00
|
|
|
|
2018-06-20 11:37:36 +00:00
|
|
|
// Sign signs a leaf certificate used by Connect proxies from a CSR. The PEM
|
|
|
|
// returned should include only the leaf certificate as all Intermediates
|
|
|
|
// needed to validate it will be added by Consul based on the active
|
2019-11-21 17:40:29 +00:00
|
|
|
// intemediate and any cross-signed intermediates managed by Consul. Note that
|
|
|
|
// providers should return ErrRateLimited if they are unable to complete the
|
|
|
|
// operation due to upstream rate limiting so that clients can intelligently
|
|
|
|
// backoff.
|
2018-04-24 23:31:42 +00:00
|
|
|
Sign(*x509.CertificateRequest) (string, error)
|
2018-04-24 23:16:37 +00:00
|
|
|
|
2018-09-13 02:52:24 +00:00
|
|
|
// SignIntermediate will validate the CSR to ensure the trust domain in the
|
2019-11-21 17:40:29 +00:00
|
|
|
// URI SAN matches the local one and that basic constraints for a CA
|
|
|
|
// certificate are met. It should return a signed CA certificate with a path
|
|
|
|
// length constraint of 0 to ensure that the certificate cannot be used to
|
|
|
|
// generate further CA certs. Note that providers should return ErrRateLimited
|
|
|
|
// if they are unable to complete the operation due to upstream rate limiting
|
|
|
|
// so that clients can intelligently backoff.
|
2018-09-13 02:52:24 +00:00
|
|
|
SignIntermediate(*x509.CertificateRequest) (string, error)
|
|
|
|
|
2019-11-11 21:36:22 +00:00
|
|
|
// CrossSignCA must accept a CA certificate from another CA provider and cross
|
|
|
|
// sign it exactly as it is such that it forms a chain back the the
|
2018-04-24 23:16:37 +00:00
|
|
|
// CAProvider's current root. Specifically, the Distinguished Name, Subject
|
|
|
|
// Alternative Name, SubjectKeyID and other relevant extensions must be kept.
|
|
|
|
// The resulting certificate must have a distinct Serial Number and the
|
|
|
|
// AuthorityKeyID set to the CAProvider's current signing key as well as the
|
|
|
|
// Issuer related fields changed as necessary. The resulting certificate is
|
|
|
|
// returned as a PEM formatted string.
|
2019-11-11 21:36:22 +00:00
|
|
|
//
|
|
|
|
// If the CA provider does not support this operation, it may return an error
|
2019-11-21 17:40:29 +00:00
|
|
|
// provided `SupportsCrossSigning` also returns false. Note that
|
|
|
|
// providers should return ErrRateLimited if they are unable to complete the
|
|
|
|
// operation due to upstream rate limiting so that clients can intelligently
|
|
|
|
// backoff.
|
2018-06-19 23:46:18 +00:00
|
|
|
CrossSignCA(*x509.Certificate) (string, error)
|
2018-04-21 03:39:51 +00:00
|
|
|
|
2019-11-11 21:36:22 +00:00
|
|
|
// SupportsCrossSigning should indicate whether the CA provider supports
|
|
|
|
// cross-signing an external root to provide a seamless rotation. If the CA
|
|
|
|
// does not support this, the user will have to force an upgrade when that CA
|
|
|
|
// provider is the current CA as the upgrade may cause interruptions to
|
|
|
|
// connectivity during the rollout.
|
|
|
|
SupportsCrossSigning() (bool, error)
|
|
|
|
|
2018-04-24 18:50:31 +00:00
|
|
|
// Cleanup performs any necessary cleanup that should happen when the provider
|
2018-04-21 03:39:51 +00:00
|
|
|
// is shut down permanently, such as removing a temporary PKI backend in Vault
|
|
|
|
// created for an intermediate CA.
|
2018-04-24 18:50:31 +00:00
|
|
|
Cleanup() error
|
2018-04-09 04:56:11 +00:00
|
|
|
}
|
2019-11-11 20:30:01 +00:00
|
|
|
|
|
|
|
// NeedsLogger is an optional interface that allows a CA provider to use the
|
|
|
|
// Consul logger to output diagnostic messages.
|
|
|
|
type NeedsLogger interface {
|
|
|
|
// SetLogger will pass a configured Logger to the provider.
|
2020-01-28 23:50:41 +00:00
|
|
|
SetLogger(logger hclog.Logger)
|
2019-11-11 20:30:01 +00:00
|
|
|
}
|
2020-09-10 13:12:48 +00:00
|
|
|
|
|
|
|
// NeedsStop is an optional interface that allows a CA to define a function
|
|
|
|
// to be called when the CA instance is no longer in use. This is different
|
|
|
|
// from Cleanup(), as only the local provider instance is being shut down
|
|
|
|
// such as in the case of a leader change.
|
|
|
|
type NeedsStop interface {
|
|
|
|
Stop()
|
|
|
|
}
|