Commit Graph

249 Commits

Author SHA1 Message Date
R.B. Boyer 6adad71125
wan federation via mesh gateways (#6884)
This is like a Möbius strip of code due to the fact that low-level components (serf/memberlist) are connected to high-level components (the catalog and mesh-gateways) in a twisty maze of references which make it hard to dive into. With that in mind here's a high level summary of what you'll find in the patch:

There are several distinct chunks of code that are affected:

* new flags and config options for the server

* retry join WAN is slightly different

* retry join code is shared to discover primary mesh gateways from secondary datacenters

* because retry join logic runs in the *agent* and the results of that
  operation for primary mesh gateways are needed in the *server* there are
  some methods like `RefreshPrimaryGatewayFallbackAddresses` that must occur
  at multiple layers of abstraction just to pass the data down to the right
  layer.

* new cache type `FederationStateListMeshGatewaysName` for use in `proxycfg/xds` layers

* the function signature for RPC dialing picked up a new required field (the
  node name of the destination)

* several new RPCs for manipulating a FederationState object:
  `FederationState:{Apply,Get,List,ListMeshGateways}`

* 3 read-only internal APIs for debugging use to invoke those RPCs from curl

* raft and fsm changes to persist these FederationStates

* replication for FederationStates as they are canonically stored in the
  Primary and replicated to the Secondaries.

* a special derivative of anti-entropy that runs in secondaries to snapshot
  their local mesh gateway `CheckServiceNodes` and sync them into their upstream
  FederationState in the primary (this works in conjunction with the
  replication to distribute addresses for all mesh gateways in all DCs to all
  other DCs)

* a "gateway locator" convenience object to make use of this data to choose
  the addresses of gateways to use for any given RPC or gossip operation to a
  remote DC. This gets data from the "retry join" logic in the agent and also
  directly calls into the FSM.

* RPC (`:8300`) on the server sniffs the first byte of a new connection to
  determine if it's actually doing native TLS. If so it checks the ALPN header
  for protocol determination (just like how the existing system uses the
  type-byte marker).

* 2 new kinds of protocols are exclusively decoded via this native TLS
  mechanism: one for ferrying "packet" operations (udp-like) from the gossip
  layer and one for "stream" operations (tcp-like). The packet operations
  re-use sockets (using length-prefixing) to cut down on TLS re-negotiation
  overhead.

* the server instances specially wrap the `memberlist.NetTransport` when running
  with gateway federation enabled (in a `wanfed.Transport`). The general gist is
  that if it tries to dial a node in the SAME datacenter (deduced by looking
  at the suffix of the node name) there is no change. If dialing a DIFFERENT
  datacenter it is wrapped up in a TLS+ALPN blob and sent through some mesh
  gateways to eventually end up in a server's :8300 port.

* a new flag when launching a mesh gateway via `consul connect envoy` to
  indicate that the servers are to be exposed. This sets a special service
  meta when registering the gateway into the catalog.

* `proxycfg/xds` notice this metadata blob to activate additional watches for
  the FederationState objects as well as the location of all of the consul
  servers in that datacenter.

* `xds:` if the extra metadata is in place additional clusters are defined in a
  DC to bulk sink all traffic to another DC's gateways. For the current
  datacenter we listen on a wildcard name (`server.<dc>.consul`) that load
  balances all servers as well as one mini-cluster per node
  (`<node>.server.<dc>.consul`)

* the `consul tls cert create` command got a new flag (`-node`) to help create
  an additional SAN in certs that can be used with this flavor of federation.
2020-03-09 15:59:02 -05:00
Kim Ngo a8f4123d37
agent/txn_endpoint: configure max txn request length (#7388)
configure max transaction size separately from kv limit
2020-03-05 15:42:37 -06:00
Hans Hasselberg e05ac57e8f
tls: support tls 1.3 (#7325) 2020-02-19 23:22:31 +01:00
Hans Hasselberg 315d57bfb1
agent: sensible keyring error (#7272)
Fixes #7231. Before an agent would always emit a warning when there is
an encrypt key in the configuration and an existing keyring stored,
which is happening on restart.

Now it only emits that warning when the encrypt key from the
configuration is not part of the keyring.
2020-02-13 20:35:09 +01:00
Hans Hasselberg cb0f94487c
config: increase http_max_conns_per_client default to 200 (#7289) 2020-02-13 16:27:33 +01:00
Akshay Ganeshen 8beb716414
feat: support sending body in HTTP checks (#6602) 2020-02-10 09:27:12 -07:00
Matt Keeler 9e5fd7f925
OSS Changes for various config entry namespacing bugs (#7226) 2020-02-06 10:52:25 -05:00
Freddy cb77fc6d01
Add managed service provider token (#7218)
Stubs for enterprise-only ACL token to be used by managed service providers.
2020-02-04 13:58:56 -07:00
Hans Hasselberg 5531678e9e
Security fixes (#7182)
* Mitigate HTTP/RPC Services Allow Unbounded Resource Usage

Fixes #7159.

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Co-authored-by: Paul Banks <banks@banksco.de>
2020-01-31 11:19:37 -05:00
Chris Piraino 401221de58
Allow users to configure either unstructured or JSON logging (#7130)
* hclog Allow users to choose between unstructured and JSON logging
2020-01-28 17:50:41 -06:00
Anthony Scalisi beb928f8de fix spelling errors (#7135) 2020-01-27 07:00:33 -06:00
R.B. Boyer 0f44bcd3d8
agent: default the primary_datacenter to the datacenter if not configured (#7111)
Something similar already happens inside of the server
(agent/consul/server.go) but by doing it in the general config parsing
for the agent we can have agent-level code rely on the PrimaryDatacenter
field, too.
2020-01-23 09:59:31 -06:00
Hans Hasselberg 11a571de95
agent: setup grpc server with auto_encrypt certs and add -https-port (#7086)
* setup grpc server with TLS config used across consul.
* add -https-port flag
2020-01-22 11:32:17 +01:00
Hans Hasselberg 804eb17094
connect: check if intermediate cert needs to be renewed. (#6835)
Currently when using the built-in CA provider for Connect, root certificates are valid for 10 years, however secondary DCs get intermediates that are valid for only 1 year. There is no mechanism currently short of rotating the root in the primary that will cause the secondary DCs to renew their intermediates.
This PR adds a check that renews the cert if it is half way through its validity period.

In order to be able to test these changes, a new configuration option was added: IntermediateCertTTL which is set extremely low in the tests.
2020-01-17 23:27:13 +01:00
Hans Hasselberg 87f32c8ba6
auto_encrypt: set dns and ip san for k8s and provide configuration (#6944)
* Add CreateCSRWithSAN
* Use CreateCSRWithSAN in auto_encrypt and cache
* Copy DNSNames and IPAddresses to cert
* Verify auto_encrypt.sign returns cert with SAN
* provide configuration options for auto_encrypt dnssan and ipsan
* rename CreateCSRWithSAN to CreateCSR
2020-01-17 23:25:26 +01:00
Aestek ba8fd8296f Add support for dual stack IPv4/IPv6 network (#6640)
* Use consts for well known tagged adress keys

* Add ipv4 and ipv6 tagged addresses for node lan and wan

* Add ipv4 and ipv6 tagged addresses for service lan and wan

* Use IPv4 and IPv6 address in DNS
2020-01-17 09:54:17 -05:00
Matej Urbas ce023359fe agent: configurable MaxQueryTime and DefaultQueryTime. (#3777) 2020-01-17 14:20:57 +01:00
Matt Keeler 3faee222f2
OSS changes to allow for parsing the enterprise DNS config prop… (#6959) 2019-12-18 10:16:35 -05:00
Matt Keeler 5934f803bf
Sync of OSS changes to support namespaces (#6909) 2019-12-09 21:26:41 -05:00
Hans Hasselberg 9ff69194a2
tls: auto_encrypt and verify_incoming (#6811) (#6899)
* relax requirements for auto_encrypt on server
* better error message when auto_encrypt and verify_incoming on
* docs: explain verify_incoming on Consul clients.
2019-12-06 21:36:13 +01:00
Paul Banks cd1b613352
connect: Add AWS PCA provider (#6795)
* Update AWS SDK to use PCA features.

* Add AWS PCA provider

* Add plumbing for config, config validation tests, add test for inheriting existing CA resources created by user

* Unparallel the tests so we don't exhaust PCA limits

* Merge updates

* More aggressive polling; rate limit pass through on sign; Timeout on Sign and CA create

* Add AWS PCA docs

* Fix Vault doc typo too

* Doc typo

* Apply suggestions from code review

Co-Authored-By: R.B. Boyer <rb@hashicorp.com>
Co-Authored-By: kaitlincarter-hc <43049322+kaitlincarter-hc@users.noreply.github.com>

* Doc fixes; tests for erroring if State is modified via API

* More review cleanup

* Uncomment tests!

* Minor suggested clean ups
2019-11-21 17:40:29 +00:00
Sarah Christoff 5e1c6e907b
Set MinQuorum variable in Autopilot (#6654)
* Add MinQuorum to Autopilot
2019-10-29 09:04:41 -05:00
rerorero 86c8e48dd9 fix: incorrect struct tag and WaitGroup usage (#6649)
* remove duplicated json tag

* fix: incorrect wait group usage
2019-10-18 13:59:29 -04:00
PHBourquin 039615641e Checks to passing/critical only after reaching a consecutive success/failure threshold (#5739)
A check may be set to become passing/critical only if a specified number of successive
checks return passing/critical in a row. Status will stay identical as before until
the threshold is reached.
This feature is available for HTTP, TCP, gRPC, Docker & Monitor checks.
2019-10-14 21:49:49 +01:00
Sarah Christoff 194f5740ce
ui_content_path config option fix (#6601)
* fix ui-content-path config option
2019-10-09 09:14:48 -05:00
Freddy fdd10dd8b8
Expose HTTP-based paths through Connect proxy (#6446)
Fixes: #5396

This PR adds a proxy configuration stanza called expose. These flags register
listeners in Connect sidecar proxies to allow requests to specific HTTP paths from outside of the node. This allows services to protect themselves by only
listening on the loopback interface, while still accepting traffic from non
Connect-enabled services.

Under expose there is a boolean checks flag that would automatically expose all
registered HTTP and gRPC check paths.

This stanza also accepts a paths list to expose individual paths. The primary
use case for this functionality would be to expose paths for third parties like
Prometheus or the kubelet.

Listeners for requests to exposed paths are be configured dynamically at run
time. Any time a proxy, or check can be registered, a listener can also be
created.

In this initial implementation requests to these paths are not
authenticated/encrypted.
2019-09-25 20:55:52 -06:00
Hans Hasselberg faa54ab989
auto_encrypt: verify_incoming_rpc is good enough for auto_encrypt.allow_tls (#6376)
Previously `verify_incoming` was required when turning on `auto_encrypt.allow_tls`, but that doesn't work together with HTTPS UI in some scenarios. Adding `verify_incoming_rpc` to the allowed configurations.
2019-08-27 14:36:36 +02:00
R.B. Boyer ae79cdab1b
connect: introduce ExternalSNI field on service-defaults (#6324)
Compiling this will set an optional SNI field on each DiscoveryTarget.
When set this value should be used for TLS connections to the instances
of the target. If not set the default should be used.

Setting ExternalSNI will disable mesh gateway use for that target. It also 
disables several service-resolver features that do not make sense for an 
external service.
2019-08-19 12:19:44 -05:00
Mike Morris 65be58703c
connect: remove managed proxies (#6220)
* connect: remove managed proxies implementation and all supporting config options and structs

* connect: remove deprecated ProxyDestination

* command: remove CONNECT_PROXY_TOKEN env var

* agent: remove entire proxyprocess proxy manager

* test: remove all managed proxy tests

* test: remove irrelevant managed proxy note from TestService_ServerTLSConfig

* test: update ContentHash to reflect managed proxy removal

* test: remove deprecated ProxyDestination test

* telemetry: remove managed proxy note

* http: remove /v1/agent/connect/proxy endpoint

* ci: remove deprecated test exclusion

* website: update managed proxies deprecation page to note removal

* website: remove managed proxy configuration API docs

* website: remove managed proxy note from built-in proxy config

* website: add note on removing proxy subdirectory of data_dir
2019-08-09 15:19:30 -04:00
Paul Banks e87cef2bb8 Revert "connect: support AWS PCA as a CA provider" (#6251)
This reverts commit 3497b7c00d.
2019-07-31 09:08:10 -04:00
Todd Radel 3497b7c00d
connect: support AWS PCA as a CA provider (#6189)
Port AWS PCA provider from consul-ent
2019-07-30 22:57:51 -04:00
Todd Radel 2552f4a11a
connect: Support RSA keys in addition to ECDSA (#6055)
Support RSA keys in addition to ECDSA
2019-07-30 17:47:39 -04:00
freddygv 1a14b94441 Update default gossip encryption key size to 32 bytes 2019-07-30 09:45:41 -06:00
R.B. Boyer c6c4a2251a Merge Consul OSS branch master at commit b3541c4f34 2019-07-26 10:34:24 -05:00
Jack Pearkes 4e0a16ab2d
config: correct limit to limits in config example (#6219)
This isn't yet documented on the website, but wanted to update this to add the missing s.
2019-07-25 12:38:57 -07:00
Jeff Mitchell 94c73d0c92 Chunking support (#6172)
* Initial chunk support

This uses the go-raft-middleware library to allow for chunked commits to the KV
2019-07-24 17:06:39 -04:00
Matt Keeler 3053342198
Envoy Mesh Gateway integration tests (#6187)
* Allow setting the mesh gateway mode for an upstream in config files

* Add envoy integration test for mesh gateways

This necessitated many supporting changes in most of the other test cases.

Add remote mode mesh gateways integration test
2019-07-24 17:01:42 -04:00
R.B. Boyer ad9e7b6ae9
connect: allow L7 routers to match on http methods (#6164)
Fixes #6158
2019-07-23 20:56:39 -05:00
R.B. Boyer 85cf2706e6
connect: change router syntax for matching query parameters to resemble the syntax for matching paths and headers for consistency. (#6163)
This is a breaking change, but only in the context of the beta series.
2019-07-23 20:55:26 -05:00
R.B. Boyer 1dbd92e091
connect: validate and test more of the L7 config entries (#6156) 2019-07-23 20:50:23 -05:00
Alvin Huang ef6b80bab2 resolve circleci config conflicts 2019-07-23 20:18:36 -04:00
Pierre Souchay b4590fb8e8 Display nicely Networks (CIDR) in runtime configuration (#6029)
* Display nicely Networks (CIDR) in runtime configuration

CIDR mask is displayed in binary in configuration.
This add support for nicely displaying CIDR in runtime configuration.

Currently, if a configuration contains the following lines:

  "http_config": {
    "allow_write_http_from": [
      "127.0.0.0/8",
      "::1/128"
    ]
  }

A call to `/v1/agent/self?pretty` would display

  "AllowWriteHTTPFrom": [
            {
                "IP": "127.0.0.0",
                "Mask": "/wAAAA=="
            },
            {
                "IP": "::1",
                "Mask": "/////////////////////w=="
            }
  ]

This PR fixes it and it will now display:

   "AllowWriteHTTPFrom": [ "127.0.0.0/8", "::1/128" ]

* Added test for cidr nice rendering in `TestSanitize()`.
2019-07-23 16:30:16 -04:00
Paul Banks f38da47c55
Allow raft TrailingLogs to be configured. (#6186)
This fixes pathological cases where the write throughput and snapshot size are both so large that more than 10k log entries are written in the time it takes to restore the snapshot from disk. In this case followers that restart can never catch up with leader replication again and enter a loop of constantly downloading a full snapshot and restoring it only to find that snapshot is already out of date and the leader has truncated its logs so a new snapshot is sent etc.

In general if you need to adjust this, you are probably abusing Consul for purposes outside its design envelope and should reconsider your usage to reduce data size and/or write volume.
2019-07-23 15:19:57 +01:00
hashicorp-ci a4431da1cc Merge Consul OSS branch 'master' at commit ef257b084d 2019-07-20 02:00:29 +00:00
javicrespo b006060d4c log rotation: limit count of rotated log files (#5831) 2019-07-19 15:36:34 -06:00
R.B. Boyer 67a36e3452
handle structs.ConfigEntry decoding similarly to api.ConfigEntry decoding (#6106)
Both 'consul config write' and server bootstrap config entries take a
decoding detour through mapstructure on the way from HCL to an actual
struct. They both may take in snake_case or CamelCase (for consistency)
so need very similar handling.

Unfortunately since they are operating on mirror universes of structs
(api.* vs structs.*) the code cannot be identitical, so try to share the
kind-configuration and duplicate the rest for now.
2019-07-12 12:20:30 -05:00
Jack Pearkes e6f1b78efb Make cluster names SNI always (#6081)
* Make cluster names SNI always

* Update some tests

* Ensure we check for prepared query types

* Use sni for route cluster names

* Proper mesh gateway mode defaulting when the discovery chain is used

* Ignore service splits from PatchSliceOfMaps

* Update some xds golden files for proper test output

* Allow for grpc/http listeners/cluster configs with the disco chain

* Update stats expectation
2019-07-08 12:48:48 +01:00
Matt Keeler 8d953f5840 Implement Mesh Gateways
This includes both ingress and egress functionality.
2019-07-01 16:28:30 -04:00
hashicorp-ci 43bda6fb76 Merge Consul OSS branch 'master' at commit e91f73f592 2019-06-30 02:00:31 +00:00
R.B. Boyer 38d76c624e
Allow for both snake_case and CamelCase for config entries written with 'consul config write'. (#6044)
This also has the added benefit of fixing an issue with passing
time.Duration fields through config entries.
2019-06-28 11:35:35 -05:00