Commit Graph

82 Commits

Author SHA1 Message Date
Mike Morris 708957a982
chore: update raft to v1.2.0 (#8822) 2020-10-08 15:07:10 -04:00
Matt Keeler 38f5ddce2a
Add per-agent reconnect timeouts (#8781)
This allows for client agent to be run in a more stateless manner where they may be abruptly terminated and not expected to come back. If advertising a per-agent reconnect timeout using the advertise_reconnect_timeout configuration when that agent leaves, other agents will wait only that amount of time for the agent to come back before reaping it.

This has the advantageous side effect of causing servers to deregister the node/services/checks for that agent sooner than if the global reconnect_timeout was used.
2020-10-08 15:02:19 -04:00
Mike Morris 1ebc2fb006
chore(deps): update gopsutil to v2.20.9 (#8843)
* core(deps): bump golang.org/x/sys

To resolve /go/pkg/mod/github.com/shirou/gopsutil@v2.20.9+incompatible/host/host_bsd.go:20:13: undefined: unix.SysctlTimeval

* chore(deps): make update-vendor
2020-10-07 12:57:18 -04:00
Daniel Nephin 627449a870 Vendor gofuzz and google/go-cmp 2020-09-28 18:28:37 -04:00
Kyle Havlovitz b1b21139ca Merge branch 'master' into vault-ca-renew-token 2020-09-15 14:39:04 -07:00
Kyle Havlovitz 1cd7c43544 Update vault CA for latest api client 2020-09-15 13:33:55 -07:00
Kyle Havlovitz 74dc50a771 vendor: Update vault api package 2020-09-15 12:45:29 -07:00
Daniel Nephin 0c87cf468c Update go-metrics dependencies, to use metrics.Default() 2020-09-14 19:05:22 -04:00
Mike Morris 13017e0af9 vendor: bump consul/api to v1.7.0 2020-09-10 21:40:41 -04:00
R.B. Boyer 74d5df7c7a
xds: use envoy's rbac filter to handle intentions entirely within envoy (#8569) 2020-08-27 12:20:58 -05:00
Hans Hasselberg a932aafc91
add primary keys to list keyring (#8522)
During gossip encryption key rotation it would be nice to be able to see if all nodes are using the same key. This PR adds another field to the json response from `GET v1/operator/keyring` which lists the primary keys in use per dc. That way an operator can tell when a key was successfully setup as primary key.

Based on https://github.com/hashicorp/serf/pull/611 to add primary key to list keyring output:

```json
[
  {
    "WAN": true,
    "Datacenter": "dc2",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 6,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
    },
    "NumNodes": 6
  },
  {
    "WAN": false,
    "Datacenter": "dc2",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 8,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "NumNodes": 8
  },
  {
    "WAN": false,
    "Datacenter": "dc1",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 3,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "NumNodes": 8
  }
]
```

I intentionally did not change the CLI output because I didn't find a good way of displaying this information. There are a couple of options that we could implement later:
* add a flag to show the primary keys
* add a flag to show json output

Fixes #3393.
2020-08-18 09:50:24 +02:00
s-christoff 102b7e55da
Update Go-Metrics 0.3.4 (#8478) 2020-08-11 11:17:43 -05:00
Mike Morris 48e7c07cf9
api: bump consul/api to v1.6.0 and consul/sdk to v0.6.0 (#8460)
* api: bump consul/sdk dependency to v0.6.0

* api: bump dependency to v1.6.0
2020-08-07 17:26:05 -04:00
Kyle Havlovitz f4efd53d57 vendor: Update github.com/armon/go-metrics to v0.3.3 2020-07-23 11:37:33 -07:00
Matt Keeler a6a1a0e3d6
Update mapstructure to v1.3.3 (#8361)
This was done in preparation for another PR where I was running into https://github.com/mitchellh/mapstructure/issues/202 and implemented a fix for the library.
2020-07-22 15:13:21 -04:00
R.B. Boyer e853368c23
gossip: Avoid issue where two unique leave events for the same node could lead to infinite rebroadcast storms (#8343)
bump serf to v0.9.3 to include fix for https://github.com/hashicorp/serf/pull/606
2020-07-21 15:48:10 -05:00
Pierre Souchay 20d1ea7d2d
Upgrade go-connlimit to v0.3.0 / return http 429 on too many connections (#8221)
Fixes #7527

I want to highlight this and explain what I think the implications are and make sure we are aware:

* `HTTPConnStateFunc` closes the connection when it is beyond the limit. `Close` does not block.
* `HTTPConnStateFuncWithDefault429Handler(10 * time.Millisecond)` blocks until the following is done (worst case):
  1) `conn.SetDeadline(10*time.Millisecond)` so that
  2) `conn.Write(429error)` is guaranteed to timeout after 10ms, so that the http 429 can be written and 
  3) `conn.Close` can happen

The implication of this change is that accepting any new connection is worst case delayed by 10ms. But only after a client reached the limit already.
2020-07-03 09:25:07 +02:00
Hans Hasselberg 95c027a3ea
Update gopsutil (#8208)
https://github.com/shirou/gopsutil/pull/895 is merged and fixes our
problem. Time to update. Since there is no new version just yet,
updating to the sha.
2020-07-01 14:47:56 +02:00
Matt Keeler e9835610f3
Add a test for go routine leaks
This is in its own separate package so that it will be a separate test binary that runs thus isolating the go runtime from other tests and allowing accurate go routine leak checking.

This test would ideally use goleak.VerifyTestMain but that will fail 100% of the time due to some architectural things (blocking queries and net/rpc uncancellability).

This test is not comprehensive. We should enable/exercise more features and more cluster configurations. However its a start.
2020-06-24 17:09:50 -04:00
R.B. Boyer c63c994b04
connect: upgrade github.com/envoyproxy/go-control-plane to v0.9.5 (#8165) 2020-06-23 15:19:56 -05:00
Paul Banks f6ac08be04 state: track changes so that they may be used to produce change events 2020-06-16 13:04:29 -04:00
Daniel Nephin 98d271bee9 Update google.golang.org/api and stretchr/testify
To match the versions used in enterprise, should slightly reduce the
chances of getting a merge conflict when using `go.mod`.
2020-06-09 16:03:05 -04:00
Daniel Nephin db74f09b6b Update protobuf and golang.org/x/... vendor
Partially extracted from #7547

Updates protobuf to the most recent in the 1.3.x series, and updates
golang.org/x/sys to a7d97aace0b0 because of https://github.com/shirou/gopsutil/issues/853
prevents updating to a more recent version.

This breaking change in x/sys also prevents us from getting a newer
version of x/net. In the future, if gopsutil is not patched,  we may want to run a fork version of
gopsutil so that we can update both x/net and x/sys.
2020-06-09 14:46:41 -04:00
Daniel Nephin 99eb583ebc
Replace goe/verify.Values with testify/require.Equal (#7993)
* testing: replace most goe/verify.Values with require.Equal

One difference between these two comparisons is that go/verify considers
nil slices/maps to be equal to empty slices/maps, where as testify/require
does not, and does not appear to provide any way to enable that behaviour.

Because of this difference some expected values were changed from empty
slices to nil slices, and some calls to verify.Values were left.

* Remove github.com/pascaldekloe/goe/verify

Reduce the number of assertion packages we use from 2 to 1
2020-06-02 12:41:25 -04:00
R.B. Boyer 1efafd7523
acl: add auth method for JWTs (#7846) 2020-05-11 20:59:29 -05:00
Mike Morris 291e6af33a
vendor: revert golang.org/x/sys bump to avoid FreeBSD regression (#7780) 2020-05-05 09:26:17 +02:00
Hans Hasselberg c4093c87cc
agent: don't let left nodes hold onto their node-id (#7747) 2020-05-04 18:39:08 +02:00
Matt Keeler daec810e34
Merge pull request #7714 from hashicorp/oss-sync/msp-agent-token 2020-05-04 11:33:50 -04:00
Matt Keeler 55050beedb
Update go-discover dependency (#7731) 2020-05-04 10:59:48 -04:00
Matt Keeler 8c545b5206
Update mapstructure to v1.2.3
This release contains a fix to prevent duplicate keys in the Metadata after decoding where the output value contains pointer fields.
2020-04-28 09:33:16 -04:00
R.B. Boyer b989967791
cli: ensure that 'snapshot save' is fsync safe and also only writes to the requested file on success (#7698) 2020-04-24 17:34:47 -05:00
R.B. Boyer 5f1518c37c
cli: fix usage of gzip.Reader to better detect corrupt snapshots during save/restore (#7697) 2020-04-24 17:18:56 -05:00
Daniel Nephin 50d73c2674 Update github.com/joyent/triton-go to latest
There was an RSA private key used for testing included in the old
version. This commit updates it to a version that does not include the
key so that the key is not detected by tools which scan the Consul
binary for private keys.

Commands run:

go get github.com/joyent/triton-go@6801d15b779f042cfd821c8a41ef80fc33af9d47
make update-vendor
2020-04-16 12:34:29 -04:00
Daniel Nephin 5f44bbcac2
Merge pull request #7519 from hashicorp/dnephin/help-to-stdout
cli: send requested help text to stdout
2020-04-01 11:26:12 -04:00
Daniel Nephin ad7c78f134 Remove t.Name() from TestAgent.Name
And re-add the name to the logger so that log messages from different agents
in a single can be identified.
2020-03-30 16:47:24 -04:00
Daniel Nephin 5023a3b178 cli: send requested help text to stdout
This behaviour matches the GNU CLI standard:
http://www.gnu.org/prep/standards/html_node/_002d_002dhelp.html
2020-03-26 15:27:34 -04:00
R.B. Boyer 31943924f9
bump the expected go language version of the main module to 1.13 (#7429) 2020-03-10 14:46:09 -05:00
R.B. Boyer 6adad71125
wan federation via mesh gateways (#6884)
This is like a Möbius strip of code due to the fact that low-level components (serf/memberlist) are connected to high-level components (the catalog and mesh-gateways) in a twisty maze of references which make it hard to dive into. With that in mind here's a high level summary of what you'll find in the patch:

There are several distinct chunks of code that are affected:

* new flags and config options for the server

* retry join WAN is slightly different

* retry join code is shared to discover primary mesh gateways from secondary datacenters

* because retry join logic runs in the *agent* and the results of that
  operation for primary mesh gateways are needed in the *server* there are
  some methods like `RefreshPrimaryGatewayFallbackAddresses` that must occur
  at multiple layers of abstraction just to pass the data down to the right
  layer.

* new cache type `FederationStateListMeshGatewaysName` for use in `proxycfg/xds` layers

* the function signature for RPC dialing picked up a new required field (the
  node name of the destination)

* several new RPCs for manipulating a FederationState object:
  `FederationState:{Apply,Get,List,ListMeshGateways}`

* 3 read-only internal APIs for debugging use to invoke those RPCs from curl

* raft and fsm changes to persist these FederationStates

* replication for FederationStates as they are canonically stored in the
  Primary and replicated to the Secondaries.

* a special derivative of anti-entropy that runs in secondaries to snapshot
  their local mesh gateway `CheckServiceNodes` and sync them into their upstream
  FederationState in the primary (this works in conjunction with the
  replication to distribute addresses for all mesh gateways in all DCs to all
  other DCs)

* a "gateway locator" convenience object to make use of this data to choose
  the addresses of gateways to use for any given RPC or gossip operation to a
  remote DC. This gets data from the "retry join" logic in the agent and also
  directly calls into the FSM.

* RPC (`:8300`) on the server sniffs the first byte of a new connection to
  determine if it's actually doing native TLS. If so it checks the ALPN header
  for protocol determination (just like how the existing system uses the
  type-byte marker).

* 2 new kinds of protocols are exclusively decoded via this native TLS
  mechanism: one for ferrying "packet" operations (udp-like) from the gossip
  layer and one for "stream" operations (tcp-like). The packet operations
  re-use sockets (using length-prefixing) to cut down on TLS re-negotiation
  overhead.

* the server instances specially wrap the `memberlist.NetTransport` when running
  with gateway federation enabled (in a `wanfed.Transport`). The general gist is
  that if it tries to dial a node in the SAME datacenter (deduced by looking
  at the suffix of the node name) there is no change. If dialing a DIFFERENT
  datacenter it is wrapped up in a TLS+ALPN blob and sent through some mesh
  gateways to eventually end up in a server's :8300 port.

* a new flag when launching a mesh gateway via `consul connect envoy` to
  indicate that the servers are to be exposed. This sets a special service
  meta when registering the gateway into the catalog.

* `proxycfg/xds` notice this metadata blob to activate additional watches for
  the FederationState objects as well as the location of all of the consul
  servers in that datacenter.

* `xds:` if the extra metadata is in place additional clusters are defined in a
  DC to bulk sink all traffic to another DC's gateways. For the current
  datacenter we listen on a wildcard name (`server.<dc>.consul`) that load
  balances all servers as well as one mini-cluster per node
  (`<node>.server.<dc>.consul`)

* the `consul tls cert create` command got a new flag (`-node`) to help create
  an additional SAN in certs that can be used with this flavor of federation.
2020-03-09 15:59:02 -05:00
Matt Keeler 3c6d9516bc
Bump `api` and `sdk` module versions
api -> v1.4.0
sdk -> v0.4.0
2020-02-10 20:08:47 -05:00
R.B. Boyer 73ba5d9990
make the TestRPC_RPCMaxConnsPerClient test less flaky (#7255) 2020-02-10 15:13:53 -06:00
Matt Keeler 58befa754b
Update to miekg/dns v1.1.26 (#7252) 2020-02-10 15:14:27 -05:00
Hans Hasselberg 649ffcb66f
memberlist: vendor v0.1.6 to pull in new state: stateLeft (#7184) 2020-02-03 11:02:13 +01:00
Matt Keeler d1fcf1e950
Add replace directive to prevent contacting istio.io during the… (#7194)
They keep having TLS handshake timeouts. Its pointed at github instead.
2020-01-31 13:57:54 -05:00
Hans Hasselberg 5531678e9e
Security fixes (#7182)
* Mitigate HTTP/RPC Services Allow Unbounded Resource Usage

Fixes #7159.

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Co-authored-by: Paul Banks <banks@banksco.de>
2020-01-31 11:19:37 -05:00
Chris Piraino 401221de58
Allow users to configure either unstructured or JSON logging (#7130)
* hclog Allow users to choose between unstructured and JSON logging
2020-01-28 17:50:41 -06:00
Hans Hasselberg 9c1361c02b
raft: update raft to v1.1.2 (#7079)
* update raft
* use hclogger for raft.
2020-01-20 13:58:02 +01:00
Wim 4f5d5020b8 dns: fix memoryleak by upgrading outdated miekg/dns (#6748)
* Add updated github.com/miekg/dns to go modules
* Add updated github.com/miekg/dns to vendor
* Fix github.com/miekg/dns api breakage
* Decrease size when trimming UDP packets
Need more room for the header(?), if we don't decrease the size we get an
"overflow unpacking uint32" from the dns library
* Fix dns truncate tests with api changes
* Make windows build working again. Upgrade x/sys and x/crypto and vendor
This upgrade is needed because of API breakage in x/sys introduced
by the minimal x/sys dependency of miekg/dns

This API breakage has been fixed in commit
855e68c859
2019-12-16 22:31:27 +01:00
Worming 25da8d9cfb accept go get command with go 1.13 (#6938) 2019-12-12 10:15:05 -06:00
Mike Morris 66b8c20990
Bump go-discover to support EC2 Metadata Service v2 (#6865)
Refs https://github.com/hashicorp/go-discover/pull/128

* deps: add replace directive for gocheck

Transitive dep, source at https://launchpad.net/gocheck indicates
project moved. This also avoids a dependency on bzr when fetching
modules. Refs https://github.com/hashicorp/consul/pull/6818

* deps: make update-vendor

* test: update retry-join expected names from go-discover
2019-12-04 11:59:16 -05:00
Paul Banks cd1b613352
connect: Add AWS PCA provider (#6795)
* Update AWS SDK to use PCA features.

* Add AWS PCA provider

* Add plumbing for config, config validation tests, add test for inheriting existing CA resources created by user

* Unparallel the tests so we don't exhaust PCA limits

* Merge updates

* More aggressive polling; rate limit pass through on sign; Timeout on Sign and CA create

* Add AWS PCA docs

* Fix Vault doc typo too

* Doc typo

* Apply suggestions from code review

Co-Authored-By: R.B. Boyer <rb@hashicorp.com>
Co-Authored-By: kaitlincarter-hc <43049322+kaitlincarter-hc@users.noreply.github.com>

* Doc fixes; tests for erroring if State is modified via API

* More review cleanup

* Uncomment tests!

* Minor suggested clean ups
2019-11-21 17:40:29 +00:00