Commit Graph

571 Commits

Author SHA1 Message Date
Matt Keeler 66fd23d67f
Refactor to call non-voting servers read replicas (#9191)
Co-authored-by: Kit Patella <kit@jepsen.io>
2020-11-17 10:53:57 -05:00
Freddy fe728855ed
Add DC and NS support for Envoy metrics (#9207)
This PR updates the tags that we generate for Envoy stats.

Several of these come with breaking changes, since we can't keep two stats prefixes for a filter.
2020-11-16 16:37:19 -07:00
Matt Keeler 7f87adbcf4
Remove this constant as it is soon to be changing and we want to prevent backwards compat issues (#9193) 2020-11-13 17:09:51 -05:00
R.B. Boyer 61eac21f1a
agent: return the default ACL policy to callers as a header (#9101)
Header is: X-Consul-Default-ACL-Policy=<allow|deny>

This is of particular utility when fetching matching intentions, as the
fallthrough for a request that doesn't match any intentions is to
enforce using the default acl policy.
2020-11-12 10:38:32 -06:00
Matt Keeler 7ef9b04f90
Add a CLI command for retrieving the autopilot configuration. (#9142) 2020-11-11 13:19:02 -05:00
Matt Keeler c048e86bb2
Switch to using the external autopilot module 2020-11-09 09:22:11 -05:00
R.B. Boyer 8baf158ea8
Revert "Add namespace support for metrics (OSS) (#9117)" (#9124)
This reverts commit 06b3b017d3.
2020-11-06 10:24:32 -06:00
Freddy 06b3b017d3
Add namespace support for metrics (OSS) (#9117) 2020-11-05 18:24:29 -07:00
R.B. Boyer e113dc0fe2
upstream some differences from enterprise (#8902) 2020-10-09 09:42:53 -05:00
Matt Keeler 38f5ddce2a
Add per-agent reconnect timeouts (#8781)
This allows for client agent to be run in a more stateless manner where they may be abruptly terminated and not expected to come back. If advertising a per-agent reconnect timeout using the advertise_reconnect_timeout configuration when that agent leaves, other agents will wait only that amount of time for the agent to come back before reaping it.

This has the advantageous side effect of causing servers to deregister the node/services/checks for that agent sooner than if the global reconnect_timeout was used.
2020-10-08 15:02:19 -04:00
R.B. Boyer 0c9177f6a5
api: unflake some intention-related api tests (#8857) 2020-10-07 13:32:53 -05:00
R.B. Boyer 1b413b0444
connect: support defining intentions using layer 7 criteria (#8839)
Extend Consul’s intentions model to allow for request-based access control enforcement for HTTP-like protocols in addition to the existing connection-based enforcement for unspecified protocols (e.g. tcp).
2020-10-06 17:09:13 -05:00
R.B. Boyer a2a8e9c783
connect: intentions are now managed as a new config entry kind "service-intentions" (#8834)
- Upgrade the ConfigEntry.ListAll RPC to be kind-aware so that older
copies of consul will not see new config entries it doesn't understand
replicate down.

- Add shim conversion code so that the old API/CLI method of interacting
with intentions will continue to work so long as none of these are
edited via config entry endpoints. Almost all of the read-only APIs will
continue to function indefinitely.

- Add new APIs that operate on individual intentions without IDs so that
the UI doesn't need to implement CAS operations.

- Add a new serf feature flag indicating support for
intentions-as-config-entries.

- The old line-item intentions way of interacting with the state store
will transparently flip between the legacy memdb table and the config
entry representations so that readers will never see a hiccup during
migration where the results are incomplete. It uses a piece of system
metadata to control the flip.

- The primary datacenter will begin migrating intentions into config
entries on startup once all servers in the datacenter are on a version
of Consul with the intentions-as-config-entries feature flag. When it is
complete the old state store representations will be cleared. We also
record a piece of system metadata indicating this has occurred. We use
this metadata to skip ALL of this code the next time the leader starts
up.

- The secondary datacenters continue to run the old intentions
replicator until all servers in the secondary DC and primary DC support
intentions-as-config-entries (via serf flag). Once this condition it met
the old intentions replicator ceases.

- The secondary datacenters replicate the new config entries as they are
migrated in the primary. When they detect that the primary has zeroed
it's old state store table it waits until all config entries up to that
point are replicated and then zeroes its own copy of the old state store
table. We also record a piece of system metadata indicating this has
occurred. We use this metadata to skip ALL of this code the next time
the leader starts up.
2020-10-06 13:24:05 -05:00
R.B. Boyer d2eb27e0a3
api: support GetMeta() and GetNamespace() on all config entry kinds (#8764)
Fixes #8755

Since I was updating the interface, i also added the missing `GetNamespace()`.

Depending upon how you look at it, this is a breaking change since it adds methods to the exported interface `api.ConfigEntry`. Given that you cannot define your own config entry kinds, and all of the machinery of the `api.Client` acts like a factory to construct the canned ones from the rest of the module, this feels like it's not a problematic change as it would only break someone who had reimplemented the `ConfigEntry` interface themselves for no apparent utility?
2020-09-29 09:11:57 -05:00
freddygv 7b9d1b41d5 Resolve conflicts against master 2020-09-11 18:41:58 -06:00
freddygv 768dbaa68d Add session flag to cookie config 2020-09-11 18:34:03 -06:00
freddygv eab90ea9fa Revert EnvoyConfig nesting 2020-09-11 09:21:43 -06:00
Seth Hoenig 9fab3fe990
api: create fresh http client for unix sockets (#8602)
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
2020-09-06 12:27:39 -04:00
Freddy c9c9e4face
Make LockDelay configurable in api locks (#8621) 2020-09-04 13:38:26 -06:00
freddygv eaa250cc80 Ensure resolver node with LB isn't considered default 2020-09-03 08:55:57 -06:00
freddygv f81fe6a1a1 Remove LB infix and move injection to xds 2020-09-02 15:13:50 -06:00
R.B. Boyer 119e945c3e
connect: all config entries pick up a meta field (#8596)
Fixes #8595
2020-09-02 14:10:25 -05:00
freddygv 63f79e5f9b Restructure structs and other PR comments 2020-09-02 09:10:50 -06:00
freddygv 0236e169bb Add documentation for resolver LB cfg 2020-08-28 14:46:13 -06:00
freddygv ff56a64b08 Add LB policy to service-resolver 2020-08-27 19:44:02 -06:00
Jack 9e1c6727f9
Add http2 and grpc support to ingress gateways (#8458) 2020-08-27 15:34:08 -06:00
Matt Keeler 7c3914d89e
Add helpers to the API client to help with getting information from `AgentMember` tags (#8575)
Lots of constants were added for various tags that would concern users and are not already parsed out.

Additionally two methods on the AgentMember type were added to ask a member what its ACL Mode is and whether its a server or not.
2020-08-27 11:00:48 -04:00
Hans Hasselberg a932aafc91
add primary keys to list keyring (#8522)
During gossip encryption key rotation it would be nice to be able to see if all nodes are using the same key. This PR adds another field to the json response from `GET v1/operator/keyring` which lists the primary keys in use per dc. That way an operator can tell when a key was successfully setup as primary key.

Based on https://github.com/hashicorp/serf/pull/611 to add primary key to list keyring output:

```json
[
  {
    "WAN": true,
    "Datacenter": "dc2",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 6,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
    },
    "NumNodes": 6
  },
  {
    "WAN": false,
    "Datacenter": "dc2",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 8,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "NumNodes": 8
  },
  {
    "WAN": false,
    "Datacenter": "dc1",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 3,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "NumNodes": 8
  }
]
```

I intentionally did not change the CLI output because I didn't find a good way of displaying this information. There are a couple of options that we could implement later:
* add a flag to show the primary keys
* add a flag to show json output

Fixes #3393.
2020-08-18 09:50:24 +02:00
Daniel Nephin d68edcecf4 testing: Remove all the defer os.Removeall
Now that testutil uses t.Cleanup to remove the directory the caller no longer has to manage
the removal
2020-08-14 19:58:53 -04:00
Mike Morris 48e7c07cf9
api: bump consul/api to v1.6.0 and consul/sdk to v0.6.0 (#8460)
* api: bump consul/sdk dependency to v0.6.0

* api: bump dependency to v1.6.0
2020-08-07 17:26:05 -04:00
Daniel Nephin ed8210fe4d api: Use a Logger instead of an io.Writer in api.Watch
So that we can pass around only a Logger, not a LogOutput
2020-08-05 13:25:08 -04:00
Mike Morris 85ef7ba943
api: restore Leader() and Peers() to avoid breaking function signatures (#8395)
api: add TestAPI_StatusLeaderWithQueryOptions and TestAPI_StatusPeersWithQueryOptions
api: make TestAPI_Status* error messages more verbose
2020-07-29 12:09:15 -04:00
spooner c34b088583
Added QueryOptions for status api (#7818)
* Added QueryOptions & Tests for status api
2020-07-28 12:26:50 -04:00
R.B. Boyer e853368c23
gossip: Avoid issue where two unique leave events for the same node could lead to infinite rebroadcast storms (#8343)
bump serf to v0.9.3 to include fix for https://github.com/hashicorp/serf/pull/606
2020-07-21 15:48:10 -05:00
Daniel Nephin 51efba2c7d testutil: NewLogBuffer - buffer logs until a test fails
Replaces #7559

Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case.

Attaching log output from background goroutines also cause data races.  If the goroutine outlives the test, it will race with the test being marked done. Previously this was noticed as a panic when logging, but with the race detector enabled it is shown as a data race.

The previous solution did not address the problem of correct test attribution because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long.

This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful.

All of the logs are printed from the test goroutine, so they should be associated with the correct test.

Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly.

Related:
https://github.com/golang/go/issues/38458 (may be fixed in go1.15)
https://github.com/golang/go/issues/38382#issuecomment-612940030
2020-07-21 12:50:40 -04:00
Freddy e72af87918
Add api mod support for /catalog/gateway-services (#8278) 2020-07-10 13:01:45 -06:00
Seth Hoenig 95f46eb3ed
api/agent: enable setting SuccessBeforePassing and FailuresBeforeCritical in API (#7949)
Fixes #7764

Until now these two fields could only be set through on-disk agent configuration.
This change adds the fields to the agent API struct definition so that they can
be set using the agent HTTP API.
2020-06-29 14:52:35 +02:00
R.B. Boyer 462f0f37ed
connect: various changes to make namespaces for intentions work more like for other subsystems (#8194)
Highlights:

- add new endpoint to query for intentions by exact match

- using this endpoint from the CLI instead of the dump+filter approach

- enforcing that OSS can only read/write intentions with a SourceNS or
  DestinationNS field of "default".

- preexisting OSS intentions with now-invalid namespace fields will
  delete those intentions on initial election or for wildcard namespaces
  an attempt will be made to downgrade them to "default" unless one
  exists.

- also allow the '-namespace' CLI arg on all of the intention subcommands

- update lots of docs
2020-06-26 16:59:15 -05:00
freddygv c791fbc79c Update namespaces subject-verb agreement 2020-06-23 10:57:30 -06:00
s-christoff 818d00fda3
Add AgentMemberStatus const (#8110)
* Add AgentMemberStatus const
2020-06-22 12:18:45 -05:00
Daniel Nephin 5afcf5c1bc
Merge pull request #8034 from hashicorp/dnephin/add-linter-staticcheck-4
ci: enable SA4006 staticcheck check and add ineffassign
2020-06-17 12:16:02 -04:00
Daniel Nephin 068b43df90 Enable gofmt simplify
Code changes done automatically with 'gofmt -s -w'
2020-06-16 13:21:11 -04:00
Daniel Nephin cb050b280c ci: enable SA4006 staticcheck check
And fix the 'value not used' issues.

Many of these are not bugs, but a few are tests not checking errors, and
one appears to be a missed error in non-test code.
2020-06-16 13:10:11 -04:00
Matt Keeler d3881dd754
ACL Node Identities (#7970)
A Node Identity is very similar to a service identity. Its main targeted use is to allow creating tokens for use by Consul agents that will grant the necessary permissions for all the typical agent operations (node registration, coordinate updates, anti-entropy).

Half of this commit is for golden file based tests of the acl token and role cli output. Another big updates was to refactor many of the tests in agent/consul/acl_endpoint_test.go to use the same style of tests and the same helpers. Besides being less boiler plate in the tests it also uses a common way of starting a test server with ACLs that should operate without any warnings regarding deprecated non-uuid master tokens etc.
2020-06-16 12:54:27 -04:00
Chris Piraino 91ab89dd48
Move ingress param to a new endpoint (#8081)
In discussion with team, it was pointed out that query parameters tend
to be filter mechanism, and that semantically the "/v1/health/connect"
endpoint should return "all healthy connect-enabled endpoints (e.g.
could be side car proxies or native instances) for this service so I can
connect with mTLS".

That does not fit an ingress gateway, so we remove the query parameter
and add a new endpoint "/v1/health/ingress" that semantically means
"all the healthy ingress gateway instances that I can connect to
to access this connect-enabled service without mTLS"
2020-06-10 13:07:15 -05:00
Chris Piraino 496e683360
Merge pull request #8064 from hashicorp/ingress/health-query-param
Add API query parameter ?ingress to allow users to find ingress gateways associated to a service
2020-06-09 16:08:28 -05:00
Chris Piraino 5e0cd7ede5 Remove unnecessary defer from api.health_test.go
We do not need to deregister services because every test gets its own
instance of the client agent and the tmp directories are all deleted at
the end.
2020-06-09 14:45:57 -05:00
Daniel Nephin 08f1ed16b4
Merge pull request #7900 from hashicorp/dnephin/add-linter-staticcheck-2
intentions: fix a bug in Intention.SetHash
2020-06-09 15:40:20 -04:00
Chris Piraino 4837069fe0 api: update api module with health.Ingress() method 2020-06-09 12:11:47 -05:00
Daniel Nephin caa692deea ci: Enabled SA2002 staticcheck check
And handle errors in the main test goroutine
2020-06-05 17:50:11 -04:00