consul

Commit Graph

Author	SHA1	Message	Date
Matt Keeler	6fe09aa23b	Update CHANGELOG.md	2019-01-11 16:06:17 -05:00
Matt Keeler	1ec5f2a27f	Store leaf cert indexes in raft and use for the ModifyIndex on the returned certs (#5211 ) * Store leaf cert indexes in raft and use for the ModifyIndex on the returned certs This ensures that future certificate signings will have a strictly greater ModifyIndex than any previous certs signed.	2019-01-11 16:04:57 -05:00
Matt Keeler	834e168f94	Update CHANGELOG.md	2019-01-11 09:31:49 -05:00
Aestek	4afbe792df	Improve blocking queries on services that do not exist (#4810 ) ## Background When making a blocking query on a missing service (was never registered, or is not registered anymore) the query returns as soon as any service is updated. On clusters with frequent updates (5~10 updates/s in our DCs) these queries virtually do not block, and clients with no protections againt this waste ressources on the agent and server side. Clients that do protect against this get updates later than they should because of the backoff time they implement between requests. ## Implementation While reducing the number of unnecessary updates we still want : * Clients to be notified as soon as when the last instance of a service disapears. * Clients to be notified whenever there's there is an update for the service. * Clients to be notified as soon as the first instance of the requested service is added. To reduce the number of unnecessary updates we need to block when a request to a missing service is made. However in the following case : 1. Client `client1` makes a query for service `foo`, gets back a node and X-Consul-Index 42 2. `foo` is unregistered 3. `client1` makes a query for `foo` with `index=42` -> `foo` does not exist, the query blocks and `client1` is not notified of the change on `foo` We could store the last raft index when each service was last alive to know wether we should block on the incoming query or not, but that list could grow indefinetly. We instead store the last raft index when a service was unregistered and use it when a query targets a service that does not exist. When a service `srv` is unregistered this "missing service index" is always greater than any X-Consul-Index held by the clients while `srv` was up, allowing us to immediatly notify them. 1. Client `client1` makes a query for service `foo`, gets back a node and `X-Consul-Index: 42` 2. `foo` is unregistered, we set the "missing service index" to 43 3. `client1` makes a blocking query for `foo` with `index=42` -> `foo` does not exist, we check against the "missing service index" and return immediatly with `X-Consul-Index: 43` 4. `client1` makes a blocking query for `foo` with `index=43` -> we block 5. Other changes happen in the cluster, but foo still doesn't exist and "missing service index" hasn't changed, the query is still blocked 6. `foo` is registered again on index 62 -> `foo` exists and its index is greater than 43, we unblock the query	2019-01-11 09:26:14 -05:00
R.B. Boyer	4db60f8243	website: minor acl guide fixes (#5214 )	2019-01-10 14:17:20 -06:00
Elghazal Ahmed	2e97a4858f	website: add autowire in Community Tools list (#5118 ) * add autowire in Community Tools list * put list in the right alphabetic order	2019-01-10 12:27:55 -06:00
Hans Hasselberg	f8112e7eaf	Update CHANGELOG.md	2019-01-10 17:44:40 +01:00
Hans Hasselberg	27738c77c0	Update CHANGELOG.md	2019-01-10 17:43:31 +01:00
Matt Keeler	052d68b5fb	Update CHANGELOG.md	2019-01-10 11:25:01 -05:00
Matt Keeler	baa8946ea6	cache: Pass through wait query param to the cache.Get (#5203 ) This adds a MaxQueryTime field to the connect ca leaf cache request type and populates it via the wait query param. The cache will then do the right thing and timeout the operation as expected if no new leaf cert is available within that time. Fixes #4462 The reproduction scenario in the original issue now times out appropriately.	2019-01-10 11:23:37 -05:00
Matt Keeler	27a9f51b24	Update CHANGELOG.md	2019-01-10 10:51:16 -05:00
Pierre Souchay	1618d79518	Allow `"disable_host_node_id": false` to work on Linux as non-root. (#4926 ) Bump `shirou/gopsutil` to include https://github.com/shirou/gopsutil/pull/603 This will allow to have consistent node-id even when machine is reinstalled when using `"disable_host_node_id": false` It will fix https://github.com/hashicorp/consul/issues/4914 and allow having the same node-id even when reinstalling a node from scratch. However, it is only compatible with a single OS (installing to Windows will change the node-id, but it seems acceptable).	2019-01-10 10:50:14 -05:00
Matt Keeler	8969b8d3e4	Update CHANGELOG.md	2019-01-10 09:29:51 -05:00
Aestek	c043de5381	[Security] Allow blocking Write endpoints on Agent using Network Addresses (#4719 ) * Add -write-allowed-nets option * Add documentation for the new write_allowed_nets option	2019-01-10 09:27:26 -05:00
Matt Keeler	1048f3d5e7	acl: Prevent tokens from deleting themselves (#5210 ) Fixes #4897 Also apparently token deletion could segfault in secondary DCs when attempting to delete non-existant tokens. For that reason both checks are wrapped within the non-nil check.	2019-01-10 09:22:51 -05:00
Paul Banks	315cfef08c	Update CHANGELOG.md	2019-01-10 12:50:31 +00:00
Paul Banks	82487d8f68	Update CHANGELOG.md	2019-01-10 12:47:59 +00:00
Paul Banks	0638e09b6e	connect: agent leaf cert caching improvements (#5091 ) * Add State storage and LastResult argument into Cache so that cache.Types can safely store additional data that is eventually expired. * New Leaf cache type working and basic tests passing. TODO: more extensive testing for the Root change jitter across blocking requests, test concurrent fetches for different leaves interact nicely with rootsWatcher. * Add multi-client and delayed rotation tests. * Typos and cleanup error handling in roots watch * Add comment about how the FetchResult can be used and change ca leaf state to use a non-pointer state. * Plumb test override of root CA jitter through TestAgent so that tests are deterministic again! * Fix failing config test	2019-01-10 12:46:11 +00:00
kaitlincarter-hc	2dfc9ae989	Re-worked the ACL guide into two docs and an updated guide. (#5093 ) * Re-worked the ACL guide into two docs and an updated guide. Co-Authored-By: kaitlincarter-hc <43049322+kaitlincarter-hc@users.noreply.github.com> * Updating syntax based on amayer5125's comments. * Missed one of amayer5125's comments * found a bad link in the acl system docs * fixing a link in the rules docs	2019-01-09 15:07:20 -06:00
Kyle Havlovitz	c07c5446a8	txn: clean up some state store/acl code	2019-01-09 11:59:23 -08:00
Erik R. Rygg	6580890f6e	Include information about multi-dc Connect	2019-01-08 14:30:36 -07:00
Matt Keeler	0e280b5a08	Update CHANGELOG.md	2019-01-08 11:44:36 -05:00
dawxy	238d430275	Fix data race (#5029 ) Fix #4357	2019-01-08 11:43:14 -05:00
Hans Hasselberg	067027230b	connect: add tls config for vault connect ca provider (#5125 ) * add tlsconfig for vault connect ca provider. * add options to the docs * add tests for new configuration	2019-01-08 17:09:22 +01:00
Hans Hasselberg	0fc1c203cc	snapshot: read meta.json correctly. (#5193 ) * snapshot: read meta.json correctly. Fixes #4452.	2019-01-08 17:06:28 +01:00
Alejandro Guirao Rodríguez	9f33353c14	agent/config: Fix typo in comment (#5202 )	2019-01-08 16:27:22 +01:00
Paul Banks	a13e4b2090	Update CHANGELOG.md	2019-01-08 10:16:34 +00:00
Paul Banks	bb7145f27d	agent: add default weights to service in local state to prevent AE churn (#5126 ) * Add default weights when adding a service with no weights to local state to prevent constant AE re-sync. This fix was contributed by @42wim in https://github.com/hashicorp/consul/pull/5096 but was merged against the wrong base. This adds it to master and adds a test to cover the behaviour. * Fix tests that broke due to comparing internal state which now has default weights	2019-01-08 10:13:49 +00:00
Paul Banks	233ef4634b	Update CHANGELOG.md	2019-01-08 10:09:03 +00:00
Paul Banks	0589525ae9	agent: Don't leave old errors around in cache (#5094 ) * Fixes #4480. Don't leave old errors around in cache that can be hit in specific circumstances. * Move error reset to cover extreme edge case of nil Value, nil err Fetch	2019-01-08 10:06:38 +00:00
Jack Pearkes	96b877f79e	website: fixed ca provider references (#5185 ) Fixes https://github.com/hashicorp/consul/issues/5182.	2019-01-07 18:47:02 -08:00
Matt Keeler	62aa40872c	Update CHANGELOG.md	2019-01-07 16:55:14 -05:00
Pierre Souchay	ae7f88f995	Avoid to have infinite recursion in DNS lookups when resolving CNAMEs (#4918 ) * Avoid to have infinite recursion in DNS lookups when resolving CNAMEs This will avoid killing Consul when a Service.Address is using CNAME to a Consul CNAME that creates an infinite recursion. This will fix https://github.com/hashicorp/consul/issues/4907 * Use maxRecursionLevel = 3 to allow several recursions	2019-01-07 16:53:54 -05:00
Paul Banks	88d239818c	Update CHANGELOG.md	2019-01-07 21:32:20 +00:00
Paul Banks	b29bc906ee	bugfix: use ServiceTags to generate cache key hash (#4987 ) * bugfix: use ServiceTags to generate cahce key hash * update unit test * update * remote print log * Update .gitignore * Completely deprecate ServiceTag field internally for clarity * Add explicit test for CacheInfo cases	2019-01-07 21:30:47 +00:00
R.B. Boyer	0f22706af5	Update CHANGELOG.md	2019-01-07 15:05:06 -06:00
R.B. Boyer	b96391ecff	update github.com/hashicorp/{serf,memberlist,go-sockaddr} (#5189 ) This activates large-cluster improvements in the gossip layer from https://github.com/hashicorp/memberlist/pull/167	2019-01-07 15:00:47 -06:00
Matt Keeler	d4f7a830a8	Update CHANGELOG.md	2019-01-07 13:56:08 -05:00
Aestek	8709213d6e	Prevent status flap when re-registering a check (#4904 ) Fixes point `#2` of: https://github.com/hashicorp/consul/issues/4903 When registering a service each healthcheck status is saved and restored (https://github.com/hashicorp/consul/blob/master/agent/agent.go#L1914) to avoid unnecessary flaps in health state. This change extends this feature to single check registration by moving this protection in `AddCheck()` so that both `PUT /v1/agent/service/register` and `PUT /v1/agent/check/register` behave in the same idempotent way. #### Steps to reproduce 1. Register a check : ``` curl -X PUT \ http://127.0.0.1:8500/v1/agent/check/register \ -H 'Content-Type: application/json' \ -d '{ "Name": "my_check", "ServiceID": "srv", "Interval": "10s", "Args": ["true"] }' ``` 2. The check will initialize and change to `passing` 3. Run the same request again 4. The check status will quickly go from `critical` to `passing` (the delay for this transission is determined by https://github.com/hashicorp/consul/blob/master/agent/checks/check.go#L95)	2019-01-07 13:53:03 -05:00
RJ Spiker	516ba47609	website: fix carousel bugs	2019-01-07 13:39:14 -05:00
Mitchell Hashimoto	f76022fa63	CA Provider Plugins (#4751 ) This adds the `agent/connect/ca/plugin` library for consuming/serving Connect CA providers as [go-plugin](https://github.com/hashicorp/go-plugin) plugins. This does not wire this up in any way to Consul itself, so this will not enable using these plugins yet. ## Why? We want to enable CA providers to be pluggable without modifying Consul so that any CA or PKI system can potentially back the Connect certificates. This CA system may also be used in the future for easier bootstrapping and internal cluster security. ### go-plugin The benefit of `go-plugin` is that for the plugin consumer, the fact that the interface implementation is communicating over multi-process RPC is invisible. Internals of Consul will continue to just use `ca.Provider` interface implementations as if they're local. For plugin _authors_, they simply have to implement the interface. The network/transport/process management issues are handled by go-plugin itself. The CA provider plugins support both `net/rpc` and gRPC transports. This enables easy authoring in any language. go-plugin handles the actual protocol handshake and connection. This is just a feature of go-plugin. `go-plugin` is already in production use for years by Packer, Terraform, Nomad, Vault, and Sentinel. We've shown stability for both desktop and server-side software. It is very mature. ## Implementation Details ### `map[string]interface{}` The `Configure` method passes a `map[string]interface{}`. This map contains only Go primitives and containers of primitives (no funcs, chans, etc.). For `net/rpc` we encode as-is using Gob. For gRPC we marshal to JSON and transmit as a `bytes` type. This is the same approach we take with Vault and other software. Note that this is just the transport protocol, the end software views it fully decoded. ### `x509.Certificate` and `CertificateRequest` We transmit the raw ASN.1 bytes and decode on the other side. Unit tests are verifying we get the same cert/csrs across the wire. ### Testing `go-plugin` exposes test helpers that enable testing the full plugin RPC over real loopback network connections. We test all endpoints for success and error for both `net/rpc` and gRPC. ### Vendoring This PR doesn't introduce vendoring for two reasons: 1. @banks's `f-envoy` branch introduces a lot of these and I didn't want conflict. 2. The library isn't actually used yet so it doesn't introduce compile-time errors (it does introduce test errors). ## Next Steps With this in place, we need to figure out the proper way to actually hook these up to Consul, load them, etc. This discussion can happen elsewhere, since regardless of approach this plugin library implementation is the exact same.	2019-01-07 12:48:44 -05:00
R.B. Boyer	3841e9e396	website: fix stray sentinel references using the old syntax (#5191 ) [skip ci]	2019-01-07 09:59:17 -06:00
Matt Keeler	7fd03b3ba4	Update CHANGELOG.md	2019-01-07 09:56:31 -05:00
Grégoire Seux	4f62a3b528	Implement /v1/agent/health/service/<service name> endpoint (#3551 ) This endpoint aggregates all checks related to <service id> on the agent and return an appropriate http code + the string describing the worst check. This allows to cleanly expose service status to other component, hiding complexity of multiple checks. This is especially useful to use consul to feed a load balancer which would delegate health checking to consul agent. Exposing this endpoint on the agent is necessary to avoid a hit on consul servers and avoid decreasing resiliency (this endpoint will work even if there is no consul leader in the cluster).	2019-01-07 09:39:23 -05:00
Alvin Huang	067346e496	Merge pull request #5187 from hashicorp/add_ui_tests Add ui tests	2019-01-04 12:13:08 -05:00
kaitlincarter-hc	9a976a40f3	Added the new monitoring guide (#5117 )	2019-01-04 10:26:07 -06:00
Matt Keeler	57b5a1de0c	Update CHANGELOG.md	2019-01-04 10:03:29 -05:00
Aestek	5960974db1	[Fix] Services sometimes not being synced with acl_enforce_version_8 = false (#4771 ) Fixes: https://github.com/hashicorp/consul/issues/3676 This fixes a bug were registering an agent with a non-existent ACL token can prevent other services registered with a good token from being synced to the server when using `acl_enforce_version_8 = false`. ## Background When `acl_enforce_version_8` is off the agent does not check the ACL token validity before storing the service in its state. When syncing a service registered with a missing ACL token we fall into the default error handling case (https://github.com/hashicorp/consul/blob/master/agent/local/state.go#L1255) and stop the sync (https://github.com/hashicorp/consul/blob/master/agent/local/state.go#L1082) without setting its Synced property to true like in the permission denied case. This means that the sync will always stop at the faulty service(s). The order in which the services are synced is random since we iterate on a map. So eventually all services with good ACL tokens will be synced, this can however take some time and is influenced by the cluster size, the bigger the slower because retries are less frequent. Having a service in this state also prevent all further sync of checks as they are done after the services. ## Changes This change modify the sync process to continue even if there is an error. This fixes the issue described above as well as making the sync more error tolerant: if the server repeatedly refuses a service (the ACL token could have been deleted by the time the service is synced, the servers were upgraded to a newer version that has more strict checks on the service definition...). Then all services and check that can be synced will, and those that don't will be marked as errors in the logs instead of blocking the whole process.	2019-01-04 10:01:50 -05:00
Alvin Huang	fde5d75c68	Merge pull request #5186 from hashicorp/add_codeowners add codeowners for consul docs	2019-01-04 09:32:53 -05:00
Alvin Huang	5425a86058	add documentation on how to use ember-exam	2019-01-03 23:50:02 -05:00

1 2 3 4 5 ...

9372 Commits All Branches Search

9372 Commits

All Branches