consul

Commit Graph

Author	SHA1	Message	Date
R.B. Boyer	64271289ec	server: partly fix config entry replication issue that prevents replication in some circumstances (#12307 ) There are some cross-config-entry relationships that are enforced during "graph validation" at persistence time that are required to be maintained. This means that config entries may form a digraph at times. Config entry replication procedes in a particular sorted order by kind and name. Occasionally there are some fixups to these digraphs that end up replicating in the wrong order and replicating the leaves (ingress-gateway) before the roots (service-defaults) leading to replication halting due to a graph validation error related to things like mismatched service protocol requirements. This PR changes replication to give each computed change (upsert/delete) a fair shot at being applied before deciding to terminate that round of replication in error. In the case where we've simply tried to do the operations in the wrong order at least ONE of the outstanding requests will complete in the right order, leading the subsequent round to have fewer operations to do, with a smaller likelihood of graph validation errors. This does not address all scenarios, but for scenarios where the edits are being applied in the wrong order this should avoid replication halting. Fixes #9319 The scenario that is NOT ADDRESSED by this PR is as follows: 1. create: service-defaults: name=new-web, protocol=http 2. create: service-defaults: name=old-web, protocol=http 3. create: service-resolver: name=old-web, redirect-to=new-web 4. delete: service-resolver: name=old-web 5. update: service-defaults: name=old-web, protocol=grpc 6. update: service-defaults: name=new-web, protocol=grpc 7. create: service-resolver: name=old-web, redirect-to=new-web If you shutdown dc2 just before (4) and turn it back on after (7) replication is impossible as there is no single edit you can make to make forward progress.	2022-02-23 17:27:48 -06:00
Chris S. Kim	ea47f066d7	Merge pull request #12430 from hashicorp/ci/main-assetfs-build auto-updated agent/uiserver/bindata_assetfs.go from commit `73b6687c5`	2022-02-23 18:19:30 -05:00
Daniel Nephin	771df290d7	Merge pull request #11910 from hashicorp/dnephin/ca-provider-interface-for-ica-in-primary ca: add support for an external trusted CA	2022-02-22 13:14:52 -05:00
R.B. Boyer	8b987a4d59	configentry: make a new package to hold shared config entry structs that aren't used for RPC or the FSM (#12384 ) First two candidates are ConfigEntryKindName and DiscoveryChainConfigEntries.	2022-02-22 10:36:36 -06:00
Dhia Ayachi	cd9d8d44a5	file watcher to be used for configuration auto-reload feature (#12301 ) * add config watcher to the config package * add logging to watcher * add test and refactor to add WatcherEvent. * add all API calls and fix a bug with recreated files * add tests for watcher * remove the unnecessary use of context * Add debug log and a test for file rename * use inode to detect if the file is recreated/replaced and only listen to create events. * tidy ups (#1535) * tidy ups * Add tests for inode reconcile * fix linux vs windows syscall * fix linux vs windows syscall * fix windows compile error * increase timeout * use ctime ID * remove remove/creation test as it's a use case that fail in linux * fix linux/windows to use Ino/CreationTime * fix the watcher to only overwrite current file id * fix linter error * fix remove/create test * set reconcile loop to 200 Milliseconds * fix watcher to not trigger event on remove, add more tests * on a remove event try to add the file back to the watcher and trigger the handler if success * fix race condition * fix flaky test * fix race conditions * set level to info * fix when file is removed and get an event for it after * fix to trigger handler when we get a remove but re-add fail * fix error message * add tests for directory watch and fixes * detect if a file is a symlink and return an error on Add * rename Watcher to FileWatcher and remove symlink deref * add fsnotify@v1.5.1 * fix go mod * fix flaky test * Apply suggestions from code review Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com> * fix a possible stack overflow * do not reset timer on errors, rename OS specific files * start the watcher when creating it * fix data race in tests * rename New func * do not call handler when a remove event happen * events trigger on write and rename * fix watcher tests * make handler async * remove recursive call * do not produce events for sub directories * trim "/" at the end of a directory when adding * add missing test * fix logging * add todo * fix failing test * fix flaking tests * fix flaky test * add logs * fix log text * increase timeout * reconcile when remove * check reconcile when removed * fix reconcile move test * fix logging * delete invalid file * Apply suggestions from code review Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * fix review comments * fix is watched to properly catch a remove * change test timeout * fix test and rename id * fix test to create files with different mod time. * fix deadlock when stopping watcher * Apply suggestions from code review Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * fix a deadlock when calling stop while emitting event is blocked * make sure to close the event channel after the event loop is done * add go doc * back date file instead of sleeping * Apply suggestions from code review Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * check error Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com> Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2022-02-21 11:36:52 -05:00
hc-github-team-consul-core	ad14a2bffd	auto-updated agent/uiserver/bindata_assetfs.go from commit `73b6687c5`	2022-02-21 12:27:52 +00:00
Evan Culver	602e08ada7	checks: populate interval and timeout when registering services (#11138 )	2022-02-18 12:05:33 -08:00
Kyle Havlovitz	362753cad7	Merge pull request #12385 from hashicorp/tproxy-http-upstream-fix xds: respect chain protocol on default discovery chain	2022-02-18 10:08:59 -08:00
Daniel Nephin	dc484ee09e	rpc: set response to nil when not found Otherwise when the query times out we might incorrectly send a value for the reply, when we should send an empty reply. Also document errNotFound and how to handle the result in that case.	2022-02-18 12:26:06 -05:00
Daniel Nephin	6021105dfc	ca: test that original certs from secondary still verify There's a chance this could flake if the secondary hasn't received the update yet, but running this test many times doesn't show any flakes yet.	2022-02-17 18:45:16 -05:00
Daniel Nephin	6b679aa9d4	Update TODOs to reference an issue with more details And remove a no longer needed TODO	2022-02-17 18:21:30 -05:00
Daniel Nephin	1853a32df6	ca: add test cases for rotating external trusted CA	2022-02-17 18:21:30 -05:00
Daniel Nephin	5e8ea2a039	ca: add a test for secondary with external CA	2022-02-17 18:21:30 -05:00
Daniel Nephin	42ec34d101	ca: examine the full chain in newCARoot make TestNewCARoot much more strict compare the full result instead of only a few fields. add a test case with 2 and 3 certificates in the pem	2022-02-17 18:21:30 -05:00
Daniel Nephin	71f3ae04e2	ca: small docs improvements	2022-02-17 18:21:30 -05:00
Daniel Nephin	86994812ed	ca: cleanup validateSetIntermediate	2022-02-17 18:21:30 -05:00
Daniel Nephin	c1c1580bf8	ca: only return the leaf cert from Sign in vault provider The interface is documented as 'Sign will only return the leaf', and the other providers only return the leaf. It seems like this was added during the initial implementation, so is likely just something we missed. It doesn't break anything , but it does cause confusing cert chains in the API response which could break something in the future.	2022-02-17 18:21:30 -05:00
Daniel Nephin	85ecbaf109	Merge pull request #12110 from hashicorp/dnephin/blocking-queries-not-found rpc: make blocking queries for non-existent items more efficient	2022-02-17 18:09:39 -05:00
Ashwin Venkatesh	6e6cd928a2	Parse datacenter from request (#12370 ) * Parse datacenter from request - Parse the value of the datacenter from the create/delete requests for AuthMethods and BindingRules so that they can be created in and deleted from the datacenters specified in the request.	2022-02-17 16:41:27 -05:00
Kyle Havlovitz	3fe358b831	xds: respect chain protocol on default discovery chain	2022-02-17 11:47:20 -08:00
Florian Apolloner	f01f00fc84	Support for connect native services in topology view. (#12098 )	2022-02-16 16:51:54 -05:00
Chris S. Kim	154b781bc8	Move IndexEntryName helpers to common files (#12365 )	2022-02-16 12:56:38 -05:00
Daniel Nephin	8a6e75ac81	rpc: add errNotFound to all Get queries Any query that returns a list of items is not part of this commit.	2022-02-15 18:24:34 -05:00
Daniel Nephin	4b33bdf396	Make blockingQuery efficient with 'not found' results. By using the query results as state. Blocking queries are efficient when the query matches some results, because the ModifyIndex of those results, returned as queryMeta.Mindex, will never change unless the items themselves change. Blocking queries for non-existent items are not efficient because the queryMeta.Index can (and often does) change when other entities are written. This commit reduces the churn of these queries by using a different comparison for "has changed". Instead of using the modified index, we use the existence of the results. If the previous result was "not found" and the new result is still "not found", we know we can ignore the modified index and continue to block. This is done by setting the minQueryIndex to the returned queryMeta.Index, which prevents the query from returning before a state change is observed.	2022-02-15 18:24:33 -05:00
Daniel Nephin	897b953f66	Add a test for blocking query on non-existent entry This test shows how blocking queries are not efficient when the query returns no results. The test fails with 100+ calls instead of the expected 2. This test is still a bit flaky because it depends on the timing of the writes. It can sometimes return 3 calls. A future commit should fix this and make blocking queries even more optimal for not-found results.	2022-02-15 18:23:17 -05:00
Daniel Nephin	3301f94004	rpc: improve docs for blockingQuery Follow the Go convention of accepting a small interface that documents the methods used by the function. Clarify the rules for implementing a query function passed to blockingQuery.	2022-02-15 14:20:14 -05:00
R.B. Boyer	115946da99	server: conditionally avoid writing a config entry to raft if it was already the same (#12321 ) This will both save on unnecessary raft operations as well as unnecessarily incrementing the raft modify index of config entries subject to no-op updates.	2022-02-14 14:39:12 -06:00
FFMMM	78264a8030	Vendor in rpc mono repo for net/rpc fork, go-msgpack, msgpackrpc. (#12311 ) This commit syncs ENT changes to the OSS repo. Original commit details in ENT: ``` commit 569d25f7f4578981c3801e6e067295668210f748 Author: FFMMM <FFMMM@users.noreply.github.com> Date: Thu Feb 10 10:23:33 2022 -0800 Vendor fork net rpc (#1538) * replace net/rpc w consul-net-rpc/net/rpc Signed-off-by: FFMMM <FFMMM@users.noreply.github.com> * replace msgpackrpc and go-msgpack with fork from mono repo Signed-off-by: FFMMM <FFMMM@users.noreply.github.com> * gofmt all files touched Signed-off-by: FFMMM <FFMMM@users.noreply.github.com> ``` Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>	2022-02-14 09:45:45 -08:00
R.B. Boyer	52009ae86a	missed this test adjustment (#12331 )	2022-02-14 11:39:00 -06:00
R.B. Boyer	fa4577d1a9	local: fixes a data race in anti-entropy sync (#12324 ) The race detector noticed this initially in `TestAgentConfigWatcherSidecarProxy` but it is not restricted to just tests. The two main changes here were: - ensure that before we mutate the internal `agent/local` representation of a Service (for tags or VIPs) we clone those fields - ensure that there's no function argument joint ownership between the caller of a function and the local state when calling `AddService`, `AddCheck`, and related using `copystructure` for now.	2022-02-14 10:41:33 -06:00
Dao Thanh Tung	add15e12f7	URL-encode/decode resource names for HTTP API part 5 (#12297 )	2022-02-14 10:47:06 -05:00
Mark Anderson	1a16f7ee70	Refactor to make ACL errors more structured. (#12308 ) * First phase of refactoring PermissionDeniedError Add extended type PermissionDeniedByACLError that captures information about the accessor, particular permission type and the object and name of the thing being checked. It may be worth folding the test and error return into a single helper function, that can happen at a later date. Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-02-11 12:53:23 -08:00
Freddy	9580f79f86	Merge pull request #12223 from hashicorp/proxycfg/passthrough-cleanup	2022-02-10 17:35:51 -07:00
freddygv	ceb52d649a	Account for upstream targets in another DC. Transparent proxies typically cannot dial upstreams in remote datacenters. However, if their upstream configures a redirect to a remote DC then the upstream targets will be in another datacenter. In that sort of case we should use the WAN address for the passthrough.	2022-02-10 17:01:57 -07:00
freddygv	cbea3d203c	Fix race of upstreams with same passthrough ip Due to timing, a transparent proxy could have two upstreams to dial directly with the same address. For example: - The orders service can dial upstreams shipping and payment directly. - An instance of shipping at address 10.0.0.1 is deregistered. - Payments is scaled up and scheduled to have address 10.0.0.1. - The orders service receives the event for the new payments instance before seeing the deregistration for the shipping instance. At this point two upstreams have the same passthrough address and Envoy will reject the listener configuration. To disambiguate this commit considers the Raft index when storing passthrough addresses. In the example above, 10.0.0.1 would only be associated with the newer payments service instance.	2022-02-10 17:01:57 -07:00
freddygv	659ebc05a9	Ensure passthrough addresses get cleaned up Transparent proxies can set up filter chains that allow direct connections to upstream service instances. Services that can be dialed directly are stored in the PassthroughUpstreams map of the proxycfg snapshot. Previously these addresses were not being cleaned up based on new service health data. The list of addresses associated with an upstream service would only ever grow. As services scale up and down, eventually they will have instances assigned to an IP that was previously assigned to a different service. When IP addresses are duplicated across filter chain match rules the listener config will be rejected by Envoy. This commit updates the proxycfg snapshot management so that passthrough addresses can get cleaned up when no longer associated with a given upstream. There is still the possibility of a race condition here where due to timing an address is shared between multiple passthrough upstreams. That concern is mitigated by #12195, but will be further addressed in a follow-up.	2022-02-10 17:01:57 -07:00
Freddy	378a7258e3	Prevent xDS tight loop on cfg errors (#12195 )	2022-02-10 15:37:36 -07:00
Dhia Ayachi	4f0a71d7b4	fix race when starting a service while the agent `serviceManager` is … (#12302 ) * fix race when starting a service while the agent `serviceManager` is stopping * add changelog	2022-02-10 13:30:49 -05:00
Daniel Nephin	01784470f3	Merge pull request #12277 from hashicorp/dnephin/panic-in-service-register catalog: initialize the refs map to prevent a nil panic	2022-02-09 19:48:22 -05:00
Daniel Nephin	82c264b2b3	config-entry: fix a panic when registering a service or ingress gateway	2022-02-09 18:49:48 -05:00
R.B. Boyer	89bd1f57b5	xds: allow only one outstanding delta request at a time (#12236 ) Fixes #11876 This enforces that multiple xDS mutations are not issued on the same ADS connection at once, so that we can 100% control the order that they are applied. The original code made assumptions about the way multiple in-flight mutations were applied on the Envoy side that was incorrect.	2022-02-08 10:36:48 -06:00
Daniel Nephin	7ec658b7ac	Merge pull request #12265 from hashicorp/dnephin/logging-in-tests sdk: add TestLogLevel for setting log level in tests	2022-02-07 16:11:23 -05:00
Daniel Nephin	437f769916	A test to reproduce the issue	2022-02-04 14:04:12 -05:00
Daniel Nephin	51b0f82d0e	Make test more readable And fix typo	2022-02-03 18:44:09 -05:00
Daniel Nephin	608597c7b6	ca: relax and move private key type/bit validation for vault This commit makes two changes to the validation. Previously we would call this validation in GenerateRoot, which happens both on initialization (when a follower becomes leader), and when a configuration is updated. We only want to do this validation during config update so the logic was moved to the UpdateConfiguration function. Previously we would compare the config values against the actual cert. This caused problems when the cert was created manually in Vault (not created by Consul). Now we compare the new config against the previous config. Using a already created CA cert should never error now. Adding the key bit and types to the config should only error when the previous values were not the defaults.	2022-02-03 17:21:20 -05:00
Daniel Nephin	d707173253	ca: small cleanup of TestConnectCAConfig_Vault_TriggerRotation_Fails Before adding more test cases	2022-02-03 17:21:20 -05:00
Daniel Nephin	3f590bb8a1	testing: fix test failures caused by new log level These two tests require debug logging enabled, because they look for log lines. Also switched to testify assertions because the previous errors were not clear.	2022-02-03 17:07:39 -05:00
Daniel Nephin	b058845110	sdk: add TestLogLevel for setting log level in tests And default log level to WARN.	2022-02-03 13:42:28 -05:00
Daniel Nephin	7839b2d7e0	ca: add a test that uses an intermediate CA as the primary CA This test found a bug in the secondary. We were appending the root cert to the PEM, but that cert was already appended. This was failing validation in Vault here: https://github.com/hashicorp/vault/blob/sdk/v0.3.0/sdk/helper/certutil/types.go#L329 Previously this worked because self signed certs have the same SubjectKeyID and AuthorityKeyID. So having the same self-signed cert repeated doesn't fail that check. However with an intermediate that is not self-signed, those values are different, and so we fail the check. A test I added in a previous commit should show that this continues to work with self-signed root certs as well.	2022-02-02 13:41:35 -05:00
Daniel Nephin	ac732ce82b	acl: un-embed ACLIdentity This is safer than embedding two interface because there are a number of places where we check the concrete type. If we check the concrete type on the top-level interface it will fail. So instead expose the ACLIdentity from a method.	2022-02-02 12:07:31 -05:00

1 2 3 4 5 ...

4103 Commits