consul

Commit Graph

Author	SHA1	Message	Date
Hans Hasselberg	d4877f03e7	fix TestLeader_SecondaryCA_IntermediateRenew (#8702 ) * fix lessThanHalfTime * get lock for CAProvider() * make a var to relate both vars * rename to getCAProviderWithLock * move CertificateTimeDriftBuffer to agent/connect/ca	2020-09-18 10:13:29 +02:00
Mike Morris	6b62751921	test: update tags for database service registrations and queries (#8693 )	2020-09-16 14:05:01 -04:00
Kyle Havlovitz	b1b21139ca	Merge branch 'master' into vault-ca-renew-token	2020-09-15 14:39:04 -07:00
Daniel Nephin	cdd392d77f	agent/consul: pass dependencies directly from agent In an upcoming change we will need to pass a grpc.ClientConnPool from BaseDeps into Server. While looking at that change I noticed all of the existing consulOption fields are already on BaseDeps. Instead of duplicating the fields, we can create a struct used by agent/consul, and use that struct in BaseDeps. This allows us to pass along dependencies without translating them into different representations. I also looked at moving all of BaseDeps in agent/consul, however that created some circular imports. Resolving those cycles wouldn't be too bad (it was only an error in agent/consul being imported from cache-types), however this change seems a little better by starting to introduce some structure to BaseDeps. This change is also a small step in reducing the scope of Agent. Also remove some constants that were only used by tests, and move the relevant comment to where the live configuration is set. Removed some validation from NewServer and NewClient, as these are not really runtime errors. They would be code errors, which will cause a panic anyway, so no reason to handle them specially here.	2020-09-15 17:29:32 -04:00
Daniel Nephin	3aa9bd4c23	agent/consul: make router required	2020-09-15 17:26:26 -04:00
Kyle Havlovitz	7ffef62ed7	Clean up CA shutdown logic and error	2020-09-15 12:28:58 -07:00
freddygv	7fd518ff1d	Merge master	2020-09-14 16:17:43 -06:00
Daniel Nephin	20aea3dbc9	Merge pull request #8587 from hashicorp/streaming/add-grpc-server streaming: add gRPC server for handling connections	2020-09-14 15:24:54 -04:00
freddygv	7b9d1b41d5	Resolve conflicts against master	2020-09-11 18:41:58 -06:00
Kyle Havlovitz	49056fe70f	Clean up Vault renew tests and shutdown	2020-09-11 08:41:05 -07:00
freddygv	eab90ea9fa	Revert EnvoyConfig nesting	2020-09-11 09:21:43 -06:00
Kyle Havlovitz	aa97366020	Add a stop function to make sure the renewer is shut down on leader change	2020-09-10 06:12:48 -07:00
Kyle Havlovitz	411b6537ef	Add a test for token renewal	2020-09-09 16:36:37 -07:00
Daniel Nephin	2257247095	server: add gRPC server for streaming events Includes a stats handler and stream interceptor for grpc metrics. Co-authored-by: Paul Banks <banks@banksco.de>	2020-09-08 12:10:41 -04:00
Hans Hasselberg	436a7032d1	secondaryIntermediateCertRenewalWatch abort on success (#8588 ) secondaryIntermediateCertRenewalWatch was using `retryLoopBackoff` to renew the intermediate certificate. Once it entered the inner loop and started `retryLoopBackoff` it would never leave that. `retryLoopBackoffAbortOnSuccess` will return when renewing is successful, like it was intended originally.	2020-09-04 11:47:16 +02:00
Daniel Nephin	e573e64d58	state: handle terminating gateways in service health events	2020-09-03 16:58:05 -04:00
Daniel Nephin	3775392fb5	state: improve comments in catalog_events.go Co-authored-by: Paul Banks <banks@banksco.de>	2020-09-03 16:58:05 -04:00
Daniel Nephin	417c5c93a8	state: use changeType in serviceChanges To be a little more explicit, instead of nil implying an indirect change	2020-09-03 16:58:05 -04:00
Daniel Nephin	01424ba146	don't over allocate slice	2020-09-03 16:58:04 -04:00
Daniel Nephin	d210242875	state: fix a bug in building service health events The nodeCheck slice was being used as the first arg in append, which in some cases will modify the array backing the slice. This would lead to service checks for other services in the wrong event. Also refactor some things to reduce the arguments to functions.	2020-09-03 16:58:04 -04:00
Daniel Nephin	7581305523	state: Remove unused args and return values Also rename some functions to identify them as constructors for events	2020-09-03 16:58:04 -04:00
Daniel Nephin	27b02d391c	state: use an enum for tracking node changes	2020-09-03 16:58:04 -04:00
Daniel Nephin	09329b542d	state: serviceHealthSnapshot refactored to remove unused return value and remove duplication	2020-09-03 16:58:04 -04:00
Daniel Nephin	bf523420ee	state: Add Change processor and snapshotter for service health Co-authored-by: Paul Banks <banks@banksco.de>	2020-09-03 16:58:04 -04:00
Daniel Nephin	e03e911144	state: fix bug in changeTrackerDB.publish Creating a new readTxn does not work because it will not see the newly created objects that are about to be committed. Instead use the active write Txn.	2020-09-03 16:58:01 -04:00
Daniel Nephin	5de4d5bbe3	stream: have SnapshotFunc accept a non-pointer SubscribeRequest The value is not expected to be modified. Passing a value makes that explicit.	2020-09-03 16:54:02 -04:00
freddygv	cd4cf5161f	Update resolver defaulting	2020-09-03 13:08:44 -06:00
freddygv	eaa250cc80	Ensure resolver node with LB isn't considered default	2020-09-03 08:55:57 -06:00
freddygv	f81fe6a1a1	Remove LB infix and move injection to xds	2020-09-02 15:13:50 -06:00
Chris Piraino	28f163c2d2	Merge pull request #8603 from hashicorp/feature/usage-metrics Track node and service counts in the state store and emit them periodically as metrics	2020-09-02 13:23:39 -05:00
R.B. Boyer	d0f74cd1e8	connect: fix bug in preventing some namespaced config entry modifications (#8601 ) Whenever an upsert/deletion of a config entry happens, within the open state store transaction we speculatively test compile all discovery chains that may be affected by the pending modification to verify that the write would not create an erroneous scenario (such as splitting traffic to a subset that did not exist). If a single discovery chain evaluation references two config entries with the same kind and name in different namespaces then sometimes the upsert/deletion would be falsely rejected. It does not appear as though this bug would've let invalid writes through to the state store so the correction does not require a cleanup phase.	2020-09-02 10:47:19 -05:00
Chris Piraino	bcb586bee2	Set metrics reporting interval to 9 seconds This is below the 10 second interval that lib/telemetry.go implements as its aggregation interval, ensuring that we always report these metrics.	2020-09-02 10:24:23 -05:00
Chris Piraino	a3028cad89	Update godoc string for memdb wrapper functions/structs	2020-09-02 10:24:22 -05:00
Chris Piraino	d301145e62	Refactor state store usage to track unique service names This commit refactors the state store usage code to track unique service name changes on transaction commit. This means we only need to lookup usage entries when reading the information, as opposed to iterating over a large number of service indices. - Take into account a service instance's name being changed - Do not iterate through entire list of service instances, we only care about whether there is 0, 1, or more than 1.	2020-09-02 10:24:21 -05:00
Chris Piraino	086a8ea8eb	Use ReadTxn interface in state store helper functions	2020-09-02 10:24:20 -05:00
Chris Piraino	69dbc926ad	Add WriteTxn interface and convert more functions to ReadTxn We add a WriteTxn interface for use in updating the usage memdb table, with the forward-looking prospect of incrementally converting other functions to accept interfaces. As well, we use the ReadTxn in new usage code, and as a side effect convert a couple of existing functions to use that interface as well.	2020-09-02 10:24:19 -05:00
Chris Piraino	3feae7f77b	Report node/service usage metrics from every server Using the newly provided state store methods, we periodically emit usage metrics from the servers. We decided to emit these metrics from all servers, not just the leader, because that means we do not have to care about leader election flapping causing metrics turbulence, and it seems reasonable for each server to emit its own view of the state, even if they should always converge rapidly.	2020-09-02 10:24:17 -05:00
Chris Piraino	04705e90f9	Add new usage memdb table that tracks usage counts of various elements We update the usage table on Commit() by using the TrackedChanges() API of memdb. Track memdb changes on restore so that usage data can be compiled	2020-09-02 10:24:16 -05:00
freddygv	63f79e5f9b	Restructure structs and other PR comments	2020-09-02 09:10:50 -06:00
Matt Keeler	91d680b830	Merge of auto-config and auto-encrypt code (#8523 ) auto-encrypt is now handled as a special case of auto-config. This also is moving all the cert-monitor code into the auto-config package.	2020-08-31 13:12:17 -04:00
freddygv	81115b6eaa	Compile down LB policy to disco chain nodes	2020-08-28 13:11:04 -06:00
Daniel Nephin	6956477be5	Merge pull request #8548 from edevil/fix_flake Fix flaky TestACLResolver_Client/Concurrent-Token-Resolve	2020-08-28 15:10:55 -04:00
R.B. Boyer	74d5df7c7a	xds: use envoy's rbac filter to handle intentions entirely within envoy (#8569 )	2020-08-27 12:20:58 -05:00
Matt Keeler	f97cc0445a	Move RPC router from Client/Server and into BaseDeps (#8559 ) This will allow it to be a shared component which is needed for AutoConfig	2020-08-27 11:23:52 -04:00
André Cruz	9a0792139c	Decrease test flakiness Fix flaky TestACLResolver_Client/Concurrent-Token-Resolve and TestCacheNotifyPolling	2020-08-24 20:30:02 +01:00
André Cruz	aa212423e3	testing: Fix govet errors	2020-08-21 18:01:55 +01:00
Hans Hasselberg	a932aafc91	add primary keys to list keyring (#8522 ) During gossip encryption key rotation it would be nice to be able to see if all nodes are using the same key. This PR adds another field to the json response from `GET v1/operator/keyring` which lists the primary keys in use per dc. That way an operator can tell when a key was successfully setup as primary key. Based on https://github.com/hashicorp/serf/pull/611 to add primary key to list keyring output: ```json [ { "WAN": true, "Datacenter": "dc2", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 6, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6 }, "NumNodes": 6 }, { "WAN": false, "Datacenter": "dc2", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 8, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "NumNodes": 8 }, { "WAN": false, "Datacenter": "dc1", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 3, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "NumNodes": 8 } ] ``` I intentionally did not change the CLI output because I didn't find a good way of displaying this information. There are a couple of options that we could implement later: * add a flag to show the primary keys * add a flag to show json output Fixes #3393.	2020-08-18 09:50:24 +02:00
Daniel Nephin	d68edcecf4	testing: Remove all the defer os.Removeall Now that testutil uses t.Cleanup to remove the directory the caller no longer has to manage the removal	2020-08-14 19:58:53 -04:00
Daniel Nephin	d677706625	state: remove unused Store method receiver And use ReadTxn interface where appropriate.	2020-08-13 11:25:22 -04:00
Daniel Nephin	912aae8624	Merge pull request #8461 from hashicorp/dnephin/remove-notify-shutdown agent/consul: Remove NotifyShutdown	2020-08-13 11:16:48 -04:00
Daniel Nephin	5b37efd91b	Merge pull request #8365 from hashicorp/dnephin/fix-service-by-node-meta-flake state: speed up tests that use watchLimit	2020-08-13 11:16:12 -04:00
R.B. Boyer	e3cd4a8539	connect: use stronger validation that ingress gateways have compatible protocols defined for their upstreams (#8470 ) Fixes #8466 Since Consul 1.8.0 there was a bug in how ingress gateway protocol compatibility was enforced. At the point in time that an ingress-gateway config entry was modified the discovery chain for each upstream was checked to ensure the ingress gateway protocol matched. Unfortunately future modifications of other config entries were not validated against existing ingress-gateway definitions, such as: 1. create tcp ingress-gateway pointing to 'api' (ok) 2. create service-defaults for 'api' setting protocol=http (worked, but not ok) 3. create service-splitter or service-router for 'api' (worked, but caused an agent panic) If you were to do these in a different order, it would fail without a crash: 1. create service-defaults for 'api' setting protocol=http (ok) 2. create service-splitter or service-router for 'api' (ok) 3. create tcp ingress-gateway pointing to 'api' (fail with message about protocol mismatch) This PR introduces the missing validation. The two new behaviors are: 1. create tcp ingress-gateway pointing to 'api' (ok) 2. (NEW) create service-defaults for 'api' setting protocol=http ("ok" for back compat) 3. (NEW) create service-splitter or service-router for 'api' (fail with message about protocol mismatch) In consideration for any existing users that may be inadvertently be falling into item (2) above, that is now officiall a valid configuration to be in. For anyone falling into item (3) above while you cannot use the API to manufacture that scenario anymore, anyone that has old (now bad) data will still be able to have the agent use them just enough to generate a new agent/proxycfg error message rather than a panic. Unfortunately we just don't have enough information to properly fix the config entries.	2020-08-12 11:19:20 -05:00
Hans Hasselberg	aacf0fd777	Merge pull request #8471 from hashicorp/local_only thread local-only through the layers	2020-08-12 08:54:51 +02:00
Freddy	875816d0d3	Internal endpoint to query intentions associated with a gateway (#8400 )	2020-08-11 17:20:41 -06:00
Kyle Havlovitz	635952681e	Fix a state store comment about version	2020-08-11 13:46:12 -07:00
Kyle Havlovitz	c39a275666	fsm: Fix snapshot bug with restoring node/service/check indexes	2020-08-11 11:49:52 -07:00
Hans Hasselberg	aff02198d7	Refactor keyring ops: * changes some functions to return data instead of modifying pointer arguments * renames globalRPC() to keyringRPCs() to make its purpose more clear * restructures KeyringOperation() to make it more understandable	2020-08-11 13:42:03 +02:00
freddygv	de0b574a26	Update error handling	2020-08-10 17:48:22 -06:00
Daniel Nephin	80e99cb3e6	testing: remove unnecessary defers in tests The data directory is now removed by the test helper that created it.	2020-08-07 17:28:16 -04:00
Daniel Nephin	7dbacf297c	testing: Remove NotifyShutdown NotifyShutdown was only used for testing. Now that t.Cleanup exists, we can use that instead of attaching cleanup to the Server shutdown. The Autopilot test which used NotifyShutdown doesn't need this notification because Shutdown is synchronous. Waiting for the function to return is equivalent.	2020-08-07 17:14:44 -04:00
Hans Hasselberg	d316cd06c1	auto_config implies connect (#8433 )	2020-08-07 12:02:02 +02:00
freddygv	15c3cfce5e	PR comments and addtl tests	2020-08-05 16:07:11 -06:00
Daniel Nephin	3b82ad0955	Rename NewClient/NewServer Now that duplicate constructors have been removed we can use the shorter names for the single constructor.	2020-08-05 14:00:55 -04:00
Daniel Nephin	5acf01ceeb	Remove LogOutput from Server	2020-08-05 14:00:44 -04:00
Daniel Nephin	0c5428eea8	Remove LogOutput from Client	2020-08-05 14:00:42 -04:00
Daniel Nephin	e8ee2cf2f7	Pass a logger to ConnPool and yamux, instead of an io.Writer Allowing us to remove the LogOutput field from config.	2020-08-05 13:25:08 -04:00
Daniel Nephin	1e17a0c3e1	config: Remove unused field	2020-08-05 13:25:08 -04:00
freddygv	0956624e39	collect GatewayServices from iter in a function	2020-07-31 13:30:40 -06:00
Freddy	f1e8addbdf	Avoid panics during shutdown routine (#8412 )	2020-07-30 11:11:10 -06:00
freddygv	aa6c59dbfc	end to end changes to pass gatewayservices to /ui/services/	2020-07-30 10:21:11 -06:00
Matt Keeler	dbb461a5d3	Allow setting verify_incoming* when using auto_encrypt or auto_config (#8394 ) Ensure that enabling AutoConfig sets the tls configurator properly This also refactors the TLS configurator a bit so the naming doesn’t imply only AutoEncrypt as the source of the automatically setup TLS cert info.	2020-07-30 10:15:12 -04:00
Matt Keeler	34034b76f5	Agent Auto Config: Implement Certificate Generation (#8360 ) Most of the groundwork was laid in previous PRs between adding the cert-monitor package to extracting the logic of signing certificates out of the connect_ca_endpoint.go code and into a method on the server. This also refactors the auto-config package a bit to split things out into multiple files.	2020-07-28 15:31:48 -04:00
Matt Keeler	5c2c762106	Move connect root retrieval and cert signing logic out of the RPC endpoints (#8364 ) The code now lives on the Server type itself. This was done so that all of this could be shared with auto config certificate signing.	2020-07-24 10:00:51 -04:00
Matt Keeler	2ee9fe0a4d	Move generation of the CA Configuration from the agent code into a method on the RuntimeConfig (#8363 ) This allows this to be reused elsewhere.	2020-07-23 16:05:28 -04:00
Daniel Nephin	3d115a62fd	Merge pull request #8323 from hashicorp/dnephin/add-event-publisher-2 stream: close subscriptions on shutdown	2020-07-23 13:12:50 -04:00
Matt Keeler	2713c0e682	Refactor the agentpb package (#8362 ) First move the whole thing to the top-level proto package name. Secondly change some things around internally to have sub-packages.	2020-07-23 11:24:20 -04:00
Daniel Nephin	ed69feca6d	stream: close all subs when EventProcessor is shutdown.	2020-07-22 19:04:10 -04:00
Daniel Nephin	a99a4103bd	stream: fix overallocation in filter And add tests	2020-07-22 19:04:10 -04:00
Daniel Nephin	9ed61fd160	state: speed up TestStateStore_ServicesByNodeMeta Make watchLimit a var so that we can patch it in tests and reduce the time spent creating state.	2020-07-22 16:57:06 -04:00
Daniel Nephin	0402dd7ac5	state: Use subtests in TestStateStore_ServicesByNodeMeta These subtests make it much easier to identify the slow part of the test, but they also help enumerate all the different cases which are being tested.	2020-07-22 16:39:09 -04:00
Daniel Nephin	3570ce6566	Merge pull request #7948 from hashicorp/dnephin/buffer-test-logs testutil: NewLogBuffer - buffer logs until a test fails	2020-07-21 15:21:52 -04:00
Matt Keeler	3c09482864	Merge pull request #8311 from hashicorp/bugfix/auto-encrypt-token-update	2020-07-21 13:15:27 -04:00
Daniel Nephin	51efba2c7d	testutil: NewLogBuffer - buffer logs until a test fails Replaces #7559 Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case. Attaching log output from background goroutines also cause data races. If the goroutine outlives the test, it will race with the test being marked done. Previously this was noticed as a panic when logging, but with the race detector enabled it is shown as a data race. The previous solution did not address the problem of correct test attribution because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long. This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful. All of the logs are printed from the test goroutine, so they should be associated with the correct test. Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly. Related: https://github.com/golang/go/issues/38458 (may be fixed in go1.15) https://github.com/golang/go/issues/38382#issuecomment-612940030	2020-07-21 12:50:40 -04:00
Matt Keeler	9da8c51ac5	Fix issue with changing the agent token causing failure to renew the auto-encrypt certificate The fallback method would still work but it would get into a state where it would let the certificate expire for 10s before getting a new one. And the new one used the less secure RPC endpoint. This is also a pretty large refactoring of the auto encrypt code. I was going to write some tests around the certificate monitoring but it was going to be impossible to get a TestAgent configured in such a way that I could write a test that ran in less than an hour or two to exercise the functionality. Moving the certificate monitoring into its own package will allow for dependency injection and in particular mocking the cache types to control how it hands back certificates and how long those certificates should live. This will allow for exercising the main loop more than would be possible with it coupled so tightly with the Agent.	2020-07-21 12:19:25 -04:00
Daniel Nephin	9482961b1c	stream: handle empty event in TestEventSnapshot When the race detector is enabled we see this test fail occasionally. The reordering of execution seems to make it possible for the snapshot splice to happen before any events are published to the topicBuffers. We can handle this case in the test the same way it is handled by a subscription, by proceeding to the next event.	2020-07-20 18:20:02 -04:00
Daniel Nephin	ecccb30690	state: update calls that are no longer state methods In a previous commit these methods were changed to functions, so remove the Store paramter.	2020-07-16 15:46:10 -04:00
Daniel Nephin	2008884241	state: un-method funcs that don't use their receiver This change was mostly automated with the following First generate a list of functions with: git grep -o 'Store) $[^(]\+$(tx \txn' ./agent/consul/state \| awk '{print $2}' \| grep -o '^[^(]\+' Then the list was curated a bit with trial/error to remove and add funcs as necessary. Finally the replacement was done with: dir=agent/consul/state file=${1-funcnames} while read fn; do echo "$fn" sed -i -e "s/(s \Store) $fn(/$fn(/" $dir/.go sed -i -e "s/s\.$fn(/$fn(/" $dir/.go sed -i -e "s/s\.store\.$fn(/$fn(/" $dir/*.go done < $file	2020-07-16 15:30:39 -04:00
Daniel Nephin	63b153df8c	store: convert methods that don't use their receiver to functions Making these functions allows them to be used without introducing an artificial dependency on the struct. Many of these will be called from streaming Event processors, which do not have a store. This change is being made ahead of the streaming work to get to reduce the size of the streaming diff.	2020-07-16 15:30:10 -04:00
Daniel Nephin	c07acbeb6b	stream: Add forceClose and refactor subscription filtering Move the subscription context to Next. context.Context should generally never be stored in a struct because it makes that struct only valid while the context is valid. This is rarely obvious from the caller. Adds a forceClosed channel in place of the old context, and uses the new context as a way for the caller to stop the Subscription blocking. Remove some recursion out of bufferImte.Next. The caller is already looping so we can continue in that loop instead of recursing. This ensures currentItem is updated immediately (which probably does not matter in practice), and also removes the chance that we overflow the stack. NextNoBlock and FollowAfter do not need to handle bufferItem.Err, the caller already handles it. Moves filter to a method to simplify Next, and more explicitly separate filtering from looping. Also improve some godoc Only unwrap itemBuffer.Err when necessary	2020-07-14 15:57:47 -04:00
Daniel Nephin	f19f8e99bb	stream: Improve docstrings Also rename ResumeStrema to EndOfEmptySnapshot to be more consistent with other framing events Co-authored-by: Paul Banks <banks@banksco.de>	2020-07-14 15:57:47 -04:00
Daniel Nephin	fc1c2ae412	stream: change Topic to an interface Consumers of the package can decide on which type to use for the Topic. In the future we may use a gRPC type for the topic.	2020-07-14 15:57:47 -04:00
Daniel Nephin	6fa36e3aee	state: Move change processing out of EventPublisher EventPublisher was receiving TopicHandlers, which had a couple of problems: - ChangeProcessors were being grouped by Topic, but they completely ignored the topic and were performed on every change - ChangeProcessors required EventPublisher to be aware of database changes By moving ChangeProcesors out of EventPublisher, and having Publish accept events instead of changes, EventPublisher no longer needs to be aware of these things. Handlers is now only SnapshotHandlers, which are still mapped by Topic. Also allows us to remove the small 'db' package that had only two types. They can now be unexported types in state.	2020-07-14 15:57:47 -04:00
Daniel Nephin	48c766d2c6	server: Abandom state store to shutdown EventPublisher So that we don't leak goroutines	2020-07-14 15:57:47 -04:00
Daniel Nephin	889c57fd2d	stream: unexport identifiers Now that EventPublisher is part of stream a lot of the internals can be hidden	2020-07-14 15:57:47 -04:00
Daniel Nephin	4fa0fdc0e0	stream: Move EventPublisher to stream package The EventPublisher is the central hub of the PubSub system. It is toughly coupled with much of stream. Some stream internals were exported exclusively for EventPublisher. The two Subscribe cases (with or without index) were also awkwardly split between two packages. By moving EventPublisher into stream they are now both in the same package (although still in different files).	2020-07-14 15:57:47 -04:00
Daniel Nephin	489876c86b	state: Make handleACLUpdate async once again So that we keep as much as possible out of the FSM commit hot path.	2020-07-14 15:57:47 -04:00
Daniel Nephin	5f9db94956	state: Use interface for Txn Also store the index in Changes instead of the Txn. This change is in preparation for movinng EventPublisher to the stream package, and making handleACLUpdates async once again.	2020-07-14 15:57:46 -04:00
Daniel Nephin	a709ed1ab5	stream.Subscription unexport fields and additiona docstrings	2020-07-14 15:57:46 -04:00
Daniel Nephin	17b833b4c9	Add a context for stopping EventPublisher goroutine	2020-07-14 15:57:46 -04:00
Daniel Nephin	2c8342f115	EventPublisher: Make Unsubscribe a function on Subscription It is critical that Unsubscribe be called with the same pointer to a SubscriptionRequest that was used to create the Subscription. The docstring made that clear, but it sill allowed a caler to get it wrong by creating a new SubscriptionRequest. By hiding this detail from the caller, and only exposing an Unsubscribe method, it should be impossible to fail to Unsubscribe. Also update some godoc strings.	2020-07-14 15:57:46 -04:00
Daniel Nephin	1622bb3a45	EventPublisher: handleACL changes synchronously Use a separate lock for subscriptions.ByToken to allow it to happen synchronously in the commit flow. This removes the need to create a new txn for the goroutine, and removes the need for EventPublisher to contain a reference to DB.	2020-07-14 15:57:46 -04:00
Daniel Nephin	effab15131	stream.EventSnapshot: reduce the fields on the struct Many of the fields are only needed in one place, and by using a closure they can be removed from the struct. This reduces the scope of the variables making it esier to see how they are used.	2020-07-14 15:57:45 -04:00
Daniel Nephin	a5cf933fe8	stream.EventBuffer: Seed the fuzz test with time.Now() Otherwise the test will run with exactly the same values each time. By printing the seed we can attempt to reproduce the test by adding an env var to override the seed	2020-07-14 15:57:45 -04:00
Daniel Nephin	bbe7272d8e	state: memdb_wrapper.go -> memdb.go Renaming in a separate commit so that git can merge changes to the file.	2020-07-14 15:57:45 -04:00
Daniel Nephin	2a8a8f7b8d	state: publish changes from Commit Make topicRegistry use functions instead of unbound methods Use a regular memDB in EventPublisher to remove a reference cycle Removes the need for EventPublisher to use a store	2020-07-14 15:57:45 -04:00
Daniel Nephin	f5ecd5de5f	EventPublisher: docstrings and getTopicBuffer also rename commitCh -> publishCh	2020-07-14 15:57:45 -04:00
Daniel Nephin	555cfe52d9	ProcessChanges: use stream.Event Also remove secretHash, which was used to hash tokens. We don't expose these tokens anywhere, so we can use the string itself instead of a Hash. Fix acl_events_test.go for storing a structs type.	2020-07-14 15:57:45 -04:00
Daniel Nephin	4e0bc8013b	stream: Use local types for Event Topic SubscriptionRequest	2020-07-14 15:57:45 -04:00
Daniel Nephin	aacd514dca	Rename stream_publisher.go -> event_publisher.go	2020-07-14 15:57:44 -04:00
Daniel Nephin	c0b0109e80	Add streaming package with Subscription and Snapshot components. The remaining files from 7965767de0bd62ab07669b85d6879bd5f815d157 Co-authored-by: Paul Banks <banks@banksco.de>	2020-07-14 15:57:44 -04:00
Chris Piraino	4d857d117f	Set enterprise metadata after resolving the token (#8302 ) The token can encode enterprise metadata information, and we must make sure we set that on the reply so that we can correct filter ACLs.	2020-07-13 13:39:57 -05:00
Daniel Nephin	d98a4c1317	Merge pull request #8237 from hashicorp/dnephin/remove-acls-enabled-from-delegate Remove ACLsEnabled from delegate interface	2020-07-09 16:35:43 -04:00
Matt Keeler	4fb535ba48	Pass the Config and TLS Configurator into the AutoConfig constructor This is instead of having the AutoConfigBackend interface provide functions for retrieving them. NOTE: the config is not reloadable. For now this is fine as we don’t look at any reloadable fields. If that changes then we should provide a way to make it reloadable.	2020-07-08 12:36:11 -04:00
Matt Keeler	f2f32735ce	Rename (Server).forward to (Server).ForwardRPC Also get rid of the preexisting shim in server.go that existed before to have this name just call the unexported one.	2020-07-08 11:05:44 -04:00
Matt Keeler	d2e4869c7c	Refactor AutoConfig RPC to not have a direct dependency on the Server type Instead it has an interface which can be mocked for better unit testing that is deterministic and not prone to flakiness.	2020-07-08 11:05:44 -04:00
Daniel Nephin	5247ef4c70	Remove ACLsEnabled from delegate interface In all cases (oss/ent, client/server) this method was returning a value from config. Since the value is consistent, it doesn't need to be part of the delegate interface.	2020-07-03 17:00:20 -04:00
Daniel Nephin	5d36f98710	agent/consul: Add support for NotModified to two endpoints A query made with AllowNotModifiedResponse and a MinIndex, where the result has the same Index as MinIndex, will return an empty response with QueryMeta.NotModified set to true. Co-authored-by: Pierre Souchay <pierresouchay@users.noreply.github.com>	2020-07-02 17:05:46 -04:00
Matt Keeler	f8e8f48125	Merge pull request #8211 from hashicorp/bugfix/auto-encrypt-various	2020-07-02 09:49:49 -04:00
Yury Evtikhov	10361dd210	DNS: add IsErrQueryNotFound function for easier error evaluation	2020-07-01 03:41:44 +01:00
Matt Keeler	2ddcba00c6	Overwrite agent leaf cert trust domain on the servers	2020-06-30 09:59:08 -04:00
Matt Keeler	19040f1166	Store the Connect CA rate limiter on the server This fixes a bug where auto_encrypt was operating without utilizing a common rate limiter.	2020-06-30 09:59:07 -04:00
Matt Keeler	39b567a55a	Fix auto_encrypt IP/DNS SANs The initial auto encrypt CSR wasn’t containing the user supplied IP and DNS SANs. This fixes that. Also We were configuring a default :: IP SAN. This should be ::1 instead and was fixed.	2020-06-30 09:59:07 -04:00
R.B. Boyer	462f0f37ed	connect: various changes to make namespaces for intentions work more like for other subsystems (#8194 ) Highlights: - add new endpoint to query for intentions by exact match - using this endpoint from the CLI instead of the dump+filter approach - enforcing that OSS can only read/write intentions with a SourceNS or DestinationNS field of "default". - preexisting OSS intentions with now-invalid namespace fields will delete those intentions on initial election or for wildcard namespaces an attempt will be made to downgrade them to "default" unless one exists. - also allow the '-namespace' CLI arg on all of the intention subcommands - update lots of docs	2020-06-26 16:59:15 -05:00
Daniel Nephin	a891ee8428	Merge pull request #8176 from hashicorp/dnephin/add-linter-unparam-1 lint: add unparam linter and fix some of the issues	2020-06-25 15:34:48 -04:00
Matt Keeler	29d0cfdd7d	Fix go routine leak in auto encrypt ca roots tracking	2020-06-24 17:09:50 -04:00
Matt Keeler	25a4f3c83b	Allow cancelling blocking queries in response to shutting down.	2020-06-24 17:09:50 -04:00
Daniel Nephin	010a609912	Fix a bunch of unparam lint issues	2020-06-24 13:00:14 -04:00
Matt Keeler	15e7b3940c	Ensure that retryLoopBackoff can be cancelled We needed to pass a cancellable context into the limiter.Wait instead of context.Background. So I made the func take a context instead of a chan as most places were just passing through a Done chan from a context anyways. Fix go routine leak in the gateway locator	2020-06-24 12:41:08 -04:00
Matt Keeler	d6e05482ab	Allow cancelling startup when performing auto-config (#8157 ) Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>	2020-06-19 15:16:00 -04:00
Daniel Nephin	b0ba546a1f	Remove bytesToUint64 from agent/consul	2020-06-18 12:45:43 -04:00
Daniel Nephin	a00f007c5e	Remove unused private IP code from agent/consul	2020-06-18 12:40:38 -04:00
Matt Keeler	3dbbd2d37d	Implement Client Agent Auto Config There are a couple of things in here. First, just like auto encrypt, any Cluster.AutoConfig RPC will implicitly use the less secure RPC mechanism. This drastically modifies how the Consul Agent starts up and moves most of the responsibilities (other than signal handling) from the cli command and into the Agent.	2020-06-17 16:49:46 -04:00
Matt Keeler	8b7d669a27	Allow the Agent its its child Client/Server to share a connection pool This is needed so that we can make an AutoConfig RPC at the Agent level prior to creating the Client/Server.	2020-06-17 16:19:33 -04:00
Matt Keeler	51c3a605ad	Merge pull request #8035 from hashicorp/feature/auto-config/server-rpc	2020-06-17 16:07:25 -04:00
Chris Piraino	79a862d019	Remove ACLEnforceVersion8 from tests (#8138 ) The field had been deprecated for a while and was recently removed, however a PR which added these tests prior to removal was merged.	2020-06-17 14:58:01 -05:00
Daniel Nephin	5afcf5c1bc	Merge pull request #8034 from hashicorp/dnephin/add-linter-staticcheck-4 ci: enable SA4006 staticcheck check and add ineffassign	2020-06-17 12:16:02 -04:00
Matt Keeler	9b01f9423c	Implement the insecure version of the Cluster.AutoConfig RPC endpoint Right now this is only hooked into the insecure RPC server and requires JWT authorization. If no JWT authorizer is setup in the configuration then we inject a disabled “authorizer” to always report that JWT authorization is disabled.	2020-06-17 11:25:29 -04:00
Pierre Souchay	d31691dc87	gossip: Ensure that metadata of Consul Service is updated (#7903 ) While upgrading servers to a new version, I saw that metadata of existing servers are not upgraded, so the version and raft meta is not up to date in catalog. The only way to do it was to: * update Consul server * make it leave the cluster, then metadata is accurate That's because the optimization to avoid updating catalog does not take into account metadata, so no update on catalog is performed.	2020-06-17 12:16:13 +02:00
Daniel Nephin	d345cd8d30	ci: Add ineffsign linter And fix an additional ineffective assignment that was not caught by staticcheck	2020-06-16 17:32:50 -04:00
Daniel Nephin	a9851e1812	Merge pull request #8070 from hashicorp/dnephin/add-gofmt-simplify ci: Enable gofmt simplify	2020-06-16 17:18:38 -04:00
Matt Keeler	9f7b22a5eb	Agent Auto Configuration: Configuration Syntax Updates (#8003 )	2020-06-16 15:03:22 -04:00
Daniel Nephin	068b43df90	Enable gofmt simplify Code changes done automatically with 'gofmt -s -w'	2020-06-16 13:21:11 -04:00
Daniel Nephin	cb050b280c	ci: enable SA4006 staticcheck check And fix the 'value not used' issues. Many of these are not bugs, but a few are tests not checking errors, and one appears to be a missed error in non-test code.	2020-06-16 13:10:11 -04:00
Daniel Nephin	f7c84ad802	Rename txnWrapper to txn	2020-06-16 13:06:02 -04:00
Daniel Nephin	32aa3ada35	Rename db	2020-06-16 13:04:31 -04:00
Daniel Nephin	deef6fcc32	Handle return value from txn.Commit	2020-06-16 13:04:31 -04:00
Daniel Nephin	59bac0f99d	state: Update docstrings for changeTrackerDB and txn And un-embed memdb.DB to prevent accidental access to underlying methods.	2020-06-16 13:04:31 -04:00
Paul Banks	f6ac08be04	state: track changes so that they may be used to produce change events	2020-06-16 13:04:29 -04:00
Matt Keeler	d3881dd754	ACL Node Identities (#7970 ) A Node Identity is very similar to a service identity. Its main targeted use is to allow creating tokens for use by Consul agents that will grant the necessary permissions for all the typical agent operations (node registration, coordinate updates, anti-entropy). Half of this commit is for golden file based tests of the acl token and role cli output. Another big updates was to refactor many of the tests in agent/consul/acl_endpoint_test.go to use the same style of tests and the same helpers. Besides being less boiler plate in the tests it also uses a common way of starting a test server with ACLs that should operate without any warnings regarding deprecated non-uuid master tokens etc.	2020-06-16 12:54:27 -04:00
freddygv	0f97b7d63d	Fixup stray sid references	2020-06-12 13:47:43 -06:00
freddygv	19e3954603	Move compound service names to use ServiceName type	2020-06-12 13:47:43 -06:00
freddygv	15c74d6943	Move GatewayServices out of Internal	2020-06-12 13:46:47 -06:00
Daniel Nephin	08f1ed16b4	Merge pull request #7900 from hashicorp/dnephin/add-linter-staticcheck-2 intentions: fix a bug in Intention.SetHash	2020-06-09 15:40:20 -04:00
Daniel Nephin	62a1125c7b	Merge pull request #8037 from hashicorp/dnephin/add-linter-staticcheck-5 ci: Enabled SA2002 staticcheck check	2020-06-09 15:31:24 -04:00
Hans Hasselberg	242994a016	acl: do not resolve local tokens from remote dcs (#8068 )	2020-06-09 21:13:09 +02:00
Hans Hasselberg	98eea08d3b	Tokens converted from legacy ACLs get their Hash computed (#8047 ) * Fixes #5606: Tokens converted from legacy ACLs get their Hash computed This allows new style token replication to work for legacy tokens as well when they change. * tests: fix timestamp comparison Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>	2020-06-08 21:44:06 +02:00
Daniel Nephin	caa692deea	ci: Enabled SA2002 staticcheck check And handle errors in the main test goroutine	2020-06-05 17:50:11 -04:00
Daniel Nephin	ce6cc094a1	intentions: fix a bug in Intention.SetHash Found using staticcheck. binary.Write does not accept int types without a size. The error from binary.Write was ignored, so we never saw this error. Casting the data to uint64 produces a correct hash. Also deprecate the Default{Addr,Port} fields, and prevent them from being encoded. These fields will always be empty and are not used. Removing these would break backwards compatibility, so they are left in place for now. Co-authored-by: Hans Hasselberg <me@hans.io>	2020-06-05 14:51:43 -04:00
R.B. Boyer	b88bd6660e	server: don't activate federation state replication or anti-entropy until all servers are running 1.8.0+ (#8014 )	2020-06-04 16:05:27 -05:00
Hans Hasselberg	0f343332da	Merge pull request #7966 from hashicorp/pool_improvements Agent connection pool cleanup	2020-06-04 08:56:26 +02:00
Matt Keeler	771c613dae	Fix legacy management tokens in unupgraded secondary dcs (#7908 ) The ACL.GetPolicy RPC endpoint was supposed to return the “parent” policy and not always the default policy. In the case of legacy management tokens the parent policy was supposed to be “manage”. The result of us not sending this properly was that operations that required specifically a management token such as saving a snapshot would not work in secondary DCs until they were upgraded.	2020-06-03 11:22:22 -04:00
Matt Keeler	0e4c65d422	Fix segfault due to race condition for checking server versions (#7957 ) The ACL monitoring routine uses c.routers to check for server version updates. Therefore it needs to be started after initializing the routers.	2020-06-03 10:36:32 -04:00
Daniel Nephin	99eb583ebc	Replace goe/verify.Values with testify/require.Equal (#7993 ) * testing: replace most goe/verify.Values with require.Equal One difference between these two comparisons is that go/verify considers nil slices/maps to be equal to empty slices/maps, where as testify/require does not, and does not appear to provide any way to enable that behaviour. Because of this difference some expected values were changed from empty slices to nil slices, and some calls to verify.Values were left. * Remove github.com/pascaldekloe/goe/verify Reduce the number of assertion packages we use from 2 to 1	2020-06-02 12:41:25 -04:00
R.B. Boyer	833211c14c	acl: allow auth methods created in the primary datacenter to optionally create global tokens (#7899 )	2020-06-01 11:44:47 -05:00
R.B. Boyer	ffb9c7d6f7	acl: remove the deprecated `acl_enforce_version_8` option (#7991 ) Fixes #7292	2020-05-29 16:16:03 -05:00
Jono Sosulska	c554ba9e10	Replace whitelist/blacklist terminology with allowlist/denylist (#7971 ) * Replace whitelist/blacklist terminology with allowlist/denylist	2020-05-29 14:19:16 -04:00
Hans Hasselberg	1fbc1d4777	pool: remove timeout parameter Timeout was never used in a meaningful way by callers, which is why it is now entirely internal to the pool.	2020-05-29 08:21:28 +02:00
Hans Hasselberg	ad03f863ff	pool: remove useTLS and ForceTLS In the past TLS usage was enforced with these variables, but these days this decision is made by TLSConfigurator and there is no reason to keep using the variables.	2020-05-29 08:21:24 +02:00
Hans Hasselberg	c45432014b	pool: remove version The version field has been used to decide which multiplexing to use. It was introduced in `2457293dce`. But this is 6y ago and there is no need for this differentiation anymore.	2020-05-28 23:06:01 +02:00
Daniel Nephin	c88fae0aac	ci: Add staticcheck and fix most errors Three of the checks are temporarily disabled to limit the size of the diff, and allow us to enable all the other checks in CI. In a follow up we can fix the issues reported by the other checks one at a time, and enable them.	2020-05-28 11:59:58 -04:00
R.B. Boyer	77f2e54618	create lib/stringslice package (#7934 )	2020-05-27 11:47:32 -05:00
R.B. Boyer	ddd0a13e27	agent: handle re-bootstrapping in a secondary datacenter when WAN federation via mesh gateways is configured (#7931 ) The main fix here is to always union the `primary-gateways` list with the list of mesh gateways in the primary returned from the replicated federation states list. This will allow any replicated (incorrect) state to be supplemented with user-configured (correct) state in the config file. Eventually the game of random selection whack-a-mole will pick a winning entry and re-replicate the latest federation states from the primary. If the user-configured state is actually the incorrect one, then the same eventual correct selection process will work in that case, too. The secondary fix is actually to finish making wanfed-via-mgws actually work as originally designed. Once a secondary datacenter has replicated federation states for the primary AND managed to stand up its own local mesh gateways then all of the RPCs from a secondary to the primary SHOULD go through two sets of mesh gateways to arrive in the consul servers in the primary (one hop for the secondary datacenter's mesh gateway, and one hop through the primary datacenter's mesh gateway). This was neglected in the initial implementation. While everything works, ideally we should treat communications that go around the mesh gateways as just provided for bootstrapping purposes. Now we heuristically use the success/failure history of the federation state replicator goroutine loop to determine if our current mesh gateway route is working as intended. If it is, we try using the local gateways, and if those don't work we fall back on trying the primary via the union of the replicated state and the go-discover configuration flags. This can be improved slightly in the future by possibly initializing the gateway choice to local on startup if we already have replicated state. This PR does not address that improvement. Fixes #7339	2020-05-27 11:31:10 -05:00
R.B. Boyer	1b5023cb69	connect: ensure proxy-defaults protocol is used for upstreams (#7938 )	2020-05-21 16:08:39 -05:00
Daniel Nephin	04bf0f3490	Update agent/consul/state/catalog.go Co-authored-by: Hans Hasselberg <me@hans.io>	2020-05-20 16:34:14 -04:00
Daniel Nephin	3f607d9ef0	state: use an error to indicate compare failed Errors are values. We can use the error value to identify the 'comparison failed' case which makes the function easier to use and should make it harder to miss handle the error case	2020-05-20 12:43:33 -04:00
Pierre Souchay	e9d176db2a	Allow to restrict servers that can join a given Serf Consul cluster. (#7628 ) Based on work done in https://github.com/hashicorp/memberlist/pull/196 this allows to restrict the IP ranges that can join a given Serf cluster and be a member of the cluster. Restrictions on IPs can be done separatly using 2 new differents flags and config options to restrict IPs for LAN and WAN Serf.	2020-05-20 11:31:19 +02:00
Daniel Nephin	1bbea2751f	consul/state: refactor tnxService to avoid missed cases Handling errors at the end of a log switch/case block is somewhat brittle. This block included a couple cases where errors were ignored, but it was not obvious the way it was written. This change moves all error handling into each case block. There is still potentially one case where err is ignored, which will be handled in a follow up.	2020-05-19 16:50:14 -04:00
Daniel Nephin	c662f0f0de	Fix a number of problems found by staticcheck Some of these problems are minor (unused vars), but others are real bugs (ignored errors). Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>	2020-05-19 16:50:14 -04:00
Daniel Nephin	5c99109dd9	Remove unused var The usage of this var was removed in `b92f895c23`. Found by using staticcheck	2020-05-19 16:50:14 -04:00
Chris Piraino	79468793f6	Do not return an error if requested service is not a gateway This commit converts the previous error into just a Warn-level log message. By returning an error when the requested service was not a gateway, we did not appropriately update envoy because the cache Fetch returned an error and thus did not propagate the update through proxycfg and xds packages.	2020-05-18 09:08:04 -05:00
Aleksandr Zagaevskiy	a75e3d9051	Preserve ModifyIndex for unchanged entry in KVS TXN (#7832 )	2020-05-14 13:25:04 -06:00
Matt Keeler	acccdbe45c	Fix identity resolution on clients and in secondary dcs (#7862 ) Previously this happened to be using the method on the Server/Client that was meant to allow the ACLResolver to locally resolve tokens. On Servers that had tokens (primary or secondary dc + token replication) this function would lookup the token from raft and return the ACLIdentity. On clients this was always a noop. We inadvertently used this function instead of creating a new one when we added logging accessor ids for permission denied RPC requests. With this commit, a new method is used for resolving the identity properly via the ACLResolver which may still resolve locally in the case of being on a server with tokens but also supports remote token resolution.	2020-05-13 13:00:08 -04:00
Chris Piraino	7a7760bfd5	Make new gateway tests compatible with enterprise (#7856 )	2020-05-12 13:48:20 -05:00
Daniel Nephin	600645b5f9	Add unconvert linter To find unnecessary type convertions	2020-05-12 13:47:25 -04:00
Drew Bailey	c9d0b83277	Value is already an int, remove type cast	2020-05-12 13:13:09 -04:00
R.B. Boyer	1efafd7523	acl: add auth method for JWTs (#7846 )	2020-05-11 20:59:29 -05:00
Chris Piraino	c21052457b	Return early from updateGatewayServices if nothing to update (#7838 ) * Return early from updateGatewayServices if nothing to update Previously, we returned an empty slice of gatewayServices, which caused us to accidentally delete everything in the memdb table * PR comment and better formatting	2020-05-11 14:46:48 -05:00
Chris Piraino	4d6751bf16	Fix TestInternal_GatewayServiceDump_Ingress (#7840 ) Protocol was added as a field on GatewayServices after GatewayServiceDump PR branch was created.	2020-05-11 14:46:31 -05:00
R.B. Boyer	7414a3fa53	cli: ensure 'acl auth-method update' doesn't deep merge the Config field (#7839 )	2020-05-11 14:21:17 -05:00
Chris Piraino	74c0543ef2	PR comment and better formatting	2020-05-11 14:04:59 -05:00
Chris Piraino	fb9ee9d892	Return early from updateGatewayServices if nothing to update Previously, we returned an empty slice of gatewayServices, which caused us to accidentally delete everything in the memdb table	2020-05-11 12:38:04 -05:00
Freddy	b3ec383d04	Gateway Services Nodes UI Endpoint (#7685 ) The endpoint supports queries for both Ingress Gateways and Terminating Gateways. Used to display a gateway's linked services in the UI.	2020-05-11 11:35:17 -06:00
Chris Piraino	429d0cedd2	Restoring config entries updates the gateway-services table (#7811 ) - Adds a new validateConfigEntryEnterprise function - Also fixes some state store tests that were failing in enterprise	2020-05-08 13:24:33 -05:00
Freddy	c32a4f1ece	Fix up enterprise compatibility for gateways (#7813 )	2020-05-08 09:44:34 -06:00
Jono Sosulska	9b363e9f23	Fix spelling of deregister (#7804 )	2020-05-08 10:03:45 -04:00
Chris Piraino	5105bf3d67	Require individual services in ingress entry to match protocols (#7774 ) We require any non-wildcard services to match the protocol defined in the listener on write, so that we can maintain a consistent experience through ingress gateways. This also helps guard against accidental misconfiguration by a user. - Update tests that require an updated protocol for ingress gateways	2020-05-06 16:09:24 -05:00
Chris Piraino	905279f5d1	A proxy-default config entry only exists in the default namespace	2020-05-06 15:06:14 -05:00
Chris Piraino	114a18e890	Remove outdated comment	2020-05-06 15:06:14 -05:00
Kyle Havlovitz	89e6b16815	Filter wildcard gateway services to match listener protocol This now requires some type of protocol setting in ingress gateway tests to ensure the services are not filtered out. - small refactor to add a max(x, y) function - Use internal configEntryTxn function and add MaxUint64 to lib	2020-05-06 15:06:13 -05:00
Chris Piraino	f40833d094	Allow Hosts field to be set on an ingress config entry - Validate that this cannot be set on a 'tcp' listener nor on a wildcard service. - Add Hosts field to api and test in consul config write CLI - xds: Configure envoy with user-provided hosts from ingress gateways	2020-05-06 15:06:13 -05:00
Kyle Havlovitz	711d1389aa	Support multiple listeners referencing the same service in gateway definitions	2020-05-06 15:06:13 -05:00
Kyle Havlovitz	247f9eaf13	Allow ingress gateways to route traffic based on Host header This commit adds the necessary changes to allow an ingress gateway to route traffic from a single defined port to multiple different upstream services in the Consul mesh. To do this, we now require all HTTP requests coming into the ingress gateway to specify a Host header that matches "<service-name>.*" in order to correctly route traffic to the correct service. - Differentiate multiple listener's route names by port - Adds a case in xds for allowing default discovery chains to create a route configuration when on an ingress gateway. This allows default services to easily use host header routing - ingress-gateways have a single route config for each listener that utilizes domain matching to route to different services.	2020-05-06 15:06:13 -05:00
R.B. Boyer	a854e4d9c5	acl: oss plumbing to support auth method namespace rules in enterprise (#7794 ) This includes website docs updates.	2020-05-06 13:48:04 -05:00
R.B. Boyer	3242d0816d	test: make the kube auth method test helper use freeport (#7788 )	2020-05-05 16:55:21 -05:00
Hans Hasselberg	096a2f2f02	network_segments: stop advertising segment tags	2020-05-05 21:32:05 +02:00
Hans Hasselberg	995a24b8e4	agent: refactor to use a single addrFn	2020-05-05 21:08:10 +02:00
Hans Hasselberg	6994c0d47f	agent: rename local/global to src/dst	2020-05-05 21:07:34 +02:00
Chris Piraino	69b44fb942	Construct a default destination if one does not exist for service-router (#7783 )	2020-05-05 10:49:50 -05:00
R.B. Boyer	22eb016153	acl: add MaxTokenTTL field to auth methods (#7779 ) When set to a non zero value it will limit the ExpirationTime of all tokens created via the auth method.	2020-05-04 17:02:57 -05:00
R.B. Boyer	ca52ba7068	acl: add DisplayName field to auth methods (#7769 ) Also add a few missing acl fields in the api.	2020-05-04 15:18:25 -05:00
Hans Hasselberg	c4093c87cc	agent: don't let left nodes hold onto their node-id (#7747 )	2020-05-04 18:39:08 +02:00
Matt Keeler	daec810e34	Merge pull request #7714 from hashicorp/oss-sync/msp-agent-token	2020-05-04 11:33:50 -04:00
R.B. Boyer	9533451a63	acl: refactor the authmethod.Validator interface (#7760 ) This is a collection of refactors that make upcoming PRs easier to digest. The main change is the introduction of the authmethod.Identity struct. In the one and only current auth method (type=kubernetes) all of the trusted identity attributes are both selectable and projectable, so they were just passed around as a map[string]string. When namespaces were added, this was slightly changed so that the enterprise metadata can also come back from the login operation, so login now returned two fields. Now with some upcoming auth methods it won't be true that all identity attributes will be both selectable and projectable, so rather than update the login function to return 3 pieces of data it seemed worth it to wrap those fields up and give them a proper name.	2020-05-01 17:35:28 -05:00
R.B. Boyer	54ba8e3868	acl: change authmethod.Validator to take a logger (#7758 )	2020-05-01 15:55:26 -05:00
R.B. Boyer	b282268408	sdk: extracting testutil.RequireErrorContains from various places it was duplicated (#7753 )	2020-05-01 11:56:34 -05:00
Hans Hasselberg	51549bd232	rpc: oss changes for network area connection pooling (#7735 )	2020-04-30 22:12:17 +02:00
Freddy	021f0ee36e	Watch fallback channel for gateways that do not exist (#7715 ) Also ensure that WatchSets in tests are reset between calls to watchFired. Any time a watch fires, subsequent calls to watchFired on the same WatchSet will also return true even if there were no changes.	2020-04-29 16:52:27 -06:00
Matt Keeler	bec3fb7c18	Some boilerplate to allow for ACL Bootstrap disabling configurability	2020-04-28 09:42:46 -04:00
Freddy	137a2c32c6	TLS Origination for Terminating Gateways (#7671 )	2020-04-27 16:25:37 -06:00
freddygv	3afe816a94	Clean up dead code, issue addressed by passing ws to serviceGatewayNodes	2020-04-27 11:08:41 -06:00
freddygv	77bb2f1002	Fix internal endpoint test	2020-04-27 11:08:41 -06:00
freddygv	915db10903	Avoid deleting mappings for services linked to other gateways on dereg	2020-04-27 11:08:41 -06:00
freddygv	cd28d4125d	Re-fix bug in CheckConnectServiceNodes	2020-04-27 11:08:41 -06:00
freddygv	9f233dece2	Fix ConnectQueryBlocking test	2020-04-27 11:08:40 -06:00
freddygv	86342e4bca	Fix bug in CheckConnectServiceNodes Previously, if a blocking query called CheckConnectServiceNodes before the gateway-services memdb table had any entries, a nil watchCh would be returned when calling serviceTerminatingGatewayNodes. This means that the blocking query would not fire if a gateway config entry was added after the watch started. In cases where the blocking query started on proxy registration, the proxy could potentially never become aware of an upstream endpoint if that upstream was going to be represented by a gateway.	2020-04-27 11:08:40 -06:00
Matt Keeler	a1648c61ae	A couple testing helper updates (#7694 )	2020-04-27 12:17:38 -04:00
Kit Patella	e2467f4b2c	Merge pull request #7656 from hashicorp/feature/audit/oss-merge agent: stub out auditing functionality in OSS	2020-04-17 13:33:06 -07:00
Chris Piraino	6ef8ae9965	Fix bug where non-typical services are associated with gateways (#7662 ) On every service registration, we check to see if a service should be assassociated to a wildcard gateway-service. This fixes an issue where we did not correctly check to see if the service being registered was a "typical" service or not.	2020-04-17 11:24:34 -05:00
Kit Patella	927f584761	agent: stub out auditing functionality in OSS	2020-04-16 15:07:52 -07:00
Kyle Havlovitz	e9e8c0e730	Ingress Gateways for TCP services (#7509 ) * Implements a simple, tcp ingress gateway workflow This adds a new type of gateway for allowing Ingress traffic into Connect from external services. Co-authored-by: Chris Piraino <cpiraino@hashicorp.com>	2020-04-16 14:00:48 -07:00
Matt Keeler	6a78c24d67	Update the Client code to use the common version checking infra… (#7558 ) Also reduce the log level of some version checking messages on the server as they can be pretty noisy during upgrades and really are more for debugging purposes.	2020-04-14 11:54:27 -04:00
Matt Keeler	da893c36a1	Allow the bootstrap endpoint to be disabled in enterprise. (#7614 )	2020-04-14 11:45:39 -04:00
Pierre Souchay	1b4218a068	fix flaky TestReplication_FederationStates test due to race conditions (#7612 ) The test had two racy bugs related to memdb references. The first was when we initially populated data and retained the FederationState objects in a slice. Due to how the `inmemCodec` works these were actually the identical objects passed into memdb. The second was that the `checkSame` assertion function was reading from memdb and setting the RaftIndexes to zeros to aid in equality checks. This was mutating the contents of memdb which is a no-no. With this fix, the command: ``` i=0; while /usr/local/bin/go test -count=1 -timeout 30s github.com/hashicorp/consul/agent/consul -run '^(TestReplication_FederationStates)$'; do i=$((i + 1)); printf "$i "; done ``` That used to break on my machine in less than 20 runs is now running 150+ times without any issue. Might also fix #7575	2020-04-09 15:42:41 -05:00
Freddy	9eb1867fbb	Terminating gateway discovery (#7571 ) * Enable discovering terminating gateways * Add TerminatingGatewayServices to state store * Use GatewayServices RPC endpoint for ingress/terminating	2020-04-08 12:37:24 -06:00
Matt Keeler	0e7d3d93b3	Enable filtering language support for the v1/connect/intentions… (#7593 ) * Enable filtering language support for the v1/connect/intentions listing API * Update website for filtering of Intentions * Update website/source/api/connect/intentions.html.md	2020-04-07 11:48:44 -04:00
Matt Keeler	8aec09aa8f	Ensure that token clone copies the roles (#7577 )	2020-04-02 12:09:35 -04:00
Emre Savcı	2083b7b04d	agent: add len, cap while initializing arrays	2020-04-01 10:54:51 +02:00
Freddy	90576060bc	Add config entry for terminating gateways (#7545 ) This config entry will be used to configure terminating gateways. It accepts the name of the gateway and a list of services the gateway will represent. For each service users will be able to specify: its name, namespace, and additional options for TLS origination. Co-authored-by: Kyle Havlovitz <kylehav@gmail.com> Co-authored-by: Chris Piraino <cpiraino@hashicorp.com>	2020-03-31 13:27:32 -06:00
Kyle Havlovitz	c911174327	Add config entry/state for Ingress Gateways (#7483 ) * Add Ingress gateway config entry and other relevant structs * Add api package tests for ingress gateways * Embed EnterpriseMeta into ingress service struct * Add namespace fields to api module and test consul config write decoding * Don't require a port for ingress gateways * Add snakeJSON and camelJSON cases in command test * Run Normalize on service's ent metadata Sadly cannot think of a way to test this in OSS. * Every protocol requires at least 1 service * Validate ingress protocols * Update agent/structs/config_entry_gateways.go Co-authored-by: Chris Piraino <cpiraino@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com>	2020-03-31 11:59:10 -05:00
Matt Keeler	028654410c	Ensure server requirements checks are done against ALL known se… (#7491 ) Co-authored-by: Paul Banks <banks@banksco.de>	2020-03-27 12:31:43 -04:00
Daniel Nephin	b5c7d292e4	Merge pull request #7516 from hashicorp/dnephin/remove-unused-method agent: Remove unused method Encrypted from delegate interface	2020-03-26 14:17:58 -04:00
Daniel Nephin	bb8833a2d5	agent: Remove unused Encrypted from interface It appears to be unused. It looks like it has been around a while, I geuss at some point we stopped using this method.	2020-03-26 12:34:31 -04:00
Freddy	18d356899c	Enable CLI to register terminating gateways (#7500 ) * Enable CLI to register terminating gateways * Centralize gateway proxy configuration	2020-03-26 10:20:56 -06:00
Alejandro Baez	bafa69bb69	Add PolicyReadByName for API (#6615 )	2020-03-25 10:34:24 -04:00
Matt Keeler	80db61193c	Fix ACL mode advertisement and detection (#7451 ) These changes are necessary to ensure advertisement happens correctly even when datacenters are connected via network areas in Consul enterprise. This also changes how we check if ACLs can be upgraded within the local datacenter. Previously we would iterate through all LAN members. Now we just use the ServerLookup type to iterate through all known servers in the DC.	2020-03-16 12:54:45 -04:00
Freddy	709932f088	Update MSP token and filtering (#7431 )	2020-03-11 12:08:49 -06:00
R.B. Boyer	85a08bf8ed	server: strip local ACL tokens from RPCs during forwarding if crossing datacenters (#7419 ) Fixes #7414	2020-03-10 11:15:22 -05:00
Kyle Havlovitz	955ee64b95	Merge pull request #7373 from hashicorp/acl-segments-fix Add stub methods for ACL/segment bug fix from enterprise	2020-03-09 14:25:49 -07:00
R.B. Boyer	6adad71125	wan federation via mesh gateways (#6884 ) This is like a Möbius strip of code due to the fact that low-level components (serf/memberlist) are connected to high-level components (the catalog and mesh-gateways) in a twisty maze of references which make it hard to dive into. With that in mind here's a high level summary of what you'll find in the patch: There are several distinct chunks of code that are affected: * new flags and config options for the server * retry join WAN is slightly different * retry join code is shared to discover primary mesh gateways from secondary datacenters * because retry join logic runs in the agent and the results of that operation for primary mesh gateways are needed in the server there are some methods like `RefreshPrimaryGatewayFallbackAddresses` that must occur at multiple layers of abstraction just to pass the data down to the right layer. * new cache type `FederationStateListMeshGatewaysName` for use in `proxycfg/xds` layers * the function signature for RPC dialing picked up a new required field (the node name of the destination) * several new RPCs for manipulating a FederationState object: `FederationState:{Apply,Get,List,ListMeshGateways}` * 3 read-only internal APIs for debugging use to invoke those RPCs from curl * raft and fsm changes to persist these FederationStates * replication for FederationStates as they are canonically stored in the Primary and replicated to the Secondaries. * a special derivative of anti-entropy that runs in secondaries to snapshot their local mesh gateway `CheckServiceNodes` and sync them into their upstream FederationState in the primary (this works in conjunction with the replication to distribute addresses for all mesh gateways in all DCs to all other DCs) * a "gateway locator" convenience object to make use of this data to choose the addresses of gateways to use for any given RPC or gossip operation to a remote DC. This gets data from the "retry join" logic in the agent and also directly calls into the FSM. * RPC (`:8300`) on the server sniffs the first byte of a new connection to determine if it's actually doing native TLS. If so it checks the ALPN header for protocol determination (just like how the existing system uses the type-byte marker). * 2 new kinds of protocols are exclusively decoded via this native TLS mechanism: one for ferrying "packet" operations (udp-like) from the gossip layer and one for "stream" operations (tcp-like). The packet operations re-use sockets (using length-prefixing) to cut down on TLS re-negotiation overhead. * the server instances specially wrap the `memberlist.NetTransport` when running with gateway federation enabled (in a `wanfed.Transport`). The general gist is that if it tries to dial a node in the SAME datacenter (deduced by looking at the suffix of the node name) there is no change. If dialing a DIFFERENT datacenter it is wrapped up in a TLS+ALPN blob and sent through some mesh gateways to eventually end up in a server's :8300 port. * a new flag when launching a mesh gateway via `consul connect envoy` to indicate that the servers are to be exposed. This sets a special service meta when registering the gateway into the catalog. * `proxycfg/xds` notice this metadata blob to activate additional watches for the FederationState objects as well as the location of all of the consul servers in that datacenter. * `xds:` if the extra metadata is in place additional clusters are defined in a DC to bulk sink all traffic to another DC's gateways. For the current datacenter we listen on a wildcard name (`server.<dc>.consul`) that load balances all servers as well as one mini-cluster per node (`<node>.server.<dc>.consul`) * the `consul tls cert create` command got a new flag (`-node`) to help create an additional SAN in certs that can be used with this flavor of federation.	2020-03-09 15:59:02 -05:00
Matt Keeler	7584dfe8c8	Fix session backwards incompatibility with 1.6.x and earlier.	2020-03-05 15:34:55 -05:00
Kyle Havlovitz	7c57837908	Add stub methods for ACL/segment bug fix from enterprise	2020-03-02 10:30:23 -08:00
rerorero	2630a949f7	fix: Destroying a session that doesn't exist returns status cod… (#6905 ) fix #6840	2020-02-18 11:13:15 -05:00
Matt Keeler	b137060630	Allow the PolicyResolve and RoleResolve endpoints to process na… (#7296 )	2020-02-13 14:55:27 -05:00
R.B. Boyer	80b1165976	fix use of hclog logger (#7264 )	2020-02-12 09:37:16 -06:00
ShimmerGlass	68e0f6bf84	agent: add server raft.{last,applied}_index gauges (#6694 ) These metrics are useful for : * Tracking the rate of update to the db * Allow to have a rough idea of when an index originated	2020-02-11 10:50:18 +01:00
Hans Hasselberg	6739fe6e83	connect: add validations around intermediate cert ttl (#7213 )	2020-02-11 00:05:49 +01:00
R.B. Boyer	73ba5d9990	make the TestRPC_RPCMaxConnsPerClient test less flaky (#7255 )	2020-02-10 15:13:53 -06:00
Sarah Christoff	6678c8898a	Fix flaky TestAutopilot_BootstrapExpect (#7242 )	2020-02-10 14:52:58 -06:00
Kit Patella	55f19a9eb2	rpc: measure blocking queries (#7224 ) * agent: measure blocking queries * agent.rpc: update docs to mention we only record blocking queries * agent.rpc: make go fmt happy * agent.rpc: fix non-atomic read and decrement with bitwise xor of uint64 0 * agent.rpc: clarify review question * agent.rpc: today I learned that one must declare all variables before interacting with goto labels * Update agent/consul/server.go agent.rpc: more precise comment on `Server.queriesBlocking` Co-Authored-By: Paul Banks <banks@banksco.de> * Update website/source/docs/agent/telemetry.html.md agent.rpc: improve queries_blocking description Co-Authored-By: Paul Banks <banks@banksco.de> * agent.rpc: fix some bugs found in review * add a note about the updated counter behavior to telemetry.md * docs: add upgrade-specific note on consul.rpc.quer{y,ies_blocking} behavior Co-authored-by: Paul Banks <banks@banksco.de>	2020-02-10 10:01:15 -08:00
Matt Keeler	d0cd092e3b	Catalog + Namespace OSS changes. (#7219 ) * Various Prepared Query + Namespace things * Last round of OSS changes for a namespaced catalog	2020-02-10 10:40:44 -05:00
R.B. Boyer	8c596953b0	agent: ensure that we always use the same settings for msgpack (#7245 ) We set RawToString=true so that []uint8 => string when decoding an interface{}. We set the MapType so that map[interface{}]interface{} decodes to map[string]interface{}. Add tests to ensure that this doesn't break existing usages. Fixes #7223	2020-02-07 15:50:24 -06:00
Freddy	01855d8579	Remove outdated TODO (#7244 )	2020-02-07 13:14:48 -07:00
Matt Keeler	444517080b	Fix a bug with ACL enforcement of reads on namespaced config entries. (#7239 )	2020-02-07 08:30:40 -05:00
Kit Patella	9a220f3010	agent/consul server: fix LeaderTest_ChangeNodeID (#7236 ) * fix LeaderTest_ChangeNodeID to use StatusLeft and add waitForAnyLANLeave * unextract the waitFor... fn, simplify, and provide a more descriptive error	2020-02-06 16:37:53 -08:00
Matt Keeler	9e5fd7f925	OSS Changes for various config entry namespacing bugs (#7226 )	2020-02-06 10:52:25 -05:00
R.B. Boyer	0ecb4538c1	agent: differentiate wan vs lan loggers in memberlist and serf (#7205 ) This should be a helpful change until memberlist and serf can be properly switched to native hclog.	2020-02-05 09:52:43 -06:00
Matt Keeler	dceb107325	Fix disco chain graph validation for namespaces (#7217 ) Previously this happened to be validating only the chains in the default namespace. Now it will validate all chains in all namespaces when the global proxy-defaults is changed.	2020-02-05 10:06:27 -05:00
Matt Keeler	228da48f5d	Minor Non-Functional Updates (#7215 ) * Cleanup the discovery chain compilation route handling Nothing functionally should be different here. The real difference is that when creating new targets or handling route destinations we use the router config entries name and namespace instead of that of the top level request. Today they SHOULD always be the same but that may not always be the case. This hopefully also makes it easier to understand how the router entries are handled. * Refactor a small bit of the service manager tests in oss We used to use the stringHash function to compute part of the filename where things would get persisted to. This has been changed in the core code to calling the StringHash method on the ServiceID type. It just so happens that the new method will output the same value for anything in the default namespace (by design actually). However, logically this filename computation in the test should do the same thing as the core code itself so I updated it here. Also of note is that newer enterprise-only tests for the service manager cannot use the old stringHash function at all because it will produce incorrect results for non-default namespaces.	2020-02-05 10:06:11 -05:00
Freddy	cb77fc6d01	Add managed service provider token (#7218 ) Stubs for enterprise-only ACL token to be used by managed service providers.	2020-02-04 13:58:56 -07:00
Hans Hasselberg	f6ec8ed92b	agent: increase watchLimit to 8192. (#7200 ) The previous value was too conservative and users with many instances were having problems because of it. This change increases the limit to 8192 which reportedly fixed most of the issues with that. Related: #4984, #4986, #5050.	2020-02-04 13:11:30 +01:00
Davor Kapsa	3cb4def563	auto_encrypt: check previously ignored error (#6604 )	2020-02-03 10:35:11 +01:00
Hans Hasselberg	5531678e9e	Security fixes (#7182 ) * Mitigate HTTP/RPC Services Allow Unbounded Resource Usage Fixes #7159. Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> Co-authored-by: Paul Banks <banks@banksco.de>	2020-01-31 11:19:37 -05:00
Matt Keeler	6855a778c2	Updates to the Txn API for namespaces (#7172 ) * Updates to the Txn API for namespaces * Update agent/consul/txn_endpoint.go Co-Authored-By: R.B. Boyer <rb@hashicorp.com> Co-authored-by: R.B. Boyer <public@richardboyer.net>	2020-01-30 13:12:26 -05:00
Matt Keeler	61d8778210	Sync some feature flag support from enterprise (#7167 )	2020-01-29 13:21:38 -05:00
R.B. Boyer	d78b5008ce	various tweaks on top of the hclog work (#7165 )	2020-01-29 11:16:08 -06:00
Chris Piraino	401221de58	Allow users to configure either unstructured or JSON logging (#7130 ) * hclog Allow users to choose between unstructured and JSON logging	2020-01-28 17:50:41 -06:00
Kit Patella	0d336edb65	Add accessorID of token when ops are denied by ACL system (#7117 ) * agent: add and edit doc comments * agent: add ACL token accessorID to debugging traces * agent: polish acl debugging * agent: minor fix + string fmt over value interp * agent: undo export & fix logging field names * agent: remove note and migrate up to code review * Update agent/consul/acl.go Co-Authored-By: Matt Keeler <mkeeler@users.noreply.github.com> * agent: incorporate review feedback * Update agent/acl.go Co-Authored-By: R.B. Boyer <public@richardboyer.net> Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> Co-authored-by: R.B. Boyer <public@richardboyer.net>	2020-01-27 11:54:32 -08:00
Matt Keeler	c09693e545	Updates to Config Entries and Connect for Namespaces (#7116 )	2020-01-24 10:04:58 -05:00
Matt Keeler	bbc2eb1951	Add the v1/catalog/node-services/:node endpoint (#7115 ) The backing RPC already existed but the endpoint will be useful for other service syncing processes such as consul-k8s as this endpoint can return all services registered with a node regardless of namespacing.	2020-01-24 09:27:25 -05:00
Hans Hasselberg	7d6ea82527	raft: increase raft notify buffer. (#6863 ) * Increase raft notify buffer. Fixes https://github.com/hashicorp/consul/issues/6852. Increasing the buffer helps recovering from leader flapping. It lowers the chances of the flapping leader to get into a deadlock situation like described in #6852.	2020-01-22 16:15:59 +01:00
Hans Hasselberg	f0fc9aea7f	tests: fix autopilot test (#7092 )	2020-01-21 14:09:51 +01:00
Hans Hasselberg	9c1361c02b	raft: update raft to v1.1.2 (#7079 ) * update raft * use hclogger for raft.	2020-01-20 13:58:02 +01:00
Hans Hasselberg	804eb17094	connect: check if intermediate cert needs to be renewed. (#6835 ) Currently when using the built-in CA provider for Connect, root certificates are valid for 10 years, however secondary DCs get intermediates that are valid for only 1 year. There is no mechanism currently short of rotating the root in the primary that will cause the secondary DCs to renew their intermediates. This PR adds a check that renews the cert if it is half way through its validity period. In order to be able to test these changes, a new configuration option was added: IntermediateCertTTL which is set extremely low in the tests.	2020-01-17 23:27:13 +01:00
Hans Hasselberg	87f32c8ba6	auto_encrypt: set dns and ip san for k8s and provide configuration (#6944 ) * Add CreateCSRWithSAN * Use CreateCSRWithSAN in auto_encrypt and cache * Copy DNSNames and IPAddresses to cert * Verify auto_encrypt.sign returns cert with SAN * provide configuration options for auto_encrypt dnssan and ipsan * rename CreateCSRWithSAN to CreateCSR	2020-01-17 23:25:26 +01:00
Matej Urbas	ce023359fe	agent: configurable MaxQueryTime and DefaultQueryTime. (#3777 )	2020-01-17 14:20:57 +01:00
Matt Keeler	663cf1e9a8	AuthMethod updates to support alternate namespace logins (#7029 )	2020-01-14 10:09:29 -05:00
Matt Keeler	8bd34e126f	Intentions ACL enforcement updates (#7028 ) * Renamed structs.IntentionWildcard to structs.WildcardSpecifier * Refactor ACL Config Get rid of remnants of enterprise only renaming. Add a WildcardName field for specifying what string should be used to indicate a wildcard. * Add wildcard support in the ACL package For read operations they can call anyAllowed to determine if any read access to the given resource would be granted. For write operations they can call allAllowed to ensure that write access is granted to everything. * Make v1/agent/connect/authorize namespace aware * Update intention ACL enforcement This also changes how intention:read is granted. Before the Intention.List RPC would allow viewing an intention if the token had intention:read on the destination. However Intention.Match allowed viewing if access was allowed for either the source or dest side. Now Intention.List and Intention.Get fall in line with Intention.Matches previous behavior. Due to this being done a few different places ACL enforcement for a singular intention is now done with the CanRead and CanWrite methods on the intention itself. * Refactor Intention.Apply to make things easier to follow.	2020-01-13 15:51:40 -05:00
Pierre Souchay	3bf2e640c7	rpc: log method when a server/server RPC call fails (#4548 ) Sometimes, we have lots of errors in cross calls between DCs (several hundreds / sec) Enrich the log in order to help diagnose the root cause of issue.	2020-01-13 19:55:29 +01:00
R.B. Boyer	10f04a8c4a	connect: derive connect certificate serial numbers from a memdb index instead of the provider table max index (#7011 )	2020-01-09 16:32:19 +01:00
R.B. Boyer	50c879923c	connect: ensure that updates to the secondary root CA configuration use the correct signing key ID values for comparison (#7012 ) Fixes #6886	2020-01-09 16:28:16 +01:00
R.B. Boyer	abb1603a86	Restore a few more service-kind index updates so blocking in ServiceDump works in more cases (#6948 ) Restore a few more service-kind index updates so blocking in ServiceDump works in more cases Namely one omission was that check updates for dumped services were not unblocking. Also adds a ServiceDump state store test and also fix a watch bug with the normal dump. Follow-on from #6916	2019-12-19 10:15:37 -06:00
Matt Keeler	a78f7d7a34	OSS changes for implementing token based namespace inferencing remove debug log	2019-12-18 14:07:08 -05:00
Matt Keeler	be9ba707ba	Unflake the TestACLEndpoint_TokenList test In order to do this I added a waitForLeaderEstablishment helper which does the right thing to ensure that leader establishment has finished. fixup	2019-12-18 14:07:07 -05:00
Matt Keeler	80d13d500b	Miscellaneous acl package cleanup • Renamed EnterpriseACLConfig to just Config • Removed chained_authorizer_oss.go as it was empty • Renamed acl.go to errors.go to more closely describe its contents	2019-12-18 13:44:32 -05:00
Matt Keeler	0b346616e9	Rename EnterpriseAuthorizerContext -> AuthorizerContext	2019-12-18 13:43:24 -05:00
Preetha	c47dbffe1c	autopilot: fix dead server removal condition to use correct failure tolerance (#4017 ) * Make dead server removal condition in autopilot use correct failure tolerance rules * Introduce func with explanation	2019-12-16 23:35:13 +01:00
Matt Keeler	e81e338260	Fix blocking for ServiceDumping by kind (#6919 )	2019-12-10 13:58:30 -05:00
Matt Keeler	5934f803bf	Sync of OSS changes to support namespaces (#6909 )	2019-12-09 21:26:41 -05:00
Hans Hasselberg	2ad0831b34	agent: fewer file local differences between enterprise and oss (#6820 ) (#6898 ) * Increase number to test ignore. Consul Enterprise has more flags and since we are trying to reduce the differences between both code bases, we are increasing the number in oss. The semantics don't change, it is just a cosmetic thing. * Introduce agent.initEnterprise for enterprise related hooks. * Sync test with ent version. * Fix import order. * revert error wording.	2019-12-06 21:35:58 +01:00
Matt Keeler	8f0ab0129e	Miscellaneous Fixes (#6896 ) Ensure we close the Sentinel Evaluator so as not to leak go routines Fix a bunch of test logging so that various warnings when starting a test agent go to the ltest logger and not straight to stdout. Various canned ent meta types always return a valid pointer (no more nils). This allows us to blindly deref + assign in various places. Update ACL index tracking to ensure oss -> ent upgrades will work as expected. Update ent meta parsing to include function to disallow wildcarding.	2019-12-06 14:01:34 -05:00
Matt Keeler	deb91f3d3c	[Feature] API: Add a internal endpoint to query for ACL authori… (#6888 ) * Implement endpoint to query whether the given token is authorized for a set of operations * Updates to allow for remote ACL authorization via RPC This is only used when making an authorization request to a different datacenter.	2019-12-06 09:25:26 -05:00
Matt Keeler	71b1c9cc3c	Fix the TestLeader_SecondaryCA_IntermediateRefresh test flakiness	2019-12-04 19:19:55 -05:00
Matt Keeler	b069d6777b	OSS KV Modifications to Support Namespaces	2019-11-25 12:57:35 -05:00
Matt Keeler	7b471f6bf8	OSS Modifications necessary for sessions namespacing	2019-11-25 12:07:04 -05:00
Paul Banks	cd1b613352	connect: Add AWS PCA provider (#6795 ) * Update AWS SDK to use PCA features. * Add AWS PCA provider * Add plumbing for config, config validation tests, add test for inheriting existing CA resources created by user * Unparallel the tests so we don't exhaust PCA limits * Merge updates * More aggressive polling; rate limit pass through on sign; Timeout on Sign and CA create * Add AWS PCA docs * Fix Vault doc typo too * Doc typo * Apply suggestions from code review Co-Authored-By: R.B. Boyer <rb@hashicorp.com> Co-Authored-By: kaitlincarter-hc <43049322+kaitlincarter-hc@users.noreply.github.com> * Doc fixes; tests for erroring if State is modified via API * More review cleanup * Uncomment tests! * Minor suggested clean ups	2019-11-21 17:40:29 +00:00
Paul Banks	d7329097b2	Change CA Configure struct to pass Datacenter through (#6775 ) * Change CA Configure struct to pass Datacenter through * Remove connect/ca/plugin as we don't have immediate plans to use it. We still intend to one day but there are likely to be several changes to the CA provider interface before we do so it's better to rebuild from history when we do that work properly. * Rename PrimaryDC; fix endpoint in secondary DCs	2019-11-18 14:22:19 +00:00
Paul Banks	b621910618	Support Connect CAs that can't cross sign (#6726 ) * Support Connect CAs that can't cross sign * revert spurios mod changes from make tools * Add log warning when forcing CA rotation * Fixup SupportsCrossSigning to report errors and work with Plugin interface (fixes tests) * Fix failing snake_case test * Remove misleading comment * Revert "Remove misleading comment" This reverts commit bc4db9cabed8ad5d0e39b30e1fe79196d248349c. * Remove misleading comment * Regen proto files messed up by rebase	2019-11-11 21:36:22 +00:00
Paul Banks	45d57ca601	connect: Allow CA Providers to store small amount of state (#6751 ) * pass logger through to provider * test for proper operation of NeedsLogger * remove public testServer function * Ooops actually set the logger in all the places we need it - CA config set wasn't and causing segfault * Fix all the other places in tests where we set the logger * Allow CA Providers to persist some state * Update CA provider plugin interface * Fix plugin stubs to match provider changes * Update agent/connect/ca/provider.go Co-Authored-By: R.B. Boyer <rb@hashicorp.com> * Cleanup review comments	2019-11-11 20:57:16 +00:00
Todd Radel	29b5253154	connect: Implement NeedsLogger interface for CA providers (#6556 ) * add NeedsLogger to Provider interface * implements NeedsLogger in default provider * pass logger through to provider * test for proper operation of NeedsLogger * remove public testServer function * Switch test to actually assert on logging output rather than reflection. --amend * Ooops actually set the logger in all the places we need it - CA config set wasn't and causing segfault * Fix all the other places in tests where we set the logger * Add TODO comment	2019-11-11 20:30:01 +00:00
Todd Radel	54f92e2924	Make all Connect Cert Common Names valid FQDNs (#6423 )	2019-11-11 17:11:54 +00:00
Matt Keeler	ff8157fb51	Fill the Authz Context with a Sentinel Scope (#6729 )	2019-11-01 17:05:22 -04:00
Matt Keeler	d491a3a9d5	Miscellaneous fixes (#6727 )	2019-11-01 16:11:44 -04:00
Paul Banks	87699eca2f	Fix support for RSA CA keys in Connect. (#6638 ) * Allow RSA CA certs for consul and vault providers to correctly sign EC leaf certs. * Ensure key type ad bits are populated from CA cert and clean up tests * Add integration test and fix error when initializing secondary CA with RSA key. * Add more tests, fix review feedback * Update docs with key type config and output * Apply suggestions from code review Co-Authored-By: R.B. Boyer <rb@hashicorp.com>	2019-11-01 13:20:26 +00:00
Matt Keeler	d554f77d0d	Add hook for validating the enterprise meta attached to a reque… (#6695 )	2019-10-30 12:42:39 -04:00
Matt Keeler	8ac79d0b8b	PreVerify acl:read access for listing endpoints (#6696 ) We still will need to filter results based on the authorizer too but this helps to give an early 403.	2019-10-30 09:10:11 -04:00
Sarah Christoff	5e1c6e907b	Set MinQuorum variable in Autopilot (#6654 ) * Add MinQuorum to Autopilot	2019-10-29 09:04:41 -05:00
Matt Keeler	66d138f35e	More Replication Abstractions (#6689 ) Also updated ACL replication to use a function to fill in the desired enterprise meta for all remote listing RPCs.	2019-10-28 13:49:57 -04:00
Matt Keeler	440f6ea17a	Ensure that cache entries for tokens are prefixed “token-secret… (#6688 ) This will be necessary once we store other types of identities in here.	2019-10-25 13:05:43 -04:00
Matt Keeler	79f78632e1	Update the ACL Resolver to allow for Consul Enterprise specific hooks. (#6687 )	2019-10-25 11:06:16 -04:00
Matt Keeler	e4ea9b0a96	Updates to allow for Namespacing ACL resources in Consul Enterp… (#6675 ) Main Changes: • method signature updates everywhere to account for passing around enterprise meta. • populate the EnterpriseAuthorizerContext for all ACL related authorizations. • ACL resource listings now operate like the catalog or kv listings in that the returned entries are filtered down to what the token is allowed to see. With Namespaces its no longer all or nothing. • Modified the acl.Policy parsing to abstract away basic decoding so that enterprise can do it slightly differently. Also updated method signatures so that when parsing a policy it can take extra ent metadata to use during rules validation and policy creation. Secondary Changes: • Moved protobuf encoding functions out of the agentpb package to eliminate circular dependencies. • Added custom JSON unmarshalers for a few ACL resource types (to support snake case and to get rid of mapstructure) • AuthMethod validator cache is now an interface as these will be cached per-namespace for Consul Enterprise. • Added checks for policy/role link existence at the RPC API so we don’t push the request through raft to have it fail internally. • Forward ACL token delete request to the primary datacenter when the secondary DC doesn’t have the token. • Added a bunch of ACL test helpers for inserting ACL resource test data.	2019-10-24 14:38:09 -04:00
Freddy	60f6ec0c2f	Store check type in catalog (#6561 )	2019-10-17 20:33:11 +02:00
R.B. Boyer	de6ce5b1d9	server: ensure the primary dc and ACL dc match (#6634 ) This is mostly a sanity check for server tests that skip the normal config builder equivalent fixup.	2019-10-17 10:57:17 -05:00
R.B. Boyer	3aeb740430	unflake TestLeader_SecondaryCA_Initialize (#6631 )	2019-10-16 16:49:01 -05:00
R.B. Boyer	e6bfcb0ca8	fix flaky multidc acl tests that failed to wait for token replication (#6628 ) If acls have not yet replicated to the secondary then authz requests will be remotely resolved by the primary. Now these tests explicitly wait until replication has caught up first.	2019-10-16 12:24:29 -05:00
R.B. Boyer	040f47c46e	appease the retry linter (#6629 )	2019-10-16 11:39:22 -05:00
Paul Banks	d7aa425339	Allow time for secondary CA to initialize (#6627 )	2019-10-16 17:03:31 +01:00
Matt Keeler	973341a592	ACL Authorizer overhaul (#6620 ) * ACL Authorizer overhaul To account for upcoming features every Authorization function can now take an extra acl.EnterpriseAuthorizerContext. These are unused in OSS and will always be nil. Additionally the acl package has received some thorough refactoring to enable all of the extra Consul Enterprise specific authorizations including moving sentinel enforcement into the stubbed structs. The Authorizer funcs now return an acl.EnforcementDecision instead of a boolean. This improves the overall interface as it makes multiple Authorizers easily chainable as they now indicate whether they had an authoritative decision or should use some other defaults. A ChainedAuthorizer was added to handle this Authorizer enforcement chain and will never itself return a non-authoritative decision. Include stub for extra enterprise rules in the global management policy * Allow for an upgrade of the global-management policy	2019-10-15 16:58:50 -04:00
R.B. Boyer	6439af86eb	agent: clients should only attempt to remove pruned nodes once per call (#6591 )	2019-10-07 16:15:23 -05:00
Sarah Christoff	5e26971864	Prune Unhealthy Agents (#6571 ) * Add -prune flag to ForceLeave	2019-10-04 16:10:02 -05:00
Matt Keeler	d65bbbfd4e	Implement Leader Routine Management (#6580 ) * Implement leader routine manager Switch over the following to use it for go routine management: • Config entry Replication • ACL replication - tokens, policies, roles and legacy tokens • ACL legacy token upgrade • ACL token reaping • Intention Replication • Secondary CA Roots Watching • CA Root Pruning Also added the StopAll call into the Server Shutdown method to ensure all leader routines get killed off when shutting down. This should be mostly unnecessary as `revokeLeadership` should manually stop each one but just in case we really want these to go away (eventually).	2019-10-04 13:08:45 -04:00
Matt Keeler	fc4bcfd81f	Add EnterpriseConfig stubs (#6566 )	2019-10-01 14:34:55 -04:00
R.B. Boyer	c4b92d5534	connect: connect CA Roots in secondary datacenters should use a SigningKeyID derived from their local intermediate (#6513 ) This fixes an issue where leaf certificates issued in secondary datacenters would be reissued very frequently (every ~20 seconds) because the logic meant to detect root rotation was errantly triggering because a hash of the ultimate root (in the primary) was being compared against a hash of the local intermediate root (in the secondary) and always failing.	2019-09-26 11:54:14 -05:00
Matt Keeler	76cf54068b	Expand the QueryOptions and QueryMeta interfaces (#6545 ) In a previous PR I made it so that we had interfaces that would work enough to allow blockingQueries to work. However to complete this we need all fields to be settable and gettable. Notes: • If Go ever gets contracts/generics then we could get rid of all the Getters/Setters • protoc / protoc-gen-gogo are going to generate all the getters for us. • I copied all the getters/setters from the protobuf funcs into agent/structs/protobuf_compat.go • Also added JSON marshaling funcs that use jsonpb for protobuf types.	2019-09-26 09:55:02 -04:00
Freddy	fdd10dd8b8	Expose HTTP-based paths through Connect proxy (#6446 ) Fixes: #5396 This PR adds a proxy configuration stanza called expose. These flags register listeners in Connect sidecar proxies to allow requests to specific HTTP paths from outside of the node. This allows services to protect themselves by only listening on the loopback interface, while still accepting traffic from non Connect-enabled services. Under expose there is a boolean checks flag that would automatically expose all registered HTTP and gRPC check paths. This stanza also accepts a paths list to expose individual paths. The primary use case for this functionality would be to expose paths for third parties like Prometheus or the kubelet. Listeners for requests to exposed paths are be configured dynamically at run time. Any time a proxy, or check can be registered, a listener can also be created. In this initial implementation requests to these paths are not authenticated/encrypted.	2019-09-25 20:55:52 -06:00
Matt Keeler	100ebd63f9	Allow for enterprise only leader routines (#6533 ) Eventually I am thinking we may need a way to register these at different priority levels but for now sticking this here is fine	2019-09-23 20:09:56 -04:00
R.B. Boyer	af01d397a5	connect: don't colon-hex-encode the AuthorityKeyId and SubjectKeyId fields in connect certs (#6492 ) The fields in the certs are meant to hold the original binary representation of this data, not some ascii-encoded version. The only time we should be colon-hex-encoding fields is for display purposes or marshaling through non-TLS mediums (like RPC).	2019-09-23 12:52:35 -05:00
Matt Keeler	51dcd126b7	Add support for implementing new requests with protobufs instea… (#6502 ) * Add build system support for protobuf generation This is done generically so that we don’t have to keep updating the makefile to add another proto generation. Note: anything not in the vendor directory and with a .proto extension will be run through protoc if the corresponding namespace.pb.go file is not up to date. If you want to rebuild just a single proto file you can do so with: make proto-rebuild PROTOFILES=<list of proto files to rebuild> Providing the PROTOFILES var will override the default behavior of finding all the .proto files. * Start adding types to the agent/proto package These will be needed for some other work and are by no means comprehensive. * Add ability to resolve/fixup the agentpb.ACLLinks structure in the state store. * Use protobuf marshalling of raft requests instead of msgpack for protoc generated types. This does not change any encoding of existing types. * Removed structs package automatically encoding with protobuf marshalling Instead the caller of raftApply that wants to opt-in to protobuf encoding will have to call `raftApplyProtobuf` * Run update-vendor to fixup modules.txt Nothing changed as far as dependencies go but the ordering of modules in that file depends on the time they are first seen and its not alphabetical. * Rename some things and implement the structs.RPCInfo interface bits agentpb.QueryOptions and agentpb.WriteRequest implement 3 of the 4 RPCInfo funcs and the new TargetDatacenter message type implements the fourth. * Use the right encoding function. * Renamed agent/proto package to agent/agentpb to prevent package name conflicts * Update modules.txt to fix ordering * Change blockingQuery to take in interfaces for the query options and meta * Add %T to error output. * Add/Update some comments	2019-09-20 14:37:22 -04:00
R.B. Boyer	f9496dc627	sdk: add freelist tracking and ephemeral port range skipping to freeport This should cut down on test flakiness. Problems handled: - If you had enough parallel test cases running, the former circular approach to handling the port block could hand out the same port to multiple cases before they each had a chance to bind them, leading to one of the two tests to fail. - The freeport library would allocate out of the ephemeral port range. This has been corrected for Linux (which should cover CI). - The library now waits until a formerly-in-use port is verified to be free before putting it back into circulation.	2019-09-17 14:30:43 -05:00
R.B. Boyer	7ccaa13514	fix typo of 'unknown' in log messages	2019-09-13 15:59:49 -05:00
Hans Hasselberg	4a20efda9b	agent: handleEnterpriseLeave (#6453 )	2019-09-11 11:01:37 +02:00
Pierre Souchay	be50400c62	Distinguish between DC not existing and not being available (#6399 )	2019-09-03 09:46:24 -06:00
Matt Keeler	42d608587f	Store primaries root in secondary after intermediate signature (#6333 ) * Store primaries root in secondary after intermediate signature This ensures that the intermediate exists within the CA root stored in raft and not just in the CA provider state. This has the very nice benefit of actually outputting the intermediate cert within the ca roots HTTP/RPC endpoints. This change means that if signing the intermediate fails it will not set the root within raft. So far I have not come up with a reason why that is bad. The secondary CA roots watch will pull the root again and go through all the motions. So as soon as getting an intermediate CA works the root will get set. * Make TestAgentAntiEntropy_Check_DeferSync less flaky I am not sure this is the full fix but it seems to help for me.	2019-08-30 11:38:46 -04:00
Pierre Souchay	58f04815d5	Display IPs of machines when node names conflict to ease troubleshooting When there is an node name conflicts, such messages are displayed within Consul: `consul.fsm: EnsureRegistration failed: failed inserting node: Error while renaming Node ID: "e1d456bc-f72d-98e5-ebb3-26ae80d785cf": Node name node001 is reserved by node 05f10209-1b9c-b90c-e3e2-059e64556d4a with name node001` While it is easy to find the node that has reserved the name, it is hard to find the node trying to aquire the name since it is not registered, because it is not part of `consul members` output This PR will display the IP of the offender and solve far more easily those issues.	2019-08-28 15:57:05 -04:00
Alvin Huang	c516fabfac	revert commits on master (#6413 )	2019-08-27 17:45:58 -04:00
tradel	5a22b77340	update tests to match new method signatures	2019-08-27 14:16:39 -07:00
tradel	1ff46f3f0a	confi\gure providers with DC and domain	2019-08-27 14:16:25 -07:00
tradel	5ba28a6a7b	create a common name for autoTLS agent certs	2019-08-27 14:15:53 -07:00
Alvin Huang	0be1531d80	add nil pointer check for pointer to ACLToken struct (#6407 )	2019-08-27 11:23:28 -04:00
Hans Hasselberg	f3def8c0d0	make sure auto_encrypt has private key type and bits	2019-08-26 13:09:50 +02:00
R.B. Boyer	cc9a6f7993	Merge pull request #6388 from hashicorp/release/1-6 merging release/1-6 into master	2019-08-23 13:44:46 -05:00

... 5 6 7 8 9 ...

1269 Commits