consul

Commit Graph

Author	SHA1	Message	Date
freddygv	ef877449ce	Move valid policies to pkg level	2020-09-02 15:49:03 -06:00
freddygv	f81fe6a1a1	Remove LB infix and move injection to xds	2020-09-02 15:13:50 -06:00
R.B. Boyer	119e945c3e	connect: all config entries pick up a meta field (#8596 ) Fixes #8595	2020-09-02 14:10:25 -05:00
Chris Piraino	28f163c2d2	Merge pull request #8603 from hashicorp/feature/usage-metrics Track node and service counts in the state store and emit them periodically as metrics	2020-09-02 13:23:39 -05:00
R.B. Boyer	d0f74cd1e8	connect: fix bug in preventing some namespaced config entry modifications (#8601 ) Whenever an upsert/deletion of a config entry happens, within the open state store transaction we speculatively test compile all discovery chains that may be affected by the pending modification to verify that the write would not create an erroneous scenario (such as splitting traffic to a subset that did not exist). If a single discovery chain evaluation references two config entries with the same kind and name in different namespaces then sometimes the upsert/deletion would be falsely rejected. It does not appear as though this bug would've let invalid writes through to the state store so the correction does not require a cleanup phase.	2020-09-02 10:47:19 -05:00
Chris Piraino	bcb586bee2	Set metrics reporting interval to 9 seconds This is below the 10 second interval that lib/telemetry.go implements as its aggregation interval, ensuring that we always report these metrics.	2020-09-02 10:24:23 -05:00
Chris Piraino	a3028cad89	Update godoc string for memdb wrapper functions/structs	2020-09-02 10:24:22 -05:00
Chris Piraino	d301145e62	Refactor state store usage to track unique service names This commit refactors the state store usage code to track unique service name changes on transaction commit. This means we only need to lookup usage entries when reading the information, as opposed to iterating over a large number of service indices. - Take into account a service instance's name being changed - Do not iterate through entire list of service instances, we only care about whether there is 0, 1, or more than 1.	2020-09-02 10:24:21 -05:00
Chris Piraino	086a8ea8eb	Use ReadTxn interface in state store helper functions	2020-09-02 10:24:20 -05:00
Chris Piraino	69dbc926ad	Add WriteTxn interface and convert more functions to ReadTxn We add a WriteTxn interface for use in updating the usage memdb table, with the forward-looking prospect of incrementally converting other functions to accept interfaces. As well, we use the ReadTxn in new usage code, and as a side effect convert a couple of existing functions to use that interface as well.	2020-09-02 10:24:19 -05:00
Chris Piraino	3feae7f77b	Report node/service usage metrics from every server Using the newly provided state store methods, we periodically emit usage metrics from the servers. We decided to emit these metrics from all servers, not just the leader, because that means we do not have to care about leader election flapping causing metrics turbulence, and it seems reasonable for each server to emit its own view of the state, even if they should always converge rapidly.	2020-09-02 10:24:17 -05:00
Chris Piraino	04705e90f9	Add new usage memdb table that tracks usage counts of various elements We update the usage table on Commit() by using the TrackedChanges() API of memdb. Track memdb changes on restore so that usage data can be compiled	2020-09-02 10:24:16 -05:00
freddygv	63f79e5f9b	Restructure structs and other PR comments	2020-09-02 09:10:50 -06:00
Daniel Nephin	f1a41318d7	token: OSS support for enterprise tokens	2020-08-31 15:10:15 -04:00
Daniel Nephin	629e4aaa65	config: use token.Config for ACLToken config Using the target Config struct reduces the amount of copying and translating of configuration structs.	2020-08-31 15:10:15 -04:00
Daniel Nephin	330be5b740	agent/token: Move token persistence out of agent And into token.Store. This change isolates any awareness of token persistence in a single place. It is a small step in allowing Agent.New to accept its dependencies.	2020-08-31 15:00:34 -04:00
Daniel Nephin	a80de898ea	fix TestStore_RegularTokens This test was only passing because t.Parallel was causing every subtest to run with the last value in the iteration, which sets a value for all tokens. The test started to fail once t.Parallel was removed, but the same failure could have been produced by adding 'tt := tt' to the t.Run() func. These tests run in under 10ms, so there is no reason to use t.Parallel.	2020-08-31 14:59:14 -04:00
Matt Keeler	91d680b830	Merge of auto-config and auto-encrypt code (#8523 ) auto-encrypt is now handled as a special case of auto-config. This also is moving all the cert-monitor code into the auto-config package.	2020-08-31 13:12:17 -04:00
freddygv	0236e169bb	Add documentation for resolver LB cfg	2020-08-28 14:46:13 -06:00
freddygv	28d0602fc1	Pass LB config to Envoy via xDS	2020-08-28 14:27:40 -06:00
freddygv	2bbbd9e1da	Log error as error	2020-08-28 13:11:55 -06:00
freddygv	81115b6eaa	Compile down LB policy to disco chain nodes	2020-08-28 13:11:04 -06:00
Daniel Nephin	6956477be5	Merge pull request #8548 from edevil/fix_flake Fix flaky TestACLResolver_Client/Concurrent-Token-Resolve	2020-08-28 15:10:55 -04:00
Daniel Nephin	72bf350069	Merge pull request #8552 from pierresouchay/reload_cache_throttling_config Ensure that Cache options are reloaded when `consul reload` is performed	2020-08-28 15:04:42 -04:00
Pierre Souchay	d5974b1d17	Added Unit test for cache reloading	2020-08-28 13:03:58 +02:00
freddygv	ff56a64b08	Add LB policy to service-resolver	2020-08-27 19:44:02 -06:00
Jack	9e1c6727f9	Add http2 and grpc support to ingress gateways (#8458 )	2020-08-27 15:34:08 -06:00
R.B. Boyer	74d5df7c7a	xds: use envoy's rbac filter to handle intentions entirely within envoy (#8569 )	2020-08-27 12:20:58 -05:00
R.B. Boyer	d1843456d2	agent: ensure that we normalize bootstrapped config entries (#8547 )	2020-08-27 11:37:25 -05:00
Pierre Souchay	9a64d3e5fe	Also test reload of EntryFetchMaxBurst	2020-08-27 18:14:05 +02:00
Matt Keeler	f97cc0445a	Move RPC router from Client/Server and into BaseDeps (#8559 ) This will allow it to be a shared component which is needed for AutoConfig	2020-08-27 11:23:52 -04:00
Pierre Souchay	5842a902df	Tests that changes in rate limit are taken into account by agent	2020-08-27 16:41:20 +02:00
Pierre Souchay	879d087f65	Added `options.Equals()` and minor fixes indentation fixes	2020-08-27 13:44:45 +02:00
R.B. Boyer	fead4fc2a5	agent: expose the list of supported envoy versions on /v1/agent/self (#8545 )	2020-08-26 10:04:11 -05:00
Kyle Havlovitz	97f1f341d6	Automatically renew the token used by the Vault CA provider	2020-08-25 10:34:49 -07:00
Pierre Souchay	d2be9d38da	Ensure that Cache options are reloaded when `consul reload` is performed. This will apply cache throttling parameters are properly applied: * cache.EntryFetchMaxBurst * cache.EntryFetchRate When values are updated, a log is displayed in info.	2020-08-24 23:33:10 +02:00
André Cruz	9a0792139c	Decrease test flakiness Fix flaky TestACLResolver_Client/Concurrent-Token-Resolve and TestCacheNotifyPolling	2020-08-24 20:30:02 +01:00
André Cruz	aa212423e3	testing: Fix govet errors	2020-08-21 18:01:55 +01:00
Daniel Nephin	01745feec0	Merge pull request #8537 from hashicorp/dnephin/fix-panic-on-connect-nil Fix panic when decoding 'Connect: null'	2020-08-20 18:00:25 -04:00
Daniel Nephin	07ad662131	Fix panic when decoding 'Connect: null' Surprisingly the json Unmarshal updates the aux pointer to a nil.	2020-08-20 17:52:14 -04:00
Daniel Nephin	e16375216d	config: use logging.Config in RuntimeConfig To add structure to RuntimeConfig, and remove the need to translate into a third type.	2020-08-19 13:21:00 -04:00
Daniel Nephin	f2373a5575	logging: move init of grpclog This line initializes global state. Moving it out of the constructor and closer to where logging is setup helps keep related things together.	2020-08-19 13:21:00 -04:00
Daniel Nephin	33c401a16e	logging: Setup accept io.Writer instead of []io.Writer Also accept a non-pointer Config, since the config is not modified	2020-08-19 13:20:41 -04:00
Daniel Nephin	63bad36de7	testing: disable global metrics sink in tests This might be better handled by allowing configuration for the InMemSink interval and retail, and disabling the global. For now this is a smaller change to remove the goroutine leak caused by tests because go-metrics does not provide any way of shutting down the global goroutine.	2020-08-18 19:04:57 -04:00
Daniel Nephin	5d4df54296	agent: extract dependency creation from New With this change, Agent.New() accepts many of the dependencies instead of creating them in New. Accepting fully constructed dependencies from a constructor makes the type easier to test, and easier to change. There are still a number of dependencies created in Start() which can be addressed in a follow up.	2020-08-18 19:04:55 -04:00
Daniel Nephin	51b08c645b	Merge pull request #8514 from hashicorp/dnephin/testing-improvements-1 testing: small improvements to TestSessionCreate and testutil.retry	2020-08-18 18:26:05 -04:00
Daniel Nephin	ab2157bbc9	Merge pull request #8528 from hashicorp/dnephin/move-node-name-validation config: Move some config validation from Agent.Start to config.Builder.Validate	2020-08-18 18:25:41 -04:00
Hans Hasselberg	a932aafc91	add primary keys to list keyring (#8522 ) During gossip encryption key rotation it would be nice to be able to see if all nodes are using the same key. This PR adds another field to the json response from `GET v1/operator/keyring` which lists the primary keys in use per dc. That way an operator can tell when a key was successfully setup as primary key. Based on https://github.com/hashicorp/serf/pull/611 to add primary key to list keyring output: ```json [ { "WAN": true, "Datacenter": "dc2", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 6, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6 }, "NumNodes": 6 }, { "WAN": false, "Datacenter": "dc2", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 8, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "NumNodes": 8 }, { "WAN": false, "Datacenter": "dc1", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 3, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "NumNodes": 8 } ] ``` I intentionally did not change the CLI output because I didn't find a good way of displaying this information. There are a couple of options that we could implement later: * add a flag to show the primary keys * add a flag to show json output Fixes #3393.	2020-08-18 09:50:24 +02:00
Daniel Nephin	35f1ecee0b	config: Move remote-script-checks warning to config Previously it was done in Agent.Start, but it can be done much earlier	2020-08-17 17:39:49 -04:00
Daniel Nephin	27b36bfc4e	config: move NodeName validation to config validation Previsouly it was done in Agent.Start, which is much later then it needs to be. The new 'dns' package was required, because otherwise there would be an import cycle. In the future we should move more of the dns server into the dns package.	2020-08-17 17:25:02 -04:00
Daniel Nephin	b4015969c9	Merge pull request #8515 from hashicorp/dnephin/unexport-testing-shims config: unexport fields and resolve TODOs in config.Builder	2020-08-17 16:03:07 -04:00
Daniel Nephin	16217fe9b9	testing: use t.Cleanup in testutil.TempFile So that it has the same behaviour as TempDir. Also remove the now unnecessary 'defer os.Remove'	2020-08-14 20:06:01 -04:00
Daniel Nephin	d68edcecf4	testing: Remove all the defer os.Removeall Now that testutil uses t.Cleanup to remove the directory the caller no longer has to manage the removal	2020-08-14 19:58:53 -04:00
Daniel Nephin	8a4d292c8e	config: unexport and resolve TODOs in config.Builder - unexport testing shims, and document their purpose - resolve a TODO by moving validation to NewBuilder and storing the one field that is used instead of all of Options - create a slice with the correct size to avoid extra allocations	2020-08-14 19:23:32 -04:00
Daniel Nephin	a4b201af36	testing: Improve session_endpoint_test While working on another change I caused a bunch of these tests to fail. Unfortunately the failure messages were not super helpful at first. One problem was that the request and response were created outside of the retry. This meant that when the second attempt happened, the request body was empty (because the buffer had been consumed), and so the request was not actually being retried. This was fixed by moving more of the request creation into the retry block. Another problem was that these functions can return errors in two ways, and are not consistent about which way they use. Some errors are returned to the response writer, but the tests were not checking those errors, which was causing a panic later on. This was fixed by adding a check for the response code. Also adds some missing t.Helper(), and has assertIndex use checkIndex so that it is clear these are the same implementation.	2020-08-14 18:55:52 -04:00
Daniel Nephin	070e843113	testutil: Add t.Cleanup to TempDir TempDir registers a Cleanup so that the directory is always removed. To disable to cleanup, set the TEST_NOCLEANUP env var.	2020-08-14 13:19:10 -04:00
Daniel Nephin	2b920ad199	testing: fix flaky test TestDNS_NonExistentDC_RPC I saw this test flake locally, and it was easy to reproduce with -count=10. The failure was: 'TestAgent.dns: rpc error: error=No known Consul servers'. Waiting for the agent seems to fix it.	2020-08-13 18:03:04 -04:00
Daniel Nephin	1912c5ad89	testing: wait until monitor has started before shutdown This commit fixes a test that I saw flake locally while running tests. The test output from the monitor started immediately after the line the test was looking for. To fix the problem a channel is closed when the goroutine starts. Shutdown is not called until this channel is closed, which seems to greatly reduce the chance of a flake.	2020-08-13 17:53:29 -04:00
Daniel Nephin	3a4e62836b	testing: Remove TestAgent.Key and change TestAgent.DataDir TestAgent.Key was only used by 3 tests. Extracting it from the common helper that is used in hundreds of tests helps keep the shared part small and more focused. This required a second change (which I was planning on making anyway), which was to change the behaviour of DataDir. Now in all cases the TestAgent will use the DataDir, and clean it up once the test is complete.	2020-08-13 17:53:24 -04:00
Daniel Nephin	b1679508d4	testing: use t.Cleanup in TestAgent for returnPorts	2020-08-13 17:09:37 -04:00
Daniel Nephin	4e8e0de8f0	testing: remove unused fields from TestACLAgent	2020-08-13 17:03:55 -04:00
Daniel Nephin	399c77dfb6	agent: rename vars in newConsulConfig 'base' is a bit misleading, since it is the return value. Renamed to cfg.	2020-08-13 11:58:21 -04:00
Daniel Nephin	7b5b170a0d	agent: Move setupKeyring functions to keyring.go There are a couple reasons for this change: 1. agent.go is way too big. Smaller files makes code eaasier to read because tools that show usage also include filename which can give a lot more context to someone trying to understand which functions call other functions. 2. these two functions call into a large number of functions already in keyring.go.	2020-08-13 11:58:21 -04:00
Daniel Nephin	9919e5dfa5	agent: unmethod consulConfig To allow us to move newConsulConfig out of Agent.	2020-08-13 11:58:21 -04:00
Daniel Nephin	8f596f5551	Fix conflict in merged PRs One PR renamed the var from config->cfg, and another used the old name config, which caused the build to fail on master.	2020-08-13 11:28:26 -04:00
Daniel Nephin	d677706625	state: remove unused Store method receiver And use ReadTxn interface where appropriate.	2020-08-13 11:25:22 -04:00
Daniel Nephin	190fcc14a3	Merge pull request #8463 from hashicorp/dnephin/unmethod-make-node-id agent: convert NodeID methods to functions	2020-08-13 11:18:11 -04:00
Daniel Nephin	912aae8624	Merge pull request #8461 from hashicorp/dnephin/remove-notify-shutdown agent/consul: Remove NotifyShutdown	2020-08-13 11:16:48 -04:00
Daniel Nephin	5b37efd91b	Merge pull request #8365 from hashicorp/dnephin/fix-service-by-node-meta-flake state: speed up tests that use watchLimit	2020-08-13 11:16:12 -04:00
Daniel Nephin	37eacf8192	auto-config: reduce awareness of config This is a small step to allowing Agent to accept its dependencies instead of creating them in New. There were two fields in autoconfig.Config that were used exclusively to load config. These were replaced with a single function, allowing us to move LoadConfig back to the config package. Also removed the WithX functions for building a Config. Since these were simple assignment, it appeared we were not getting much value from them.	2020-08-12 13:23:23 -04:00
Daniel Nephin	e07554500e	Remove check that hostID is a uuid. Immediately afterward we hash the ID, so it does not need to be a uuid anymore.	2020-08-12 13:05:10 -04:00
Daniel Nephin	875d8bde42	agent: convert NodeID methods to functions Making these functions allows us to cleanup how an agent is initialized. They only make use of a config and a logger, so they do not need to be agent methods. Also cleanup the testing to use t.Run and require.	2020-08-12 13:05:10 -04:00
Daniel Nephin	0738eb8596	Extract nodeID functions to a different file In preparation for turning them into functions. To reduce the scope of Agent, and refactor how Agent is created and started.	2020-08-12 13:05:10 -04:00
R.B. Boyer	e3cd4a8539	connect: use stronger validation that ingress gateways have compatible protocols defined for their upstreams (#8470 ) Fixes #8466 Since Consul 1.8.0 there was a bug in how ingress gateway protocol compatibility was enforced. At the point in time that an ingress-gateway config entry was modified the discovery chain for each upstream was checked to ensure the ingress gateway protocol matched. Unfortunately future modifications of other config entries were not validated against existing ingress-gateway definitions, such as: 1. create tcp ingress-gateway pointing to 'api' (ok) 2. create service-defaults for 'api' setting protocol=http (worked, but not ok) 3. create service-splitter or service-router for 'api' (worked, but caused an agent panic) If you were to do these in a different order, it would fail without a crash: 1. create service-defaults for 'api' setting protocol=http (ok) 2. create service-splitter or service-router for 'api' (ok) 3. create tcp ingress-gateway pointing to 'api' (fail with message about protocol mismatch) This PR introduces the missing validation. The two new behaviors are: 1. create tcp ingress-gateway pointing to 'api' (ok) 2. (NEW) create service-defaults for 'api' setting protocol=http ("ok" for back compat) 3. (NEW) create service-splitter or service-router for 'api' (fail with message about protocol mismatch) In consideration for any existing users that may be inadvertently be falling into item (2) above, that is now officiall a valid configuration to be in. For anyone falling into item (3) above while you cannot use the API to manufacture that scenario anymore, anyone that has old (now bad) data will still be able to have the agent use them just enough to generate a new agent/proxycfg error message rather than a panic. Unfortunately we just don't have enough information to properly fix the config entries.	2020-08-12 11:19:20 -05:00
Freddy	d72f72dcd5	Notify alias checks when aliased service is [de]registered (#8456 )	2020-08-12 09:47:41 -06:00
Daniel Nephin	3d96c5b651	Merge pull request #8469 from hashicorp/dnephin/config-source config: make Source an interface to avoid the marshal/unmarshal cycle in auto-config	2020-08-12 11:17:15 -04:00
Hans Hasselberg	aacf0fd777	Merge pull request #8471 from hashicorp/local_only thread local-only through the layers	2020-08-12 08:54:51 +02:00
Freddy	875816d0d3	Internal endpoint to query intentions associated with a gateway (#8400 )	2020-08-11 17:20:41 -06:00
Kyle Havlovitz	635952681e	Fix a state store comment about version	2020-08-11 13:46:12 -07:00
Kyle Havlovitz	c39a275666	fsm: Fix snapshot bug with restoring node/service/check indexes	2020-08-11 11:49:52 -07:00
Hans Hasselberg	aff02198d7	Refactor keyring ops: * changes some functions to return data instead of modifying pointer arguments * renames globalRPC() to keyringRPCs() to make its purpose more clear * restructures KeyringOperation() to make it more understandable	2020-08-11 13:42:03 +02:00
Hans Hasselberg	07261db64d	thread local-only through the layers $ consul keyring -list -local-only ==> Gathering installed encryption keys... dc1 (LAN): aUlAW4ST3+vwseI61so24CoORkyjZofcmHk+j7QPSYQ= [1/1]	2020-08-11 13:41:53 +02:00
Daniel Nephin	4297a8ba07	auto-config: Avoid the marshal/unmarshal cycle in auto-config Use a LiteralConfig and return a config.Config from translate.	2020-08-10 20:07:52 -04:00
freddygv	de0b574a26	Update error handling	2020-08-10 17:48:22 -06:00
Daniel Nephin	38980ebb4c	config: Make Source an interface This will allow us to accept config from auto-config without needing to go through a serialziation cycle.	2020-08-10 12:46:28 -04:00
Mike Morris	ff37af1129	changelog: Update for 1.8.2, 1.7.6, 1.7.5 and 1.6.7 (#8462 ) * update bindata_assetfs.go * Release v1.8.2 * Putting source back into Dev Mode * changelog: add entries for 1.7.6, 1.7.5 and 1.6.7 Co-authored-by: hashicorp-ci <hashicorp-ci@users.noreply.github.com>	2020-08-07 18:58:09 -04:00
Daniel Nephin	80e99cb3e6	testing: remove unnecessary defers in tests The data directory is now removed by the test helper that created it.	2020-08-07 17:28:16 -04:00
Daniel Nephin	7dbacf297c	testing: Remove NotifyShutdown NotifyShutdown was only used for testing. Now that t.Cleanup exists, we can use that instead of attaching cleanup to the Server shutdown. The Autopilot test which used NotifyShutdown doesn't need this notification because Shutdown is synchronous. Waiting for the function to return is equivalent.	2020-08-07 17:14:44 -04:00
Matt Keeler	67dec3b609	Require token replication to be enabled in secondary dcs when ACLs are enabled with AutoConfig (#8451 ) AutoConfig will generate local tokens for clients and the ability to use local tokens is gated off of token replication being enabled and being configured with a replication token. Therefore we already have a hard requirement on having token replication enabled, this commit just makes sure to surface that to the operator instead of having to discern what the issue is from RPC errors.	2020-08-07 10:20:27 -04:00
Hans Hasselberg	d316cd06c1	auto_config implies connect (#8433 )	2020-08-07 12:02:02 +02:00
freddygv	c8f5215e9d	Fix test build	2020-08-06 11:31:56 -06:00
Hans Hasselberg	51a8e15cf8	Mark its own cluster as healthy when rebalancing. (#8406 ) This code started as an optimization to avoid doing an RPC Ping to itself. But in a single server cluster the rebalancing was led to believe that there were no healthy servers because foundHealthyServer was not set. Now this is being set properly. Fixes #8401 and #8403.	2020-08-06 10:42:09 +02:00
freddygv	15c3cfce5e	PR comments and addtl tests	2020-08-05 16:07:11 -06:00
Daniel Nephin	ae382805bd	Merge pull request #8404 from hashicorp/dnephin/remove-log-output-field Use Logger consistently, instead of LogOutput	2020-08-05 14:31:43 -04:00
Daniel Nephin	3b82ad0955	Rename NewClient/NewServer Now that duplicate constructors have been removed we can use the shorter names for the single constructor.	2020-08-05 14:00:55 -04:00
Daniel Nephin	0420d91cdd	Remove LogOutput from Agent Now that it is no longer used, we can remove this unnecessary field. This is a pre-step in cleanup up RuntimeConfig->Consul.Config, which is a pre-step to adding a gRPCHandler component to Server for streaming. Removing this field also allows us to remove one of the return values from logging.Setup.	2020-08-05 14:00:44 -04:00
Daniel Nephin	5acf01ceeb	Remove LogOutput from Server	2020-08-05 14:00:44 -04:00
Daniel Nephin	0c5428eea8	Remove LogOutput from Client	2020-08-05 14:00:42 -04:00
Daniel Nephin	e8ee2cf2f7	Pass a logger to ConnPool and yamux, instead of an io.Writer Allowing us to remove the LogOutput field from config.	2020-08-05 13:25:08 -04:00
Daniel Nephin	ed8210fe4d	api: Use a Logger instead of an io.Writer in api.Watch So that we can pass around only a Logger, not a LogOutput	2020-08-05 13:25:08 -04:00
Daniel Nephin	1e17a0c3e1	config: Remove unused field	2020-08-05 13:25:08 -04:00
Daniel Nephin	ba3ace1219	Return nil value on error. The main bug was fixed in `cb050b280c`, but the return value of 'result' is still misleading. Change the return value to nil to make the code more clear.	2020-08-05 13:10:17 -04:00
R.B. Boyer	c599a2f5f4	xds: add support for envoy 1.15.0 and drop support for 1.11.x (#8424 ) Related changes: - hard-fail the xDS connection attempt if the envoy version is known to be too old to be supported - remove the RouterMatchSafeRegex proxy feature since all supported envoy versions have it - stop using --max-obj-name-len (due to: envoyproxy/envoy#11740)	2020-07-31 15:52:49 -05:00
freddygv	0956624e39	collect GatewayServices from iter in a function	2020-07-31 13:30:40 -06:00
Freddy	f1e8addbdf	Avoid panics during shutdown routine (#8412 )	2020-07-30 11:11:10 -06:00
freddygv	aa6c59dbfc	end to end changes to pass gatewayservices to /ui/services/	2020-07-30 10:21:11 -06:00
Matt Keeler	1a78cf9b4c	Ensure certificates retrieved through the cache get persisted with auto-config (#8409 )	2020-07-30 11:37:18 -04:00
freddygv	51255eede7	Support ConnectedWithProxy	2020-07-30 09:32:12 -06:00
Matt Keeler	dbb461a5d3	Allow setting verify_incoming* when using auto_encrypt or auto_config (#8394 ) Ensure that enabling AutoConfig sets the tls configurator properly This also refactors the TLS configurator a bit so the naming doesn’t imply only AutoEncrypt as the source of the automatically setup TLS cert info.	2020-07-30 10:15:12 -04:00
Hans Hasselberg	054595b1f8	agent/cache test for cache throttling. (#8396 )	2020-07-30 14:41:13 +02:00
Matt Keeler	34034b76f5	Agent Auto Config: Implement Certificate Generation (#8360 ) Most of the groundwork was laid in previous PRs between adding the cert-monitor package to extracting the logic of signing certificates out of the connect_ca_endpoint.go code and into a method on the server. This also refactors the auto-config package a bit to split things out into multiple files.	2020-07-28 15:31:48 -04:00
Matt Keeler	be01c4241d	Default Cache rate limiting options in New Also get rid of the TestCache helper which was where these defaults were happening previously.	2020-07-28 12:34:35 -04:00
Matt Keeler	83d09de230	Fix some broken code in master There were several PRs that while all passed CI independently, when they all got merged into the same branch caused compilation errors in test code. The main changes that caused issues where changing agent/cache.Cache.New to require a concrete options struct instead of a pointer. This broke the cert monitor tests and the catalog_list_services_test.go. Another change was made to unembed the http.Server from the agent.HTTPServer struct. That coupled with another change to add a test to ensure cache rate limiting coming from HTTP requests was working as expected caused compilation failures.	2020-07-28 09:50:10 -04:00
Pierre Souchay	505de6dc29	Added ratelimit to handle throtling cache (#8226 ) This implements a solution for #7863 It does: Add a new config cache.entry_fetch_rate to limit the number of calls/s for a given cache entry, default value = rate.Inf Add cache.entry_fetch_max_burst size of rate limit (default value = 2) The new configuration now supports the following syntax for instance to allow 1 query every 3s: command line HCL: -hcl 'cache = { entry_fetch_rate = 0.333}' in JSON { "cache": { "entry_fetch_rate": 0.333 } }	2020-07-27 23:11:11 +02:00
Matt Keeler	5c2c762106	Move connect root retrieval and cert signing logic out of the RPC endpoints (#8364 ) The code now lives on the Server type itself. This was done so that all of this could be shared with auto config certificate signing.	2020-07-24 10:00:51 -04:00
Matt Keeler	2ee9fe0a4d	Move generation of the CA Configuration from the agent code into a method on the RuntimeConfig (#8363 ) This allows this to be reused elsewhere.	2020-07-23 16:05:28 -04:00
Daniel Nephin	3d115a62fd	Merge pull request #8323 from hashicorp/dnephin/add-event-publisher-2 stream: close subscriptions on shutdown	2020-07-23 13:12:50 -04:00
Matt Keeler	2713c0e682	Refactor the agentpb package (#8362 ) First move the whole thing to the top-level proto package name. Secondly change some things around internally to have sub-packages.	2020-07-23 11:24:20 -04:00
Paul Coignet	1d75a8fb50	Fix tests	2020-07-23 11:04:10 +02:00
Daniel Nephin	ed69feca6d	stream: close all subs when EventProcessor is shutdown.	2020-07-22 19:04:10 -04:00
Daniel Nephin	a99a4103bd	stream: fix overallocation in filter And add tests	2020-07-22 19:04:10 -04:00
Daniel Nephin	9ed61fd160	state: speed up TestStateStore_ServicesByNodeMeta Make watchLimit a var so that we can patch it in tests and reduce the time spent creating state.	2020-07-22 16:57:06 -04:00
Daniel Nephin	0402dd7ac5	state: Use subtests in TestStateStore_ServicesByNodeMeta These subtests make it much easier to identify the slow part of the test, but they also help enumerate all the different cases which are being tested.	2020-07-22 16:39:09 -04:00
Daniel Nephin	3570ce6566	Merge pull request #7948 from hashicorp/dnephin/buffer-test-logs testutil: NewLogBuffer - buffer logs until a test fails	2020-07-21 15:21:52 -04:00
Matt Keeler	3c09482864	Merge pull request #8311 from hashicorp/bugfix/auto-encrypt-token-update	2020-07-21 13:15:27 -04:00
Daniel Nephin	a33a7a6fe2	Merge pull request #8344 from hashicorp/dnephin/fix-flakes-in-stream stream: handle empty event in TestEventSnapshot	2020-07-21 13:14:35 -04:00
Daniel Nephin	51efba2c7d	testutil: NewLogBuffer - buffer logs until a test fails Replaces #7559 Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case. Attaching log output from background goroutines also cause data races. If the goroutine outlives the test, it will race with the test being marked done. Previously this was noticed as a panic when logging, but with the race detector enabled it is shown as a data race. The previous solution did not address the problem of correct test attribution because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long. This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful. All of the logs are printed from the test goroutine, so they should be associated with the correct test. Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly. Related: https://github.com/golang/go/issues/38458 (may be fixed in go1.15) https://github.com/golang/go/issues/38382#issuecomment-612940030	2020-07-21 12:50:40 -04:00
Matt Keeler	12acdd7481	Disable background cache refresh for Connect Leaf Certs The rationale behind removing them is that all of our own code (xDS, builtin connect proxy) use the cache notification mechanism. This ensures that the blocking fetch behind the scenes is always executing. Therefore the only way you might go to get a certificate and have to wait is when 1) the request has never been made for that cert before or 2) you are using the v1/agent/connect/ca/leaf API for retrieving the cert yourself. In the first case, the refresh change doesn’t alter the behavior. In the second case, it can be mitigated by using blocking queries with that API which just like normal cache notification mechanism will cause the blocking fetch to be initiated and to get leaf certs as soon as needed. If you are not using blocking queries, or Envoy/xDS, or the builtin connect proxy but are retrieving the certs yourself then the HTTP endpoint might take a little longer to respond. This also renames the RefreshTimeout field on the register options to QueryTimeout to more accurately reflect that it is used for any type that supports blocking queries.	2020-07-21 12:19:25 -04:00
Matt Keeler	9da8c51ac5	Fix issue with changing the agent token causing failure to renew the auto-encrypt certificate The fallback method would still work but it would get into a state where it would let the certificate expire for 10s before getting a new one. And the new one used the less secure RPC endpoint. This is also a pretty large refactoring of the auto encrypt code. I was going to write some tests around the certificate monitoring but it was going to be impossible to get a TestAgent configured in such a way that I could write a test that ran in less than an hour or two to exercise the functionality. Moving the certificate monitoring into its own package will allow for dependency injection and in particular mocking the cache types to control how it hands back certificates and how long those certificates should live. This will allow for exercising the main loop more than would be possible with it coupled so tightly with the Agent.	2020-07-21 12:19:25 -04:00
Daniel Nephin	1910e2a246	checks: wait for goroutine to complete CheckAlias already had a waitGroup, but the Add() call was happening too late, which was causing a race in tests. The add must happen before the goroutine is started. CheckHTTP did not have a waitGroup, so I added it to match CheckAlias. It looks like a lot of the implementation could be shared, and may not need all of channel, waitgroup and bool, but I will leave that refactor for another time.	2020-07-20 18:55:39 -04:00
Daniel Nephin	9482961b1c	stream: handle empty event in TestEventSnapshot When the race detector is enabled we see this test fail occasionally. The reordering of execution seems to make it possible for the snapshot splice to happen before any events are published to the topicBuffers. We can handle this case in the test the same way it is handled by a subscription, by proceeding to the next event.	2020-07-20 18:20:02 -04:00
Daniel Nephin	57e00b0fd5	Merge pull request #8245 from hashicorp/dnephin/use-not-modified-in-cache agent/cache: Use AllowNotModified in CatalogListServices	2020-07-20 15:30:52 -04:00
Daniel Nephin	29571465e1	Merge pull request #8290 from hashicorp/dnephin/watch-decode watch: fix script watches with single arg	2020-07-20 14:41:17 -04:00
Paul Coignet	a4e39c840b	Add default prefix_filter	2020-07-20 10:39:58 +02:00
Daniel Nephin	ecccb30690	state: update calls that are no longer state methods In a previous commit these methods were changed to functions, so remove the Store paramter.	2020-07-16 15:46:10 -04:00
Daniel Nephin	2008884241	state: un-method funcs that don't use their receiver This change was mostly automated with the following First generate a list of functions with: git grep -o 'Store) $[^(]\+$(tx \txn' ./agent/consul/state \| awk '{print $2}' \| grep -o '^[^(]\+' Then the list was curated a bit with trial/error to remove and add funcs as necessary. Finally the replacement was done with: dir=agent/consul/state file=${1-funcnames} while read fn; do echo "$fn" sed -i -e "s/(s \Store) $fn(/$fn(/" $dir/.go sed -i -e "s/s\.$fn(/$fn(/" $dir/.go sed -i -e "s/s\.store\.$fn(/$fn(/" $dir/*.go done < $file	2020-07-16 15:30:39 -04:00
Daniel Nephin	63b153df8c	store: convert methods that don't use their receiver to functions Making these functions allows them to be used without introducing an artificial dependency on the struct. Many of these will be called from streaming Event processors, which do not have a store. This change is being made ahead of the streaming work to get to reduce the size of the streaming diff.	2020-07-16 15:30:10 -04:00
André	3bc27df844	minor: fix docstring of DNSOnlyPassing (#8318 ) In runtime.go it had "duration" but it is actually a boolean.	2020-07-16 09:47:33 -04:00
Daniel Nephin	2ec3760b70	agent/cache: Use AllowNotModifiedResponse in CatalogListServices Co-authored-by: Pierre Souchay <pierresouchay@users.noreply.github.com>	2020-07-14 18:58:20 -04:00
Daniel Nephin	dbb8d14728	agent/cache: Update some docstrings	2020-07-14 18:58:20 -04:00
Daniel Nephin	c07acbeb6b	stream: Add forceClose and refactor subscription filtering Move the subscription context to Next. context.Context should generally never be stored in a struct because it makes that struct only valid while the context is valid. This is rarely obvious from the caller. Adds a forceClosed channel in place of the old context, and uses the new context as a way for the caller to stop the Subscription blocking. Remove some recursion out of bufferImte.Next. The caller is already looping so we can continue in that loop instead of recursing. This ensures currentItem is updated immediately (which probably does not matter in practice), and also removes the chance that we overflow the stack. NextNoBlock and FollowAfter do not need to handle bufferItem.Err, the caller already handles it. Moves filter to a method to simplify Next, and more explicitly separate filtering from looping. Also improve some godoc Only unwrap itemBuffer.Err when necessary	2020-07-14 15:57:47 -04:00
Daniel Nephin	f19f8e99bb	stream: Improve docstrings Also rename ResumeStrema to EndOfEmptySnapshot to be more consistent with other framing events Co-authored-by: Paul Banks <banks@banksco.de>	2020-07-14 15:57:47 -04:00
Daniel Nephin	fc1c2ae412	stream: change Topic to an interface Consumers of the package can decide on which type to use for the Topic. In the future we may use a gRPC type for the topic.	2020-07-14 15:57:47 -04:00
Daniel Nephin	6fa36e3aee	state: Move change processing out of EventPublisher EventPublisher was receiving TopicHandlers, which had a couple of problems: - ChangeProcessors were being grouped by Topic, but they completely ignored the topic and were performed on every change - ChangeProcessors required EventPublisher to be aware of database changes By moving ChangeProcesors out of EventPublisher, and having Publish accept events instead of changes, EventPublisher no longer needs to be aware of these things. Handlers is now only SnapshotHandlers, which are still mapped by Topic. Also allows us to remove the small 'db' package that had only two types. They can now be unexported types in state.	2020-07-14 15:57:47 -04:00
Daniel Nephin	48c766d2c6	server: Abandom state store to shutdown EventPublisher So that we don't leak goroutines	2020-07-14 15:57:47 -04:00
Daniel Nephin	889c57fd2d	stream: unexport identifiers Now that EventPublisher is part of stream a lot of the internals can be hidden	2020-07-14 15:57:47 -04:00
Daniel Nephin	4fa0fdc0e0	stream: Move EventPublisher to stream package The EventPublisher is the central hub of the PubSub system. It is toughly coupled with much of stream. Some stream internals were exported exclusively for EventPublisher. The two Subscribe cases (with or without index) were also awkwardly split between two packages. By moving EventPublisher into stream they are now both in the same package (although still in different files).	2020-07-14 15:57:47 -04:00
Daniel Nephin	489876c86b	state: Make handleACLUpdate async once again So that we keep as much as possible out of the FSM commit hot path.	2020-07-14 15:57:47 -04:00
Daniel Nephin	5f9db94956	state: Use interface for Txn Also store the index in Changes instead of the Txn. This change is in preparation for movinng EventPublisher to the stream package, and making handleACLUpdates async once again.	2020-07-14 15:57:46 -04:00
Daniel Nephin	a709ed1ab5	stream.Subscription unexport fields and additiona docstrings	2020-07-14 15:57:46 -04:00
Daniel Nephin	17b833b4c9	Add a context for stopping EventPublisher goroutine	2020-07-14 15:57:46 -04:00
Daniel Nephin	2c8342f115	EventPublisher: Make Unsubscribe a function on Subscription It is critical that Unsubscribe be called with the same pointer to a SubscriptionRequest that was used to create the Subscription. The docstring made that clear, but it sill allowed a caler to get it wrong by creating a new SubscriptionRequest. By hiding this detail from the caller, and only exposing an Unsubscribe method, it should be impossible to fail to Unsubscribe. Also update some godoc strings.	2020-07-14 15:57:46 -04:00
Daniel Nephin	1622bb3a45	EventPublisher: handleACL changes synchronously Use a separate lock for subscriptions.ByToken to allow it to happen synchronously in the commit flow. This removes the need to create a new txn for the goroutine, and removes the need for EventPublisher to contain a reference to DB.	2020-07-14 15:57:46 -04:00
Daniel Nephin	effab15131	stream.EventSnapshot: reduce the fields on the struct Many of the fields are only needed in one place, and by using a closure they can be removed from the struct. This reduces the scope of the variables making it esier to see how they are used.	2020-07-14 15:57:45 -04:00
Daniel Nephin	a5cf933fe8	stream.EventBuffer: Seed the fuzz test with time.Now() Otherwise the test will run with exactly the same values each time. By printing the seed we can attempt to reproduce the test by adding an env var to override the seed	2020-07-14 15:57:45 -04:00
Daniel Nephin	bbe7272d8e	state: memdb_wrapper.go -> memdb.go Renaming in a separate commit so that git can merge changes to the file.	2020-07-14 15:57:45 -04:00
Daniel Nephin	2a8a8f7b8d	state: publish changes from Commit Make topicRegistry use functions instead of unbound methods Use a regular memDB in EventPublisher to remove a reference cycle Removes the need for EventPublisher to use a store	2020-07-14 15:57:45 -04:00
Daniel Nephin	f5ecd5de5f	EventPublisher: docstrings and getTopicBuffer also rename commitCh -> publishCh	2020-07-14 15:57:45 -04:00
Daniel Nephin	555cfe52d9	ProcessChanges: use stream.Event Also remove secretHash, which was used to hash tokens. We don't expose these tokens anywhere, so we can use the string itself instead of a Hash. Fix acl_events_test.go for storing a structs type.	2020-07-14 15:57:45 -04:00
Daniel Nephin	4e0bc8013b	stream: Use local types for Event Topic SubscriptionRequest	2020-07-14 15:57:45 -04:00
Daniel Nephin	aacd514dca	Rename stream_publisher.go -> event_publisher.go	2020-07-14 15:57:44 -04:00
Daniel Nephin	c0b0109e80	Add streaming package with Subscription and Snapshot components. The remaining files from 7965767de0bd62ab07669b85d6879bd5f815d157 Co-authored-by: Paul Banks <banks@banksco.de>	2020-07-14 15:57:44 -04:00
Matt Keeler	8beca1cb8d	Add ability for notifications when one of the agent tokens is updated (#8301 ) Co-authored-by: Chris Piraino <cpiraino@hashicorp.com>	2020-07-14 09:53:55 -04:00
Hans Hasselberg	496fb5fc5b	add support for envoy 1.14.4, 1.13.4, 1.12.6 (#8216 )	2020-07-13 15:44:44 -05:00
Chris Piraino	4d857d117f	Set enterprise metadata after resolving the token (#8302 ) The token can encode enterprise metadata information, and we must make sure we set that on the reply so that we can correct filter ACLs.	2020-07-13 13:39:57 -05:00
Daniel Nephin	7b3f26e61d	watch: Allow args from different types Fixes a bug where specifying a slice of args with a single item was being converted to a string when config was loaded, causing an error.	2020-07-10 17:18:32 -04:00
Freddy	e72af87918	Add api mod support for /catalog/gateway-services (#8278 )	2020-07-10 13:01:45 -06:00
Daniel Nephin	653c938edc	watch: extract makeWatchPlan to facilitate testing There is a bug in here now that slices in opaque config are unsliced. But to test that bug fix we need a function that can be easily tested.	2020-07-10 13:33:45 -04:00
Paul Coignet	b8b939b98a	Keep both metrics	2020-07-10 11:27:22 +02:00
R.B. Boyer	1eef096dfe	xds: version sniff envoy and switch regular expressions from 'regex' to 'safe_regex' on newer envoy versions (#8222 ) - cut down on extra node metadata transmission - split the golden file generation to compare all envoy version	2020-07-09 17:04:51 -05:00
Daniel Nephin	f22f3d300d	Merge pull request #8231 from hashicorp/dnephin/unembed-HTTPServer-Server agent/http: un-embed the http.Server	2020-07-09 17:42:33 -04:00
Daniel Nephin	df4088291c	agent/http: Update TestSetupHTTPServer_HTTP2 To remove the need to store the http.Server. This will allow us to remove the http.Server field from the HTTPServer struct.	2020-07-09 16:42:19 -04:00
Daniel Nephin	d98a4c1317	Merge pull request #8237 from hashicorp/dnephin/remove-acls-enabled-from-delegate Remove ACLsEnabled from delegate interface	2020-07-09 16:35:43 -04:00
Paul Coignet	cf68577bef	Use method and path as labels	2020-07-09 10:31:27 +02:00
Matt Keeler	4fb535ba48	Pass the Config and TLS Configurator into the AutoConfig constructor This is instead of having the AutoConfigBackend interface provide functions for retrieving them. NOTE: the config is not reloadable. For now this is fine as we don’t look at any reloadable fields. If that changes then we should provide a way to make it reloadable.	2020-07-08 12:36:11 -04:00
Matt Keeler	f2f32735ce	Rename (Server).forward to (Server).ForwardRPC Also get rid of the preexisting shim in server.go that existed before to have this name just call the unexported one.	2020-07-08 11:05:44 -04:00
Matt Keeler	d2e4869c7c	Refactor AutoConfig RPC to not have a direct dependency on the Server type Instead it has an interface which can be mocked for better unit testing that is deterministic and not prone to flakiness.	2020-07-08 11:05:44 -04:00
Chris Piraino	735337b170	Append port number to ingress host domain (#8190 ) A port can be sent in the Host header as defined in the HTTP RFC, so we take any hosts that we want to match traffic to and also add another host with the listener port added. Also fix an issue with envoy integration tests not running the case-ingress-gateway-tls test.	2020-07-07 10:43:04 -05:00
Daniel Nephin	5247ef4c70	Remove ACLsEnabled from delegate interface In all cases (oss/ent, client/server) this method was returning a value from config. Since the value is consistent, it doesn't need to be part of the delegate interface.	2020-07-03 17:00:20 -04:00
Daniel Nephin	a7f69b615a	Merge pull request #8215 from hashicorp/dnephin/support-not-modified-response-server agent/consul: Add support for NotModified to two endpoints	2020-07-03 16:15:31 -04:00
Pierre Souchay	20d1ea7d2d	Upgrade go-connlimit to v0.3.0 / return http 429 on too many connections (#8221 ) Fixes #7527 I want to highlight this and explain what I think the implications are and make sure we are aware: * `HTTPConnStateFunc` closes the connection when it is beyond the limit. `Close` does not block. * `HTTPConnStateFuncWithDefault429Handler(10 * time.Millisecond)` blocks until the following is done (worst case): 1) `conn.SetDeadline(10*time.Millisecond)` so that 2) `conn.Write(429error)` is guaranteed to timeout after 10ms, so that the http 429 can be written and 3) `conn.Close` can happen The implication of this change is that accepting any new connection is worst case delayed by 10ms. But only after a client reached the limit already.	2020-07-03 09:25:07 +02:00
Daniel Nephin	a5e45defb1	agent/http: un-embed the HTTPServer The embedded HTTPServer struct is not used by the large HTTPServer struct. It is used by tests and the agent. This change is a small first step in the process of removing that field. The eventual goal is to reduce the scope of HTTPServer making it easier to test, and split into separate packages.	2020-07-02 17:21:12 -04:00
Daniel Nephin	5d36f98710	agent/consul: Add support for NotModified to two endpoints A query made with AllowNotModifiedResponse and a MinIndex, where the result has the same Index as MinIndex, will return an empty response with QueryMeta.NotModified set to true. Co-authored-by: Pierre Souchay <pierresouchay@users.noreply.github.com>	2020-07-02 17:05:46 -04:00
Matt Keeler	f8e8f48125	Merge pull request #8211 from hashicorp/bugfix/auto-encrypt-various	2020-07-02 09:49:49 -04:00
Yury Evtikhov	10361dd210	DNS: add IsErrQueryNotFound function for easier error evaluation	2020-07-01 03:41:44 +01:00
Yury Evtikhov	8d18422f19	DNS: fix agent returning SERVFAIL where NXDOMAIN should be returned	2020-07-01 01:51:21 +01:00
Yury Evtikhov	3b4ddaaab5	DNS: add test to verify NXDOMAIN is returned when a non-existent domain is queried over RPC	2020-07-01 01:51:16 +01:00
Matt Keeler	6e7acfa618	Add an AutoEncrypt “integration” test Also fix a bug where Consul could segfault if TLS was enabled but no client certificate was provided. How no one has reported this as a problem I am not sure.	2020-06-30 15:23:29 -04:00
Matt Keeler	2ddcba00c6	Overwrite agent leaf cert trust domain on the servers	2020-06-30 09:59:08 -04:00
Matt Keeler	19040f1166	Store the Connect CA rate limiter on the server This fixes a bug where auto_encrypt was operating without utilizing a common rate limiter.	2020-06-30 09:59:07 -04:00
Matt Keeler	a5a9560bbd	Initialize the agent leaf cert cache result with a state to prevent unnecessary second certificate signing	2020-06-30 09:59:07 -04:00
Matt Keeler	39b567a55a	Fix auto_encrypt IP/DNS SANs The initial auto encrypt CSR wasn’t containing the user supplied IP and DNS SANs. This fixes that. Also We were configuring a default :: IP SAN. This should be ::1 instead and was fixed.	2020-06-30 09:59:07 -04:00
Matt Keeler	85fd8c552f	Merge pull request #8193 from hashicorp/feature/auto-config/suppress-config-warnings	2020-06-27 10:06:52 -04:00
R.B. Boyer	462f0f37ed	connect: various changes to make namespaces for intentions work more like for other subsystems (#8194 ) Highlights: - add new endpoint to query for intentions by exact match - using this endpoint from the CLI instead of the dump+filter approach - enforcing that OSS can only read/write intentions with a SourceNS or DestinationNS field of "default". - preexisting OSS intentions with now-invalid namespace fields will delete those intentions on initial election or for wildcard namespaces an attempt will be made to downgrade them to "default" unless one exists. - also allow the '-namespace' CLI arg on all of the intention subcommands - update lots of docs	2020-06-26 16:59:15 -05:00
Matt Keeler	be576c9737	Use the DNS and IP SANs from the auto config stanza when set	2020-06-26 16:01:30 -04:00
Matt Keeler	e8b39dd255	Overhaul the auto-config translation This fixes some issues around spurious warnings about using enterprise configuration in OSS.	2020-06-26 15:25:21 -04:00
Freddy	10d6e9c458	Split up unused key validation for oss/ent (#8189 ) Split up unused key validation in config entry decode for oss/ent. This is needed so that we can return an informative error in OSS if namespaces are provided.	2020-06-25 13:58:29 -06:00
Daniel Nephin	a891ee8428	Merge pull request #8176 from hashicorp/dnephin/add-linter-unparam-1 lint: add unparam linter and fix some of the issues	2020-06-25 15:34:48 -04:00
Matt Keeler	7041f69892	Merge pull request #8184 from hashicorp/bugfix/goroutine-leaks	2020-06-25 09:22:19 -04:00
Chris Piraino	df48db0abd	Merge pull request #7932 from hashicorp/ingress/internal-ui-endpoint-multiple-ports Update gateway-services-nodes API endpoint to allow multiple addresses	2020-06-24 17:11:01 -05:00
Chris Piraino	f213d3592a	remove obsolete comments about test parallelization	2020-06-24 16:36:13 -05:00
Chris Piraino	b3db907bdf	Update gateway-services-nodes API endpoint to allow multiple addresses Previously, we were only returning a single ListenerPort for a single service. However, we actually allow a single service to be serviced over multiple ports, as well as allow users to define what hostnames they expect their services to be contacted over. When no hosts are defined, we return the default ingress domain for any configured DNS domain. To show this in the UI, we modify the gateway-services-nodes API to return a GatewayConfig.Addresses field, which is a list of addresses over which the specific service can be contacted.	2020-06-24 16:35:23 -05:00
Matt Keeler	e9835610f3	Add a test for go routine leaks This is in its own separate package so that it will be a separate test binary that runs thus isolating the go runtime from other tests and allowing accurate go routine leak checking. This test would ideally use goleak.VerifyTestMain but that will fail 100% of the time due to some architectural things (blocking queries and net/rpc uncancellability). This test is not comprehensive. We should enable/exercise more features and more cluster configurations. However its a start.	2020-06-24 17:09:50 -04:00
Matt Keeler	29d0cfdd7d	Fix go routine leak in auto encrypt ca roots tracking	2020-06-24 17:09:50 -04:00
Matt Keeler	25a4f3c83b	Allow cancelling blocking queries in response to shutting down.	2020-06-24 17:09:50 -04:00
Daniel Nephin	0279bf6fe5	Update TestAgent_GetCoordinate The old test case was a very specific regresion test for a case that is no longer possible. Replaced with a new test that checks the default coordinate is returned.	2020-06-24 13:00:15 -04:00
Daniel Nephin	f65e21e6dc	Remove unused return values	2020-06-24 13:00:15 -04:00
Daniel Nephin	010a609912	Fix a bunch of unparam lint issues	2020-06-24 13:00:14 -04:00
Matt Keeler	15e7b3940c	Ensure that retryLoopBackoff can be cancelled We needed to pass a cancellable context into the limiter.Wait instead of context.Background. So I made the func take a context instead of a chan as most places were just passing through a Done chan from a context anyways. Fix go routine leak in the gateway locator	2020-06-24 12:41:08 -04:00
Matt Keeler	e2cfa93f02	Don’t leak metrics go routines in tests (#8182 )	2020-06-24 10:15:25 -04:00
gitforbit	808f632346	agent-http: cleanup: return nil instead of err (#8043 ) Since err is already checked, it should return `nil`	2020-06-24 14:29:21 +02:00
R.B. Boyer	c63c994b04	connect: upgrade github.com/envoyproxy/go-control-plane to v0.9.5 (#8165 )	2020-06-23 15:19:56 -05:00
freddygv	c791fbc79c	Update namespaces subject-verb agreement	2020-06-23 10:57:30 -06:00
freddygv	044d027ff8	Remove break	2020-06-22 19:59:04 -06:00
freddygv	70810b0602	Let users know namespaces are ent only in config entry decode	2020-06-22 19:59:04 -06:00
Pierre Souchay	35d852fd9a	Returns DNS Error NSDOMAIN when DC does not exists (#8103 ) This will allow to increase cache value when DC is not valid (aka return SOA to avoid too many consecutive requests) and will distinguish DC being temporarily not available from DC not existing. Implements https://github.com/hashicorp/consul/issues/8102	2020-06-22 09:01:48 -04:00
Matt Keeler	4a5b352c18	Require enabling TLS to enable Auto Config (#8159 ) On the servers they must have a certificate. On the clients they just have to set verify_outgoing to true to attempt TLS connections for RPCs. Eventually we may relax these restrictions but right now all of the settings we push down (acl tokens, acl related settings, certificates, gossip key) are sensitive and shouldn’t be transmitted over an unencrypted connection. Our guides and docs should recoommend verify_server_hostname on the clients as well. Another reason to do this is weird things happen when making an insecure RPC when TLS is not enabled. Basically it tries TLS anyways. We should probably fix that to make it clearer what is going on.	2020-06-19 16:38:14 -04:00
Freddy	5baa7b1b04	Always return a gateway cluster (#8158 )	2020-06-19 13:31:39 -06:00
Matt Keeler	d6e05482ab	Allow cancelling startup when performing auto-config (#8157 ) Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>	2020-06-19 15:16:00 -04:00
Daniel Nephin	4f17350928	Merge pull request #8147 from hashicorp/dnephin/remove-private-ip-2 Remove some dead code from agent/consul/util.go	2020-06-18 15:51:09 -04:00
Matt Keeler	b0fcf86140	Change auto config authorizer to allow for future extension The envisioned changes would allow extra settings to enable dynamically defined auth methods to be used instead of or in addition to the statically defined one in the configuration.	2020-06-18 15:22:24 -04:00
Daniel Nephin	b0ba546a1f	Remove bytesToUint64 from agent/consul	2020-06-18 12:45:43 -04:00
Daniel Nephin	a00f007c5e	Remove unused private IP code from agent/consul	2020-06-18 12:40:38 -04:00
Matt Keeler	3dbbd2d37d	Implement Client Agent Auto Config There are a couple of things in here. First, just like auto encrypt, any Cluster.AutoConfig RPC will implicitly use the less secure RPC mechanism. This drastically modifies how the Consul Agent starts up and moves most of the responsibilities (other than signal handling) from the cli command and into the Agent.	2020-06-17 16:49:46 -04:00
Matt Keeler	8b7d669a27	Allow the Agent its its child Client/Server to share a connection pool This is needed so that we can make an AutoConfig RPC at the Agent level prior to creating the Client/Server.	2020-06-17 16:19:33 -04:00
Matt Keeler	51c3a605ad	Merge pull request #8035 from hashicorp/feature/auto-config/server-rpc	2020-06-17 16:07:25 -04:00
Chris Piraino	79a862d019	Remove ACLEnforceVersion8 from tests (#8138 ) The field had been deprecated for a while and was recently removed, however a PR which added these tests prior to removal was merged.	2020-06-17 14:58:01 -05:00
Daniel Nephin	692a4a8fc8	Merge pull request #7762 from hashicorp/dnephin/warn-on-unknown-service-file config: warn if a config file is being skipped because of its file extension	2020-06-17 15:14:40 -04:00
Daniel Nephin	be29d6bf75	config: warn when a config file is skipped All commands which read config (agent, services, and validate) will now print warnings when one of the config files is skipped because it did not match an expected format. Also ensures that config validate prints all warnings.	2020-06-17 13:08:54 -04:00
Daniel Nephin	5afcf5c1bc	Merge pull request #8034 from hashicorp/dnephin/add-linter-staticcheck-4 ci: enable SA4006 staticcheck check and add ineffassign	2020-06-17 12:16:02 -04:00
Matt Keeler	9b01f9423c	Implement the insecure version of the Cluster.AutoConfig RPC endpoint Right now this is only hooked into the insecure RPC server and requires JWT authorization. If no JWT authorizer is setup in the configuration then we inject a disabled “authorizer” to always report that JWT authorization is disabled.	2020-06-17 11:25:29 -04:00
Pierre Souchay	d31691dc87	gossip: Ensure that metadata of Consul Service is updated (#7903 ) While upgrading servers to a new version, I saw that metadata of existing servers are not upgraded, so the version and raft meta is not up to date in catalog. The only way to do it was to: * update Consul server * make it leave the cluster, then metadata is accurate That's because the optimization to avoid updating catalog does not take into account metadata, so no update on catalog is performed.	2020-06-17 12:16:13 +02:00
Daniel Nephin	d345cd8d30	ci: Add ineffsign linter And fix an additional ineffective assignment that was not caught by staticcheck	2020-06-16 17:32:50 -04:00
Daniel Nephin	a9851e1812	Merge pull request #8070 from hashicorp/dnephin/add-gofmt-simplify ci: Enable gofmt simplify	2020-06-16 17:18:38 -04:00
Matt Keeler	9f7b22a5eb	Agent Auto Configuration: Configuration Syntax Updates (#8003 )	2020-06-16 15:03:22 -04:00
Daniel Nephin	068b43df90	Enable gofmt simplify Code changes done automatically with 'gofmt -s -w'	2020-06-16 13:21:11 -04:00
Daniel Nephin	cb050b280c	ci: enable SA4006 staticcheck check And fix the 'value not used' issues. Many of these are not bugs, but a few are tests not checking errors, and one appears to be a missed error in non-test code.	2020-06-16 13:10:11 -04:00
Daniel Nephin	f7c84ad802	Rename txnWrapper to txn	2020-06-16 13:06:02 -04:00
Daniel Nephin	32aa3ada35	Rename db	2020-06-16 13:04:31 -04:00
Daniel Nephin	deef6fcc32	Handle return value from txn.Commit	2020-06-16 13:04:31 -04:00
Daniel Nephin	59bac0f99d	state: Update docstrings for changeTrackerDB and txn And un-embed memdb.DB to prevent accidental access to underlying methods.	2020-06-16 13:04:31 -04:00
Paul Banks	f6ac08be04	state: track changes so that they may be used to produce change events	2020-06-16 13:04:29 -04:00
Matt Keeler	d3881dd754	ACL Node Identities (#7970 ) A Node Identity is very similar to a service identity. Its main targeted use is to allow creating tokens for use by Consul agents that will grant the necessary permissions for all the typical agent operations (node registration, coordinate updates, anti-entropy). Half of this commit is for golden file based tests of the acl token and role cli output. Another big updates was to refactor many of the tests in agent/consul/acl_endpoint_test.go to use the same style of tests and the same helpers. Besides being less boiler plate in the tests it also uses a common way of starting a test server with ACLs that should operate without any warnings regarding deprecated non-uuid master tokens etc.	2020-06-16 12:54:27 -04:00
Daniel Nephin	476b57fe22	config: refactor to consolidate all File->Source loading Previously the logic for reading ConfigFiles and produces Sources was split between NewBuilder and Build. This commit moves all of the logic into NewBuilder so that Build() can operate entirely on Sources. This change is in preparation for logging warnings when files have an unsupported extension. It also reduces the scope of BuilderOpts, and gets us very close to removing Builder.options.	2020-06-16 12:52:23 -04:00
Daniel Nephin	219790ca49	config: Make ConfigFormat not a pointer The nil value was never used. We can avoid a bunch of complications by making the field a string value instead of a pointer. This change is in preparation for fixing a silent config failure.	2020-06-16 12:52:22 -04:00
Daniel Nephin	77101eee82	config: rename Flags to BuilderOpts Flags is an overloaded term in this context. It generally is used to refer to command line flags. This struct, however, is a data object used as input to the construction. It happens to be partially populated by command line flags, but otherwise has very little to do with them. Renaming this struct should make the actual responsibility of this struct more obvious, and remove the possibility that it is confused with command line flags. This change is in preparation for adding additional fields to BuilderOpts.	2020-06-16 12:51:19 -04:00
Daniel Nephin	85e0338136	config: remove Args field from Flags This field was populated for one reason, to test that it was empty. Of all the callers, only a single one used this functionality. The rest constructed a `Flags{}` struct which did not set Args. I think this shows that the logic was in the wrong place. Only the agent command needs to care about validating the args. This commit removes the field, and moves the logic to the one caller that cares. Also fix some comments.	2020-06-16 12:49:53 -04:00
Daniel Nephin	73cd0b6fac	agent/service_manager: remove 'updateCh' field from serviceConfigWatch Passing the channel to the function which uses it significantly reduces the scope of the variable, and makes its usage more explicit. It also moves the initialization of the channel closer to where it is used. Also includes a couple very small cleanups to remove a local var and read the error from `ctx.Err()` directly instead of creating a channel to check for an error.	2020-06-16 12:15:57 -04:00
Daniel Nephin	26291a8482	agent/service_manager: remove 'defaults' field from serviceConfigWatch This field was always read by the same function that populated the field, so it does not need to be a field. Passing the value as an argument to functions makes it more obvious where the value comes from, and also reduces the scope of the variable significantly.	2020-06-16 12:15:52 -04:00
Daniel Nephin	93235da253	agent/service_manager: Pass ctx around [The documentation for context](https://golang.org/pkg/context/) recommends not storing context in a struct field: > Do not store Contexts inside a struct type; instead, pass a Context > explicitly to each function that needs it. The Context should be the > first parameter, typically named ctx... Sometimes there are good reasons to not follow this recommendation, but in this case it seems easy enough to follow. Also moved the ctx argument to be the first in one of the function calls to follow the same recommendation.	2020-06-16 12:14:00 -04:00
Daniel Nephin	2eac5b8023	Merge pull request #8074 from hashicorp/dnephin/remove-references-to-PatchSliceOfMaps Update comments that reference PatchSliceOfMaps	2020-06-15 14:33:10 -04:00
Matt Keeler	8837907de4	Make the Agent Cache more Context aware (#8092 ) Blocking queries issues will still be uncancellable (that cannot be helped until we get rid of net/rpc). However this makes it so that if calling getWithIndex (like during a cache Notify go routine) we can cancell the outer routine. Previously it would keep issuing more blocking queries until the result state actually changed.	2020-06-15 11:01:25 -04:00
freddygv	d97cff0966	Update telemetry for gateway-services endpoint	2020-06-12 14:44:36 -06:00
freddygv	cd927eed5e	Remove unused method and fixup docs ref	2020-06-12 13:47:43 -06:00
freddygv	0f97b7d63d	Fixup stray sid references	2020-06-12 13:47:43 -06:00
freddygv	19e3954603	Move compound service names to use ServiceName type	2020-06-12 13:47:43 -06:00
freddygv	e7b52d35d4	Create HTTP endpoint	2020-06-12 13:46:47 -06:00
freddygv	15c74d6943	Move GatewayServices out of Internal	2020-06-12 13:46:47 -06:00
Freddy	166a8b2a58	Only pass one hostname via EDS and prefer healthy ones (#8084 ) Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> Currently when passing hostname clusters to Envoy, we set each service instance registered with Consul as an LbEndpoint for the cluster. However, Envoy can only handle one per cluster: [2020-06-04 18:32:34.094][1][warning][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC config for type.googleapis.com/envoy.api.v2.Cluster rejected: Error adding/updating cluster(s) dc2.internal.ddd90499-9b47-91c5-4616-c0cbf0fc358a.consul: LOGICAL_DNS clusters must have a single locality_lb_endpoint and a single lb_endpoint, server.dc2.consul: LOGICAL_DNS clusters must have a single locality_lb_endpoint and a single lb_endpoint Envoy is currently handling this gracefully by only picking one of the endpoints. However, we should avoid passing multiple to avoid these warning logs. This PR: * Ensures we only pass one endpoint, which is tied to one service instance. * We prefer sending an endpoint which is marked as Healthy by Consul. * If no endpoints are healthy we emit a warning and skip the cluster. * If multiple unique hostnames are spread across service instances we emit a warning and let the user know which will be resolved.	2020-06-12 13:46:17 -06:00
Chris Piraino	6fa48c9512	Allow users to set hosts to the wildcard specifier when TLS is disabled (#8083 ) This allows easier demoing/testing of ingress gateways, while still preserving the validation we have for DNSSANs	2020-06-11 10:03:06 -05:00
Chris Piraino	91ab89dd48	Move ingress param to a new endpoint (#8081 ) In discussion with team, it was pointed out that query parameters tend to be filter mechanism, and that semantically the "/v1/health/connect" endpoint should return "all healthy connect-enabled endpoints (e.g. could be side car proxies or native instances) for this service so I can connect with mTLS". That does not fit an ingress gateway, so we remove the query parameter and add a new endpoint "/v1/health/ingress" that semantically means "all the healthy ingress gateway instances that I can connect to to access this connect-enabled service without mTLS"	2020-06-10 13:07:15 -05:00
Daniel Nephin	8ec029ae6a	Update comments that reference PatchSliceOfMaps To reference decode.HookWeakDecodeFromSlice instead. Also removes a step from the adding config fields checklist which is no longer necessary.	2020-06-09 17:43:05 -04:00
Chris Piraino	496e683360	Merge pull request #8064 from hashicorp/ingress/health-query-param Add API query parameter ?ingress to allow users to find ingress gateways associated to a service	2020-06-09 16:08:28 -05:00
Chris Piraino	c1d329c5dd	Remove TODO note about ingress API, it is done!	2020-06-09 14:58:30 -05:00
Chris Piraino	ca41f80493	Set connect or ingress boolean after checking for query param	2020-06-09 14:45:21 -05:00
Daniel Nephin	08f1ed16b4	Merge pull request #7900 from hashicorp/dnephin/add-linter-staticcheck-2 intentions: fix a bug in Intention.SetHash	2020-06-09 15:40:20 -04:00
Daniel Nephin	62a1125c7b	Merge pull request #8037 from hashicorp/dnephin/add-linter-staticcheck-5 ci: Enabled SA2002 staticcheck check	2020-06-09 15:31:24 -04:00
Hans Hasselberg	242994a016	acl: do not resolve local tokens from remote dcs (#8068 )	2020-06-09 21:13:09 +02:00
Kyle Havlovitz	0c8966220f	Merge pull request #8040 from hashicorp/ingress/expose-cli Ingress expose CLI command	2020-06-09 12:11:23 -07:00
Chris Piraino	3c037d9b96	Add ?ingress query parameter on /v1/health/connect Refactor boolean query parameter logic from ?passing value to re-use with ingress	2020-06-09 11:44:31 -05:00
Daniel Nephin	c66c533d73	Merge pull request #7964 from hashicorp/dnephin/remove-patch-slice-of-maps-forward-compat config: Use HookWeakDecodeFromSlice in place of PatchSliceOfMaps	2020-06-08 19:53:04 -04:00
Daniel Nephin	75cbbe2702	config: add HookWeakDecodeFromSlice Currently opaque config blocks (config entries, and CA provider config) are modified by PatchSliceOfMaps, making it impossible for these opaque config sections to contain slices of maps. In order to fix this problem, any lazy-decoding of these blocks needs to support weak decoding of []map[string]interface{} to a struct type before PatchSliceOfMaps is replaces. This is necessary because these config blobs are persisted, and during an upgrade an older version of Consul could read one of the new configuration values, which would cause an error. To support the upgrade path, this commit first introduces the new hooks for weak decoding of []map[string]interface{} and uses them only in the lazy-decode paths. That way, in a future release, new style configuration will be supported by the older version of Consul. This decode hook has a number of advantages: 1. It no longer panics. It allows mapstructure to report the error 2. It no longer requires the user to declare which fields are slices of structs. It can deduce that information from the 'to' value. 3. It will make it possible to preserve opaque configuration, allowing for structured opaque config.	2020-06-08 17:05:09 -04:00
Hans Hasselberg	98eea08d3b	Tokens converted from legacy ACLs get their Hash computed (#8047 ) * Fixes #5606: Tokens converted from legacy ACLs get their Hash computed This allows new style token replication to work for legacy tokens as well when they change. * tests: fix timestamp comparison Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>	2020-06-08 21:44:06 +02:00
Chris Piraino	1a853fc954	Always require Host header values for http services (#7990 ) Previously, we did not require the 'service-name.' host header value when on a single http service was exposed. However, this allows a user to get into a situation where, if they add another service to the listener, suddenly the previous service's traffic might not be routed correctly. Thus, we always require the Host header, even if there is only 1 service. Also, we add the make the default domain matching more restrictive by matching "service-name.ingress." by default. This lines up better with the namespace case and more accurately matches the Consul DNS value we expect people to use in this case.	2020-06-08 13:16:24 -05:00
Hans Hasselberg	c7e6c9ebec	http: use default minsize for gzip handler. (#7354 ) Fixes #6306	2020-06-08 10:10:08 +02:00
Hans Hasselberg	72f92ae7ca	agent: add option to disable agent cache for HTTP endpoints (#8023 ) This allows the operator to disable agent caching for the http endpoint. It is on by default for backwards compatibility and if disabled will ignore the url parameter `cached`.	2020-06-08 10:08:12 +02:00
Kyle Havlovitz	b874c8ef0c	Add connect expose CLI command	2020-06-05 14:54:29 -07:00
Daniel Nephin	caa692deea	ci: Enabled SA2002 staticcheck check And handle errors in the main test goroutine	2020-06-05 17:50:11 -04:00
Hans Hasselberg	5281cb74db	Setup intermediate_pki_path on secondary when using vault (#8001 ) Make sure to mount vault backend for intermediate_pki_path on secondary dc.	2020-06-05 21:36:22 +02:00
Daniel Nephin	ce6cc094a1	intentions: fix a bug in Intention.SetHash Found using staticcheck. binary.Write does not accept int types without a size. The error from binary.Write was ignored, so we never saw this error. Casting the data to uint64 produces a correct hash. Also deprecate the Default{Addr,Port} fields, and prevent them from being encoded. These fields will always be empty and are not used. Removing these would break backwards compatibility, so they are left in place for now. Co-authored-by: Hans Hasselberg <me@hans.io>	2020-06-05 14:51:43 -04:00
R.B. Boyer	9cfa4a3fc9	tests: ensure that the ServiceExists helper function normalizes entmeta (#8025 ) This fixes a unit test failure over in enterprise due to https://github.com/hashicorp/consul/pull/7384	2020-06-05 10:41:39 +02:00
R.B. Boyer	b88bd6660e	server: don't activate federation state replication or anti-entropy until all servers are running 1.8.0+ (#8014 )	2020-06-04 16:05:27 -05:00
Hans Hasselberg	dfcf45c6cf	tests: use constructor instead init (#8024 )	2020-06-04 22:59:06 +02:00
Pierre Souchay	9813ae512b	checks: when a service does not exists in an alias, consider it failing (#7384 ) In current implementation of Consul, check alias cannot determine if a service exists or not. Because a service without any check is semantically considered as passing, so when no healthchecks are found for an agent, the check was considered as passing. But this make little sense as the current implementation does not make any difference between: * a non-existing service (passing) * a service without any check (passing as well) In order to make it work, we have to ensure that when a check did not find any healthcheck, the service does indeed exists. If it does not, lets consider the check as failing.	2020-06-04 14:50:52 +02:00
Hans Hasselberg	0f343332da	Merge pull request #7966 from hashicorp/pool_improvements Agent connection pool cleanup	2020-06-04 08:56:26 +02:00
Freddy	9ed325ba8b	Enable gateways to resolve hostnames to IPv4 addresses (#7999 ) The DNS resolution will be handled by Envoy and defaults to LOGICAL_DNS. This discovery type can be overridden on a per-gateway basis with the envoy_dns_discovery_type Gateway Option. If a service contains an instance with a hostname as an address we set the Envoy cluster to use DNS as the discovery type rather than EDS. Since both mesh gateways and terminating gateways route to clusters using SNI, whenever there is a mix of hostnames and IP addresses associated with a service we use the hostname + CDS rather than the IPs + EDS. Note that we detect hostnames by attempting to parse the service instance's address as an IP. If it is not a valid IP we assume it is a hostname.	2020-06-03 15:28:45 -06:00
Matt Keeler	771c613dae	Fix legacy management tokens in unupgraded secondary dcs (#7908 ) The ACL.GetPolicy RPC endpoint was supposed to return the “parent” policy and not always the default policy. In the case of legacy management tokens the parent policy was supposed to be “manage”. The result of us not sending this properly was that operations that required specifically a management token such as saving a snapshot would not work in secondary DCs until they were upgraded.	2020-06-03 11:22:22 -04:00
Matt Keeler	0e4c65d422	Fix segfault due to race condition for checking server versions (#7957 ) The ACL monitoring routine uses c.routers to check for server version updates. Therefore it needs to be started after initializing the routers.	2020-06-03 10:36:32 -04:00
Daniel Nephin	99eb583ebc	Replace goe/verify.Values with testify/require.Equal (#7993 ) * testing: replace most goe/verify.Values with require.Equal One difference between these two comparisons is that go/verify considers nil slices/maps to be equal to empty slices/maps, where as testify/require does not, and does not appear to provide any way to enable that behaviour. Because of this difference some expected values were changed from empty slices to nil slices, and some calls to verify.Values were left. * Remove github.com/pascaldekloe/goe/verify Reduce the number of assertion packages we use from 2 to 1	2020-06-02 12:41:25 -04:00
Alvin Huang	80c34f0461	Merge pull request #7956 from hashicorp/update-master-to-1.8.0-beta2 Update master to 1.8.0 beta2	2020-06-01 16:52:19 -04:00
R.B. Boyer	833211c14c	acl: allow auth methods created in the primary datacenter to optionally create global tokens (#7899 )	2020-06-01 11:44:47 -05:00
R.B. Boyer	ffb9c7d6f7	acl: remove the deprecated `acl_enforce_version_8` option (#7991 ) Fixes #7292	2020-05-29 16:16:03 -05:00
Jono Sosulska	c554ba9e10	Replace whitelist/blacklist terminology with allowlist/denylist (#7971 ) * Replace whitelist/blacklist terminology with allowlist/denylist	2020-05-29 14:19:16 -04:00
Hans Hasselberg	1fbc1d4777	pool: remove timeout parameter Timeout was never used in a meaningful way by callers, which is why it is now entirely internal to the pool.	2020-05-29 08:21:28 +02:00
Hans Hasselberg	ad03f863ff	pool: remove useTLS and ForceTLS In the past TLS usage was enforced with these variables, but these days this decision is made by TLSConfigurator and there is no reason to keep using the variables.	2020-05-29 08:21:24 +02:00
Hans Hasselberg	c45432014b	pool: remove version The version field has been used to decide which multiplexing to use. It was introduced in `2457293dce`. But this is 6y ago and there is no need for this differentiation anymore.	2020-05-28 23:06:01 +02:00
hashicorp-ci	cd617dbfa9	update bindata_assetfs.go	2020-05-28 14:39:37 -04:00
hashicorp-ci	b3ca11fb0b	update bindata_assetfs.go	2020-05-28 14:39:28 -04:00
Daniel Nephin	c88fae0aac	ci: Add staticcheck and fix most errors Three of the checks are temporarily disabled to limit the size of the diff, and allow us to enable all the other checks in CI. In a follow up we can fix the issues reported by the other checks one at a time, and enable them.	2020-05-28 11:59:58 -04:00
Daniel Nephin	4f2bff174d	Merge pull request #7963 from hashicorp/dnephin/replace-lib-translate-keys Replace lib.TranslateKeys with a mapstructure decode hook	2020-05-27 16:51:26 -04:00
Daniel Nephin	6a2d7d77c0	config: use the new HookTranslateKeys instead of lib.TranslateKeys With the exception of CA provider config, which will be migrated at some later time.	2020-05-27 16:24:47 -04:00
Daniel Nephin	8ced4300c8	Add alias struct tags for new decode hook	2020-05-27 16:24:47 -04:00
R.B. Boyer	77f2e54618	create lib/stringslice package (#7934 )	2020-05-27 11:47:32 -05:00
R.B. Boyer	ddd0a13e27	agent: handle re-bootstrapping in a secondary datacenter when WAN federation via mesh gateways is configured (#7931 ) The main fix here is to always union the `primary-gateways` list with the list of mesh gateways in the primary returned from the replicated federation states list. This will allow any replicated (incorrect) state to be supplemented with user-configured (correct) state in the config file. Eventually the game of random selection whack-a-mole will pick a winning entry and re-replicate the latest federation states from the primary. If the user-configured state is actually the incorrect one, then the same eventual correct selection process will work in that case, too. The secondary fix is actually to finish making wanfed-via-mgws actually work as originally designed. Once a secondary datacenter has replicated federation states for the primary AND managed to stand up its own local mesh gateways then all of the RPCs from a secondary to the primary SHOULD go through two sets of mesh gateways to arrive in the consul servers in the primary (one hop for the secondary datacenter's mesh gateway, and one hop through the primary datacenter's mesh gateway). This was neglected in the initial implementation. While everything works, ideally we should treat communications that go around the mesh gateways as just provided for bootstrapping purposes. Now we heuristically use the success/failure history of the federation state replicator goroutine loop to determine if our current mesh gateway route is working as intended. If it is, we try using the local gateways, and if those don't work we fall back on trying the primary via the union of the replicated state and the go-discover configuration flags. This can be improved slightly in the future by possibly initializing the gateway choice to local on startup if we already have replicated state. This PR does not address that improvement. Fixes #7339	2020-05-27 11:31:10 -05:00
Raphaël Rondeau	0d2f178b7b	connect: fix endpoints clusterName when using cluster escape hatch (#7319 ) ```changelog * fix(connect): fix endpoints clusterName when using cluster escape hatch ```	2020-05-26 10:57:22 +02:00
Pierre Souchay	d6649e42af	Stop all watches before shuting down anything dring shutdown. (#7526 ) This will prevent watches from being triggered. ```changelog * fix(agent): stop all watches before shuting down ```	2020-05-26 10:01:49 +02:00
R.B. Boyer	1b5023cb69	connect: ensure proxy-defaults protocol is used for upstreams (#7938 )	2020-05-21 16:08:39 -05:00
Kyle Havlovitz	b14696e32a	Standardize support for Tagged and BindAddresses in Ingress Gateways (#7924 ) * Standardize support for Tagged and BindAddresses in Ingress Gateways This updates the TaggedAddresses and BindAddresses behavior for Ingress to match Mesh/Terminating gateways. The `consul connect envoy` command now also allows passing an address without a port for tagged/bind addresses. * Update command/connect/envoy/envoy.go Co-authored-by: Freddy <freddygv@users.noreply.github.com> * PR comments * Check to see if address is an actual IP address * Update agent/xds/listeners.go Co-authored-by: Freddy <freddygv@users.noreply.github.com> * fix whitespace Co-authored-by: Chris Piraino <cpiraino@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com>	2020-05-21 09:08:12 -05:00
Daniel Nephin	03291943e1	Merge pull request #7933 from hashicorp/dnephin/state-txn-missing-errors state: fix unhandled error	2020-05-20 17:00:20 -04:00
Daniel Nephin	04bf0f3490	Update agent/consul/state/catalog.go Co-authored-by: Hans Hasselberg <me@hans.io>	2020-05-20 16:34:14 -04:00
Seth Hoenig	44ee818d46	grpc: use default resolver scheme for grpc dialing (#7617 ) Currently checks of type gRPC will emit log messages such as, 2020/02/12 13:48:22 [INFO] parsed scheme: "" 2020/02/12 13:48:22 [INFO] scheme "" not registered, fallback to default scheme Without adding full support for using custom gRPC schemes (maybe that's right long-term path) we can just supply the default scheme as provided by the grpc library. Fixes https://github.com/hashicorp/consul/issues/7274 and https://github.com/hashicorp/nomad/issues/7415	2020-05-20 22:26:26 +02:00
Daniel Nephin	3f607d9ef0	state: use an error to indicate compare failed Errors are values. We can use the error value to identify the 'comparison failed' case which makes the function easier to use and should make it harder to miss handle the error case	2020-05-20 12:43:33 -04:00
Pierre Souchay	5c7af90154	tests: added unit test to ensure watches are not re-triggered on consul reload (#7449 ) This ensures no regression about https://github.com/hashicorp/consul/issues/7318 And ensure that https://github.com/hashicorp/consul/issues/7446 cannot happen anymore	2020-05-20 12:38:29 +02:00
Pierre Souchay	e9d176db2a	Allow to restrict servers that can join a given Serf Consul cluster. (#7628 ) Based on work done in https://github.com/hashicorp/memberlist/pull/196 this allows to restrict the IP ranges that can join a given Serf cluster and be a member of the cluster. Restrictions on IPs can be done separatly using 2 new differents flags and config options to restrict IPs for LAN and WAN Serf.	2020-05-20 11:31:19 +02:00
Daniel Nephin	1bbea2751f	consul/state: refactor tnxService to avoid missed cases Handling errors at the end of a log switch/case block is somewhat brittle. This block included a couple cases where errors were ignored, but it was not obvious the way it was written. This change moves all error handling into each case block. There is still potentially one case where err is ignored, which will be handled in a follow up.	2020-05-19 16:50:14 -04:00
Daniel Nephin	9f27d61bee	Remove unused var The usage was removed in `8e22d80e35`, however it seems there may be a bug here because the cluster name is not updated when the target changes.	2020-05-19 16:50:14 -04:00
Daniel Nephin	4f738b462b	Handle error from template.Execute Refactored the function to make the problem more obvious, by using a guard.	2020-05-19 16:50:14 -04:00
Daniel Nephin	c662f0f0de	Fix a number of problems found by staticcheck Some of these problems are minor (unused vars), but others are real bugs (ignored errors). Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>	2020-05-19 16:50:14 -04:00
Daniel Nephin	5c99109dd9	Remove unused var The usage of this var was removed in `b92f895c23`. Found by using staticcheck	2020-05-19 16:50:14 -04:00
Chris Piraino	9d9e23cc44	Add service id context to the proxycfg logger This is especially useful when multiple proxies are all querying the same Consul agent.	2020-05-18 09:08:05 -05:00
Chris Piraino	79468793f6	Do not return an error if requested service is not a gateway This commit converts the previous error into just a Warn-level log message. By returning an error when the requested service was not a gateway, we did not appropriately update envoy because the cache Fetch returned an error and thus did not propagate the update through proxycfg and xds packages.	2020-05-18 09:08:04 -05:00
Aleksandr Zagaevskiy	a75e3d9051	Preserve ModifyIndex for unchanged entry in KVS TXN (#7832 )	2020-05-14 13:25:04 -06:00
Pierre Souchay	cf55e81c06	tests: fix unstable test `TestAgentAntiEntropy_Checks`. (#7594 ) Example of failure: https://circleci.com/gh/hashicorp/consul/153932#tests/containers/2	2020-05-14 09:54:49 +02:00
Kit Patella	ad1d4d4d07	http: migrate from instrumentation in s.wrap() to an s.enterpriseHandler()	2020-05-13 15:47:05 -07:00
Matt Keeler	acccdbe45c	Fix identity resolution on clients and in secondary dcs (#7862 ) Previously this happened to be using the method on the Server/Client that was meant to allow the ACLResolver to locally resolve tokens. On Servers that had tokens (primary or secondary dc + token replication) this function would lookup the token from raft and return the ACLIdentity. On clients this was always a noop. We inadvertently used this function instead of creating a new one when we added logging accessor ids for permission denied RPC requests. With this commit, a new method is used for resolving the identity properly via the ACLResolver which may still resolve locally in the case of being on a server with tokens but also supports remote token resolution.	2020-05-13 13:00:08 -04:00
Chris Piraino	7a7760bfd5	Make new gateway tests compatible with enterprise (#7856 )	2020-05-12 13:48:20 -05:00
Daniel Nephin	600645b5f9	Add unconvert linter To find unnecessary type convertions	2020-05-12 13:47:25 -04:00
Drew Bailey	c9d0b83277	Value is already an int, remove type cast	2020-05-12 13:13:09 -04:00
Daniel Nephin	9d5ab443a7	Merge pull request #7689 from hashicorp/dnephin/remove-deadcode-1 Remove some dead code	2020-05-12 12:33:59 -04:00
Daniel Nephin	47238a693d	Merge pull request #7819 from hashicorp/dnephin/remove-t.Parallel-1 test: Remove t.Parallel() from agent/structs tests	2020-05-12 12:11:57 -04:00
R.B. Boyer	1efafd7523	acl: add auth method for JWTs (#7846 )	2020-05-11 20:59:29 -05:00
Kit Patella	58ee349a83	Merge pull request #7843 from hashicorp/oss-sync/auditing-config agent/config: Fix tests & include Audit struct as a pointer on Config	2020-05-11 14:23:44 -07:00
Kit Patella	10b3478a4d	agent/config: include Audit struct as a pointer on Config, fix tests	2020-05-11 14:13:05 -07:00
Kit Patella	b5564751bf	Merge pull request #7841 from hashicorp/oss-sync/auditing-config OSS sync - Auditing config	2020-05-11 13:44:38 -07:00
Kit Patella	f5030957d0	agent/config: add auditing config to OSS and add to enterpriseConfigMap exclusions	2020-05-11 13:27:35 -07:00
Chris Piraino	c21052457b	Return early from updateGatewayServices if nothing to update (#7838 ) * Return early from updateGatewayServices if nothing to update Previously, we returned an empty slice of gatewayServices, which caused us to accidentally delete everything in the memdb table * PR comment and better formatting	2020-05-11 14:46:48 -05:00
Chris Piraino	4d6751bf16	Fix TestInternal_GatewayServiceDump_Ingress (#7840 ) Protocol was added as a field on GatewayServices after GatewayServiceDump PR branch was created.	2020-05-11 14:46:31 -05:00
R.B. Boyer	7414a3fa53	cli: ensure 'acl auth-method update' doesn't deep merge the Config field (#7839 )	2020-05-11 14:21:17 -05:00
Chris Piraino	74c0543ef2	PR comment and better formatting	2020-05-11 14:04:59 -05:00
Chris Piraino	fb9ee9d892	Return early from updateGatewayServices if nothing to update Previously, we returned an empty slice of gatewayServices, which caused us to accidentally delete everything in the memdb table	2020-05-11 12:38:04 -05:00
Freddy	b3ec383d04	Gateway Services Nodes UI Endpoint (#7685 ) The endpoint supports queries for both Ingress Gateways and Terminating Gateways. Used to display a gateway's linked services in the UI.	2020-05-11 11:35:17 -06:00
Kyle Havlovitz	136549205c	Merge pull request #7759 from hashicorp/ingress/tls-hosts Add TLS option for Ingress Gateway listeners	2020-05-11 09:18:43 -07:00
Kyle Havlovitz	8d140ce9af	Disallow the blanket wildcard prefix from being used as custom host	2020-05-08 20:24:18 -07:00
Chris Piraino	a0e1f57ac2	Remove development log line	2020-05-08 20:24:18 -07:00
Chris Piraino	26f92e74f6	Compute all valid DNSSANs for ingress gateways For DNSSANs we take into account the following and compute the appropriate wildcard values: - source datacenter - namespaces - alt domains	2020-05-08 20:23:17 -07:00
Daniel Nephin	5655d7f34e	Add outlier_detection check to integration test Fix decoding of time.Duration types.	2020-05-08 14:56:57 -04:00
Daniel Nephin	eaa05d623a	xds: Add passive health check config for upstreams	2020-05-08 14:56:57 -04:00
Chris Piraino	429d0cedd2	Restoring config entries updates the gateway-services table (#7811 ) - Adds a new validateConfigEntryEnterprise function - Also fixes some state store tests that were failing in enterprise	2020-05-08 13:24:33 -05:00
Daniel Nephin	e60bb9f102	test: Remove t.Parallel() from agent/structs tests go test will only run tests in parallel within a single package. In this case the package test run time is exactly the same with or without t.Parallel() (~0.7s). In generally we should avoid t.Parallel() as it causes a number of problems with `go test` not reporting failure messages correctly. I encountered one of these problems, which is what prompted this change. Since `t.Parallel` is not providing any benefit in this package, this commit removes it. The change was automated with: git grep -l 't.Parallel' \| xargs sed -i -e '/t.Parallel/d'	2020-05-08 14:06:10 -04:00
Freddy	c32a4f1ece	Fix up enterprise compatibility for gateways (#7813 )	2020-05-08 09:44:34 -06:00

... 5 6 7 8 9 ...

2629 Commits