consul

Commit Graph

Author	SHA1	Message	Date
Daniel Nephin	0bb9c318b7	http: fix tests incorrectly using HTTPAddr to get the address of the https server. In #8234 I changed a few tests to use TestAgent.HTTPAddr() to find the addr used in the test. Due to the way HTTPAddr() was implemented these tests were passing, but I think the pass was incidental. HTTPAddr() was not matching any servers, and was instead returning the last server, which happened to be the one these tests wanted. This commit fixes the implementation of HTTPAddr to panic if no match was found. The tests which require an HTTPS server are changed to use a new firstAddr() to look up the correct address.	2020-09-04 15:29:17 -04:00
freddygv	403a180430	Set tgw filter router config name to cluster name	2020-09-04 12:45:05 -06:00
Hans Hasselberg	436a7032d1	secondaryIntermediateCertRenewalWatch abort on success (#8588 ) secondaryIntermediateCertRenewalWatch was using `retryLoopBackoff` to renew the intermediate certificate. Once it entered the inner loop and started `retryLoopBackoff` it would never leave that. `retryLoopBackoffAbortOnSuccess` will return when renewing is successful, like it was intended originally.	2020-09-04 11:47:16 +02:00
freddygv	959d9913b8	Add server receiver to routes and log tgw err	2020-09-03 16:19:58 -06:00
Daniel Nephin	ed4b51f1ae	Merge pull request #8357 from hashicorp/streaming/add-service-health-events streaming: add ServiceHealth events	2020-09-03 17:53:56 -04:00
Daniel Nephin	4c9ed41eab	Merge pull request #8554 from hashicorp/dnephin/agent-setup-persisted-tokens agent: move token persistence from agent into token.Store	2020-09-03 17:29:21 -04:00
Daniel Nephin	e573e64d58	state: handle terminating gateways in service health events	2020-09-03 16:58:05 -04:00
Daniel Nephin	3775392fb5	state: improve comments in catalog_events.go Co-authored-by: Paul Banks <banks@banksco.de>	2020-09-03 16:58:05 -04:00
Daniel Nephin	417c5c93a8	state: use changeType in serviceChanges To be a little more explicit, instead of nil implying an indirect change	2020-09-03 16:58:05 -04:00
Daniel Nephin	01424ba146	don't over allocate slice	2020-09-03 16:58:04 -04:00
Daniel Nephin	d210242875	state: fix a bug in building service health events The nodeCheck slice was being used as the first arg in append, which in some cases will modify the array backing the slice. This would lead to service checks for other services in the wrong event. Also refactor some things to reduce the arguments to functions.	2020-09-03 16:58:04 -04:00
Daniel Nephin	7581305523	state: Remove unused args and return values Also rename some functions to identify them as constructors for events	2020-09-03 16:58:04 -04:00
Daniel Nephin	27b02d391c	state: use an enum for tracking node changes	2020-09-03 16:58:04 -04:00
Daniel Nephin	09329b542d	state: serviceHealthSnapshot refactored to remove unused return value and remove duplication	2020-09-03 16:58:04 -04:00
Daniel Nephin	bf523420ee	state: Add Change processor and snapshotter for service health Co-authored-by: Paul Banks <banks@banksco.de>	2020-09-03 16:58:04 -04:00
Daniel Nephin	e03e911144	state: fix bug in changeTrackerDB.publish Creating a new readTxn does not work because it will not see the newly created objects that are about to be committed. Instead use the active write Txn.	2020-09-03 16:58:01 -04:00
Daniel Nephin	5de4d5bbe3	stream: have SnapshotFunc accept a non-pointer SubscribeRequest The value is not expected to be modified. Passing a value makes that explicit.	2020-09-03 16:54:02 -04:00
freddygv	cd4cf5161f	Update resolver defaulting	2020-09-03 13:08:44 -06:00
freddygv	00f2794bfa	Update golden files after default route fix for tgw	2020-09-03 12:35:11 -06:00
Daniel Nephin	6ca45e1a61	agent: add apiServers type for managing HTTP servers Remove Server field from HTTPServer. The field is no longer used.	2020-09-03 13:40:12 -04:00
freddygv	318aa094fd	Fix http assertion in route creation	2020-09-03 10:21:20 -06:00
freddygv	30ba080d25	Add explicit protocol overrides in tgw xds test cases	2020-09-03 08:57:48 -06:00
freddygv	eaa250cc80	Ensure resolver node with LB isn't considered default	2020-09-03 08:55:57 -06:00
freddygv	ef877449ce	Move valid policies to pkg level	2020-09-02 15:49:03 -06:00
freddygv	f81fe6a1a1	Remove LB infix and move injection to xds	2020-09-02 15:13:50 -06:00
R.B. Boyer	119e945c3e	connect: all config entries pick up a meta field (#8596 ) Fixes #8595	2020-09-02 14:10:25 -05:00
Chris Piraino	28f163c2d2	Merge pull request #8603 from hashicorp/feature/usage-metrics Track node and service counts in the state store and emit them periodically as metrics	2020-09-02 13:23:39 -05:00
R.B. Boyer	d0f74cd1e8	connect: fix bug in preventing some namespaced config entry modifications (#8601 ) Whenever an upsert/deletion of a config entry happens, within the open state store transaction we speculatively test compile all discovery chains that may be affected by the pending modification to verify that the write would not create an erroneous scenario (such as splitting traffic to a subset that did not exist). If a single discovery chain evaluation references two config entries with the same kind and name in different namespaces then sometimes the upsert/deletion would be falsely rejected. It does not appear as though this bug would've let invalid writes through to the state store so the correction does not require a cleanup phase.	2020-09-02 10:47:19 -05:00
Chris Piraino	bcb586bee2	Set metrics reporting interval to 9 seconds This is below the 10 second interval that lib/telemetry.go implements as its aggregation interval, ensuring that we always report these metrics.	2020-09-02 10:24:23 -05:00
Chris Piraino	a3028cad89	Update godoc string for memdb wrapper functions/structs	2020-09-02 10:24:22 -05:00
Chris Piraino	d301145e62	Refactor state store usage to track unique service names This commit refactors the state store usage code to track unique service name changes on transaction commit. This means we only need to lookup usage entries when reading the information, as opposed to iterating over a large number of service indices. - Take into account a service instance's name being changed - Do not iterate through entire list of service instances, we only care about whether there is 0, 1, or more than 1.	2020-09-02 10:24:21 -05:00
Chris Piraino	086a8ea8eb	Use ReadTxn interface in state store helper functions	2020-09-02 10:24:20 -05:00
Chris Piraino	69dbc926ad	Add WriteTxn interface and convert more functions to ReadTxn We add a WriteTxn interface for use in updating the usage memdb table, with the forward-looking prospect of incrementally converting other functions to accept interfaces. As well, we use the ReadTxn in new usage code, and as a side effect convert a couple of existing functions to use that interface as well.	2020-09-02 10:24:19 -05:00
Chris Piraino	3feae7f77b	Report node/service usage metrics from every server Using the newly provided state store methods, we periodically emit usage metrics from the servers. We decided to emit these metrics from all servers, not just the leader, because that means we do not have to care about leader election flapping causing metrics turbulence, and it seems reasonable for each server to emit its own view of the state, even if they should always converge rapidly.	2020-09-02 10:24:17 -05:00
Chris Piraino	04705e90f9	Add new usage memdb table that tracks usage counts of various elements We update the usage table on Commit() by using the TrackedChanges() API of memdb. Track memdb changes on restore so that usage data can be compiled	2020-09-02 10:24:16 -05:00
freddygv	63f79e5f9b	Restructure structs and other PR comments	2020-09-02 09:10:50 -06:00
Daniel Nephin	f1a41318d7	token: OSS support for enterprise tokens	2020-08-31 15:10:15 -04:00
Daniel Nephin	629e4aaa65	config: use token.Config for ACLToken config Using the target Config struct reduces the amount of copying and translating of configuration structs.	2020-08-31 15:10:15 -04:00
Daniel Nephin	330be5b740	agent/token: Move token persistence out of agent And into token.Store. This change isolates any awareness of token persistence in a single place. It is a small step in allowing Agent.New to accept its dependencies.	2020-08-31 15:00:34 -04:00
Daniel Nephin	a80de898ea	fix TestStore_RegularTokens This test was only passing because t.Parallel was causing every subtest to run with the last value in the iteration, which sets a value for all tokens. The test started to fail once t.Parallel was removed, but the same failure could have been produced by adding 'tt := tt' to the t.Run() func. These tests run in under 10ms, so there is no reason to use t.Parallel.	2020-08-31 14:59:14 -04:00
Matt Keeler	91d680b830	Merge of auto-config and auto-encrypt code (#8523 ) auto-encrypt is now handled as a special case of auto-config. This also is moving all the cert-monitor code into the auto-config package.	2020-08-31 13:12:17 -04:00
freddygv	0236e169bb	Add documentation for resolver LB cfg	2020-08-28 14:46:13 -06:00
freddygv	28d0602fc1	Pass LB config to Envoy via xDS	2020-08-28 14:27:40 -06:00
freddygv	2bbbd9e1da	Log error as error	2020-08-28 13:11:55 -06:00
freddygv	81115b6eaa	Compile down LB policy to disco chain nodes	2020-08-28 13:11:04 -06:00
Daniel Nephin	6956477be5	Merge pull request #8548 from edevil/fix_flake Fix flaky TestACLResolver_Client/Concurrent-Token-Resolve	2020-08-28 15:10:55 -04:00
Daniel Nephin	72bf350069	Merge pull request #8552 from pierresouchay/reload_cache_throttling_config Ensure that Cache options are reloaded when `consul reload` is performed	2020-08-28 15:04:42 -04:00
Pierre Souchay	d5974b1d17	Added Unit test for cache reloading	2020-08-28 13:03:58 +02:00
freddygv	ff56a64b08	Add LB policy to service-resolver	2020-08-27 19:44:02 -06:00
Jack	9e1c6727f9	Add http2 and grpc support to ingress gateways (#8458 )	2020-08-27 15:34:08 -06:00
R.B. Boyer	74d5df7c7a	xds: use envoy's rbac filter to handle intentions entirely within envoy (#8569 )	2020-08-27 12:20:58 -05:00
R.B. Boyer	d1843456d2	agent: ensure that we normalize bootstrapped config entries (#8547 )	2020-08-27 11:37:25 -05:00
Pierre Souchay	9a64d3e5fe	Also test reload of EntryFetchMaxBurst	2020-08-27 18:14:05 +02:00
Matt Keeler	f97cc0445a	Move RPC router from Client/Server and into BaseDeps (#8559 ) This will allow it to be a shared component which is needed for AutoConfig	2020-08-27 11:23:52 -04:00
Pierre Souchay	5842a902df	Tests that changes in rate limit are taken into account by agent	2020-08-27 16:41:20 +02:00
Pierre Souchay	879d087f65	Added `options.Equals()` and minor fixes indentation fixes	2020-08-27 13:44:45 +02:00
R.B. Boyer	fead4fc2a5	agent: expose the list of supported envoy versions on /v1/agent/self (#8545 )	2020-08-26 10:04:11 -05:00
Kyle Havlovitz	97f1f341d6	Automatically renew the token used by the Vault CA provider	2020-08-25 10:34:49 -07:00
Pierre Souchay	d2be9d38da	Ensure that Cache options are reloaded when `consul reload` is performed. This will apply cache throttling parameters are properly applied: * cache.EntryFetchMaxBurst * cache.EntryFetchRate When values are updated, a log is displayed in info.	2020-08-24 23:33:10 +02:00
André Cruz	9a0792139c	Decrease test flakiness Fix flaky TestACLResolver_Client/Concurrent-Token-Resolve and TestCacheNotifyPolling	2020-08-24 20:30:02 +01:00
André Cruz	aa212423e3	testing: Fix govet errors	2020-08-21 18:01:55 +01:00
Daniel Nephin	01745feec0	Merge pull request #8537 from hashicorp/dnephin/fix-panic-on-connect-nil Fix panic when decoding 'Connect: null'	2020-08-20 18:00:25 -04:00
Daniel Nephin	07ad662131	Fix panic when decoding 'Connect: null' Surprisingly the json Unmarshal updates the aux pointer to a nil.	2020-08-20 17:52:14 -04:00
Daniel Nephin	e16375216d	config: use logging.Config in RuntimeConfig To add structure to RuntimeConfig, and remove the need to translate into a third type.	2020-08-19 13:21:00 -04:00
Daniel Nephin	f2373a5575	logging: move init of grpclog This line initializes global state. Moving it out of the constructor and closer to where logging is setup helps keep related things together.	2020-08-19 13:21:00 -04:00
Daniel Nephin	33c401a16e	logging: Setup accept io.Writer instead of []io.Writer Also accept a non-pointer Config, since the config is not modified	2020-08-19 13:20:41 -04:00
Daniel Nephin	63bad36de7	testing: disable global metrics sink in tests This might be better handled by allowing configuration for the InMemSink interval and retail, and disabling the global. For now this is a smaller change to remove the goroutine leak caused by tests because go-metrics does not provide any way of shutting down the global goroutine.	2020-08-18 19:04:57 -04:00
Daniel Nephin	5d4df54296	agent: extract dependency creation from New With this change, Agent.New() accepts many of the dependencies instead of creating them in New. Accepting fully constructed dependencies from a constructor makes the type easier to test, and easier to change. There are still a number of dependencies created in Start() which can be addressed in a follow up.	2020-08-18 19:04:55 -04:00
Daniel Nephin	51b08c645b	Merge pull request #8514 from hashicorp/dnephin/testing-improvements-1 testing: small improvements to TestSessionCreate and testutil.retry	2020-08-18 18:26:05 -04:00
Daniel Nephin	ab2157bbc9	Merge pull request #8528 from hashicorp/dnephin/move-node-name-validation config: Move some config validation from Agent.Start to config.Builder.Validate	2020-08-18 18:25:41 -04:00
Hans Hasselberg	a932aafc91	add primary keys to list keyring (#8522 ) During gossip encryption key rotation it would be nice to be able to see if all nodes are using the same key. This PR adds another field to the json response from `GET v1/operator/keyring` which lists the primary keys in use per dc. That way an operator can tell when a key was successfully setup as primary key. Based on https://github.com/hashicorp/serf/pull/611 to add primary key to list keyring output: ```json [ { "WAN": true, "Datacenter": "dc2", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 6, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6 }, "NumNodes": 6 }, { "WAN": false, "Datacenter": "dc2", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 8, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "NumNodes": 8 }, { "WAN": false, "Datacenter": "dc1", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 3, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "NumNodes": 8 } ] ``` I intentionally did not change the CLI output because I didn't find a good way of displaying this information. There are a couple of options that we could implement later: * add a flag to show the primary keys * add a flag to show json output Fixes #3393.	2020-08-18 09:50:24 +02:00
Daniel Nephin	35f1ecee0b	config: Move remote-script-checks warning to config Previously it was done in Agent.Start, but it can be done much earlier	2020-08-17 17:39:49 -04:00
Daniel Nephin	27b36bfc4e	config: move NodeName validation to config validation Previsouly it was done in Agent.Start, which is much later then it needs to be. The new 'dns' package was required, because otherwise there would be an import cycle. In the future we should move more of the dns server into the dns package.	2020-08-17 17:25:02 -04:00
Daniel Nephin	b4015969c9	Merge pull request #8515 from hashicorp/dnephin/unexport-testing-shims config: unexport fields and resolve TODOs in config.Builder	2020-08-17 16:03:07 -04:00
Daniel Nephin	16217fe9b9	testing: use t.Cleanup in testutil.TempFile So that it has the same behaviour as TempDir. Also remove the now unnecessary 'defer os.Remove'	2020-08-14 20:06:01 -04:00
Daniel Nephin	d68edcecf4	testing: Remove all the defer os.Removeall Now that testutil uses t.Cleanup to remove the directory the caller no longer has to manage the removal	2020-08-14 19:58:53 -04:00
Daniel Nephin	8a4d292c8e	config: unexport and resolve TODOs in config.Builder - unexport testing shims, and document their purpose - resolve a TODO by moving validation to NewBuilder and storing the one field that is used instead of all of Options - create a slice with the correct size to avoid extra allocations	2020-08-14 19:23:32 -04:00
Daniel Nephin	a4b201af36	testing: Improve session_endpoint_test While working on another change I caused a bunch of these tests to fail. Unfortunately the failure messages were not super helpful at first. One problem was that the request and response were created outside of the retry. This meant that when the second attempt happened, the request body was empty (because the buffer had been consumed), and so the request was not actually being retried. This was fixed by moving more of the request creation into the retry block. Another problem was that these functions can return errors in two ways, and are not consistent about which way they use. Some errors are returned to the response writer, but the tests were not checking those errors, which was causing a panic later on. This was fixed by adding a check for the response code. Also adds some missing t.Helper(), and has assertIndex use checkIndex so that it is clear these are the same implementation.	2020-08-14 18:55:52 -04:00
Daniel Nephin	070e843113	testutil: Add t.Cleanup to TempDir TempDir registers a Cleanup so that the directory is always removed. To disable to cleanup, set the TEST_NOCLEANUP env var.	2020-08-14 13:19:10 -04:00
Daniel Nephin	2b920ad199	testing: fix flaky test TestDNS_NonExistentDC_RPC I saw this test flake locally, and it was easy to reproduce with -count=10. The failure was: 'TestAgent.dns: rpc error: error=No known Consul servers'. Waiting for the agent seems to fix it.	2020-08-13 18:03:04 -04:00
Daniel Nephin	1912c5ad89	testing: wait until monitor has started before shutdown This commit fixes a test that I saw flake locally while running tests. The test output from the monitor started immediately after the line the test was looking for. To fix the problem a channel is closed when the goroutine starts. Shutdown is not called until this channel is closed, which seems to greatly reduce the chance of a flake.	2020-08-13 17:53:29 -04:00
Daniel Nephin	3a4e62836b	testing: Remove TestAgent.Key and change TestAgent.DataDir TestAgent.Key was only used by 3 tests. Extracting it from the common helper that is used in hundreds of tests helps keep the shared part small and more focused. This required a second change (which I was planning on making anyway), which was to change the behaviour of DataDir. Now in all cases the TestAgent will use the DataDir, and clean it up once the test is complete.	2020-08-13 17:53:24 -04:00
Daniel Nephin	b1679508d4	testing: use t.Cleanup in TestAgent for returnPorts	2020-08-13 17:09:37 -04:00
Daniel Nephin	4e8e0de8f0	testing: remove unused fields from TestACLAgent	2020-08-13 17:03:55 -04:00
Daniel Nephin	399c77dfb6	agent: rename vars in newConsulConfig 'base' is a bit misleading, since it is the return value. Renamed to cfg.	2020-08-13 11:58:21 -04:00
Daniel Nephin	7b5b170a0d	agent: Move setupKeyring functions to keyring.go There are a couple reasons for this change: 1. agent.go is way too big. Smaller files makes code eaasier to read because tools that show usage also include filename which can give a lot more context to someone trying to understand which functions call other functions. 2. these two functions call into a large number of functions already in keyring.go.	2020-08-13 11:58:21 -04:00
Daniel Nephin	9919e5dfa5	agent: unmethod consulConfig To allow us to move newConsulConfig out of Agent.	2020-08-13 11:58:21 -04:00
Daniel Nephin	8f596f5551	Fix conflict in merged PRs One PR renamed the var from config->cfg, and another used the old name config, which caused the build to fail on master.	2020-08-13 11:28:26 -04:00
Daniel Nephin	d677706625	state: remove unused Store method receiver And use ReadTxn interface where appropriate.	2020-08-13 11:25:22 -04:00
Daniel Nephin	190fcc14a3	Merge pull request #8463 from hashicorp/dnephin/unmethod-make-node-id agent: convert NodeID methods to functions	2020-08-13 11:18:11 -04:00
Daniel Nephin	912aae8624	Merge pull request #8461 from hashicorp/dnephin/remove-notify-shutdown agent/consul: Remove NotifyShutdown	2020-08-13 11:16:48 -04:00
Daniel Nephin	5b37efd91b	Merge pull request #8365 from hashicorp/dnephin/fix-service-by-node-meta-flake state: speed up tests that use watchLimit	2020-08-13 11:16:12 -04:00
Daniel Nephin	37eacf8192	auto-config: reduce awareness of config This is a small step to allowing Agent to accept its dependencies instead of creating them in New. There were two fields in autoconfig.Config that were used exclusively to load config. These were replaced with a single function, allowing us to move LoadConfig back to the config package. Also removed the WithX functions for building a Config. Since these were simple assignment, it appeared we were not getting much value from them.	2020-08-12 13:23:23 -04:00
Daniel Nephin	e07554500e	Remove check that hostID is a uuid. Immediately afterward we hash the ID, so it does not need to be a uuid anymore.	2020-08-12 13:05:10 -04:00
Daniel Nephin	875d8bde42	agent: convert NodeID methods to functions Making these functions allows us to cleanup how an agent is initialized. They only make use of a config and a logger, so they do not need to be agent methods. Also cleanup the testing to use t.Run and require.	2020-08-12 13:05:10 -04:00
Daniel Nephin	0738eb8596	Extract nodeID functions to a different file In preparation for turning them into functions. To reduce the scope of Agent, and refactor how Agent is created and started.	2020-08-12 13:05:10 -04:00
R.B. Boyer	e3cd4a8539	connect: use stronger validation that ingress gateways have compatible protocols defined for their upstreams (#8470 ) Fixes #8466 Since Consul 1.8.0 there was a bug in how ingress gateway protocol compatibility was enforced. At the point in time that an ingress-gateway config entry was modified the discovery chain for each upstream was checked to ensure the ingress gateway protocol matched. Unfortunately future modifications of other config entries were not validated against existing ingress-gateway definitions, such as: 1. create tcp ingress-gateway pointing to 'api' (ok) 2. create service-defaults for 'api' setting protocol=http (worked, but not ok) 3. create service-splitter or service-router for 'api' (worked, but caused an agent panic) If you were to do these in a different order, it would fail without a crash: 1. create service-defaults for 'api' setting protocol=http (ok) 2. create service-splitter or service-router for 'api' (ok) 3. create tcp ingress-gateway pointing to 'api' (fail with message about protocol mismatch) This PR introduces the missing validation. The two new behaviors are: 1. create tcp ingress-gateway pointing to 'api' (ok) 2. (NEW) create service-defaults for 'api' setting protocol=http ("ok" for back compat) 3. (NEW) create service-splitter or service-router for 'api' (fail with message about protocol mismatch) In consideration for any existing users that may be inadvertently be falling into item (2) above, that is now officiall a valid configuration to be in. For anyone falling into item (3) above while you cannot use the API to manufacture that scenario anymore, anyone that has old (now bad) data will still be able to have the agent use them just enough to generate a new agent/proxycfg error message rather than a panic. Unfortunately we just don't have enough information to properly fix the config entries.	2020-08-12 11:19:20 -05:00
Freddy	d72f72dcd5	Notify alias checks when aliased service is [de]registered (#8456 )	2020-08-12 09:47:41 -06:00
Daniel Nephin	3d96c5b651	Merge pull request #8469 from hashicorp/dnephin/config-source config: make Source an interface to avoid the marshal/unmarshal cycle in auto-config	2020-08-12 11:17:15 -04:00
Hans Hasselberg	aacf0fd777	Merge pull request #8471 from hashicorp/local_only thread local-only through the layers	2020-08-12 08:54:51 +02:00

1 2 3 4 5 ...

2392 Commits