Commit Graph

12306 Commits

Author SHA1 Message Date
Daniel Nephin 875d8bde42 agent: convert NodeID methods to functions
Making these functions allows us to cleanup how an agent is initialized. They only make use of a config and a logger, so they do not need to be agent methods.

Also cleanup the testing to use t.Run and require.
2020-08-12 13:05:10 -04:00
Daniel Nephin 0738eb8596 Extract nodeID functions to a different file
In preparation for turning them into functions.
To reduce the scope of Agent, and refactor how Agent is created and started.
2020-08-12 13:05:10 -04:00
R.B. Boyer 839ca03b7c update changelog snippet 2020-08-12 11:21:54 -05:00
R.B. Boyer e3cd4a8539
connect: use stronger validation that ingress gateways have compatible protocols defined for their upstreams (#8470)
Fixes #8466

Since Consul 1.8.0 there was a bug in how ingress gateway protocol
compatibility was enforced. At the point in time that an ingress-gateway
config entry was modified the discovery chain for each upstream was
checked to ensure the ingress gateway protocol matched. Unfortunately
future modifications of other config entries were not validated against
existing ingress-gateway definitions, such as:

1. create tcp ingress-gateway pointing to 'api' (ok)
2. create service-defaults for 'api' setting protocol=http (worked, but not ok)
3. create service-splitter or service-router for 'api' (worked, but caused an agent panic)

If you were to do these in a different order, it would fail without a
crash:

1. create service-defaults for 'api' setting protocol=http (ok)
2. create service-splitter or service-router for 'api' (ok)
3. create tcp ingress-gateway pointing to 'api' (fail with message about
   protocol mismatch)

This PR introduces the missing validation. The two new behaviors are:

1. create tcp ingress-gateway pointing to 'api' (ok)
2. (NEW) create service-defaults for 'api' setting protocol=http ("ok" for back compat)
3. (NEW) create service-splitter or service-router for 'api' (fail with
   message about protocol mismatch)

In consideration for any existing users that may be inadvertently be
falling into item (2) above, that is now officiall a valid configuration
to be in. For anyone falling into item (3) above while you cannot use
the API to manufacture that scenario anymore, anyone that has old (now
bad) data will still be able to have the agent use them just enough to
generate a new agent/proxycfg error message rather than a panic.
Unfortunately we just don't have enough information to properly fix the
config entries.
2020-08-12 11:19:20 -05:00
Freddy d72f72dcd5
Notify alias checks when aliased service is [de]registered (#8456) 2020-08-12 09:47:41 -06:00
Daniel Nephin 3d96c5b651
Merge pull request #8469 from hashicorp/dnephin/config-source
config: make Source an interface to avoid the marshal/unmarshal cycle in auto-config
2020-08-12 11:17:15 -04:00
Mike Morris ebc9b27cfa
ci: bump Go to v1.14.7 (#8449) 2020-08-12 10:43:19 -04:00
Hans Hasselberg aacf0fd777
Merge pull request #8471 from hashicorp/local_only
thread local-only through the layers
2020-08-12 08:54:51 +02:00
Freddy 875816d0d3
Internal endpoint to query intentions associated with a gateway (#8400) 2020-08-11 17:20:41 -06:00
Iryna Shustava ed0fa4b3b1
docs: update helm chart ref (#8483)
No longer require servers to be running on k8s when
manageSystemACLs is true
2020-08-11 14:39:44 -07:00
Daniel Nephin 36202a12dd
Merge pull request #8453 from hashicorp/dnephin/fix-test-server-timeout
sdk: mitigate api test timeout
2020-08-11 16:48:29 -04:00
Kyle Havlovitz 635952681e Fix a state store comment about version 2020-08-11 13:46:12 -07:00
Kyle Havlovitz 9bc9d3014d
Merge pull request #8474 from hashicorp/snapshot-index-fix
fsm: Fix snapshot bug with restoring node/service/check indexes
2020-08-11 12:35:08 -07:00
Kyle Havlovitz c39a275666 fsm: Fix snapshot bug with restoring node/service/check indexes 2020-08-11 11:49:52 -07:00
Freddy 58a2788578
Update CHANGELOG.md 2020-08-11 12:15:53 -06:00
John Cowen 43ec04f073
ui: Reduce reconnection attempts on disconnection (#8481)
* ui: Reduce reconnection attempts on disconnection

The UI will attempt to reconnect/retry a blocking query to Consul after
a disconnection in certain circumstances.

1. On receipt of a 5xx error (used for keeping blocking queries running
through reverse proxies that have lowertimeouts than consul itself)
2. When a user switches to a different tab and back again)
3. When the connection to Consul is dropped entirely (when Consul itself
has exited)

In the last case the retry attempts where not using a 3 second interval
between attempts like the first case is.

This commit changes the last case to use the same 3 second pause as the
last case.
2020-08-11 18:47:15 +01:00
John Cowen a686de0414
ui: Add Optgroups and selectedItems to multiple select dropdown and use (#8476)
* ui: Switch selects to use more HTML-like approach for optgroups

* Add KV comparator

* Use new option/optgroup approach for sort/select

* Fix up tests for new order of menu items
2020-08-11 18:02:51 +01:00
John Cowen 7f711bb68f
ui: Passthrough any error from a route:application refresh (#8480) 2020-08-11 17:57:22 +01:00
Kenia 2d30d864ce
ui: Add unique slug key id to proxy (#8479) 2020-08-11 12:53:45 -04:00
s-christoff 102b7e55da
Update Go-Metrics 0.3.4 (#8478) 2020-08-11 11:17:43 -05:00
Daniel Nephin fe2f80c3a1 Use SIGABRT to get a stack trace when the timeout is hit 2020-08-11 12:12:55 -04:00
Hans Hasselberg aff02198d7 Refactor keyring ops:
* changes some functions to return data instead of modifying pointer
  arguments
* renames globalRPC() to keyringRPCs() to make its purpose more clear
* restructures KeyringOperation() to make it more understandable
2020-08-11 13:42:03 +02:00
Hans Hasselberg 07261db64d thread local-only through the layers
$ consul keyring -list -local-only
==> Gathering installed encryption keys...

dc1 (LAN):
  aUlAW4ST3+vwseI61so24CoORkyjZofcmHk+j7QPSYQ= [1/1]
2020-08-11 13:41:53 +02:00
Daniel Nephin 4297a8ba07 auto-config: Avoid the marshal/unmarshal cycle in auto-config
Use a LiteralConfig and return a config.Config from translate.
2020-08-10 20:07:52 -04:00
Daniel Nephin 38980ebb4c config: Make Source an interface
This will allow us to accept config from auto-config without needing to
go through a serialziation cycle.
2020-08-10 12:46:28 -04:00
John Cowen 8bea00d974
ui: Dropdown/select improvements (#8468)
* ui: Better org of split-button/sort-button ready for design change

* ui: Improve keyboard accessibility of dropdown menu
2020-08-10 16:00:05 +01:00
Kenia 31bcf71d63
ui: Add sorting to namespaces (#8405)
* Add sorting to namespaces

* Add sorting to namespaces

* ui: Fix up default namespace no delete test (#8467)

Co-authored-by: John Cowen <johncowen@users.noreply.github.com>
2020-08-10 10:54:51 -04:00
John Cowen d1c879e06c
ui: Rework popover-menu auto closing (#8340)
* ui: Move more menu subcomponents deeper down into popovermenu

* ui: Simplify aria-menu component+remove auto menu close on route change

* Add ember-string-fns

* Use new PopoverMenu sub components and fix up tests

* Fix up wrong closing let

* Remove dcs from the service show page now we have it in the navigation
2020-08-10 09:26:02 +01:00
Mike Morris ff37af1129
changelog: Update for 1.8.2, 1.7.6, 1.7.5 and 1.6.7 (#8462)
* update bindata_assetfs.go

* Release v1.8.2

* Putting source back into Dev Mode

* changelog: add entries for 1.7.6, 1.7.5 and 1.6.7

Co-authored-by: hashicorp-ci <hashicorp-ci@users.noreply.github.com>
2020-08-07 18:58:09 -04:00
Daniel Nephin 80e99cb3e6 testing: remove unnecessary defers in tests
The data directory is now removed by the test helper that created it.
2020-08-07 17:28:16 -04:00
Mike Morris 48e7c07cf9
api: bump consul/api to v1.6.0 and consul/sdk to v0.6.0 (#8460)
* api: bump consul/sdk dependency to v0.6.0

* api: bump dependency to v1.6.0
2020-08-07 17:26:05 -04:00
Daniel Nephin 7dbacf297c testing: Remove NotifyShutdown
NotifyShutdown was only used for testing. Now that t.Cleanup exists, we
can use that instead of attaching cleanup to the Server shutdown.

The Autopilot test which used NotifyShutdown doesn't need this
notification because Shutdown is synchronous. Waiting for the function
to return is equivalent.
2020-08-07 17:14:44 -04:00
Jack 77d0c33fc8
Specify allowed ingress gateway protocols in docs (#8454)
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
2020-08-07 13:25:23 -06:00
Mike Morris 0ff4f46c52
Update CHANGELOG.md 2020-08-07 13:15:20 -04:00
Matt Keeler 2e92bec149
Update CHANGELOG.md 2020-08-07 10:21:44 -04:00
Matt Keeler 67dec3b609
Require token replication to be enabled in secondary dcs when ACLs are enabled with AutoConfig (#8451)
AutoConfig will generate local tokens for clients and the ability to use local tokens is gated off of token replication being enabled and being configured with a replication token. Therefore we already have a hard requirement on having token replication enabled, this commit just makes sure to surface that to the operator instead of having to discern what the issue is from RPC errors.
2020-08-07 10:20:27 -04:00
Hans Hasselberg 3f9d089a1a
Update CHANGELOG.md 2020-08-07 12:07:12 +02:00
Hans Hasselberg d316cd06c1
auto_config implies connect (#8433) 2020-08-07 12:02:02 +02:00
Rebecca Zanzig 39b62e5d8a
Merge pull request #8426 from hashicorp/docs/k8s-resources
Add lifecycle sidecar and init container resource settings docs
2020-08-06 15:28:11 -07:00
Rebecca Zanzig 18e9f925b8 Add lifecycle sidecar and init container resource settings docs 2020-08-06 15:11:54 -07:00
Hans Hasselberg 586ee2566f
Introducing changelog-gen (#8387)
* add templates for changelog-gen
* add entry files for currently unreleased PRs on master
2020-08-06 23:15:29 +02:00
Daniel Nephin 8e67d8eaeb sdk: mitigate api test timeout
Occasionally we are seeing the go-test-api job timeout at 10 minutes.
Looking at the stack trace I saw the following:

1. Lots of tests blocked on server.Stop in NewTestServerConfigT. This
   suggests that SIGINT is being sent to the server, but the server is
   not properly shutting down.

2. Over 20k goroutines that look like this:

goroutine 16355 [select, 8 minutes]:
net/http.(*persistConn).readLoop(0xc004270240)
    /usr/local/go/src/net/http/transport.go:2099 +0x99e
created by net/http.(*Transport).dialConn
    /usr/local/go/src/net/http/transport.go:1647 +0xc56

Issue 1 seems to be the main problem, but debugging that directly is not
possible because our buffered logs do not get sent when the tests
timeout. To mitigate this problem I've added a timeout to the cmd.Wait()
to force kill the process and return an error.

Unfortunately because we retry this operation, we still may not see the
cause because the next attempt will likely pass. I'm tempted to remove
the retry around NewTestServerConfigT.

Issue 2 seems to be caused by not closing the response body. Since the
request is performed many times in a loop, many goroutines are created
and are not closed until the response body is closed.
2020-08-06 17:00:20 -04:00
Hans Hasselberg d4217cc165
Update CHANGELOG.md 2020-08-06 21:31:18 +02:00
Blake Covarrubias c81610c5f9 website: Redirect /mesh to new URL
Redirect service mesh use case page to point to new URL.
2020-08-06 09:25:08 -07:00
Hans Hasselberg 51a8e15cf8
Mark its own cluster as healthy when rebalancing. (#8406)
This code started as an optimization to avoid doing an RPC Ping to
itself. But in a single server cluster the rebalancing was led to
believe that there were no healthy servers because foundHealthyServer
was not set. Now this is being set properly.

Fixes #8401 and #8403.
2020-08-06 10:42:09 +02:00
Mike Morris 7fd4471b80
Update version.js to 1.8.1 (#8439) 2020-08-05 16:56:38 -04:00
R.B. Boyer d405a095a2 update changelog 2020-08-05 15:02:35 -05:00
R.B. Boyer 397019d970
xds: revert setting set_node_on_first_message_only to true when generating envoy bootstrap config (#8440)
When consul is restarted and an envoy that had already sent
DiscoveryRequests to the previous consul process sends a request to the
new process it doesn't respect the setting and never populates
DiscoveryRequest.Node for the life of the new consul process due to this
bug: https://github.com/envoyproxy/envoy/issues/9682

Fixes #8430
2020-08-05 15:00:24 -05:00
Daniel Nephin ae382805bd
Merge pull request #8404 from hashicorp/dnephin/remove-log-output-field
Use Logger consistently, instead of LogOutput
2020-08-05 14:31:43 -04:00
Daniel Nephin 3b82ad0955 Rename NewClient/NewServer
Now that duplicate constructors have been removed we can use the shorter names for the single constructor.
2020-08-05 14:00:55 -04:00