Commit Graph

649 Commits

Author SHA1 Message Date
R.B. Boyer 64271289ec
server: partly fix config entry replication issue that prevents replication in some circumstances (#12307)
There are some cross-config-entry relationships that are enforced during
"graph validation" at persistence time that are required to be
maintained. This means that config entries may form a digraph at times.

Config entry replication procedes in a particular sorted order by kind
and name.

Occasionally there are some fixups to these digraphs that end up
replicating in the wrong order and replicating the leaves
(ingress-gateway) before the roots (service-defaults) leading to
replication halting due to a graph validation error related to things
like mismatched service protocol requirements.

This PR changes replication to give each computed change (upsert/delete)
a fair shot at being applied before deciding to terminate that round of
replication in error. In the case where we've simply tried to do the
operations in the wrong order at least ONE of the outstanding requests
will complete in the right order, leading the subsequent round to have
fewer operations to do, with a smaller likelihood of graph validation
errors.

This does not address all scenarios, but for scenarios where the edits
are being applied in the wrong order this should avoid replication
halting.

Fixes #9319

The scenario that is NOT ADDRESSED by this PR is as follows:

1. create: service-defaults: name=new-web, protocol=http
2. create: service-defaults: name=old-web, protocol=http
3. create: service-resolver: name=old-web, redirect-to=new-web
4. delete: service-resolver: name=old-web
5. update: service-defaults: name=old-web, protocol=grpc
6. update: service-defaults: name=new-web, protocol=grpc
7. create: service-resolver: name=old-web, redirect-to=new-web

If you shutdown dc2 just before (4) and turn it back on after (7)
replication is impossible as there is no single edit you can make to
make forward progress.
2022-02-23 17:27:48 -06:00
R.B. Boyer c7e7daa7b7
add changelog entry for enterprise only change (#12425) 2022-02-23 14:23:48 -06:00
Daniel Nephin 771df290d7
Merge pull request #11910 from hashicorp/dnephin/ca-provider-interface-for-ica-in-primary
ca: add support for an external trusted CA
2022-02-22 13:14:52 -05:00
John Cowen 73b6687c5b
ui: Transition App Chrome to use new Disclosure Menus (#12334)
* Add %panel CSS component

* Deprecate old menu-panel component

* Various smallish tweaks to disclosure-menu

* Move all menus in the app chrome to use new DisclosureMenu

* Follow up CSS to move all app chrome menus to new components

* Don't prevent default any events from anchors

* Add a tick to click steps
2022-02-21 12:22:59 +00:00
Evan Culver 602e08ada7
checks: populate interval and timeout when registering services (#11138) 2022-02-18 12:05:33 -08:00
Kyle Havlovitz 362753cad7
Merge pull request #12385 from hashicorp/tproxy-http-upstream-fix
xds: respect chain protocol on default discovery chain
2022-02-18 10:08:59 -08:00
John Cowen 360cf89ea5
ui: Fixup displaying a Nspace default policy when expanding the preview pane (#12316) 2022-02-18 17:22:05 +00:00
John Cowen 717621698d
ui: Replace CollapsibleNotices with more a11y focussed Disclosure component (#12305)
* Delete collapsible notices component and related helper

* Add relative t action/helper to our Route component

* Replace single use CollapsibleNotices with multi-use Disclosure
2022-02-18 17:16:03 +00:00
Evan Culver 8a301fb915
ci: combine 'enhancement' entry type with 'improvement' (#12376) 2022-02-17 19:21:47 -08:00
Daniel Nephin 1853a32df6 ca: add test cases for rotating external trusted CA 2022-02-17 18:21:30 -05:00
Daniel Nephin 85ecbaf109
Merge pull request #12110 from hashicorp/dnephin/blocking-queries-not-found
rpc: make blocking queries for non-existent items more efficient
2022-02-17 18:09:39 -05:00
Ashwin Venkatesh 6e6cd928a2
Parse datacenter from request (#12370)
* Parse datacenter from request
- Parse the value of the datacenter from the create/delete requests for AuthMethods and BindingRules so that they can be created in and deleted from the datacenters specified in the request.
2022-02-17 16:41:27 -05:00
Kyle Havlovitz 1edc69f50b Add changelog note 2022-02-17 12:17:12 -08:00
Florian Apolloner f01f00fc84
Support for connect native services in topology view. (#12098) 2022-02-16 16:51:54 -05:00
Daniel Nephin 4b33bdf396 Make blockingQuery efficient with 'not found' results.
By using the query results as state.

Blocking queries are efficient when the query matches some results,
because the ModifyIndex of those results, returned as queryMeta.Mindex,
will never change unless the items themselves change.

Blocking queries for non-existent items are not efficient because the
queryMeta.Index can (and often does) change when other entities are
written.

This commit reduces the churn of these queries by using a different
comparison for "has changed". Instead of using the modified index, we
use the existence of the results. If the previous result was "not found"
and the new result is still "not found", we know we can ignore the
modified index and continue to block.

This is done by setting the minQueryIndex to the returned
queryMeta.Index, which prevents the query from returning before a state
change is observed.
2022-02-15 18:24:33 -05:00
Daniel Nephin cc2c005fad debug: limit the size of the trace
We've noticed that a trace that is captured over the full duration is
too large to open on most machines. A trace.out captured over just the
interval period (30s by default) should be a more than enough time to
capture trace data.
2022-02-15 14:15:34 -05:00
R.B. Boyer 115946da99
server: conditionally avoid writing a config entry to raft if it was already the same (#12321)
This will both save on unnecessary raft operations as well as
unnecessarily incrementing the raft modify index of config entries
subject to no-op updates.
2022-02-14 14:39:12 -06:00
R.B. Boyer 80dfcb1bcd
raft: update to v1.3.5 (#12325)
This includes closing some leadership transfer gaps and adding snapshot
restore progress logging.
2022-02-14 13:48:52 -06:00
R.B. Boyer fa4577d1a9
local: fixes a data race in anti-entropy sync (#12324)
The race detector noticed this initially in `TestAgentConfigWatcherSidecarProxy` but it is not restricted to just tests.

The two main changes here were:

- ensure that before we mutate the internal `agent/local` representation of a Service (for tags or VIPs) we clone those fields
- ensure that there's no function argument joint ownership between the caller of a function and the local state when calling `AddService`, `AddCheck`, and related using `copystructure` for now.
2022-02-14 10:41:33 -06:00
Mark Anderson 1a16f7ee70 Refactor to make ACL errors more structured. (#12308)
* First phase of refactoring PermissionDeniedError

Add extended type PermissionDeniedByACLError that captures information
about the accessor, particular permission type and the object and name
of the thing being checked.

It may be worth folding the test and error return into a single helper
function, that can happen at a later date.

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-02-11 12:53:23 -08:00
John Cowen 0ac32122f1
ui: Make sure saving intentions from topology includes the partition (#12317) 2022-02-11 13:58:01 +00:00
John Cowen cf98691e85
ui: Stop ember-data overwriting SyncTimes (#12315) 2022-02-11 13:54:46 +00:00
John Cowen c1b4e33dee
ui: Exclude Service Health from Node listing page (#12248)
This commit excludes the health of any service instances from the Node Listing page. This means that if you are viewing the Node listing page you will only see failing nodes if there are any Node Checks failing, Service Instance Health checks are no longer taken into account.

Co-authored-by: Jamie White <jamie@jgwhite.co.uk>
2022-02-11 09:52:27 +00:00
Freddy 9580f79f86
Merge pull request #12223 from hashicorp/proxycfg/passthrough-cleanup 2022-02-10 17:35:51 -07:00
freddygv 5d882badcb Add changelog entry 2022-02-10 17:21:34 -07:00
Freddy 378a7258e3
Prevent xDS tight loop on cfg errors (#12195) 2022-02-10 15:37:36 -07:00
Dhia Ayachi 4f0a71d7b4
fix race when starting a service while the agent `serviceManager` is … (#12302)
* fix race when starting a service while the agent `serviceManager` is stopping

* add changelog
2022-02-10 13:30:49 -05:00
John Cowen d49ee8e355
ui: Ensure proxy instance health is taken into account in Service Instance Listings (#12279)
We noticed that the Service Instance listing on both Node and Service views where not taking into account proxy instance health. This fixes that up so that the small health check information in each Service Instance row includes the proxy instances health checks when displaying Service Instance health (afterall if the proxy instance is unhealthy then so is the service instance that it should be proxying)

* Refactor Consul::InstanceChecks with docs

* Add to-hash helper, which will return an object keyed by a prop

* Stop using/relying on ember-data type things, just use a hash lookup

* For the moment add an equivalent "just give me proxies" model prop

* Start stitching things together, this one requires an extra HTTP request

..previously we weren't even requesting proxies instances here

* Finish up the stitching

* Document Consul::ServiceInstance::List while I'm here

* Fix up navigation mocks Name > Service
2022-02-10 15:28:26 +00:00
Daniel Nephin 01784470f3
Merge pull request #12277 from hashicorp/dnephin/panic-in-service-register
catalog: initialize the refs map to prevent a nil panic
2022-02-09 19:48:22 -05:00
Daniel Nephin 82c264b2b3 config-entry: fix a panic when registering a service or ingress gateway 2022-02-09 18:49:48 -05:00
R.B. Boyer 89bd1f57b5
xds: allow only one outstanding delta request at a time (#12236)
Fixes #11876

This enforces that multiple xDS mutations are not issued on the same ADS connection at once, so that we can 100% control the order that they are applied. The original code made assumptions about the way multiple in-flight mutations were applied on the Envoy side that was incorrect.
2022-02-08 10:36:48 -06:00
claire labry dc2a95e465
Merge pull request #11956 from hashicorp/enable-security-scan
Enable Security Scan for CRT
2022-02-04 13:13:24 -05:00
Daniel Nephin 81a977ce1d add changelog 2022-02-03 17:39:36 -05:00
Evan Culver f98f37ba55
Merge branch 'enable-security-scan' of github.com:hashicorp/consul into enable-security-scan 2022-02-02 17:32:17 -08:00
Evan Culver d12e0ceddf
Add changelog entry 2022-02-02 17:31:08 -08:00
John Cowen 0f94ce3964
ui: Alias all our Structure Icons to Flight Icons (#12209) 2022-02-02 13:24:47 +00:00
Daniel Nephin 997bf1e5a4
Merge pull request #12166 from hashicorp/dnephin/acl-resolve-token-2
acl: remove ResolveTokenToIdentity
2022-01-31 19:19:21 -05:00
Daniel Nephin d363cc0f07 acl: remove unused methods on fakes, and add changelog
Also document the metric that was removed in a previous commit.
2022-01-31 17:53:53 -05:00
Dan Upton fdfe079674
streaming: split event buffer by key (#12080) 2022-01-28 12:27:00 +00:00
R.B. Boyer d2c0945f52
xds: fix for delta xDS reconnect bug in LDS/CDS (#12174)
When a wildcard xDS type (LDS/CDS/SRDS) reconnects from a delta xDS stream,
prior to envoy `1.19.0` it would populate the `ResourceNamesSubscribe` field
with the full list of currently subscribed items, instead of simply omitting it
to infer that it wanted everything (which is what wildcard mode means).

This upstream issue was filed in envoyproxy/envoy#16063 and fixed in
envoyproxy/envoy#16153 which went out in Envoy `1.19.0` and is fixed in later
versions (later refactored in envoyproxy/envoy#16855).

This PR conditionally forces LDS/CDS to be wildcard-only even when the
connected Envoy requests a non-wildcard subscription, but only does so on
versions prior to `1.19.0`, as we should not need to do this on later versions.

This fixes the failure case as described here: #11833 (comment)

Co-authored-by: Huan Wang <fredwanghuan@gmail.com>
2022-01-25 11:24:27 -06:00
Michele Degges 81a79a3595 Adding changelog entry 2022-01-24 12:32:22 -08:00
Ashwin Venkatesh 7568f3a102
Add support for 'Partition' and 'RetryJoin' (#12126)
- Adding a 'Partition' and 'RetryJoin' command allows test cases where
  one would like to spin up a Consul Agent in a non-default partition to
test use-cases that are common when enabling Admin Partition on
Kubernetes.
2022-01-20 16:49:36 -05:00
Dan Upton ca3aca92c4
[OSS] Remove remaining references to master (#11827) 2022-01-20 12:47:50 +00:00
R.B. Boyer 850ca7e12d
update changelog (#12128) 2022-01-19 17:28:53 -06:00
John Cowen cdb8a35501
ui: Fixup KV folder creation then further creation within that folder (#12081)
The fix here is two fold:

- We shouldn't be providing the DataSource (which loads the data) with an id when we are creating from within a folder (in the buggy code we are providing the parentKey of the new KV you are creating)
- Being able to provide an empty id to the DataSource/KV repository and that repository responding with a newly created object is more towards the "new way of doing forms", therefore the corresponding code to return a newly created ember-data object. As we changed the actual bug in point 1 here, we need to make sure the repository responds with an empty object when the request id is empty.
2022-01-19 10:09:25 +00:00
Evan Culver e35dd08a63
connect: Upgrade Envoy 1.20 to 1.20.1 (#11895) 2022-01-18 14:35:27 -05:00
Dhia Ayachi 25c36af222
update serf to v0.9.7 (#12057)
* update serf to v0.9.7

* add change log

* update changelog
2022-01-18 13:03:22 -05:00
Kyle Havlovitz c589d05ddf Add changelog note 2022-01-12 12:31:28 -08:00
Chris S. Kim ea0df29afb
Update memberlist to 0.3.1 (#12042) 2022-01-12 12:00:18 -05:00
John Cowen 9192452885
ui: Adds a notice for non-primary intention creation (#11985) 2022-01-12 11:50:09 +00:00