Commit Graph

149 Commits

Author SHA1 Message Date
Matt Keeler 118adbb123
ACL Token Persistence and Reloading (#5328)
This PR adds two features which will be useful for operators when ACLs are in use.

1. Tokens set in configuration files are now reloadable.
2. If `acl.enable_token_persistence` is set to `true` in the configuration, tokens set via the `v1/agent/token` endpoint are now persisted to disk and loaded when the agent starts (or during configuration reload)

Note that token persistence is opt-in so our users who do not want tokens on the local disk will see no change.

Some other secondary changes:

* Refactored a bunch of places where the replication token is retrieved from the token store. This token isn't just for replicating ACLs and now it is named accordingly.
* Allowed better paths in the `v1/agent/token/` API. Instead of paths like: `v1/agent/token/acl_replication_token` the path can now be just `v1/agent/token/replication`. The old paths remain to be valid. 
* Added a couple new API functions to set tokens via the new paths. Deprecated the old ones and pointed to the new names. The names are also generally better and don't imply that what you are setting is for ACLs but rather are setting ACL tokens. There is a minor semantic difference there especially for the replication token as again, its no longer used only for ACL token/policy replication. The new functions will detect 404s and fallback to using the older token paths when talking to pre-1.4.3 agents.
* Docs updated to reflect the API additions and to show using the new endpoints.
* Updated the ACL CLI set-agent-tokens command to use the non-deprecated APIs.
2019-02-27 14:28:31 -05:00
Alvin Huang c23eb91262 fix TestAgent_CheckCriticalTime and better error output 2019-02-22 17:34:45 -05:00
Matt Keeler 766d771017
Pass a testing.T into NewTestAgent and TestAgent.Start (#5342)
This way we can avoid unnecessary panics which cause other tests not to run.

This doesn't remove all the possibilities for panics causing other tests not to run, it just fixes the TestAgent
2019-02-14 10:59:14 -05:00
Aestek 5960974db1 [Fix] Services sometimes not being synced with acl_enforce_version_8 = false (#4771)
Fixes: https://github.com/hashicorp/consul/issues/3676

This fixes a bug were registering an agent with a non-existent ACL token can prevent other 
services registered with a good token from being synced to the server when using 
`acl_enforce_version_8 = false`.

## Background

When `acl_enforce_version_8` is off the agent does not check the ACL token validity before 
storing the service in its state.
When syncing a service registered with a missing ACL token we fall into the default error 
handling case (https://github.com/hashicorp/consul/blob/master/agent/local/state.go#L1255)
and stop the sync (https://github.com/hashicorp/consul/blob/master/agent/local/state.go#L1082)
without setting its Synced property to true like in the permission denied case.
This means that the sync will always stop at the faulty service(s).
The order in which the services are synced is random since we iterate on a map. So eventually
all services with good ACL tokens will be synced, this can however take some time and is influenced 
by the cluster size, the bigger the slower because retries are less frequent.
Having a service in this state also prevent all further sync of checks as they are done after
the services.

## Changes 

This change modify the sync process to continue even if there is an error. 
This fixes the issue described above as well as making the sync more error tolerant: if the server repeatedly refuses
a service (the ACL token could have been deleted by the time the service is synced, the servers 
were upgraded to a newer version that has more strict checks on the service definition...). 
Then all services and check that can be synced will, and those that don't will be marked as errors in 
the logs instead of blocking the whole process.
2019-01-04 10:01:50 -05:00
Matt Keeler 18b29c45c4
New ACLs (#4791)
This PR is almost a complete rewrite of the ACL system within Consul. It brings the features more in line with other HashiCorp products. Obviously there is quite a bit left to do here but most of it is related docs, testing and finishing the last few commands in the CLI. I will update the PR description and check off the todos as I finish them over the next few days/week.
Description

At a high level this PR is mainly to split ACL tokens from Policies and to split the concepts of Authorization from Identities. A lot of this PR is mostly just to support CRUD operations on ACLTokens and ACLPolicies. These in and of themselves are not particularly interesting. The bigger conceptual changes are in how tokens get resolved, how backwards compatibility is handled and the separation of policy from identity which could lead the way to allowing for alternative identity providers.

On the surface and with a new cluster the ACL system will look very similar to that of Nomads. Both have tokens and policies. Both have local tokens. The ACL management APIs for both are very similar. I even ripped off Nomad's ACL bootstrap resetting procedure. There are a few key differences though.

    Nomad requires token and policy replication where Consul only requires policy replication with token replication being opt-in. In Consul local tokens only work with token replication being enabled though.
    All policies in Nomad are globally applicable. In Consul all policies are stored and replicated globally but can be scoped to a subset of the datacenters. This allows for more granular access management.
    Unlike Nomad, Consul has legacy baggage in the form of the original ACL system. The ramifications of this are:
        A server running the new system must still support other clients using the legacy system.
        A client running the new system must be able to use the legacy RPCs when the servers in its datacenter are running the legacy system.
        The primary ACL DC's servers running in legacy mode needs to be a gate that keeps everything else in the entire multi-DC cluster running in legacy mode.

So not only does this PR implement the new ACL system but has a legacy mode built in for when the cluster isn't ready for new ACLs. Also detecting that new ACLs can be used is automatic and requires no configuration on the part of administrators. This process is detailed more in the "Transitioning from Legacy to New ACL Mode" section below.
2018-10-19 12:04:07 -04:00
Paul Banks 0f27ffd163 Proxy Config Manager (#4729)
* Proxy Config Manager

This component watches for local state changes on the agent and ensures that each service registered locally with Kind == connect-proxy has it's state being actively populated in the cache.

This serves two purposes:
 1. For the built-in proxy, it ensures that the state needed to accept connections is available in RAM shortly after registration and likely before the proxy actually starts accepting traffic.
 2. For (future - next PR) xDS server and other possible future proxies that require _push_ based config discovery, this provides a mechanism to subscribe and be notified about updates to a proxy instance's config including upstream service discovery results.

* Address review comments

* Better comments; Better delivery of latest snapshot for slow watchers; Embed Config

* Comment typos

* Add upstream Stringer for funsies
2018-10-10 16:55:34 +01:00
Paul Banks e812f5516a Add -sidecar-for and new /agent/service/:service_id endpoint (#4691)
- A new endpoint `/v1/agent/service/:service_id` which is a generic way to look up the service for a single instance. The primary value here is that it:
   - **supports hash-based blocking** and so;
   - **replaces `/agent/connect/proxy/:proxy_id`** as the mechanism the built-in proxy uses to read its config.
   - It's not proxy specific and so works for any service.
   - It has a temporary shim to call through to the existing endpoint to preserve current managed proxy config defaulting behaviour until that is removed entirely (tested).
 - The built-in proxy now uses the new endpoint exclusively for it's config
 - The built-in proxy now has a `-sidecar-for` flag that allows the service ID of the _target_ service to be specified, on the condition that there is exactly one "sidecar" proxy (that is one that has `Proxy.DestinationServiceID` set) for the service registered.
 - Several fixes for edge cases for SidecarService
 - A fix for `Alias` checks - when running locally they didn't update their state until some external thing updated the target. If the target service has no checks registered as below, then the alias never made it past critical.
2018-10-10 16:55:34 +01:00
Paul Banks b83bbf248c Add Proxy Upstreams to Service Definition (#4639)
* Refactor Service Definition ProxyDestination.

This includes:
 - Refactoring all internal structs used
 - Updated tests for both deprecated and new input for:
   - Agent Services endpoint response
   - Agent Service endpoint response
   - Agent Register endpoint
     - Unmanaged deprecated field
     - Unmanaged new fields
     - Managed deprecated upstreams
     - Managed new
   - Catalog Register
     - Unmanaged deprecated field
     - Unmanaged new fields
     - Managed deprecated upstreams
     - Managed new
   - Catalog Services endpoint response
   - Catalog Node endpoint response
   - Catalog Service endpoint response
 - Updated API tests for all of the above too (both deprecated and new forms of register)

TODO:
 - config package changes for on-disk service definitions
 - proxy config endpoint
 - built-in proxy support for new fields

* Agent proxy config endpoint updated with upstreams

* Config file changes for upstreams.

* Add upstream opaque config and update all tests to ensure it works everywhere.

* Built in proxy working with new Upstreams config

* Command fixes and deprecations

* Fix key translation, upstream type defaults and a spate of other subtele bugs found with ned to end test scripts...

TODO: tests still failing on one case that needs a fix. I think it's key translation for upstreams nested in Managed proxy struct.

* Fix translated keys in API registration.
≈

* Fixes from docs
 - omit some empty undocumented fields in API
 - Bring back ServiceProxyDestination in Catalog responses to not break backwards compat - this was removed assuming it was only used internally.

* Documentation updates for Upstreams in service definition

* Fixes for tests broken by many refactors.

* Enable travis on f-connect branch in this branch too.

* Add consistent Deprecation comments to ProxyDestination uses

* Update version number on deprecation notices, and correct upstream datacenter field with explanation in docs
2018-10-10 16:55:34 +01:00
Pierre Souchay eddcf228ea Implementation of Weights Data structures (#4468)
* Implementation of Weights Data structures

Adding this datastructure will allow us to resolve the
issues #1088 and #4198

This new structure defaults to values:
```
   { Passing: 1, Warning: 0 }
```

Which means, use weight of 0 for a Service in Warning State
while use Weight 1 for a Healthy Service.
Thus it remains compatible with previous Consul versions.

* Implemented weights for DNS SRV Records

* DNS properly support agents with weight support while server does not (backwards compatibility)

* Use Warning value of Weights of 1 by default

When using DNS interface with only_passing = false, all nodes
with non-Critical healthcheck used to have a weight value of 1.
While having weight.Warning = 0 as default value, this is probably
a bad idea as it breaks ascending compatibility.

Thus, we put a default value of 1 to be consistent with existing behaviour.

* Added documentation for new weight field in service description

* Better documentation about weights as suggested by @banks

* Return weight = 1 for unknown Check states as suggested by @banks

* Fixed typo (of -> or) in error message as requested by @mkeeler

* Fixed unstable unit test TestRetryJoin

* Fixed unstable tests

* Fixed wrong Fatalf format in `testrpc/wait.go`

* Added notes regarding DNS SRV lookup limitations regarding number of instances

* Documentation fixes and clarification regarding SRV records with weights as requested by @banks

* Rephrase docs
2018-09-07 15:30:47 +01:00
Martin feb3ce4ee0 Use target service name instead of ID as connect proxy service name (#4620) 2018-09-05 20:33:17 +01:00
Siva Prasad b1a34f899f
TestAgentAntiEntropy: Wait until Consul service is up on the agent. (#4591)
* Anti-Entropy test wait for Consul service added

* Reverted some tests back to using WaitForLeader
2018-08-28 09:52:11 -04:00
Pierre Souchay cec5d72396 BUGFIX: Unit test relying on WaitForLeader() did not work due to wrong test (#4472)
- Improve resilience of testrpc.WaitForLeader()

- Add additionall retry to CI

- Increase "go test" timeout to 8m

- Add wait for cluster leader to several tests in the agent package

- Add retry to some tests in the api and command packages
2018-08-06 19:46:09 -04:00
Mitchell Hashimoto b3854fdd28
agent/local: silly spacing on select statements 2018-07-19 14:21:30 -05:00
Mitchell Hashimoto 8c72bb0cdf
agent/local: address remaining test feedback 2018-07-19 14:20:50 -05:00
Mitchell Hashimoto 9f128e40d6
agent/local: don't use time.After in test since notify is instant 2018-07-18 16:16:28 -05:00
Mitchell Hashimoto f97bfd5be8
agent: address some basic feedback 2018-07-12 09:36:11 -07:00
Mitchell Hashimoto 7543d270e2
agent/local: support local alias checks 2018-07-12 09:36:10 -07:00
Pierre Souchay ff53648df2 Merge remote-tracking branch 'origin/master' into ACL_additional_info 2018-07-07 14:09:18 +02:00
Paul Banks df2cb30b01 Make tests pass and clean proxy persistence. No detached child changes yet.
This is a good state for persistence stuff to re-start the detached child work that got mixed up last time.
2018-06-25 12:24:10 -07:00
Paul Banks cdc7cfaa36 Abandon daemonize for simpler solution (preserving history):
Reverts:
  - bdb274852ae469c89092d6050697c0ff97178465
  - 2c689179c4f61c11f0016214c0fc127a0b813bfe
  - d62e25c4a7ab753914b6baccd66f88ffd10949a3
  - c727ffbcc98e3e0bf41e1a7bdd40169bd2d22191
  - 31b4d18933fd0acbe157e28d03ad59c2abf9a1fb
  - 85c3f8df3eabc00f490cd392213c3b928a85aa44
2018-06-25 12:24:10 -07:00
Paul Banks ba0fb58a72 Make daemoinze an option on test binary without hacks. Misc fixes for racey or broken tests. Still failing on several though. 2018-06-25 12:24:09 -07:00
Paul Banks e21723a891 Persist proxy state through agent restart 2018-06-25 12:24:08 -07:00
Mitchell Hashimoto 3d3eee2f6e
agent: resolve some conflicts and fix tests 2018-06-14 09:42:10 -07:00
Mitchell Hashimoto d9bd4ffebd
agent/local: clarify the non-risk of a full buffer 2018-06-14 09:42:10 -07:00
Mitchell Hashimoto 437689e83c
agent/local: remove outdated comment 2018-06-14 09:42:10 -07:00
Mitchell Hashimoto fcd2ab2338
agent/proxy: manager and basic tests, not great coverage yet coming soon 2018-06-14 09:42:08 -07:00
Mitchell Hashimoto 2bd39a84a6
agent/local: add Notify mechanism for proxy changes 2018-06-14 09:42:08 -07:00
Mitchell Hashimoto 476ea7b04a
agent: start/stop proxies 2018-06-14 09:42:08 -07:00
Mitchell Hashimoto 7355a614fe
agent/local: store proxy on local state, wip, not working yet 2018-06-14 09:42:08 -07:00
Paul Banks e0e12e165b
TLS watching integrated into Service with some basic tests.
There are also a lot of small bug fixes found when testing lots of things end-to-end for the first time and some cleanup now it's integrated with real CA code.
2018-06-14 09:42:07 -07:00
Paul Banks 730da74369
Fix various test failures and vet warnings.
Intention de-duplication in previously merged PR actualy failed some tests that were not caught be me or CI. I ran the test files for state changes but they happened not to trigger this case so I made sure they did first and then fixed. That fixed some upstream intention endpoint tests that I'd not run as part of testing the previous fix.
2018-06-14 09:41:58 -07:00
Paul Banks 2a69663448
Agent Connect Proxy config endpoint with hash-based blocking 2018-06-14 09:41:57 -07:00
Paul Banks e6071051cf
Added connect proxy config and local agent state setup on boot. 2018-06-14 09:41:57 -07:00
Mitchell Hashimoto 9781cb1ace
agent/local: anti-entropy for connect proxy services 2018-06-14 09:41:48 -07:00
Pierre Souchay c83124a94c Removed labels from new ACL denied metrics 2018-06-08 11:56:46 +02:00
Pierre Souchay 064f8ad170 Removed consul prefix from metrics as requested by @kyhavlov 2018-06-08 11:51:50 +02:00
Pierre Souchay 65d3a2b26e Fixed import 2018-04-18 17:09:25 +02:00
Pierre Souchay f13aa5ba9b Added labels to improve new metric 2018-04-18 16:51:22 +02:00
Pierre Souchay d9a23bb2fa Track calls blocked by ACLs using metrics 2018-04-17 10:17:16 +02:00
Guido Iaquinti 8cd11d5888 Add package name to log output 2018-03-21 15:56:14 +00:00
Josh Soref 94835a2715 Spelling (#3958)
* spelling: another

* spelling: autopilot

* spelling: beginning

* spelling: circonus

* spelling: default

* spelling: definition

* spelling: distance

* spelling: encountered

* spelling: enterprise

* spelling: expands

* spelling: exits

* spelling: formatting

* spelling: health

* spelling: hierarchy

* spelling: imposed

* spelling: independence

* spelling: inspect

* spelling: last

* spelling: latest

* spelling: client

* spelling: message

* spelling: minimum

* spelling: notify

* spelling: nonexistent

* spelling: operator

* spelling: payload

* spelling: preceded

* spelling: prepared

* spelling: programmatically

* spelling: required

* spelling: reconcile

* spelling: responses

* spelling: request

* spelling: response

* spelling: results

* spelling: retrieve

* spelling: service

* spelling: significantly

* spelling: specifies

* spelling: supported

* spelling: synchronization

* spelling: synchronous

* spelling: themselves

* spelling: unexpected

* spelling: validations

* spelling: value
2018-03-19 16:56:00 +00:00
James Phillips 2937656f8e
Adds a longer retry period for the AE deferred output test.
There's some justification in the comments about this and a TODO to
improve this later.

Fixes #3668
2017-11-08 18:10:13 -08:00
Frank Schroeder 1cb8b0ffe3
local state: fix go vet issue 2017-10-23 10:56:05 +02:00
Frank Schroeder 7335c34c32
local state: remove stale comment 2017-10-23 10:56:05 +02:00
Frank Schroeder 3d547e30c7
local state: make test more robust 2017-10-23 10:56:05 +02:00
Frank Schroeder 52e73301f6
local state: clone check to avoid side effect 2017-10-23 10:56:05 +02:00
Frank Schroeder 6bc9d66192
local state: use synchronized access to internal maps 2017-10-23 10:56:05 +02:00
Frank Schroeder 58d52ac580
local state: rename Add{Check,Service}State to Set{Check,Service}State 2017-10-23 10:56:04 +02:00
Frank Schroeder e144f51b29
local state: move Metadata methods together 2017-10-23 10:56:04 +02:00
Frank Schroeder 4f9e05f634
local state: update documentation of updateSyncState 2017-10-23 10:56:04 +02:00
Frank Schroeder 41c7b0927e
local state: update comments 2017-10-23 10:56:04 +02:00
Frank Schroeder de57b16d99
local state: address review comments
* move non-blocking notification mechanism into ae.Trigger
* move Pause/Resume into separate type
2017-10-23 10:56:04 +02:00
Frank Schroeder 5c77c59501
local state: refactor TestAgentAntiEntropy_EnableTagOverride
Make intent clearer by being more explicit and adding some comments.
Use verify.Values to compare service entries.
2017-10-23 10:56:04 +02:00
Frank Schroeder 524981f367
local state: fix TestAgentAntiEntropy_EnableTagOverride
The test had a race condition where it relied on the first service to be
synced to the remote catalog which sometimes failed.
2017-10-23 10:56:04 +02:00
Frank Schroeder 1181aaee6a
local state: rename tests 2017-10-23 10:56:04 +02:00
Frank Schroeder 1602ac56b5
local state: drop retry loops from tests
Since the tests are now using synchronous calls for state syncing
we no longer need to use retry loops to wait for the changes to
propagate.
2017-10-23 10:56:04 +02:00
Frank Schroeder 71c74e62c7
local state: fix anti-entropy state tests
The anti-entropy tests relied on the side-effect of the StartSync()
method to perform a full sync instead of a partial sync. This lead to
multiple anti-entropy go routines being started unnecessary retry loops.

This change changes the behavior to perform synchronous full syncs when
necessary removing the need for all of the time.Sleep and most of the
retry loops.
2017-10-23 10:56:04 +02:00
Frank Schroeder 37b95ef98e
local state: fix test with updated error message 2017-10-23 10:56:04 +02:00
Frank Schroeder b803bf3091
local state: tests compile 2017-10-23 10:56:03 +02:00
Frank Schroeder 0a9ac9749e
local state: replace multi-map state with structs
The state of the service and health check records was spread out over
multiple maps guarded by a single lock. Access to the maps has to happen
in a coordinated effort and the tests often violated this which made
them brittle and racy.

This patch replaces the multiple maps with a single one for both checks
and services to make the code less fragile.

This is also necessary since moving the local state into its own package
creates circular dependencies for the tests. To avoid this the tests can
no longer access internal data structures which they should not be doing
in the first place.

The tests still don't compile but this is a ncessary step in that
direction.
2017-10-23 10:56:03 +02:00
Frank Schroeder 6027a9e2a5
local state: move to separate package
This patch moves the local state to a separate package to further
decouple it from the agent code.

The code compiles but the tests do not yet.
2017-10-23 10:56:03 +02:00
Frank Schroeder 67a0689f71
Revert "local state: move to separate package"
This reverts commit d447e823c6.
2017-10-23 10:08:34 +02:00
Frank Schroeder 623e07760a
Revert "local state: replace multi-map state with structs"
This reverts commit ccbae7da5b.
2017-10-23 10:08:34 +02:00
Frank Schroeder 9ed4b2d631
Revert "local state: tests compile"
This reverts commit 1af52bf7be.
2017-10-23 10:08:34 +02:00
Frank Schroeder 5ae4c52ca6
Revert "local state: fix test with updated error message"
This reverts commit e9149f64d9.
2017-10-23 10:08:34 +02:00
Frank Schroeder 67cdfc038e
Revert "local state: fix anti-entropy state tests"
This reverts commit f8e20cd996.
2017-10-23 10:08:34 +02:00
Frank Schroeder 556bf3f85d
Revert "local state: drop retry loops from tests"
This reverts commit 2bdba8ab06.
2017-10-23 10:08:34 +02:00
Frank Schroeder a3aa864d5b
Revert "local state: rename tests"
This reverts commit ff62eaf063.
2017-10-23 10:08:34 +02:00
Frank Schroeder 39615cb57b
Revert "local state: fix TestAgentAntiEntropy_EnableTagOverride"
This reverts commit 86f7ea6013.
2017-10-23 10:08:34 +02:00
Frank Schroeder 9fdea75d99
Revert "local state: refactor TestAgentAntiEntropy_EnableTagOverride"
This reverts commit c28e23eac8.
2017-10-23 10:08:33 +02:00
Frank Schroeder 46641e44d9
Revert "local state: address review comments"
This reverts commit 1d315075b1.
2017-10-23 10:08:33 +02:00
Frank Schroeder abd83f2d28
Revert "local state: update comments"
This reverts commit 42188164f8.
2017-10-23 10:08:33 +02:00
Frank Schroeder c6e441dd60
Revert "local state: update documentation of updateSyncState"
This reverts commit e86521e637.
2017-10-23 10:08:33 +02:00
Frank Schroeder 648b4da4f8
Revert "local state: move Metadata methods together"
This reverts commit 9bc8127728.
2017-10-23 10:08:33 +02:00
Frank Schroeder fa45a6a547
Revert "local state: rename Add{Check,Service}State to Set{Check,Service}State"
This reverts commit 9280841a80.
2017-10-23 10:08:33 +02:00
Frank Schroeder d6f52d9ed1
Revert "local state: use synchronized access to internal maps"
This reverts commit 39a2d8d25e.
2017-10-23 10:08:32 +02:00
Frank Schroeder 4e862d126d
Revert "local state: clone check to avoid side effect"
This reverts commit af1243c725.
2017-10-23 10:08:32 +02:00
Frank Schroeder 42af4cdc70
Revert "local state: make test more robust"
This reverts commit f9267380db.
2017-10-23 10:08:32 +02:00
Frank Schroeder 4ae8317fbf
Revert "local state: remove stale comment"
This reverts commit 35f4acdddc.
2017-10-23 10:08:32 +02:00
Frank Schroeder 223fae7d92
Revert "fix go vet issue"
This reverts commit 078e50b635.
2017-10-23 10:00:26 +02:00
Frank Schroeder 078e50b635
fix go vet issue 2017-10-23 08:13:52 +02:00
Frank Schroeder 35f4acdddc local state: remove stale comment 2017-10-23 08:03:18 +02:00
Frank Schroeder f9267380db local state: make test more robust 2017-10-23 08:03:18 +02:00
Frank Schroeder af1243c725 local state: clone check to avoid side effect 2017-10-23 08:03:18 +02:00
Frank Schroeder 39a2d8d25e local state: use synchronized access to internal maps 2017-10-23 08:03:18 +02:00
Frank Schroeder 9280841a80 local state: rename Add{Check,Service}State to Set{Check,Service}State 2017-10-23 08:03:18 +02:00
Frank Schroeder 9bc8127728 local state: move Metadata methods together 2017-10-23 08:03:18 +02:00
Frank Schroeder e86521e637 local state: update documentation of updateSyncState 2017-10-23 08:03:18 +02:00
Frank Schroeder 42188164f8 local state: update comments 2017-10-23 08:03:18 +02:00
Frank Schroeder 1d315075b1 local state: address review comments
* move non-blocking notification mechanism into ae.Trigger
* move Pause/Resume into separate type
2017-10-23 08:03:18 +02:00
Frank Schroeder c28e23eac8 local state: refactor TestAgentAntiEntropy_EnableTagOverride
Make intent clearer by being more explicit and adding some comments.
Use verify.Values to compare service entries.
2017-10-23 08:03:18 +02:00
Frank Schroeder 86f7ea6013 local state: fix TestAgentAntiEntropy_EnableTagOverride
The test had a race condition where it relied on the first service to be
synced to the remote catalog which sometimes failed.
2017-10-23 08:03:18 +02:00
Frank Schroeder ff62eaf063 local state: rename tests 2017-10-23 08:03:18 +02:00
Frank Schroeder 2bdba8ab06 local state: drop retry loops from tests
Since the tests are now using synchronous calls for state syncing
we no longer need to use retry loops to wait for the changes to
propagate.
2017-10-23 08:03:18 +02:00
Frank Schroeder f8e20cd996 local state: fix anti-entropy state tests
The anti-entropy tests relied on the side-effect of the StartSync()
method to perform a full sync instead of a partial sync. This lead to
multiple anti-entropy go routines being started unnecessary retry loops.

This change changes the behavior to perform synchronous full syncs when
necessary removing the need for all of the time.Sleep and most of the
retry loops.
2017-10-23 08:03:18 +02:00
Frank Schroeder e9149f64d9 local state: fix test with updated error message 2017-10-23 08:03:18 +02:00
Frank Schroeder 1af52bf7be local state: tests compile 2017-10-23 08:03:18 +02:00
Frank Schroeder ccbae7da5b local state: replace multi-map state with structs
The state of the service and health check records was spread out over
multiple maps guarded by a single lock. Access to the maps has to happen
in a coordinated effort and the tests often violated this which made
them brittle and racy.

This patch replaces the multiple maps with a single one for both checks
and services to make the code less fragile.

This is also necessary since moving the local state into its own package
creates circular dependencies for the tests. To avoid this the tests can
no longer access internal data structures which they should not be doing
in the first place.

The tests still don't compile but this is a ncessary step in that
direction.
2017-10-23 08:03:18 +02:00
Frank Schroeder d447e823c6 local state: move to separate package
This patch moves the local state to a separate package to further
decouple it from the agent code.

The code compiles but the tests do not yet.
2017-10-23 08:03:18 +02:00