Commit Graph

1279 Commits

Author SHA1 Message Date
R.B. Boyer 20eefeea11
acl: a role binding rule for a role that does not exist should be ignored (#5778)
I wrote the docs under this assumption but completely forgot to actually
enforce it.
2019-05-03 14:22:44 -05:00
R.B. Boyer b4371bcccd
acl: enforce that you cannot persist tokens and roles with missing links except during replication (#5779) 2019-05-02 15:02:21 -05:00
Matt Keeler 42d32db817
Fix ConfigEntryResponse binary marshaller and ensure we watch the chan in ConfigEntry.Get even when no entry exists. (#5773) 2019-05-02 15:25:29 -04:00
Paul Banks 6a58527cd8
Fix previous accidental master push 🤦 (#5771)
* Fix previous accidental master push 🤦

* Fix ACL test
2019-05-02 15:49:37 +01:00
Paul Banks 6c81f9da0d
Fix panic in Resolving service config when proxy-defaults isn't defined yet (#5769) 2019-05-02 14:12:21 +01:00
Paul Banks 8f5b16ebaf
Fix uint8 conversion issues for service config response maps. 2019-05-02 14:11:33 +01:00
Paul Banks 0cfb6051ea Add integration test for central config; fix central config WIP (#5752)
* Add integration test for central config; fix central config WIP

* Add integration test for central config; fix central config WIP

* Set proxy protocol correctly and begin adding upstream support

* Add upstreams to service config cache key and start new notify watcher if they change.

This doesn't update the tests to pass though.

* Fix some merging logic get things working manually with a hack (TODO fix properly)

* Simplification to not allow enabling sidecars centrally - it makes no sense without upstreams anyway

* Test compile again and obvious ones pass. Lots of failures locally not debugged yet but may be flakes. Pushing up to see what CI does

* Fix up service manageer and API test failures

* Remove the enable command since it no longer makes much sense without being able to turn on sidecar proxies centrally

* Remove version.go hack - will make integration test fail until release

* Remove unused code from commands and upstream merge

* Re-bump version to 1.5.0
2019-05-01 16:39:31 -07:00
Matt Keeler 69f902608c
Update to use a consulent build tag instead of just ent (#5759) 2019-05-01 11:11:27 -04:00
Matt Keeler 3145bf5230 Centralized Config CLI (#5731)
* Add HTTP endpoints for config entry management

* Finish implementing decoding in the HTTP Config entry apply endpoint

* Add CAS operation to the config entry apply endpoint

Also use this for the bootstrapping and move the config entry decoding function into the structs package.

* First pass at the API client for the config entries

* Fixup some of the ConfigEntry APIs

Return a singular response object instead of a list for the ConfigEntry.Get RPC. This gets plumbed through the HTTP API as well.

Dont return QueryMeta in the JSON response for the config entry listing HTTP API. Instead just return a list of config entries.

* Minor API client fixes

* Attempt at some ConfigEntry api client tests

These don’t currently work due to weak typing in JSON

* Get some of the api client tests passing

* Implement reflectwalk magic to correct JSON encoding a ProxyConfigEntry

Also added a test for the HTTP endpoint that exposes the problem. However, since the test doesn’t actually do the JSON encode/decode its still failing.

* Move MapWalk magic into a binary marshaller instead of JSON.

* Add a MapWalk test

* Get rid of unused func

* Get rid of unused imports

* Fixup some tests now that the decoding from msgpack coerces things into json compat types

* Stub out most of the central config cli

Fully implement the config read command.

* Basic config delete command implementation

* Implement config write command

* Implement config list subcommand

Not entirely sure about the output here. Its basically the read output indented with a line specifying the kind/name of each type which is also duplicated in the indented output.

* Update command usage

* Update some help usage formatting

* Add the connect enable helper cli command

* Update list command output

* Rename the config entry API client methods.

* Use renamed apis

* Implement config write tests

Stub the others with the noTabs tests.

* Change list output format

Now just simply output 1 line per named config

* Add config read tests

* Add invalid args write test.

* Add config delete tests

* Add config list tests

* Add connect enable tests

* Update some CLI commands to use CAS ops

This also modifies the HTTP API for a write op to return a boolean indicating whether the value was written or not.

* Fix up the HTTP API CAS tests as I realized they weren’t testing what they should.

* Update config entry rpc tests to properly test CAS

* Fix up a few more tests

* Fix some tests that using ConfigEntries.Apply

* Update config_write_test.go

* Get rid of unused import
2019-04-30 16:27:16 -07:00
Matt Keeler d0f410cd84
Make a few config entry endpoints return 404s and allow for snake_case and lowercase key names. (#5748) 2019-04-30 18:19:19 -04:00
Matt Keeler 4daa1585b0
ACL Token ID Initialization (#5307) 2019-04-30 11:45:36 -04:00
Kyle Havlovitz aba54cec55 Add HTTP endpoints for config entry management (#5718) 2019-04-29 18:08:09 -04:00
R.B. Boyer c6722fc43d
Merge pull request #5617 from hashicorp/f-acl-ux
Secure ACL Introduction for Kubernetes
2019-04-26 15:34:26 -05:00
Aestek 21a776e202 Fix: fail messages after a node rename replace the new node definition (#5520)
When receiving a serf faild message for a node which is not in the
catalog, do not perform a register request to set is serf heath to
critical as it could overwrite the node information and services if it
was renamed.

Fixes : #5518
2019-04-26 21:33:41 +01:00
R.B. Boyer e47d7eeddb acl: adding support for kubernetes auth provider login (#5600)
* auth providers
* binding rules
* auth provider for kubernetes
* login/logout
2019-04-26 14:49:25 -05:00
R.B. Boyer cc1aa3f973 acl: adding Roles to Tokens (#5514)
Roles are named and can express the same bundle of permissions that can
currently be assigned to a Token (lists of Policies and Service
Identities). The difference with a Role is that it not itself a bearer
token, but just another entity that can be tied to a Token.

This lets an operator potentially curate a set of smaller reusable
Policies and compose them together into reusable Roles, rather than
always exploding that same list of Policies on any Token that needs
similar permissions.

This also refactors the acl replication code to be semi-generic to avoid
3x copypasta.
2019-04-26 14:49:12 -05:00
R.B. Boyer 7928305279 making ACLToken.ExpirationTime a *time.Time value instead of time.Time (#5663)
This is mainly to avoid having the API return "0001-01-01T00:00:00Z" as
a value for the ExpirationTime field when it is not set. Unfortunately
time.Time doesn't respect the json marshalling "omitempty" directive.
2019-04-26 14:48:16 -05:00
R.B. Boyer db43fc3a20 acl: ACL Tokens can now be assigned an optional set of service identities (#5390)
These act like a special cased version of a Policy Template for granting
a token the privileges necessary to register a service and its connect
proxy, and read upstreams from the catalog.
2019-04-26 14:48:04 -05:00
R.B. Boyer 2144bd7fbd acl: tokens can be created with an optional expiration time (#5353) 2019-04-26 14:47:51 -05:00
Matt Keeler 15e80e4e76
Implement bootstrapping proxy defaults from the config file (#5714) 2019-04-26 14:25:03 -04:00
Matt Keeler 5befe0f5d5
Implement config entry replication (#5706) 2019-04-26 13:38:39 -04:00
Alvin Huang 8ceca2ace3
Add fmt and vet (#5671)
* add go fmt and vet

* go fmt fixes
2019-04-25 12:26:33 -04:00
Kyle Havlovitz 88e1d8ce03 Fill out the service manager functionality and fix tests 2019-04-23 00:17:28 -07:00
Kyle Havlovitz b186c3020c
Merge pull request #5615 from hashicorp/config-entry-rpc
Add RPC endpoints for config entry operations
2019-04-23 00:16:54 -07:00
Kyle Havlovitz fed7595d45 Rename config entry ACL methods 2019-04-22 23:55:11 -07:00
kaitlincarter-hc 7dcc727b4d
[docs] Server Performance (#5627)
* Moving server performance guide to docs.

* fixing broken links

* updating broken link

* fixing broken links
2019-04-17 13:17:12 -05:00
Matt Keeler afa1cc98d1
Implement data filtering of some endpoints (#5579)
Fixes: #4222 

# Data Filtering

This PR will implement filtering for the following endpoints:

## Supported HTTP Endpoints

- `/agent/checks`
- `/agent/services`
- `/catalog/nodes`
- `/catalog/service/:service`
- `/catalog/connect/:service`
- `/catalog/node/:node`
- `/health/node/:node`
- `/health/checks/:service`
- `/health/service/:service`
- `/health/connect/:service`
- `/health/state/:state`
- `/internal/ui/nodes`
- `/internal/ui/services`

More can be added going forward and any endpoint which is used to list some data is a good candidate.

## Usage

When using the HTTP API a `filter` query parameter can be used to pass a filter expression to Consul. Filter Expressions take the general form of:

```
<selector> == <value>
<selector> != <value>
<value> in <selector>
<value> not in <selector>
<selector> contains <value>
<selector> not contains <value>
<selector> is empty
<selector> is not empty
not <other expression>
<expression 1> and <expression 2>
<expression 1> or <expression 2>
```

Normal boolean logic and precedence is supported. All of the actual filtering and evaluation logic is coming from the [go-bexpr](https://github.com/hashicorp/go-bexpr) library

## Other changes

Adding the `Internal.ServiceDump` RPC endpoint. This will allow the UI to filter services better.
2019-04-16 12:00:15 -04:00
Kyle Havlovitz 690e9dd2c0 Move the ACL logic into the ConfigEntry interface 2019-04-10 14:27:28 -07:00
Kyle Havlovitz f2ed482680 Add RPC endpoints for config entry operations 2019-04-06 23:38:08 -07:00
Alvin Huang f45e495e38
Merge pull request #5376 from hashicorp/fix-tests
Fix tests in prep for CircleCI Migration
2019-04-04 17:09:32 -04:00
Kyle Havlovitz 5f569fb2ac
Merge pull request #5539 from hashicorp/service-config
Service config state model
2019-04-02 16:34:58 -07:00
Kyle Havlovitz a2fa9a0019 Cleaned up some error handling/comments around config entries 2019-04-02 15:42:12 -07:00
Kyle Havlovitz d16be2e269 Encode config entry FSM messages in a generic type 2019-03-28 00:06:56 -07:00
Kyle Havlovitz f6df5c9b3b Clean up service config state store methods 2019-03-27 16:52:38 -07:00
R.B. Boyer 0d1b496a52
acl: memdb filter of tokens-by-policy was inverted (#5575)
The inversion wasn't noticed because the parallel execution of TokenList
tests was operating incorrectly due to variable shadowing.
2019-03-27 15:24:44 -05:00
Jeff Mitchell 4243c3ae42
Move internal/ to sdk/ (#5568)
* Move internal/ to sdk/

* Add a readme to the SDK folder
2019-03-27 08:54:56 -04:00
Jeff Mitchell 47c390025b
Convert to Go Modules (#5517)
* First conversion

* Use serf 0.8.2 tag and associated updated deps

* * Move freeport and testutil into internal/

* Make internal/ its own module

* Update imports

* Add replace statements so API and normal Consul code are
self-referencing for ease of development

* Adapt to newer goe/values

* Bump to new cleanhttp

* Fix ban nonprintable chars test

* Update lock bad args test

The error message when the duration cannot be parsed changed in Go 1.12
(ae0c435877d3aacb9af5e706c40f9dddde5d3e67). This updates that test.

* Update another test as well

* Bump travis

* Bump circleci

* Bump go-discover and godo to get rid of launchpad dep

* Bump dockerfile go version

* fix tar command

* Bump go-cleanhttp
2019-03-26 17:04:58 -04:00
Paul Banks d2e68a900a
Connect: Make Connect health queries unblock correctly (#5508)
* Make Connect health queryies unblock correctly in all cases and use optimal number of watch chans. Fixes #5506.

* Node check test cases and clearer bug test doc

* Comment update
2019-03-21 16:01:56 +00:00
Kyle Havlovitz d92577c16b Fix fsm serialization and add snapshot/restore 2019-03-20 16:13:13 -07:00
Kyle Havlovitz 17aa6a5a34 Fill out state store/FSM functions and add tests 2019-03-19 15:56:17 -07:00
Kyle Havlovitz 9d07add047 Add config types and state store table 2019-03-19 10:06:46 -07:00
Kyle Havlovitz aa4e26d102 Condense some test logic and add a comment about renaming 2019-03-18 16:15:36 -07:00
Paul Banks 0b5a078b95
Optimize health watching to single chan/goroutine. (#5449)
Refs #4984.

Watching chans for every node we touch in a health query is wasteful. In #4984 it shows that if there are more than 682 service instances we always fallback to watching all services which kills performance.

We already have a record in MemDB that is reliably update whenever the service health result should change thanks to per-service watch indexes.

So in general, provided there is at least one service instances and we actually have a service index for it (we always do now) we only ever need to watch a single channel.

This saves us from ever falling back to the general index and causing the performance cliff in #4984, but it also means fewer goroutines and work done for every blocking health query.

It also saves some allocations made during the query because we no longer have to populate a WatchSet with 3 chans per service instance which saves the internal map allocation.

This passes all state store tests except the one that explicitly checked for the fallback behaviour we've now optimized away and in general seems safe.
2019-03-15 20:18:48 +00:00
R.B. Boyer cd96af4fc0
acl: reduce complexity of token resolution process with alternative singleflighting (#5480)
acl: reduce complexity of token resolution process with alternative singleflighting

Switches acl resolution to use golang.org/x/sync/singleflight. For the
identity/legacy lookups this is a drop-in replacement with the same
overall approach to request coalescing.

For policies this is technically a change in behavior, but when
considered holistically is approximately performance neutral (with the
benefit of less code).

There are two goals with this blob of code (speaking specifically of
policy resolution here):

  1) Minimize cross-DC requests.
  2) Minimize client-to-server LAN requests.

The previous iteration of this code was optimizing for the case of many
possibly different tokens being resolved concurrently that have a
significant overlap in linked policies such that deduplication would be
worth the complexity. While this is laudable there are some things to
consider that can help to adjust expectations:

  1) For v1.4+ policies are always replicated, and once a single policy
  shows up in a secondary DC the replicated data is considered
  authoritative for requests made in that DC. This means that our
  earlier concerns about minimizing cross-DC requests are irrelevant
  because there will be no cross-DC policy reads that occur.

  2) For Server nodes the in-memory ACL policy cache is capped at zero,
  meaning it has no caching. Only Client nodes run with a cache. This
  means that instead of having an entire DC's worth of tokens (what a
  Server might see) that can have policy resolutions coalesced these
  nodes will only ever be seeing node-local token resolutions. In a
  reasonable worst-case scenario where a scheduler like Kubernetes has
  "filled" a node with Connect services, even that will only schedule
  ~100 connect services per node. If every service has a unique token
  there will only be 100 tokens to coalesce and even then those requests
  have to occur concurrently AND be hitting an empty consul cache.

Instead of seeing a great coalescing opportunity for cutting down on
redundant Policy resolutions, in practice it's far more likely given
node densities that you'd see requests for the same token concurrently
than you would for two tokens sharing a policy concurrently (to a degree
that would warrant the overhead of the current variation of
singleflighting.

Given that, this patch switches the Policy resolution process to only
singleflight by requesting token (but keeps the cache as by-policy).
2019-03-14 09:35:34 -05:00
Kyle Havlovitz 8ae6547934 Update state store test for changing node ID 2019-03-13 17:05:31 -07:00
Kyle Havlovitz 6932b06f97 Add a test for changing a failed node's ID 2019-03-13 15:39:07 -07:00
Hans Hasselberg 7e11dd82aa
agent: enable reloading of tls config (#5419)
This PR introduces reloading tls configuration. Consul will now be able to reload the TLS configuration which previously required a restart. It is not yet possible to turn TLS ON or OFF with these changes. Only when TLS is already turned on, the configuration can be reloaded. Most importantly the certificates and CAs.
2019-03-13 10:29:06 +01:00
R.B. Boyer 2e175be41b
acl: correctly extend the cache for acl identities during resolution (#5475) 2019-03-12 10:23:43 -05:00
Aestek 4bea29f15a [catalog] Update the node's services indexes on update (#5458)
Node updates were not updating the service indexes, which are used for
service related queries. This caused the X-Consul-Index to stay the same
after a node update as seen from a service query even though the node
data is returned in heath queries. If that happened in between queries
the client would miss this change.
We now update the indexes of the services on the node when it is
updated.

Fixes: #5450
2019-03-11 14:48:19 +00:00
Kyle Havlovitz 1e4523f55e Add logic to allow changing a failed node's ID 2019-03-07 22:42:54 -08:00
Alvin Huang 8cb8108b1b fix typos 2019-03-06 14:47:33 -05:00
R.B. Boyer f4a3b9d518
fix typos reported by golangci-lint:misspell (#5434) 2019-03-06 11:13:28 -06:00
R.B. Boyer 2ffbea41c8 improve flaky LANReap tests by expliciting configuring the tombstone timeout
In TestServer_LANReap autopilot is running, so the alternate flow
through the serf reaping function is possible. In that situation the
ReconnectTimeout is not relevant so for parity also override the
TombstoneTimeout value as well.

For additional parity update the TestServer_WANReap and
TestClient_LANReap versions of this test in the same way even though
autopilot is irrelevant here .
2019-03-05 14:34:03 -06:00
Matt Keeler 90040f8bff Fixes for CVE-2019-8336
Fix error in detecting raft replication errors.

Detect redacted token secrets and prevent attempting to insert.

Add a Redacted field to the TokenBatchRead and TokenRead RPC endpoints

This will indicate whether token secrets have been redacted.

Ensure any token with a redacted secret in secondary datacenters is removed.

Test that redacted tokens cannot be replicated.
2019-03-04 19:13:24 +00:00
Matt Keeler 6e6910ea11
Dont modify memdb owned token data for get/list requests of tokens (#5412)
Previously we were fixing up the token links directly on the *ACLToken returned by memdb. This invalidated some assumptions that a snapshot is immutable as well as potentially being able to cause a crash.

The fix here is to give the policy link fixing function copy on write semantics. When no fixes are necessary we can return the memdb object directly, otherwise we copy it and create a new list of links.

Eventually we might find a better way to keep those policy links in sync but for now this fixes the issue.
2019-03-04 09:28:46 -05:00
Matt Keeler 200c0fb3e9
Call RemoveServer for reap events (#5317)
This ensures that servers are removed from RPC routing when they are reaped.
2019-03-04 09:19:35 -05:00
R.B. Boyer 6955186239 fix ignored errors in state store internals as reported by errcheck 2019-03-01 14:18:00 -06:00
Matt Keeler 118adbb123
ACL Token Persistence and Reloading (#5328)
This PR adds two features which will be useful for operators when ACLs are in use.

1. Tokens set in configuration files are now reloadable.
2. If `acl.enable_token_persistence` is set to `true` in the configuration, tokens set via the `v1/agent/token` endpoint are now persisted to disk and loaded when the agent starts (or during configuration reload)

Note that token persistence is opt-in so our users who do not want tokens on the local disk will see no change.

Some other secondary changes:

* Refactored a bunch of places where the replication token is retrieved from the token store. This token isn't just for replicating ACLs and now it is named accordingly.
* Allowed better paths in the `v1/agent/token/` API. Instead of paths like: `v1/agent/token/acl_replication_token` the path can now be just `v1/agent/token/replication`. The old paths remain to be valid. 
* Added a couple new API functions to set tokens via the new paths. Deprecated the old ones and pointed to the new names. The names are also generally better and don't imply that what you are setting is for ACLs but rather are setting ACL tokens. There is a minor semantic difference there especially for the replication token as again, its no longer used only for ACL token/policy replication. The new functions will detect 404s and fallback to using the older token paths when talking to pre-1.4.3 agents.
* Docs updated to reflect the API additions and to show using the new endpoints.
* Updated the ACL CLI set-agent-tokens command to use the non-deprecated APIs.
2019-02-27 14:28:31 -05:00
Hans Hasselberg 786b3b1095
Centralise tls configuration part 1 (#5366)
In order to be able to reload the TLS configuration, we need one way to generate the different configurations.

This PR introduces a `tlsutil.Configurator` which holds a `tlsutil.Config`. Afterwards it is responsible for rendering every `tls.Config`. In this particular PR I moved `IncomingHTTPSConfig`, `IncomingTLSConfig`, and `OutgoingTLSWrapper` into `tlsutil.Configurator`.

This PR is a pure refactoring - not a single feature added. And not a single test added. I only slightly modified existing tests as necessary.
2019-02-26 16:52:07 +01:00
Alvin Huang 77eecf1046 add wait to TestClient_JoinLAN 2019-02-22 17:34:45 -05:00
Alvin Huang 136df63e2c add retry to TestResetSessionTimerLocked 2019-02-22 17:34:45 -05:00
R.B. Boyer c2a30c5fdd fix incorrect body of TestACLEndpoint_PolicyBatchRead
Lifted from PR #5307 as it was an unrelated drive-by fix on that PR anyway.

s/token/policy/
2019-02-22 09:32:51 -06:00
R.B. Boyer ef8258cd4e test: switch test file from assert -> require for consistency
Also in acl_endpoint_test.go:

* convert logical blocks in some token tests to subtests
* remove use of require.New

This removes a lot of noise in a later PR.
2019-02-14 14:21:19 -06:00
R.B. Boyer adbe8ed370 correct some typos 2019-02-13 13:02:12 -06:00
R.B. Boyer 88bb53d001 ensure that we plumb our configured logger into all parts of the raft library 2019-02-13 13:02:09 -06:00
R.B. Boyer 2c983902be reduce the local scope of variable 2019-02-13 11:54:28 -06:00
R.B. Boyer f2ed3a3777
clarify the ACL.PolicyDelete endpoint (#5337)
There was an errant early-return in PolicyDelete() that bypassed the
rest of the function.  This was ok because the only caller of this
function ignores the results.

This removes the early-return making it structurally behave like
TokenDelete() and for both PolicyDelete and TokenDelete clarify the lone
callers to indicate that the return values are ignored.

We may wish to avoid the entire return value as well, but this patch
doesn't go that far.
2019-02-13 09:16:30 -06:00
R.B. Boyer 324ba5df17
update TestStateStore_ACLBootstrap to not rely upon request mutation (#5335) 2019-02-12 16:09:26 -06:00
Matt Keeler 7073ba4ed2
Move autopilot initialization to prevent race (#5322)
`establishLeadership` invoked during leadership monitoring may use autopilot to do promotions etc. There was a race with doing that and having autopilot initialized and this fixes it.
2019-02-11 11:12:24 -05:00
Matt Keeler acfd87c673
Improve Connect with Prepared Queries (#5291)
Given a query like:

```
{
   "Name": "tagged-connect-query",
   "Service": {
      "Service": "foo",
      "Tags": ["tag"],
      "Connect": true
   }
}
```

And a Consul configuration like:

```
{
   "services": [
      "name": "foo",
      "port": 8080,
      "connect": { "sidecar_service": {} },
      "tags": ["tag"]
   ]
}
```

If you executed the query it would always turn up with 0 results. This was because the sidecar service was being created without any tags. You could instead make your config look like:

```
{
   "services": [
      "name": "foo",
      "port": 8080,
      "connect": { "sidecar_service": {
         "tags": ["tag"]
      } },
      "tags": ["tag"]
   ]
}
```

However that is a bit redundant for most cases. This PR ensures that the tags and service meta of the parent service get copied to the sidecar service. If there are any tags or service meta set in the sidecar service definition then this copying does not take place. After the changes, the query will now return the expected results.

A second change was made to prepared queries in this PR which is to allow filtering on ServiceMeta just like we allow for filtering on NodeMeta.
2019-02-04 09:36:51 -05:00
R.B. Boyer e1e4249e90
testutil: redirect some test agent logs to testing.T.Logf (#5304)
When tests fail, only the logs for the failing run are dumped to the
console which helps in diagnosis. This is easily added to other test
scenarios as they come up.
2019-02-01 09:21:54 -06:00
Kyle Havlovitz 88c044759f
connect: Forward intention RPCs if this isn't the primary 2019-01-22 11:29:21 -08:00
Kyle Havlovitz 6b28434f8a
Merge pull request #5249 from hashicorp/ca-fixes-oss
Minor CA fixes
2019-01-22 11:25:09 -08:00
Kyle Havlovitz 5bdf130767
Merge pull request #4869 from hashicorp/txn-checks
Add node/service/check operations to transaction api
2019-01-22 11:16:09 -08:00
Matt Keeler 579a8b32ed
Fix several ACL token/policy resolution issues. (#5246)
* Fix 2 remote ACL policy resolution issues

1 - Use the right method to fire async not found errors when the ACL.PolicyResolve RPC returns that error. This was previously accidentally firing a token result instead of a policy result which would have effectively done nothing (unless there happened to be a token with a secret id == the policy id being resolved.

2. When concurrent policy resolution is being done we single flight the requests. The bug before was that for the policy resolution that was going to piggy back on anothers RPC results it wasn’t waiting long enough for the results to come back due to looping with the wrong variable.

* Fix a handful of other edge case ACL scenarios

The main issue was that token specific issues (not able to access a particular policy or the token being deleted after initial fetching) were poisoning the policy cache.

A second issue was that for concurrent token resolutions, the first resolution to get started would go fetch all the policies. If before the policies were retrieved a second resolution request came in, the new request would register watchers for those policies but then never block waiting for them to complete. This resulted in using the default policy when it shouldn't have.
2019-01-22 13:14:43 -05:00
Paul Banks ef9f27cbc8
connect: tame thundering herd of CSRs on CA rotation (#5228)
* Support rate limiting and concurrency limiting CSR requests on servers; handle CA rotations gracefully with jitter and backoff-on-rate-limit in client

* Add CSR rate limiting docs

* Fix config naming and add tests for new CA configs
2019-01-22 17:19:36 +00:00
Kyle Havlovitz ddc4a8d848
oss: add the enterprise server stub for intention replication check 2019-01-18 17:32:10 -08:00
Matt Keeler 1ec5f2a27f
Store leaf cert indexes in raft and use for the ModifyIndex on the returned certs (#5211)
* Store leaf cert indexes in raft and use for the ModifyIndex on the returned certs

This ensures that future certificate signings will have a strictly greater ModifyIndex than any previous certs signed.
2019-01-11 16:04:57 -05:00
Aestek 4afbe792df Improve blocking queries on services that do not exist (#4810)
## Background

When making a blocking query on a missing service (was never registered, or is not registered anymore) the query returns as soon as any service is updated.
On clusters with frequent updates (5~10 updates/s in our DCs) these queries virtually do not block, and clients with no protections againt this waste ressources on the agent and server side. Clients that do protect against this get updates later than they should because of the backoff time they implement between requests.

## Implementation

While reducing the number of unnecessary updates we still want :
* Clients to be notified as soon as when the last instance of a service disapears.
* Clients to be notified whenever there's there is an update for the service.
* Clients to be notified as soon as the first instance of the requested service is added.

To reduce the number of unnecessary updates we need to block when a request to a missing service is made. However in the following case :

1. Client `client1` makes a query for service `foo`, gets back a node and X-Consul-Index 42
2. `foo` is unregistered 
3. `client1`  makes a query for `foo` with `index=42` -> `foo` does not exist, the query blocks and `client1` is not notified of the change on `foo` 

We could store the last raft index when each service was last alive to know wether we should block on the incoming query or not, but that list could grow indefinetly. 
We instead store the last raft index when a service was unregistered and use it when a query targets a service that does not exist. 
When a service `srv` is unregistered this "missing service index" is always greater than any X-Consul-Index held by the clients while `srv` was up, allowing us to immediatly notify them.

1. Client `client1` makes a query for service `foo`, gets back a node and `X-Consul-Index: 42`
2. `foo` is unregistered, we set the "missing service index" to 43 
3. `client1` makes a blocking query for `foo` with `index=42` -> `foo` does not exist, we check against the "missing service index" and return immediatly with `X-Consul-Index: 43`
4. `client1` makes a blocking query for `foo` with `index=43` -> we block
5. Other changes happen in the cluster, but foo still doesn't exist and "missing service index" hasn't changed, the query is still blocked
6. `foo` is registered again on index 62 -> `foo` exists and its index is greater than 43, we unblock the query
2019-01-11 09:26:14 -05:00
Matt Keeler 1048f3d5e7
acl: Prevent tokens from deleting themselves (#5210)
Fixes #4897 

Also apparently token deletion could segfault in secondary DCs when attempting to delete non-existant tokens. For that reason both checks are wrapped within the non-nil check.
2019-01-10 09:22:51 -05:00
Kyle Havlovitz c07c5446a8 txn: clean up some state store/acl code 2019-01-09 11:59:23 -08:00
Pierre Souchay ae7f88f995 Avoid to have infinite recursion in DNS lookups when resolving CNAMEs (#4918)
* Avoid to have infinite recursion in DNS lookups when resolving CNAMEs

This will avoid killing Consul when a Service.Address is using CNAME
to a Consul CNAME that creates an infinite recursion.

This will fix https://github.com/hashicorp/consul/issues/4907

* Use maxRecursionLevel = 3 to allow several recursions
2019-01-07 16:53:54 -05:00
Paul Banks b29bc906ee
bugfix: use ServiceTags to generate cache key hash (#4987)
* bugfix: use ServiceTags to generate cahce key hash

* update unit test

* update

* remote print log

* Update .gitignore

* Completely deprecate ServiceTag field internally for clarity

* Add explicit test for CacheInfo cases
2019-01-07 21:30:47 +00:00
Kyle Havlovitz 995e728ea0 txn: fix an issue with querying nodes by name instead of ID 2018-12-12 12:46:33 -08:00
Pierre Souchay f4dc8b42e0 [Travis][UnstableTests] Fixed unstable tests in travis (#5013)
* [Travis][UnstableTests] Fixed unstable tests in travis as seen in https://travis-ci.org/hashicorp/consul/jobs/460824602

* Fixed unstable tests in https://travis-ci.org/hashicorp/consul/jobs/460857687
2018-12-12 12:09:42 -08:00
Kyle Havlovitz 67bac7a815 api: add support for new txn operations 2018-12-12 10:54:09 -08:00
Kyle Havlovitz de4dbf583e txn: add tests for RPC endpoint 2018-12-12 10:04:10 -08:00
Kyle Havlovitz 6a512e5c0f txn: add ACL enforcement/validation to new txn ops 2018-12-12 10:04:10 -08:00
Kyle Havlovitz 9467067432 state: add tests for new txn ops 2018-12-12 10:04:10 -08:00
Kyle Havlovitz 7759e9ea8b txn: add service operations 2018-12-12 10:04:10 -08:00
Kyle Havlovitz ab58986ac3 txn: add node operations 2018-12-12 10:04:10 -08:00
Kyle Havlovitz 01e1b5b1df txn: add pre-check operations to txn endpoint 2018-12-12 10:04:10 -08:00
Kyle Havlovitz b371ea8783 Add check operations to transaction api 2018-12-12 10:04:10 -08:00
Kyle Havlovitz 4f2715d4e2 connect/ca: prevent blank CA config in snapshot
This PR both prevents a blank CA config from being written out to
a snapshot and allows Consul to gracefully recover from a snapshot
with an invalid CA config.

Fixes #4954.
2018-12-06 17:40:53 -08:00
R.B. Boyer c1eccfd1db
agent: remove some stray fmt.Print* calls (#5015) 2018-11-29 09:45:51 -06:00
Pierre Souchay c5ae9caa28 Fixed another list of unstable unit tests in travis (#4915)
* Fixed another list of unstable unit tests in travis

Fixed failing tests in https://travis-ci.org/hashicorp/consul/jobs/451357061

* Fixed another list of unstable unit tests in travis.

Fixed failing tests in https://travis-ci.org/hashicorp/consul/jobs/451357061
2018-11-20 11:27:26 +00:00
Kyle Havlovitz 76f102a1e0
Merge pull request #4952 from hashicorp/test-version
tests: Bump test server version to 1.4.0
2018-11-13 13:37:10 -08:00
R.B. Boyer 934fae659f
acl: add stub hooks to support some plumbing in enterprise (#4951) 2018-11-13 15:35:54 -06:00
Kyle Havlovitz 269354c61d
oss: bump test server version to 1.4.0 2018-11-13 13:13:26 -08:00
Aestek 4942e66440 Fix catalog tag filter backward compat (#4944)
Fix catalog service node filtering (ex /v1/catalog/service/srv?tag=tag1)
between agent version <=v1.2.3 and server >=v1.3.0.
New server version did not account for the old field when filtering
hence request made from old agent were not tag-filtered.
2018-11-13 14:44:36 +00:00
Kyle Havlovitz 4a73a59d70
Merge pull request #4917 from hashicorp/replication-token-cleanup
Use acl replication_token for connect
2018-11-12 09:12:54 -08:00
Kyle Havlovitz 972177071d update non-voting server test to fix enterprise diff 2018-11-09 12:50:24 -08:00
Kyle Havlovitz 643bd13aed oss: do a proper check-and-set on the CA roots/config fsm operation 2018-11-09 12:36:23 -08:00
R.B. Boyer 2afc2a3c3b
acl: fixes ACL replication for legacy tokens without AccessorIDs (#4885) 2018-11-07 07:59:44 -08:00
Kyle Havlovitz e8dd89359a
agent: fix formatting 2018-11-07 02:16:03 -08:00
R.B. Boyer 9211d2701d
fix comment typos (#4890) 2018-11-02 12:00:39 -05:00
Kyle Havlovitz 8337e3d8c0
Merge pull request #4872 from hashicorp/node-snapshot-fix
Node ID/datacenter snapshot fix
2018-10-31 15:51:07 -07:00
Matt Keeler db2cf01406 Adds documentation for the new ACL APIs (#4851)
* Update the ACL API docs

* Add a CreateTime to the anon token

Also require acl:read permissions at least to perform rule translation. Don’t want someone DoSing the system with an open endpoint that actually does a bit of work.

* Fix one place where I was referring to id instead of AccessorID

* Add godocs for the API package additions.

* Minor updates: removed some extra commas and updated the acl intro paragraph

* minor tweaks

* Updated the language to be clearer

* Updated the language to be clearer for policy page

* I was also confused by that! Your updates are much clearer.

Co-Authored-By: kaitlincarter-hc <43049322+kaitlincarter-hc@users.noreply.github.com>

* Sounds much better.

Co-Authored-By: kaitlincarter-hc <43049322+kaitlincarter-hc@users.noreply.github.com>

* Updated sidebar layout and deprecated warning
2018-10-31 15:11:51 -07:00
Matt Keeler f9cf0eb36e Remaining ACL Unit Tests (#4852)
* Add leader token upgrade test and fix various ACL enablement bugs

* Update the leader ACL initialization tests.

* Add a StateStore ACL tests for ACLTokenSet and ACLTokenGetBy* functions

* Advertise the agents acl support status with the agent/self endpoint.

* Make batch token upsert CAS’able to prevent consistency issues with token auto-upgrade

* Finish up the ACL state store token tests

* Finish the ACL state store unit tests

Also rename some things to make them more consistent.

* Do as much ACL replication testing as I can.
2018-10-31 13:00:46 -07:00
Kyle Havlovitz bd6d0e598f fsm: update snapshot/restore test to include ID and datacenter 2018-10-30 15:53:14 -07:00
Kyle Havlovitz 6483356329 fsm: add missing ID/datacenter to persistNodes 2018-10-30 15:52:54 -07:00
Matt Keeler 790cf90ee5
Fix the NonVoter Bootstrap test (#4786) 2018-10-24 10:23:50 -04:00
Kyle Havlovitz 819566f6b7 fsm: add Intention operations to transactions for internal use 2018-10-19 10:02:28 -07:00
Matt Keeler 34b53e7099 A few misc fixes found by go vet 2018-10-19 12:28:36 -04:00
Matt Keeler 18b29c45c4
New ACLs (#4791)
This PR is almost a complete rewrite of the ACL system within Consul. It brings the features more in line with other HashiCorp products. Obviously there is quite a bit left to do here but most of it is related docs, testing and finishing the last few commands in the CLI. I will update the PR description and check off the todos as I finish them over the next few days/week.
Description

At a high level this PR is mainly to split ACL tokens from Policies and to split the concepts of Authorization from Identities. A lot of this PR is mostly just to support CRUD operations on ACLTokens and ACLPolicies. These in and of themselves are not particularly interesting. The bigger conceptual changes are in how tokens get resolved, how backwards compatibility is handled and the separation of policy from identity which could lead the way to allowing for alternative identity providers.

On the surface and with a new cluster the ACL system will look very similar to that of Nomads. Both have tokens and policies. Both have local tokens. The ACL management APIs for both are very similar. I even ripped off Nomad's ACL bootstrap resetting procedure. There are a few key differences though.

    Nomad requires token and policy replication where Consul only requires policy replication with token replication being opt-in. In Consul local tokens only work with token replication being enabled though.
    All policies in Nomad are globally applicable. In Consul all policies are stored and replicated globally but can be scoped to a subset of the datacenters. This allows for more granular access management.
    Unlike Nomad, Consul has legacy baggage in the form of the original ACL system. The ramifications of this are:
        A server running the new system must still support other clients using the legacy system.
        A client running the new system must be able to use the legacy RPCs when the servers in its datacenter are running the legacy system.
        The primary ACL DC's servers running in legacy mode needs to be a gate that keeps everything else in the entire multi-DC cluster running in legacy mode.

So not only does this PR implement the new ACL system but has a legacy mode built in for when the cluster isn't ready for new ACLs. Also detecting that new ACLs can be used is automatic and requires no configuration on the part of administrators. This process is detailed more in the "Transitioning from Legacy to New ACL Mode" section below.
2018-10-19 12:04:07 -04:00
Pierre Souchay fab55bee2b dns: implements prefix lookups for DNS TTL (#4605)
This will fix https://github.com/hashicorp/consul/issues/4509 and allow forinstance lb-* to match services lb-001 or lb-service-007.
2018-10-19 08:41:04 -07:00
Kyle Havlovitz c617326470 re-add Connect multi-dc config changes
This reverts commit 8bcfbaffb6.
2018-10-19 08:41:03 -07:00
Jack Pearkes 8bcfbaffb6 Revert "Connect multi-dc config" (#4784) 2018-10-11 17:32:45 +01:00
Aestek 25f04fbd21 [Security] Add finer control over script checks (#4715)
* Add -enable-local-script-checks options

These options allow for a finer control over when script checks are enabled by
giving the option to only allow them when they are declared from the local
file system.

* Add documentation for the new option

* Nitpick doc wording
2018-10-11 13:22:11 +01:00
Rebecca Zanzig 34e5516834 Support multiple tags for health and catalog http api endpoints (#4717)
* Support multiple tags for health and catalog api endpoints

Fixes #1781.

Adds a `ServiceTags` field to the ServiceSpecificRequest to support
multiple tags, updates the filter logic in the catalog store, and
propagates these change through to the health and catalog endpoints.

Note: Leaves `ServiceTag` in the struct, since it is being used as
part of the DNS lookup, which in turn uses the health check.

* Update the api package to support multiple tags

Includes additional tests.

* Update new tests to use the `require` library

* Update HealthConnect check after a bad merge
2018-10-11 12:50:05 +01:00
Pierre Souchay 51b33ef015 [Performance On Large clusters] Reduce updates on large services (#4720)
* [Performance On Large clusters] Checks do update services/nodes only when really modified to avoid too many updates on very large clusters

In a large cluster, when having a few thousands of nodes, the anti-entropy
mechanism performs lots of changes (several per seconds) while
there is no real change. This patch wants to improve this in order
to increase Consul scalability when using many blocking requests on
health for instance.

* [Performance for large clusters] Only updates index of service if service is really modified

* [Performance for large clusters] Only updates index of nodes if node is really modified

* Added comments / ensure IsSame() has clear semantics

* Avoid having modified boolean, return nil directly if stutures are Same

* Fixed unstable unit tests TestLeader_ChangeServerID

* Rewrite TestNode_IsSame() for better readability as suggested by @banks

* Rename ServiceNode.IsSame() into IsSameService() + added unit tests

* Do not duplicate TestStructs_ServiceNode_Conversions() and increase test coverage of IsSameService

* Clearer documentation in IsSameService

* Take into account ServiceProxy into ServiceNode.IsSameService()

* Fixed IsSameService() with all new structures
2018-10-11 12:42:39 +01:00
Pierre Souchay 251156eb68 Added SOA configuration for DNS settings. (#4714)
This will allow to fine TUNE SOA settings sent by Consul in DNS responses,
for instance to be able to control negative ttl.

Will fix: https://github.com/hashicorp/consul/issues/4713

# Example

Override all settings:

* min_ttl: 0 => 60s
* retry: 600 (10m) => 300s (5 minutes),
* expire: 86400 (24h) => 43200 (12h)
* refresh: 3600 (1h) => 1800 (30 minutes)

```
consul agent -dev -hcl 'dns_config={soa={min_ttl=60,retry=300,expire=43200,refresh=1800}}'
```

Result:
```
dig +multiline @localhost -p 8600 service.consul

; <<>> DiG 9.12.1 <<>> +multiline @localhost -p 8600 service.consul
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 36557
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;service.consul.		IN A

;; AUTHORITY SECTION:
consul.			0 IN SOA ns.consul. hostmaster.consul. (
				1537959133 ; serial
				1800       ; refresh (30 minutes)
				300        ; retry (5 minutes)
				43200      ; expire (12 hours)
				60         ; minimum (1 minute)
				)

;; Query time: 4 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: Wed Sep 26 12:52:13 CEST 2018
;; MSG SIZE  rcvd: 93
```
2018-10-10 15:50:56 -04:00
Kyle Havlovitz e4349c5710 connect/ca: more OSS split for multi-dc 2018-10-10 12:17:59 -07:00
Kyle Havlovitz 0da4f2b2e8 connect/ca: split CA initialization logic between oss/enterprise 2018-10-10 12:17:59 -07:00
Kyle Havlovitz 56dc426227 agent: add primary_datacenter and connect replication config options 2018-10-10 12:17:59 -07:00
Kyle Havlovitz 98d95cfa80 connect: add ExternalTrustDomain to CARoot fields 2018-10-10 12:16:47 -07:00
Kyle Havlovitz 46c829b879 docs: deprecate acl_datacenter and replace it with primary_datacenter 2018-10-10 12:16:47 -07:00
Paul Banks b83bbf248c Add Proxy Upstreams to Service Definition (#4639)
* Refactor Service Definition ProxyDestination.

This includes:
 - Refactoring all internal structs used
 - Updated tests for both deprecated and new input for:
   - Agent Services endpoint response
   - Agent Service endpoint response
   - Agent Register endpoint
     - Unmanaged deprecated field
     - Unmanaged new fields
     - Managed deprecated upstreams
     - Managed new
   - Catalog Register
     - Unmanaged deprecated field
     - Unmanaged new fields
     - Managed deprecated upstreams
     - Managed new
   - Catalog Services endpoint response
   - Catalog Node endpoint response
   - Catalog Service endpoint response
 - Updated API tests for all of the above too (both deprecated and new forms of register)

TODO:
 - config package changes for on-disk service definitions
 - proxy config endpoint
 - built-in proxy support for new fields

* Agent proxy config endpoint updated with upstreams

* Config file changes for upstreams.

* Add upstream opaque config and update all tests to ensure it works everywhere.

* Built in proxy working with new Upstreams config

* Command fixes and deprecations

* Fix key translation, upstream type defaults and a spate of other subtele bugs found with ned to end test scripts...

TODO: tests still failing on one case that needs a fix. I think it's key translation for upstreams nested in Managed proxy struct.

* Fix translated keys in API registration.
≈

* Fixes from docs
 - omit some empty undocumented fields in API
 - Bring back ServiceProxyDestination in Catalog responses to not break backwards compat - this was removed assuming it was only used internally.

* Documentation updates for Upstreams in service definition

* Fixes for tests broken by many refactors.

* Enable travis on f-connect branch in this branch too.

* Add consistent Deprecation comments to ProxyDestination uses

* Update version number on deprecation notices, and correct upstream datacenter field with explanation in docs
2018-10-10 16:55:34 +01:00
Alex Dadgar 43d0f96c42 do not bootstrap with non voters 2018-09-19 17:41:36 -07:00
Kyle Havlovitz d515d25856
Merge pull request #4644 from hashicorp/ca-refactor
connect/ca: rework initialization/root generation in providers
2018-09-13 13:08:34 -07:00
Paul Banks 74f2a80a42
Fix CA pruning when CA config uses string durations. (#4669)
* Fix CA pruning when CA config uses string durations.

The tl;dr here is:

 - Configuring LeafCertTTL with a string like "72h" is how we do it by default and should be supported
 - Most of our tests managed to escape this by defining them as time.Duration directly
 - Out actual default value is a string
 - Since this is stored in a map[string]interface{} config, when it is written to Raft it goes through a msgpack encode/decode cycle (even though it's written from server not over RPC).
 - msgpack decode leaves the string as a `[]uint8`
 - Some of our parsers required string and failed
 - So after 1 hour, a default configured server would throw an error about pruning old CAs
 - If a new CA was configured that set LeafCertTTL as a time.Duration, things might be OK after that, but if a new CA was just configured from config file, intialization would cause same issue but always fail still so would never prune the old CA.
 - Mostly this is just a janky error that got passed tests due to many levels of complicated encoding/decoding.

tl;dr of the tl;dr: Yay for type safety. Map[string]interface{} combined with msgpack always goes wrong but we somehow get bitten every time in a new way :D

We already fixed this once! The main CA config had the same problem so @kyhavlov already wrote the mapstructure DecodeHook that fixes it. It wasn't used in several places it needed to be and one of those is notw in `structs` which caused a dependency cycle so I've moved them.

This adds a whole new test thta explicitly tests the case that broke here. It also adds tests that would have failed in other places before (Consul and Vaul provider parsing functions). I'm not sure if they would ever be affected as it is now as we've not seen things broken with them but it seems better to explicitly test that and support it to not be bitten a third time!

* Typo fix

* Fix bad Uint8 usage
2018-09-13 15:43:00 +01:00
Pierre Souchay 1a906ef34e Fix more unstable tests in agent and command 2018-09-12 14:49:27 +01:00
Kyle Havlovitz c112a72880
connect/ca: some cleanup and reorganizing of the new methods 2018-09-11 16:43:04 -07:00
Pierre Souchay 22500f242e Fix unstable tests in agent, api, and command/watch 2018-09-10 16:58:53 +01:00
Pierre Souchay eddcf228ea Implementation of Weights Data structures (#4468)
* Implementation of Weights Data structures

Adding this datastructure will allow us to resolve the
issues #1088 and #4198

This new structure defaults to values:
```
   { Passing: 1, Warning: 0 }
```

Which means, use weight of 0 for a Service in Warning State
while use Weight 1 for a Healthy Service.
Thus it remains compatible with previous Consul versions.

* Implemented weights for DNS SRV Records

* DNS properly support agents with weight support while server does not (backwards compatibility)

* Use Warning value of Weights of 1 by default

When using DNS interface with only_passing = false, all nodes
with non-Critical healthcheck used to have a weight value of 1.
While having weight.Warning = 0 as default value, this is probably
a bad idea as it breaks ascending compatibility.

Thus, we put a default value of 1 to be consistent with existing behaviour.

* Added documentation for new weight field in service description

* Better documentation about weights as suggested by @banks

* Return weight = 1 for unknown Check states as suggested by @banks

* Fixed typo (of -> or) in error message as requested by @mkeeler

* Fixed unstable unit test TestRetryJoin

* Fixed unstable tests

* Fixed wrong Fatalf format in `testrpc/wait.go`

* Added notes regarding DNS SRV lookup limitations regarding number of instances

* Documentation fixes and clarification regarding SRV records with weights as requested by @banks

* Rephrase docs
2018-09-07 15:30:47 +01:00
Kyle Havlovitz 546bdf8663
connect/ca: add Configure/GenerateRoot to provider interface 2018-09-06 19:18:59 -07:00
Pierre Souchay 9a2ae6e8eb Fixed more flaky tests in ./agent/consul (#4617) 2018-09-04 14:02:47 +01:00
Freddy d7a404f2ee
Bugfix: Use "%#v" when formatting structs (#4600) 2018-08-28 12:37:34 -04:00
Pierre Souchay b898131723 [BUGFIX] Avoid returning empty data on startup of a non-leader server (#4554)
Ensure that DB is properly initialized when performing stale queries

Addresses:
- https://github.com/hashicorp/consul-replicate/issues/82
- https://github.com/hashicorp/consul/issues/3975
- https://github.com/hashicorp/consul-template/issues/1131
2018-08-23 12:06:39 -04:00
Kyle Havlovitz e5e1f867e5
Merge branch 'master' into ca-snapshot-fix 2018-08-16 13:00:54 -07:00
Kyle Havlovitz f186edc42c
fsm: add connect service config to snapshot/restore test 2018-08-16 12:58:54 -07:00
nickmy9729 beddf03b26 Added code to allow snapshot inclusion of NodeMeta (#4527) 2018-08-16 15:33:35 -04:00
Kyle Havlovitz b51d76f469
fsm: add missing CA config to snapshot/restore logic 2018-08-16 11:58:50 -07:00
Kyle Havlovitz 4b35d877ca
autopilot: don't follow the normal server removal rules for nonvoters 2018-08-14 14:24:51 -07:00
Kyle Havlovitz ea14482376
Fix stats fetcher healthcheck RPCs not being independent 2018-08-14 14:23:52 -07:00
Pierre Souchay 0d6de257a2 Display more information about check being not properly added when it fails (#4405)
* Display more information about check being not properly added when it fails

It follows an incident where we add lots of error messages:

  [WARN] consul.fsm: EnsureRegistration failed: failed inserting check: Missing service registration

That seems related to Consul failing to restart on respective agents.

Having Node information as well as service information would help diagnose the issue.

* Renamed ensureCheckIfNodeMatches() as requested by @banks
2018-08-14 17:45:33 +01:00
Pierre Souchay ef3b81ab13 Allow to rename nodes with IDs, will fix #3974 and #4413 (#4415)
* Allow to rename nodes with IDs, will fix #3974 and #4413

This change allow to rename any well behaving recent agent with an
ID to be renamed safely, ie: without taking the name of another one
with case insensitive comparison.

Deprecated behaviour warning
----------------------------

Due to asceding compatibility, it is still possible however to
"take" the name of another name by not providing any ID.

Note that when not providing any ID, it is possible to have 2 nodes
having similar names with case differences, ie: myNode and mynode
which might lead to DB corruption on Consul server side and
lead to server not properly restarting.

See #3983 and #4399 for Context about this change.

Disabling registration of nodes without IDs as specified in #4414
should probably be the way to go eventually.

* Removed the case-insensitive search when adding a node within the else
block since it breaks the test TestAgentAntiEntropy_Services

While the else case is probably legit, it will be fixed with #4414 in
a later release.

* Added again the test in the else to avoid duplicated names, but
enforce this test only for nodes having IDs.

Thus most tests without any ID will work, and allows us fixing

* Added more tests regarding request with/without IDs.

`TestStateStore_EnsureNode` now test registration and renaming with IDs

`TestStateStore_EnsureNodeDeprecated` tests registration without IDs
and tests removing an ID from a node as well as updated a node
without its ID (deprecated behaviour kept for backwards compatibility)

* Do not allow renaming in case of conflict, including when other node has no ID

* Fixed function GetNodeID that was not working due to wrong type when searching node from its ID

Thus, all tests about renaming were not working properly.

Added the full test cas that allowed me to detect it.

* Better error messages, more tests when nodeID is not a valid UUID in GetNodeID()

* Added separate TestStateStore_GetNodeID to test GetNodeID.

More complete test coverage for GetNodeID

* Added new unit test `TestStateStore_ensureNoNodeWithSimilarNameTxn`

Also fixed comments to be clearer after remarks from @banks

* Fixed error message in unit test to match test case

* Use uuid.ParseUUID to parse Node.ID as requested by @mkeeler
2018-08-10 11:30:45 -04:00
Siva Prasad c88900aaa9
PR to fix TestAgent_IndexChurn and TestPreparedQuery_Wrapper. (#4512)
* Fixes TestAgent_IndexChurn

* Fixes TestPreparedQuery_Wrapper

* Increased sleep in agent_test for IndexChurn to 500ms

* Made the comment about joinWAN operation much less of a cliffhanger
2018-08-09 12:40:07 -04:00
Armon Dadgar 4f1fd34e9e consul: Update buffer sizes 2018-08-08 10:26:58 -07:00
Siva Prasad 288d350a73
Revert "CA initialization while boostrapping and TestLeader_ChangeServerID fix." (#4497)
* Revert "BUGFIX: Unit test relying on WaitForLeader() did not work due to wrong test (#4472)"

This reverts commit cec5d72396.

* Revert "CA initialization while boostrapping and TestLeader_ChangeServerID fix. (#4493)"

This reverts commit 589b589b53.
2018-08-07 08:29:48 -04:00
Pierre Souchay cec5d72396 BUGFIX: Unit test relying on WaitForLeader() did not work due to wrong test (#4472)
- Improve resilience of testrpc.WaitForLeader()

- Add additionall retry to CI

- Increase "go test" timeout to 8m

- Add wait for cluster leader to several tests in the agent package

- Add retry to some tests in the api and command packages
2018-08-06 19:46:09 -04:00
Siva Prasad 589b589b53
CA initialization while boostrapping and TestLeader_ChangeServerID fix. (#4493)
* connect: fix an issue with Consul CA bootstrapping being interrupted

* streamline change server id test
2018-08-06 16:15:24 -04:00
Kyle Havlovitz fa0d8aff33
fix inconsistency in TestConnectCAConfig_GetSet 2018-07-26 07:46:47 -07:00
Kyle Havlovitz ed87949385
Merge pull request #4400 from hashicorp/leaf-cert-ttl
Add configurable leaf cert TTL to Connect CA
2018-07-25 17:53:25 -07:00
Paul Banks 8cbeb29e73
Fixes #4421: General solution to stop blocking queries with index 0 (#4437)
* Fix theoretical cache collision bug if/when we use more cache types with same result type

* Generalized fix for blocking query handling when state store methods return zero index

* Refactor test retry to only affect CI

* Undo make file merge

* Add hint to error message returned to end-user requests if Connect is not enabled when they try to request cert

* Explicit error for Roots endpoint if connect is disabled

* Fix tests that were asserting old behaviour
2018-07-25 20:26:27 +01:00
Kyle Havlovitz ce10de036e
connect/ca: check LeafCertTTL when rotating expired roots 2018-07-20 16:04:04 -07:00
Kyle Havlovitz d6ca015a42
connect/ca: add configurable leaf cert TTL 2018-07-16 13:33:37 -07:00
Matt Keeler 63d5c069fc
Merge pull request #4379 from hashicorp/persist-intermediates
connect: persist intermediate CAs on leader change
2018-07-12 12:09:13 -04:00
Matt Keeler 0e83059d1f
Revert "Allow changing Node names since Node now have IDs" 2018-07-12 11:19:21 -04:00
Matt Keeler 91150cca59 Fixup formatting 2018-07-12 10:14:26 -04:00
Matt Keeler 3807e04de9 Revert PR 4294 - Catalog Register: Generate UUID for services registered without one
UUID auto-generation here causes trouble in a few cases. The biggest being older
nodes reregistering will fail when the UUIDs are different and the names match

This reverts commit 0f70034082.
This reverts commit d1a8f9cb3f.
This reverts commit cf69ec42a4.
2018-07-12 10:06:50 -04:00
Kyle Havlovitz f95c6807e7
connect: use reflect.DeepEqual instead for test 2018-07-11 13:10:58 -07:00
Matt Keeler 98ead2a8f8
Merge pull request #3983 from pierresouchay/node_renaming
Allow changing Node names since Node now have IDs
2018-07-11 16:03:02 -04:00
Kyle Havlovitz 4e5fb6bc19
connect: add provider state to snapshots 2018-07-11 11:34:49 -07:00
Kyle Havlovitz 462ace4867
connect: update leader initializeCA comment 2018-07-11 10:00:42 -07:00
Kyle Havlovitz 1d3f4b5099
connect: persist intermediate CAs on leader change 2018-07-11 09:44:30 -07:00
Pierre Souchay fecae3de21 When renaming a node, ensure the name is not taken by another node.
Since DNS is case insensitive and DB as issues when similar names with different
cases are added, check for unicity based on case insensitivity.

Following another big incident we had in our cluster, we also validate
that adding/renaming a not does not conflicts with case insensitive
matches.

We had the following error once:

 - one node called: mymachine.MYDC.mydomain was shut off
 - another node (different ID) was added with name: mymachine.mydc.mydomain before
   72 hours

When restarting the consul server of domain, the consul server restarted failed
to start since it detected an issue in RAFT database because
mymachine.MYDC.mydomain and mymachine.mydc.mydomain had the same names.

Checking at registration time with case insensitivity should definitly fix
those issues and avoid Consul DB corruption.
2018-07-11 14:42:54 +02:00
Matt Keeler d19c7d8882
Merge pull request #4303 from pierresouchay/non_blocking_acl
Only send one single ACL cache refresh across network when TTL is over
2018-07-10 08:57:33 -04:00
MagnumOpus21 300330e24b Agent/Proxy: Formatting and test cases fix 2018-07-09 12:46:10 -04:00
Kyle Havlovitz 401b206a2e
Store the time CARoot is rotated out instead of when to prune 2018-07-06 16:05:25 -07:00
Kyle Havlovitz 1492243e0a
connect/ca: add logic for pruning old stale RootCA entries 2018-07-02 10:35:05 -07:00
Pierre Souchay bd023f352e Updated swith case to use same branch for async-cache and extend-cache 2018-07-02 17:39:34 +02:00
Pierre Souchay 1e7665c0d5 Updated documentation and adding more test case for async-cache 2018-07-01 23:50:30 +02:00
Pierre Souchay abde81a3e7 Added async-cache with similar behaviour as extend-cache but asynchronously 2018-07-01 23:50:30 +02:00
Pierre Souchay 9406ca1c95 Only send one single ACL cache refresh across network when TTL is over
It will allow the following:

 * when connectivity is limited (saturated linnks between DCs), only one
   single request to refresh an ACL will be sent to ACL master DC instead
   of statcking ACL refresh queries
 * when extend-cache is used for ACL, do not wait for result, but refresh
   the ACL asynchronously, so no delay is not impacting slave DC
 * When extend-cache is not used, keep the existing blocking mechanism,
   but only send a single refresh request.

This will fix https://github.com/hashicorp/consul/issues/3524
2018-07-01 23:50:30 +02:00
Matt Keeler 22b7b688a3
Move starting enterprise functionality 2018-06-29 17:38:29 -04:00
Matt Keeler 0f70034082 Move default uuid test into the consul package 2018-06-27 09:21:58 -04:00
Matt Keeler d1a8f9cb3f go fmt changes 2018-06-27 09:07:22 -04:00
Matt Keeler cf69ec42a4 Make sure to generate UUIDs when services are registered without one
This makes the behavior line up with the docs and expected behavior
2018-06-26 17:04:08 -04:00
mkeeler 6813a99081 Merge remote-tracking branch 'connect/f-connect' 2018-06-25 19:42:51 +00:00
Kyle Havlovitz 3baa67cdef connect/ca: pull the cluster ID from config during a rotation 2018-06-25 12:25:42 -07:00
Kyle Havlovitz b4ef7bb64d connect/ca: leave blank root key/cert out of the default config (unnecessary) 2018-06-25 12:25:42 -07:00
Kyle Havlovitz 050da22473 connect/ca: undo the interface changes and use sign-self-issued in Vault 2018-06-25 12:25:42 -07:00
Kyle Havlovitz bc997688e3 connect/ca: update Consul provider to use new cross-sign CSR method 2018-06-25 12:25:41 -07:00
Kyle Havlovitz 226a59215d connect/ca: fix vault provider URI SANs and test 2018-06-25 12:25:41 -07:00
Kyle Havlovitz 1a8ac686b2 connect/ca: add the Vault CA provider 2018-06-25 12:25:41 -07:00
Paul Banks e33bfe249e Note leadership issues in comments 2018-06-25 12:25:41 -07:00
Paul Banks e514570dfa Actually return Intermediate certificates bundled with a leaf! 2018-06-25 12:25:40 -07:00
Paul Banks 2e223ea2b7 Fix hot loop in cache for RPC returning zero index. 2018-06-25 12:25:37 -07:00
Paul Banks 05a8097c5d Fix misc test failures (some from other PRs) 2018-06-25 12:25:13 -07:00
Paul Banks 382ce8f98a Only set precedence on write path 2018-06-25 12:25:13 -07:00
Paul Banks 4a54f8f7e3 Fix some tests failures caused by the sorting change and some cuased by previous UpdatePrecedence() change 2018-06-25 12:25:13 -07:00
Paul Banks bf7a62e0e0 Sort intention list by precedence 2018-06-25 12:25:13 -07:00
Kyle Havlovitz edbeeeb23c agent: update accepted CA config fields and defaults 2018-06-25 12:25:09 -07:00
Mitchell Hashimoto 028aa78e83 agent/consul: set precedence value on struct itself 2018-06-25 12:24:16 -07:00
Mitchell Hashimoto daf46c9cfa agent/consul: support a Connect option on prepared query request 2018-06-25 12:24:12 -07:00
Mitchell Hashimoto 440b1b2d97 agent/consul: prepared query supports "Connect" field 2018-06-25 12:24:11 -07:00
Mitchell Hashimoto 1830c6b308 agent: switch ConnectNative to an embedded struct 2018-06-25 12:24:10 -07:00
Mitchell Hashimoto eb3fcb39b3 agent/consul/state: support querying by Connect native 2018-06-25 12:24:08 -07:00
Mitchell Hashimoto d6a823ad0d agent/consul: support catalog registration with Connect native 2018-06-25 12:24:07 -07:00
Matt Keeler af910bda39
Merge pull request #4216 from hashicorp/rpc-limiting
Make RPC limits reloadable
2018-06-20 09:05:28 -04:00
Mitchell Hashimoto 1906fe1c0d
agent: address feedback 2018-06-14 09:42:20 -07:00
Mitchell Hashimoto 0accfc1628
agent: rename test to check 2018-06-14 09:42:18 -07:00
Mitchell Hashimoto 2a29679e9d
agent/consul: forward request if necessary 2018-06-14 09:42:17 -07:00
Mitchell Hashimoto 54ac5adb08
agent: comments to point to differing logic 2018-06-14 09:42:17 -07:00
Mitchell Hashimoto d68462fca6
agent/consul: implement Intention.Test endpoint 2018-06-14 09:42:17 -07:00
Paul Banks f4b8e8c96d
Add default CA config back - I didn't add it and causes nil panics 2018-06-14 09:42:17 -07:00
Paul Banks 1228a5839a
Ooops remove the CA stuff from actual server defaults and make it test server only 2018-06-14 09:42:16 -07:00
Paul Banks 4aeab3897c
Fixed many tests after rebase. Some still failing and seem unrelated to any connect changes. 2018-06-14 09:42:16 -07:00
Paul Banks b4803eca59
Generate CSR using real trust-domain 2018-06-14 09:42:16 -07:00
Paul Banks 622a475eb1
Add CSR signing verification of service ACL, trust domain and datacenter. 2018-06-14 09:42:16 -07:00
Paul Banks c1f2025d96
Return TrustDomain from CARoots RPC 2018-06-14 09:42:15 -07:00
Kyle Havlovitz e00088e8ee
Rename some of the CA structs/files 2018-06-14 09:42:15 -07:00
Kyle Havlovitz 6e9f1f8acb
Add more metadata to structs.CARoot 2018-06-14 09:42:15 -07:00
Kyle Havlovitz 627aa80d5a
Use provider state table for a global serial index 2018-06-14 09:42:15 -07:00
Kyle Havlovitz de72834b8c
Move connect CA provider to separate package 2018-06-14 09:42:15 -07:00
Mitchell Hashimoto bc605a1576
agent/consul: change provider wait from goto to a loop 2018-06-14 09:42:14 -07:00
Mitchell Hashimoto c8b65217c3
agent/consul: check nil on getCAProvider result 2018-06-14 09:42:14 -07:00
Mitchell Hashimoto 9b3495dddb
agent/consul: retry reading provider a few times 2018-06-14 09:42:14 -07:00
Paul Banks 90c574ebaa
Wire up agent leaf endpoint to cache framework to support blocking. 2018-06-14 09:42:07 -07:00
Kyle Havlovitz a4d18f0eaa
Fill out connect CA rpc endpoint tests 2018-06-14 09:42:06 -07:00
Kyle Havlovitz cce7f1cca1
Add tests for the built in CA's state store table 2018-06-14 09:42:06 -07:00
Kyle Havlovitz 15fbc2fd97
Add more tests for built-in provider 2018-06-14 09:42:06 -07:00
Kyle Havlovitz edcfdb37af
Fix some inconsistencies around the CA provider code 2018-06-14 09:42:06 -07:00
Kyle Havlovitz daa8dd1779
Add CA config to connect section of agent config 2018-06-14 09:42:05 -07:00
Kyle Havlovitz 32d1eae28b
Move ConsulCAProviderConfig into structs package 2018-06-14 09:42:04 -07:00
Kyle Havlovitz 315b8bf594
Simplify the CAProvider.Sign method 2018-06-14 09:42:04 -07:00
Kyle Havlovitz c6e1b72ccb
Simplify the CA provider interface by moving some logic out 2018-06-14 09:42:04 -07:00
Kyle Havlovitz a325388939
Clarify some comments and names around CA bootstrapping 2018-06-14 09:42:04 -07:00
Kyle Havlovitz 33418afd3c
Add cross-signing mechanism to root rotation 2018-06-14 09:42:00 -07:00
Kyle Havlovitz d83fbfc766
Add the root rotation mechanism to the CA config endpoint 2018-06-14 09:41:59 -07:00
Kyle Havlovitz f9d92d795e
Have the built in CA store its state in raft 2018-06-14 09:41:59 -07:00
Kyle Havlovitz 30c1973e8b
Fix the testing endpoint's root set op 2018-06-14 09:41:59 -07:00
Kyle Havlovitz ab737ef0f8
Hook the CA RPC endpoint into the provider interface 2018-06-14 09:41:59 -07:00
Kyle Havlovitz 1f6501895f
Add CA bootstrapping on establishing leadership 2018-06-14 09:41:59 -07:00
Kyle Havlovitz 682f105c7c
Add the bootstrap config for the CA 2018-06-14 09:41:59 -07:00
Kyle Havlovitz 1787f88618
Add CA config set to fsm operations 2018-06-14 09:41:58 -07:00
Kyle Havlovitz 6b3416e480
Add the Connect CA config to the state store 2018-06-14 09:41:58 -07:00
Paul Banks 730da74369
Fix various test failures and vet warnings.
Intention de-duplication in previously merged PR actualy failed some tests that were not caught be me or CI. I ran the test files for state changes but they happened not to trigger this case so I made sure they did first and then fixed. That fixed some upstream intention endpoint tests that I'd not run as part of testing the previous fix.
2018-06-14 09:41:58 -07:00
Paul Banks 88541bba17
Add tests all the way up through the endpoints to ensure duplicate src/destination is supported and so ultimately deny/allow nesting works.
Also adds a sanity check test for `api.Agent().ConnectAuthorize()` and a fix for a trivial bug in it.
2018-06-14 09:41:57 -07:00
Paul Banks ed9f07c361
Allow duplicate source or destination, but enforce uniqueness across all four. 2018-06-14 09:41:57 -07:00
Mitchell Hashimoto 845f7cd8ad
agent/consul/state: ensure exactly one active CA exists when setting 2018-06-14 09:41:54 -07:00
Mitchell Hashimoto 17ca8ad083
agent/connect: rename SpiffeID to CertURI 2018-06-14 09:41:53 -07:00
Mitchell Hashimoto 0cbcb07d61
agent/connect: use proper keyusage fields for CA and leaf 2018-06-14 09:41:53 -07:00
Mitchell Hashimoto a54d1af421
agent/consul: encode issued cert serial number as hex encoded 2018-06-14 09:41:53 -07:00
Mitchell Hashimoto 63d674d07d
agent: /v1/connect/ca/configuration PUT for setting configuration 2018-06-14 09:41:52 -07:00
Mitchell Hashimoto 1c3dbc83ff
agent/consul/fsm,state: snapshot/restore for CA roots 2018-06-14 09:41:52 -07:00
Mitchell Hashimoto 90f423fd02
agent/consul/fsm,state: tests for CA root related changes 2018-06-14 09:41:52 -07:00
Mitchell Hashimoto 1c72639d60
agent/consul: set more fields on the issued cert 2018-06-14 09:41:52 -07:00
Mitchell Hashimoto c2588262b7
agent: /v1/connect/ca/leaf/:service_id 2018-06-14 09:41:52 -07:00
Mitchell Hashimoto e40afd6a73
agent/consul: CAS operations for setting the CA root 2018-06-14 09:41:51 -07:00
Mitchell Hashimoto 578db06600
agent/consul: tests for CA endpoints 2018-06-14 09:41:51 -07:00
Mitchell Hashimoto 891cd22ad9
agent/consul: key the public key of the CSR, verify in test 2018-06-14 09:41:51 -07:00
Mitchell Hashimoto d768d5e9a7
agent/consul: test for ConnectCA.Sign 2018-06-14 09:41:51 -07:00
Mitchell Hashimoto f4ec28bfe3
agent/consul: basic sign endpoint not tested yet 2018-06-14 09:41:51 -07:00
Mitchell Hashimoto 5a950190f3
agent/consul: RPC endpoints to list roots 2018-06-14 09:41:50 -07:00
Mitchell Hashimoto 130098b7b5
agent/consul/state: CARoot structs and initial state store 2018-06-14 09:41:49 -07:00
Mitchell Hashimoto 4d852e62a3
agent: address PR feedback 2018-06-14 09:41:49 -07:00
Mitchell Hashimoto 6313bc5615
agent: clarified a number of comments per PR feedback 2018-06-14 09:41:49 -07:00
Mitchell Hashimoto 353953fcd2
agent/consul: Health.ServiceNodes ACL check for Connect 2018-06-14 09:41:49 -07:00
Mitchell Hashimoto b6c0cb7115
agent/consul: Catalog endpoint ACL requirements for Connect proxies 2018-06-14 09:41:49 -07:00
Mitchell Hashimoto 2feef5f7a3
agent/consul: require name for proxies 2018-06-14 09:41:48 -07:00
Mitchell Hashimoto 44ec8d94d2
agent: clean up connect/non-connect duplication by using shared methods 2018-06-14 09:41:48 -07:00
Mitchell Hashimoto 7d79f9c46f
agent/consul: implement Health.ServiceNodes for Connect, DNS works 2018-06-14 09:41:47 -07:00
Mitchell Hashimoto e01914a025
agent/consul: Catalog.ServiceNodes supports Connect filtering 2018-06-14 09:41:47 -07:00
Mitchell Hashimoto 2062e37270
agent/consul/state: ConnectServiceNodes 2018-06-14 09:41:47 -07:00
Mitchell Hashimoto 7ed26e2c64
agent/consul: enforce ACL on ProxyDestination 2018-06-14 09:41:47 -07:00
Mitchell Hashimoto 0c0c0a58e7
agent/consul: proxy registration and tests 2018-06-14 09:41:46 -07:00
Mitchell Hashimoto 4d4a8443e8
agent: test /v1/catalog/node/:node to list connect proxies 2018-06-14 09:41:46 -07:00
Mitchell Hashimoto 6e257ea51c
agent: /v1/catalog/service/:service works with proxies 2018-06-14 09:41:46 -07:00
Mitchell Hashimoto 63e4a35827
agent/consul/state: convert proxy test to testify/assert 2018-06-14 09:41:46 -07:00
Mitchell Hashimoto 21c6fc623a
agent/consul/state: service registration with proxy works 2018-06-14 09:41:46 -07:00
Mitchell Hashimoto a621afe72c
agent/consul: convert intention ACLs to testify/assert 2018-06-14 09:41:46 -07:00
Mitchell Hashimoto 9dc8aa0fb3
agent/consul,structs: add tests for ACL filter and prefix for intentions 2018-06-14 09:41:45 -07:00
Mitchell Hashimoto 5ac649af7f
agent/consul: Intention.Match ACLs 2018-06-14 09:41:45 -07:00
Mitchell Hashimoto 4d87601bf4
agent/consul: Intention.Get ACLs 2018-06-14 09:41:45 -07:00
Mitchell Hashimoto 9bbbb73734
agent/consul: Intention.Apply ACL on rename 2018-06-14 09:41:45 -07:00
Mitchell Hashimoto 01b644e213
agent/consul: tests for ACLs on Intention.Apply update/delete 2018-06-14 09:41:45 -07:00
Mitchell Hashimoto a67ff1c0dc
agent/consul: Basic ACL on Intention.Apply 2018-06-14 09:41:44 -07:00
Mitchell Hashimoto 0719ff6905
agent: convert all intention tests to testify/assert 2018-06-14 09:41:44 -07:00
Mitchell Hashimoto 454ef7d106
agent/consul/fsm,state: snapshot/restore for intentions 2018-06-14 09:41:44 -07:00
Mitchell Hashimoto 80d068aaa4
agent: use UTC time for intention times, move empty list check to
agent/consul
2018-06-14 09:41:43 -07:00
Mitchell Hashimoto 370b2599a1
agent/consul/fsm: switch tests to use structs.TestIntention 2018-06-14 09:41:43 -07:00
Mitchell Hashimoto 97e2a73145
agent/consul/state: need to set Meta for intentions for tests 2018-06-14 09:41:43 -07:00
Mitchell Hashimoto ad42f42a17
agent/consul/state: remove TODO 2018-06-14 09:41:43 -07:00
Mitchell Hashimoto 70858598e4
agent: use testing intention to get valid intentions 2018-06-14 09:41:43 -07:00
Mitchell Hashimoto ab4ea3efb4
agent/consul: set default intention SourceType, validate it 2018-06-14 09:41:43 -07:00
Mitchell Hashimoto d92993f75b
agent/structs: Intention validation 2018-06-14 09:41:42 -07:00
Mitchell Hashimoto 82a50245e0
agent/consul: support intention description, meta is non-nil 2018-06-14 09:41:42 -07:00
Mitchell Hashimoto c12690b837
agent/consul/fsm: add tests for intention requests 2018-06-14 09:41:42 -07:00
Mitchell Hashimoto a9743f4f15
agent,agent/consul: set default namespaces 2018-06-14 09:41:42 -07:00
Mitchell Hashimoto 10c370c0fb
agent/consul: set CreatedAt, UpdatedAt on intentions 2018-06-14 09:41:42 -07:00
Mitchell Hashimoto 93de03fe8b
agent/consul: RPC endpoint for Intention.Match 2018-06-14 09:41:42 -07:00
Mitchell Hashimoto f93edadbbe
agent/consul/state: IntentionMatch for performing match resolution 2018-06-14 09:41:41 -07:00
Mitchell Hashimoto fb02e53536
agent/consul: test that Apply works to delete an intention 2018-06-14 09:41:41 -07:00
Mitchell Hashimoto 4417f37ede
agent/consul/state,fsm: support for deleting intentions 2018-06-14 09:41:41 -07:00
Mitchell Hashimoto 1b44c1befa
agent/consul: creating intention must not have ID set 2018-06-14 09:41:40 -07:00
Mitchell Hashimoto 771b1737e3
agent/consul: support updating intentions 2018-06-14 09:41:40 -07:00
Mitchell Hashimoto 0d96cdc0a5
agent: GET /v1/connect/intentions/:id 2018-06-14 09:41:40 -07:00
Mitchell Hashimoto e8c4156f07
agent/consul: Intention.Get endpoint 2018-06-14 09:41:40 -07:00
Mitchell Hashimoto 9e307e178e
agent/consul: Intention.Apply, FSM methods, very little validation 2018-06-14 09:41:39 -07:00
Mitchell Hashimoto 212a272989
agent/consul: start Intention RPC endpoints, starting with List 2018-06-14 09:41:39 -07:00
Mitchell Hashimoto 9639bfb1be
agent/consul/state: list intentions 2018-06-14 09:41:39 -07:00
Mitchell Hashimoto cc8a6f7f15
agent/consul/state: initial work on intentions memdb table 2018-06-14 09:41:39 -07:00
Guido Iaquinti f7fe6c2a87 Attach server.Name label to client.rpc.failed 2018-06-13 14:56:14 +01:00
Guido Iaquinti 3d230dee80 Attach server.ID label to client.rpc.failed 2018-06-13 14:53:44 +01:00
Guido Iaquinti e85e63c18c Client: add metric for failed RPC calls to server 2018-06-13 12:35:45 +01:00
Matt Keeler 0df7cd22aa Add a Client ReloadConfig test 2018-06-11 16:23:51 -04:00
Matt Keeler 08e26d10b8 Merge branch 'master' of github.com:hashicorp/consul into rpc-limiting
# Conflicts:
#	agent/agent.go
#	agent/consul/client.go
2018-06-11 16:11:36 -04:00
Matt Keeler 65746b2f8f Apply the limits to the clients rpcLimiter 2018-06-11 15:51:17 -04:00
Matt Keeler b6e9abe926 Allow for easy enterprise/oss coexistence
Uses struct/interface embedding with the embedded structs/interfaces being empty for oss. Also methods on the server/client types are defaulted to do nothing for OSS
2018-05-24 10:36:42 -04:00
Wim 5c04864b28 Add support for reverse lookup of services 2018-05-19 19:39:02 +02:00
Preetha Appan ca67094619
Change default raft threshold config values and add a section to upgrade notes 2018-05-11 10:45:41 -05:00
Preetha Appan d721da7b67
Also make snapshot interval configurable 2018-05-11 10:43:24 -05:00
Preetha Appan 66f31cd25a
Make raft snapshot commit threshold configurable 2018-05-11 10:43:24 -05:00
Jack Pearkes 291e8b83ae
Merge pull request #4097 from hashicorp/remove-deprecated
Remove deprecated check/service fields and metric names
2018-05-10 15:45:49 -07:00
Kyle Havlovitz ba3971d2c1
Remove deprecated metric names 2018-05-08 16:23:15 -07:00
Paul Banks b7fa3358d1
Merge pull request #3970 from pierresouchay/node_health_should_change_service_index
[BUGFIX] When a node level check is removed, ensure all services of node are notified
2018-05-08 16:44:50 +01:00
Pierre Souchay c152cb7bdf Added Missing Service Meta synchronization and field 2018-04-21 17:34:29 +02:00
Pierre Souchay 89ab642928 Allow renaming nodes when ID is unchanged 2018-04-18 15:39:38 +02:00
Kyle Havlovitz af4be34a2a
Update make static-assets goal and run format 2018-04-13 09:57:25 -07:00
Matt Keeler d926679278
Merge pull request #4023 from hashicorp/f-near-ip
Add near=_ip support for prepared queries
2018-04-12 12:10:48 -04:00
Matt Keeler 136efeb3be GH-3798: A couple more PR updates
Test HTTP/DNS source IP without header/extra EDNS data.
Add WARN log for when prepared query with near=_ip is executed without specifying the source ip
2018-04-12 10:10:37 -04:00
Matt Keeler cec8d5145b GH-3798: A few more PR updates 2018-04-11 20:32:35 -04:00
Matt Keeler d065d3a6db GH-3798: Updates for PR
Allow DNS peer IP as the source IP.
Break early when the right node was found for executing the preapred query.
Update docs
2018-04-11 17:02:04 -04:00
Matt Keeler 45a537def9 GH-3798: Add near=_ip support for prepared queries 2018-04-10 14:50:50 -04:00
Paul Banks 0d8993e338
Allow ignoring checks by ID when defining a PreparedQuery. Fixes #3727. 2018-04-10 14:04:16 +01:00
Preetha Appan c7581d68c6
Renames agent API layer for service metadata to "meta" for consistency 2018-03-28 09:04:50 -05:00
Preetha daa61c5803
Merge pull request #3881 from pierresouchay/service_metadata
Feature Request: Support key-value attributes for services
2018-03-27 16:33:57 -05:00
Pierre Souchay 980189a33f Added validation of ServiceMeta in Catalog
Fixed Error Message when ServiceMeta is not valid

Added Unit test for adding a Service with badly formatted ServiceMeta
2018-03-27 22:22:42 +02:00
Preetha Appan 226cb2e95c
fix typo and remove comment 2018-03-27 14:28:05 -05:00
Preetha Appan 010a459365
Remove unnecessary nil checks 2018-03-27 10:59:42 -05:00
Preetha Appan 6c0bb5a810
Fix test and remove unused method 2018-03-27 09:44:41 -05:00
Preetha Appan d77ab91123
Allows disabling WAN federation by setting serf WAN port to -1 2018-03-26 14:21:06 -05:00
Pierre Souchay a9868ae956 Added support for renaming nodes when their IP does not change 2018-03-26 16:44:13 +02:00
Pierre Souchay 18baff80ae Merge remote-tracking branch 'origin/master' into node_health_should_change_service_index 2018-03-22 13:07:11 +01:00
Pierre Souchay 5fb1b18073 More test cases 2018-03-22 12:41:06 +01:00
Pierre Souchay 39a7b5c20d Added new test regarding checks index 2018-03-22 12:20:25 +01:00
Pierre Souchay dd9efb755a Fixed minor typo in comments
Might fix unstable travis build
2018-03-22 10:30:10 +01:00
Josh Soref 94835a2715 Spelling (#3958)
* spelling: another

* spelling: autopilot

* spelling: beginning

* spelling: circonus

* spelling: default

* spelling: definition

* spelling: distance

* spelling: encountered

* spelling: enterprise

* spelling: expands

* spelling: exits

* spelling: formatting

* spelling: health

* spelling: hierarchy

* spelling: imposed

* spelling: independence

* spelling: inspect

* spelling: last

* spelling: latest

* spelling: client

* spelling: message

* spelling: minimum

* spelling: notify

* spelling: nonexistent

* spelling: operator

* spelling: payload

* spelling: preceded

* spelling: prepared

* spelling: programmatically

* spelling: required

* spelling: reconcile

* spelling: responses

* spelling: request

* spelling: response

* spelling: results

* spelling: retrieve

* spelling: service

* spelling: significantly

* spelling: specifies

* spelling: supported

* spelling: synchronization

* spelling: synchronous

* spelling: themselves

* spelling: unexpected

* spelling: validations

* spelling: value
2018-03-19 16:56:00 +00:00
Pierre Souchay b6914617d9 Fixed typo in comments 2018-03-19 17:12:08 +01:00
Pierre Souchay 5e974843f1 Refactoring to have clearer code without weird bool 2018-03-19 16:12:54 +01:00
Pierre Souchay a44b9e84b1 [BUGFIX] When a node level check is removed, ensure all services of node are notified
Bugfix for https://github.com/hashicorp/consul/pull/3899

When a node level check is removed (example: maintenance),
some watchers on services might have to recompute their state.

If those nodes are performing blocking queries, they have to be notified.
While their state was updated when node-level state did change or was added
this was not the case when the check was removed. This fixes it.
2018-03-19 14:14:03 +01:00
Devin Canterberry a61abcd931
🐛 Formatting changes only; add missing trailing commas 2018-03-15 10:19:46 -07:00
Mitchell Hashimoto 8217564c48
agent/consul/fsm: begin using testify/assert 2018-03-06 09:48:15 -08:00
Paul Banks 9a47449c6d
Merge pull request #3899 from pierresouchay/fix_blocking_queries_index
Services Indexes modified per service instead of using a global Index
2018-03-02 16:24:43 +00:00
Pierre Souchay 360dc1dd8d Simplified error handling for maxIndexForService
* added unit tests to ensure service index is properly garbage collected
* added Upgrade from Version 1.0.6 to higher section in documentation
2018-03-01 14:09:36 +01:00
Preetha Appan 80791d5b21
Remove extra newline 2018-02-21 13:21:47 -06:00
Preetha Appan 907b97b7f2
Unit test that calls revokeLeadership twice to make sure its idempotent 2018-02-21 12:48:53 -06:00
Preetha Appan f59abcc394
Make sure revokeLeadership is called if establishLeadership errors 2018-02-21 12:33:22 -06:00
Alex Dadgar 18bf9647d5 Test autopilots start/stop idempotency 2018-02-21 10:19:30 -08:00
Alex Dadgar 33c5afdb31 Improve autopilot shutdown to be idempotent 2018-02-20 15:51:59 -08:00
Pierre Souchay a8d3745104 Fixed comments for function maxIndexForService 2018-02-20 23:57:28 +01:00
Pierre Souchay 09351ba9a6 [Revert] Only update services if tags are different
This patch did give some better results, but break watches on
the services of a node.

It is possible to apply the same optimization for nodes than
to services (one index per instance), but it would complicate
further the patch.

Let's do it in another PR.
2018-02-20 23:34:42 +01:00
Pierre Souchay 60454b570a Only update services if tags are different 2018-02-20 23:08:04 +01:00
Pierre Souchay a05d38737c Enable Raft index optimization per service name on health endpoint
Had to fix unit test in order to check properly indexes.
2018-02-20 01:35:50 +01:00
Pierre Souchay 4f10fae3c3 Get only first service to test whether we have to cleanup index of a service 2018-02-19 22:44:49 +01:00
Pierre Souchay bac8fb046f Fixed comment about raftIndex + use test.Helper() 2018-02-19 19:30:25 +01:00
Pierre Souchay 73127ef407 Services Indexes modified per service instead of using a global Index
This patch improves the watches for services on large cluster:
each service has now its own index, such watches on a specific service
are not modified by changes in the global catalog.

It should improve a lot the performance of tools such as consul-template
or libraries performing watches on very large clusters with many
services/watches.
2018-02-19 18:29:22 +01:00
Veselkov Konstantin 7de57ba4de remove golint warnings 2018-01-28 22:40:13 +04:00
Kyle Havlovitz bfeb09983b
Reset clusterHealth when autopilot starts 2018-01-23 12:52:28 -08:00
Kyle Havlovitz 17805e4634
Move autopilot health loop into leader operations 2018-01-23 11:17:41 -08:00
James Phillips da6a4635b0
Fixes a `go fmt` cleanup. 2017-12-20 13:43:38 -08:00
Kyle Havlovitz 11a0c9cc58
Fix vet error 2017-12-18 18:04:42 -08:00
Kyle Havlovitz 77dc52f430
Move autopilot initializing to oss file 2017-12-18 18:02:44 -08:00
Kyle Havlovitz 039e7f1880
Move autopilot setup to a separate file 2017-12-18 16:55:51 -08:00
Kyle Havlovitz d08ab9fd19
Make some final tweaks to autopilot package 2017-12-18 12:26:47 -08:00
Kyle Havlovitz a86d11ec0a
Merge pull request #3737 from hashicorp/autopilot-refactor
Move autopilot to a standalone package
2017-12-15 14:09:40 -08:00
James Phillips 06f980061e
Merge pull request #3728 from weiwei04/fix_globalRPC_goroutine_leak
fix globalRPC goroutine leak
2017-12-14 17:54:19 -08:00
Kyle Havlovitz 324c2ecb53
Expose IsPotentialVoter for advanced autopilot logic 2017-12-13 17:53:51 -08:00
Kyle Havlovitz 12bf61c851
Merge branch 'master' into autopilot-refactor 2017-12-13 11:54:32 -08:00
Kyle Havlovitz d6b266c045
A few last autopilot adjustments 2017-12-13 11:19:17 -08:00
Kyle Havlovitz 2310687c1d
More autopilot reorganizing 2017-12-13 10:57:37 -08:00
James Phillips 46742a5041
Adds TODOs referencing #3744. 2017-12-13 10:52:06 -08:00
Kyle Havlovitz b92f895c23
More refactoring to make autopilot consul-agnostic 2017-12-12 17:46:28 -08:00
Kyle Havlovitz de28555671
Move autopilot to a standalone package 2017-12-11 16:45:33 -08:00
James Phillips d12e81860f
Moves Serf helper into lib to fix import cycle in consul-enterprise. 2017-12-07 16:57:58 -08:00
James Phillips 5065f3d82e
Turns of intent queue warnings and enables dynamic queue sizing. 2017-12-07 16:27:06 -08:00
Wei Wei cc9648c957 fix globalRPC goroutine leak
Signed-off-by: Wei Wei <weiwei.inf@gmail.com>
2017-12-05 11:53:30 +08:00
James Phillips 3e46544085
Creates a registration mechanism for snapshot and restore. 2017-11-29 18:36:53 -08:00
James Phillips f53f521072
Begins split out of snapshots from the main FSM class. 2017-11-29 18:36:53 -08:00
James Phillips c8e763667f
Creates a registration mechanism for FSM commands. 2017-11-29 18:36:53 -08:00
James Phillips 78292662d7
Moves the FSM into its own package.
This will help make it clearer what happens when we add some registration
plumbing for the different operations and snapshots.
2017-11-29 18:36:53 -08:00
James Phillips e810697e06
Resolves an FSM snapshot TODO.
This adds checks for sink write calls before we continue the refactor, which
will resolve the other TODO comment we deleted as part of this change.
2017-11-29 18:36:53 -08:00
James Phillips aa61159b74
Creates a registration mechanism for schemas.
This also splits out the registration into the table-specific source
files.
2017-11-29 18:36:52 -08:00
James Phillips 93ff33b1be
Creates a registration mechanism for RPC endpoints. 2017-11-29 18:36:52 -08:00
James Phillips 8bf1f57737
Renames stubs to be more consistent. 2017-11-29 18:36:52 -08:00
James Phillips 8abd2050fa
Sheds monotonic time info so tombstone GC bins work properly. 2017-11-29 10:34:24 -08:00
James Phillips de57a9ef51
Gives back the lock before writing to the expire channel.
The lock isn't needed after we clean up the expire bin, and as seen
in #3700 we can get into a deadlock waiting to place the expire index
into the channel while holding this lock.

Fixes #3700
2017-11-19 16:24:16 -08:00
James Phillips f19ba41144
Moves the LAN event handler after the router is created.
Fixes #3680
2017-11-10 12:26:48 -08:00
James Phillips 17737ee030
Revert "Adds a small sleep to make sure we are in the next GC bucket." 2017-11-08 22:18:37 -08:00
James Phillips 24475048e2
Adds a sleep to make sure we are in the next GC bucket, ups time.
Fixes #3670
2017-11-08 22:02:40 -08:00
James Phillips c57884fffe
Skips the tombstone GC test in Travis for now.
Related to #3670
2017-11-08 20:14:20 -08:00
James Phillips f6b7dcbcf6
Removes bogus getPort() in favor of freeport. 2017-11-08 19:55:50 -08:00
James Phillips 7b966e2d26
Tightens timing up and reorders GC test to be less flaky. 2017-11-08 15:09:29 -08:00
James Phillips 7c6ab5e783
Doubles the GC timing. 2017-11-08 15:01:11 -08:00
James Phillips 8de7c77482
Opens up test timing a little more. 2017-11-08 14:01:19 -08:00
James Phillips c46612f691
Shifts off a gran boundary to help make test less flaky. 2017-11-08 13:57:17 -08:00
James Phillips f31856c1b7
Opens up the tombstone GC test timing. 2017-11-08 13:43:39 -08:00
Kyle Havlovitz d3dd2b1402
Move check definition to a sub-struct 2017-11-01 14:54:46 -07:00
Kyle Havlovitz dbab3cd5f6
Merge branch 'master' into esm-changes 2017-11-01 11:37:48 -07:00
Kyle Havlovitz c4375d5a47
Merge pull request #3622 from hashicorp/coordinate-node-endpoint
agent: add /v1/coordianate/node/:node endpoint
2017-11-01 11:35:50 -07:00
Kyle Havlovitz b0536a96cc
Fill out the tests around coordinate/node functionality 2017-10-31 15:36:44 -07:00
Kyle Havlovitz 1e3b0d441b
Factor out registerNodes function 2017-10-31 13:34:49 -07:00
James Phillips 6bf55d16a2
Relaxes Autopilot promotion logic. (#3623)
* Relaxes Autopilot promotion logic.

When we defaulted the Raft protocol version to 3 in #3477 we made
the numPeers() routine more strict to only count voters (this is
more conservative and more correct). This had the side effect of
breaking rolling updates because it's at odds with the Autopilot
non-voter promotion logic.

That logic used to wait to only promote to maintain an odd quorum
of servers. During a rolling update (add one new server, wait, and
then kill an old server) the dead server cleanup would still count
the old server as a peer, which is conservative and the right thing
to do, and no longer count the non-voter. This would wait to promote,
so you could get into a stalemate. It is safer to promote early than
remove early, so by promoting as soon as possible we have chosen
that as the solution here.

Fixes #3611

* Gets rid of unnecessary extra not-a-voter check.
2017-10-31 15:16:56 -05:00
Kyle Havlovitz 2392545adc
Merge branch 'coordinate-node-endpoint' of github.com:hashicorp/consul into esm-changes 2017-10-26 19:20:24 -07:00
Kyle Havlovitz 5589eadcf5
Added Coordinate.Node rpc endpoint and client api method 2017-10-26 19:16:40 -07:00
Kyle Havlovitz a7c42a6c2a
Expose SkipNodeUpdate field and some health check info in the http api 2017-10-25 19:37:30 +02:00
Frank Schroeder c94751ad43 test: replace porter tool with freeport lib
This patch removes the porter tool which hands out free ports from a
given range with a library which does the same thing. The challenge for
acquiring free ports in concurrent go test runs is that go packages are
tested concurrently and run in separate processes. There has to be some
inter-process synchronization in preventing processes allocating the
same ports.

freeport allocates blocks of ports from a range expected to be not in
heavy use and implements a system-wide mutex by binding to the first
port of that block for the lifetime of the application. Ports are then
provided sequentially from that block and are tested on localhost before
being returned as available.
2017-10-21 22:01:09 +02:00
Ryan Slade 85e4aea9d1 Replace time.Now().Sub(x) with time.Since(x) 2017-10-17 20:38:24 +02:00
James Phillips 575d70aaa7
Cleans up some drift between the OSS and Enterprise trees. 2017-10-11 15:53:07 -07:00
James Phillips bb12368eac Makes RPC handling more robust when rolling servers. (#3561)
* Adds client-side retry for no leader errors.

This paves over the case where the client was connected to the leader
when it loses leadership.

* Adds a configurable server RPC drain time and a fail-fast path for RPCs.

When a server leaves it gets removed from the Raft configuration, so it will
never know who the new leader server ends up being. Without this we'd be
doomed to wait out the RPC hold timeout and then fail. This makes things fail
a little quicker while a sever is draining, and since we added a client retry
AND since the server doing this has already shut down and left the Serf LAN,
clients should retry against some other server.

* Makes the RPC hold timeout configurable.

* Reorders struct members.

* Sets the RPC hold timeout default for test servers.

* Bumps the leave drain time up to 5 seconds.

* Robustifies retries with a simpler client-side RPC hold.

* Reverts untended delete.
2017-10-10 15:19:50 -07:00
James Phillips 4dab70cb93 Fixes handling of stop channel and failed barrier attempts. (#3546)
* Fixes handling of stop channel and failed barrier attempts.

There were two issues here. First, we needed to not exit when there
was a timeout trying to write the barrier, because Raft might not
step down, so we'd be left as the leader but having run all the step
down actions.

Second, we didn't close over the stopCh correctly, so it was possible
to nil that out and have the leaderLoop never exit. We close over it
properly AND sequence the nil-ing of it AFTER the leaderLoop exits for
good measure, so the code is more robust.

Fixes #3545

* Cleans up based on code review feedback.

* Tweaks comments.

* Renames variables and removes comments.
2017-10-06 07:54:49 -07:00
Kyle Havlovitz c728564994
Update metric names and add a legacy config flag 2017-10-04 16:43:27 -07:00
Preetha Appan 8dcd7e700c Remove extra newline 2017-10-03 15:19:31 -05:00
Preetha Appan 26accb3b8a Only allow 'list' policies within 'key' policy definitions. Consolidated two similar tests into one and fixed alignment. 2017-10-03 15:15:56 -05:00
Preetha Appan 51a04ec87d Introduces new 'list' permission that applies to KV store recursive reads, and enforced only when opted in. 2017-10-02 17:10:21 -05:00
James Phillips 0190c4a081
Gets rid of flaky clause in stats fetcher unit test.
Given how the rutine is coded we can still get data so this wasn't
a reliable thing to check.
2017-09-26 20:53:06 -07:00
preetapan 4d9fc638b4 Issue 3452 (#3500)
* Make sure that id and address are set in member created during reaping of catalog nodes that have been removed from serf

* Get address from node table in the state store rather than from service address

* Fix incorrect lookup by checkname instead of node name

* Make sure that serverlookup is called with the right address format, added unit test.

* Address code review comments

* Tweaks style stuff.
2017-09-26 20:49:41 -07:00
James Phillips 5fa2322e0b
Cleans up some edge cases in TestSnapshot_Forward_Leader.
These could cause the tests to hang.
2017-09-26 14:07:28 -07:00
Preetha Appan 3c4a108769 Move Raft protocol version for list peers end point to server side, fix unit tests. This fixes #3449 2017-09-26 09:35:39 -05:00
James Phillips 45646ac3f4 Bumps default Raft protocol to version 3. (#3477)
* Changes default Raft protocol to 3.

* Changes numPeers() to report only voters.

This should have been there before, but it's more obvious that this
is incorrect now that we default the Raft protocol to 3, which puts
new servers in a read-only state while Autopilot waits for them to
become healthy.

* Fixes TestLeader_RollRaftServer.

* Fixes TestOperator_RaftRemovePeerByAddress.

* Fixes TestServer_*.

Relaxed the check for a given number of voter peers and instead do
a thorough check that all servers see each other in their Raft
configurations.

* Fixes TestACL_*.

These now just check for Raft replication to be set up, and don't
care about the number of voter peers.

* Fixes TestOperator_Raft_ListPeers.

* Fixes TestAutopilot_CleanupDeadServerPeriodic.

* Fixes TestCatalog_ListNodes_ConsistentRead_Fail.

* Fixes TestLeader_ChangeServerID and adjusts the conn pool to throw away
sockets when it sees io.EOF.

* Changes version to 1.0.0 in the options doc.

* Makes metrics test more deterministic with autopilot metrics possible.
2017-09-25 15:27:04 -07:00
Preetha Appan d7e27e67c1 Introduce Code Policy validation via sentinel, with a noop implementation 2017-09-25 13:44:55 -05:00
Frank Schröder 12216583a1 New config parser, HCL support, multiple bind addrs (#3480)
* new config parser for agent

This patch implements a new config parser for the consul agent which
makes the following changes to the previous implementation:

 * add HCL support
 * all configuration fragments in tests and for default config are
   expressed as HCL fragments
 * HCL fragments can be provided on the command line so that they
   can eventually replace the command line flags.
 * HCL/JSON fragments are parsed into a temporary Config structure
   which can be merged using reflection (all values are pointers).
   The existing merge logic of overwrite for values and append
   for slices has been preserved.
 * A single builder process generates a typed runtime configuration
   for the agent.

The new implementation is more strict and fails in the builder process
if no valid runtime configuration can be generated. Therefore,
additional validations in other parts of the code should be removed.

The builder also pre-computes all required network addresses so that no
address/port magic should be required where the configuration is used
and should therefore be removed.

* Upgrade github.com/hashicorp/hcl to support int64

* improve error messages

* fix directory permission test

* Fix rtt test

* Fix ForceLeave test

* Skip performance test for now until we know what to do

* Update github.com/hashicorp/memberlist to update log prefix

* Make memberlist use the default logger

* improve config error handling

* do not fail on non-existing data-dir

* experiment with non-uniform timeouts to get a handle on stalled leader elections

* Run tests for packages separately to eliminate the spurious port conflicts

* refactor private address detection and unify approach for ipv4 and ipv6.

Fixes #2825

* do not allow unix sockets for DNS

* improve bind and advertise addr error handling

* go through builder using test coverage

* minimal update to the docs

* more coverage tests fixed

* more tests

* fix makefile

* cleanup

* fix port conflicts with external port server 'porter'

* stop test server on error

* do not run api test that change global ENV concurrently with the other tests

* Run remaining api tests concurrently

* no need for retry with the port number service

* monkey patch race condition in go-sockaddr until we understand why that fails

* monkey patch hcl decoder race condidtion until we understand why that fails

* monkey patch spurious errors in strings.EqualFold from here

* add test for hcl decoder race condition. Run with go test -parallel 128

* Increase timeout again

* cleanup

* don't log port allocations by default

* use base command arg parsing to format help output properly

* handle -dc deprecation case in Build

* switch autopilot.max_trailing_logs to int

* remove duplicate test case

* remove unused methods

* remove comments about flag/config value inconsistencies

* switch got and want around since the error message was misleading.

* Removes a stray debug log.

* Removes a stray newline in imports.

* Fixes TestACL_Version8.

* Runs go fmt.

* Adds a default case for unknown address types.

* Reoders and reformats some imports.

* Adds some comments and fixes typos.

* Reorders imports.

* add unix socket support for dns later

* drop all deprecated flags and arguments

* fix wrong field name

* remove stray node-id file

* drop unnecessary patch section in test

* drop duplicate test

* add test for LeaveOnTerm and SkipLeaveOnInt in client mode

* drop "bla" and add clarifying comment for the test

* split up tests to support enterprise/non-enterprise tests

* drop raft multiplier and derive values during build phase

* sanitize runtime config reflectively and add test

* detect invalid config fields

* fix tests with invalid config fields

* use different values for wan sanitiziation test

* drop recursor in favor of recursors

* allow dns_config.udp_answer_limit to be zero

* make sure tests run on machines with multiple ips

* Fix failing tests in a few more places by providing a bind address in the test

* Gets rid of skipped TestAgent_CheckPerformanceSettings and adds case for builder.

* Add porter to server_test.go to make tests there less flaky

* go fmt
2017-09-25 11:40:42 -07:00
James Phillips d84c0b1a01
Robustifies check in TestCatalog_ListNodes_ConsistentRead_Fail test.
Fixes #3469
2017-09-13 21:22:53 -07:00
James Phillips 828be5771a
Revert "Manages segments list via a pointer."
This reverts commit c277a42504.
2017-09-07 16:37:11 -07:00
James Phillips c277a42504
Manages segments list via a pointer. 2017-09-07 16:21:07 -07:00
James Phillips 96a89a3381
Cleans up formatting. 2017-09-07 12:26:58 -07:00
James Phillips 00605c0214
Shows the segment name in the keyring API and command output. 2017-09-07 12:17:39 -07:00
James Phillips 88a150cee1
Moves reconcile loop into segment stub. 2017-09-06 18:01:53 -07:00
James Phillips 5c03cb571d
Takes the skip out of the client check.
Without this the merge delegate won't check the segment for non-servers
a little below here.
2017-09-06 17:05:40 -07:00
James Phillips 3418c7ff93 Merge pull request #3447 from hashicorp/issue-3070
Skips unique node ID check for old versions of Consul.
2017-09-06 13:24:15 -07:00
James Phillips 520060e138
Fixes incorrect comment. 2017-09-06 13:23:19 -07:00
James Phillips 084679ab65
Pulls down some code for the check loop. 2017-09-06 13:07:42 -07:00
James Phillips 3535652595
Uses the Raft configuration for the self-add skip check. 2017-09-06 13:05:51 -07:00
Preetha Appan 5f2e1c9b07 Change member join reconcile step to process joining itself, to handle node IP address changes correctly when number of servers < 3 2017-09-06 13:53:01 -05:00
James Phillips 1333fa57a1
Skips unique node ID check for old versions of Consul.
Fixes #3070.
2017-09-05 22:57:29 -07:00
James Phillips 1a117ba0a8
Makes the all segments query explict, and the default for `consul members`. 2017-09-05 12:22:20 -07:00
James Phillips 9258506dab Adds simple rate limiting for client agent RPC calls to Consul servers. (#3440)
* Added rate limiting for agent RPC calls.
* Initializes the rate limiter based on the config.
* Adds the rate limiter into the snapshot RPC path.
* Adds unit tests for the RPC rate limiter.
* Groups the RPC limit parameters under "limits" in the config.
* Adds some documentation about the RPC limiter.
* Sends a 429 response when the rate limiter kicks in.
* Adds docs for new telemetry.
* Makes snapshot telemetry look like RPC telemetry and cleans up comments.
2017-09-01 15:02:50 -07:00
Kyle Havlovitz 220db48aa7 Merge pull request #3431 from hashicorp/network-segments-oss 2017-09-01 10:24:58 -07:00
Kyle Havlovitz 0e33e2ecab
Pass listeners into setupSegments 2017-08-31 17:56:43 -07:00
Kyle Havlovitz 62102a537e
Organize segments for a cleaner split between enterprise and OSS 2017-08-31 17:39:46 -07:00
Kyle Havlovitz 7e565d7338
Fix some inconsistencies with segment logic and comments 2017-08-30 17:43:46 -07:00
Preetha Appan 2386214655 Wire server provider for raft layer only on protocol version 3 and above, and update changelog 2017-08-30 14:36:47 -05:00
Kyle Havlovitz 14b027a3c2
Add segment addr field to tags for LAN flood joiner 2017-08-30 11:58:29 -07:00
Kyle Havlovitz d129767657
Add agent.segment interpolation to prepared queries 2017-08-30 11:58:29 -07:00
Kyle Havlovitz 2ada0439d4
Add rpc_listener option to segment config 2017-08-30 11:58:29 -07:00
James Phillips b1a15e0c3d
Adds open source side of network segments (feature is Enterprise-only). 2017-08-30 11:58:29 -07:00
Preetha Appan a231eea0e7 More cleanup from code review 2017-08-30 12:31:36 -05:00
Preetha Appan c6ee9bfa69 Remove copy pasted duplicate line, update documentation. 2017-08-30 10:02:10 -05:00
Preetha Appan 0f4e24f72c Consolidate server lookup into one place and replace usages of localConsuls. 2017-08-30 09:30:33 -05:00
Preetha Appan e639154abd Remove stray commented line 2017-08-30 09:30:33 -05:00
Preetha Appan 00836a6aab Remove server address tracking logic from manager/router and maintain it as part of lan event listener instead. Used sync.Map to track this, and added unit tests 2017-08-30 09:30:33 -05:00
Preetha Appan 830aca958a ServerAddressProvider interface also returns an error now 2017-08-30 09:30:33 -05:00
Preetha Appan c68fce89b5 Use config struct to create NetworkTransport layer when setting up raft 2017-08-30 09:30:33 -05:00
Preetha Appan 393ce1581b Implement AddressProvider and wire that up to raft transport layer to support server nodes changing their IP addresses in containerized environments 2017-08-30 09:30:33 -05:00
Frank Schroeder 831d84c940 build: make tests independent of build tags
When the metadata server is scanning the agents for potential servers
it is parsing the version number which the agent provided when it
joined. This version number has to conform to a certain format, i.e.
'n.n.n'. Without this version number properly set some tests fail with
error messages that disguise the root cause.

The default version number is currently set to 'unknown' in
version/version.go which does not parse and triggers the tests to fail.
The work around is to use a build tag 'consul' which will use the
version number set in version_base.go instead which has the correct
format and is set to the current release version.

In addition, some parts of the code also require the version number to
be of a certain value. Setting it to '0.0.0' for example makes some
tests pass and others fail since they don't pass the semantic check.

When using go build/install/test one has to remember to use '-tags
consul' or tests will fail with non-obvious error messages.

Using build tags makes the build process more complex and error prone
since it prevents the use of the plain go toolchain and - at least in
its current form - introduces subtle build and test issues. We should
try to eliminate build tags for anything else but platform specific
code.

This patch removes all references to specific version numbers in the
code and tests and sets the default version to '9.9.9' which is
syntactically correct and passes the semantic check. This solves the
issue of running go build/install/test without tags for the OSS build.
2017-08-30 13:40:18 +02:00
Frank Schröder a3934c263c acl: consolidate error handling (#3401)
The error handling of the ACL code relies on the presence of certain
magic error messages. Since the error values are sent via RPC between
older and newer consul agents we cannot just replace the magic values
with typed errors and switch to type checks since this would break
compatibility with older clients.

Therefore, this patch moves all magic ACL error messages into the acl
package and provides default error values and helper functions which
determine the type of error.
2017-08-23 16:52:48 +02:00
Frank Schroeder 16c58da27d agent: drop unused code
This code from http://github.com/hashicorp/consul/pull/3353 is no longer
required.
2017-08-22 00:02:46 +02:00
James Phillips e8a83bb463 Revert "Return 403 rather than a 404 when acls cause all results to be filter…" 2017-08-09 15:06:57 -07:00
James Phillips 02a87df044 Revert "Ensure that we return a permission denied only if the list of keys/en…" 2017-08-09 15:06:20 -07:00
Preetha Appan 42fb49c00b Added unit test case to kvs_endpointtest 2017-08-09 15:50:22 -05:00
Preetha Appan 3276891142 Ensure that we return a permission denied only if the list of keys/entries prior to filtering by ACL is non empty 2017-08-09 15:32:18 -05:00
Frank Schroeder 7cff50a4df
agent: move agent/consul/agent to agent/metadata 2017-08-09 14:36:52 +02:00
Frank Schroeder c395599cea
agent: move agent/consul/servers to agent/router 2017-08-09 14:36:37 +02:00
Frank Schroeder 1acff3533e
agent: move agent/consul/structs to agent/structs 2017-08-09 14:32:12 +02:00
Kyle Havlovitz cf02e3bc22 Merge pull request #3369 from hashicorp/metrics-enhancements
Add support for labels/filters from go-metrics
2017-08-08 13:55:30 -07:00
Kyle Havlovitz d5634fe2a8
Add support for labels/filters from go-metrics 2017-08-08 01:45:10 -07:00
Preetha Appan 37f75a393e
Use sanitized version of node name of server in NS record, and start with "server" rather than "ns" 2017-08-07 11:11:55 +02:00
Preetha Appan 794d1afe44
Removed a copy pasted irrelevant comment, and other code review feedback 2017-08-07 11:11:54 +02:00
Preetha Appan f9db387097
Add NS records and A records for each server. Constructs ns host names using the advertise address of the server. 2017-08-07 11:11:54 +02:00
James Phillips 4bee2e49f5 Adds secure introduction for the ACL replication token. (#3357)
Adds secure introduction for the ACL replication token, as well as a separate enable config for ACL replication.
2017-08-03 15:39:31 -07:00
James Phillips c0a5ad7903 Adds a new /v1/acl/bootstrap API (#3349) 2017-08-02 17:05:18 -07:00
Preetha Appan 4076c0d741 Return nil instead of empty list when returning a PermissionDenied error, updated unit test 2017-07-31 17:23:20 -05:00
Preetha Appan 6336014a86 Return 403 rather than a 404 when acls cause all results to be filtered out. This fixes #2637 2017-07-31 13:50:29 -05:00
James Phillips 10b660d77a Adds missing autopilot snapshot test and avoids snapshotting nil. (#3333) 2017-07-28 15:48:42 -07:00
James Phillips 6250cd70f5 Adds option to prepared queries to remove empty tags. (#3330) 2017-07-26 22:46:43 -07:00
James Phillips 496b0bcf07 Adds support for agent-side ACL token management via API instead of config files. (#3324)
* Adds token store and removes all runtime use of config for ACL tokens.
* Adds a new API for changing agent tokens on the fly.
2017-07-26 11:03:43 -07:00
Preetha Appan b94617b281 Add extra test case for deleting entire tree with empty prefix 2017-07-26 09:42:07 -05:00
Preetha Appan 4498814843 Don't insert tombstone for empty prefix delete. Other minor unit test fixes 2017-07-25 21:54:11 -05:00
Preetha Appan fee418d378 Removed redundant comments and unit test 2017-07-25 20:39:33 -05:00
Preetha Appan b772c477c2 Removed redundant call to reap tombstone from unit test 2017-07-25 19:39:05 -05:00
Preetha Appan ae443e21d6 Improved unit test per code review 2017-07-25 19:17:40 -05:00
Preetha Appan 36acf8d6a4 Use new DeletePrefixMethod for implementing KVSDeleteTree operation. This makes deletes on sub trees larger than one million nodes about 100 times faster. Added unit tests. 2017-07-25 17:21:18 -05:00
Frank Schroeder 0047b7d3f0
fix spelling in filenames
Fixes #3301
2017-07-19 13:16:38 +02:00
Kyle Havlovitz 19eae3d14b
Add UpgradeVersionTag to autopilot config 2017-07-18 13:35:41 -07:00
James Phillips 1791d99a10 Adds new config to make script checks opt-in, updates documentation. (#3284) 2017-07-17 11:20:35 -07:00
Kyle Havlovitz 78c3a86405
Add TLS setting to router areas 2017-07-14 17:38:08 -07:00
James Phillips 0881e46111 Cleans up version 8 ACLs in the agent and the docs. (#3248)
* Moves magic check and service constants into shared structs package.

* Removes the "consul" service from local state.

Since this service is added by the leader, it doesn't really make sense to
also keep it in local state (which requires special ACLs to configure), and
requires a bunch of special cases in the local state logic. This requires
fewer special cases and makes ACL bootstrapping cleaner.

* Makes coordinate update ACL log message a warning, similar to other AE warnings.

* Adds much more detailed examples for bootstrapping ACLs.

This can hopefully replace https://gist.github.com/slackpad/d89ce0e1cc0802c3c4f2d84932fa3234.
2017-07-13 22:33:47 -07:00
Frank Schroeder 1781fd311f address review comments 2017-07-07 09:22:34 +02:00
Frank Schroeder e4b40acc7e agent: remove unused code 2017-07-07 09:22:34 +02:00
Frank Schroeder 8c792ad57d agent: make TestClient_RPC_ConsulServerPing more robust 2017-07-07 09:22:34 +02:00
James Phillips a855d31f84 Adds a comment about flood joining. 2017-07-07 09:22:34 +02:00
James Phillips 5b5217528a Simplifies Serf dynamic port selection code.
This isn't racy, it's just a little dirty. The listen will happen and a port
will be selected and injected into the config once the Serf instance is
created, so we don't need the retry loop here.
2017-07-07 09:22:34 +02:00
James Phillips d8db4bc086 test: Changes WAN/LAN join confirmer to use port number vs. address.
This fixes TestServer_JoinSeparateLanAndWanAddresses which sets bogus
advertise addresses as part of the test. Port numbers uniquely identify
members since everything is running on localhost.
2017-07-07 09:22:34 +02:00
Frank Schroeder d92f70f313 test: make joinLAN/WAN reliable
only return if the members can see each other
2017-07-07 09:22:34 +02:00
Frank Schroeder 112bc19cd5 rpc: make TestServer_JoinSeparateLanAndWanAddresses more robust 2017-07-07 09:22:34 +02:00
Frank Schroeder ffd45f5da5 rpc: make TestClient_SnapshotRPC_TLS more robust 2017-07-07 09:22:34 +02:00
Frank Schroeder 2159d499e3 rpc: try shutting down leader first to avoid hang in TestLeader_LeftServer 2017-07-07 09:22:34 +02:00
Frank Schroeder f12fac278e rpc: fix logging and try quicker timing of TestServer_JoinSeparateLanAndWanAddresses 2017-07-07 09:22:34 +02:00
Frank Schroeder bae4b1d045 rpc: less agressive raft timeouts
Allowing more time for raft to consolidate should
drop the number of leader elections.
2017-07-07 09:22:34 +02:00
Frank Schroeder 457b98a099 rpc: run agent/consul tests in parallel 2017-07-07 09:22:34 +02:00
Frank Schroeder 13eeeb720d rpc: refactor sessionTimers and fix racy tests
The sessionTimers map was secured by a lock which wasn't used
properly in the tests. This lead to data races and failing tests
when accessing the length or the members of the map.

This patch adds a separate SessionTimers struct which is safe
for concurrent use and which ecapsulates the behavior of the
sessionTimers map.
2017-07-07 09:22:34 +02:00
Frank Schroeder 05f756853e rpc: fix TestServer_Leave
wait for the leader election.
2017-07-07 09:22:34 +02:00
Frank Schroeder 583959392b rpc: fix TestSession_Renew
make the timing less tight
2017-07-07 09:22:34 +02:00
Frank Schroeder ff2c29c0be rpc: fix TestReadyForConsistentRead
timing was too tight. Standardized name.
2017-07-07 09:22:34 +02:00
Frank Schroeder fcab525053 rpc: fix for 'no leader' in TLS tests
Ensure both servers know about each other before looking
for a leader.
2017-07-07 09:22:34 +02:00
Frank Schroeder b2a71fd8b0 rpc: fix TestServer_JoinWAN_Flood
The second server in the first data center should not be
in bootstrap mode.
2017-07-07 09:22:34 +02:00
Frank Schroeder 8369b6cb9d rpc: provide unique node names for server and client 2017-07-07 09:22:34 +02:00
Frank Schroeder 534977239b rpc: prefix log output with test name 2017-07-07 09:22:34 +02:00
Frank Schroeder c8ef588d8d rpc: discover serf wan port before starting serf lan
When using dynamic ports for the serf clusters then
the actual bind port of the serf WAN cluster needs to
be discovered before the serf LAN cluster is started
since the serf LAN cluster announces the port of the WAN
cluster.
2017-07-07 09:22:34 +02:00
Frank Schroeder 53eab7e970 rpc: bind rpc test server to port 0 2017-07-07 09:22:34 +02:00
Frank Schroeder e9e2c599db rpc: refactor: unify test server setup 2017-07-07 09:22:34 +02:00
Frank Schroeder c803146550 rpc: fix typos 2017-07-07 09:22:34 +02:00
Frank Schroeder a0368e3827 agent: refactor: log to stderr during tests 2017-07-07 09:22:34 +02:00
Preetha Appan f549c06764 Rename to raftNotifyCh, fix typo 2017-07-06 09:10:36 -05:00
Preetha Appan f2171a6720 Fixes deadlock between barrier write and leader notify channel read . Fixes #3230 2017-07-05 17:09:18 -05:00
James Phillips e4b11682bc Fixes broken HTTP header and method for health checks. (#3178)
* Fixes broken HTTP header and method for health checks.
* Adds a fuzz utility and test to make sure copy is complete.
2017-06-23 01:15:48 -07:00
Frank Schroeder 2b41f2e3a3 agent: make the RPC endpoint overwrite mechanism more transparent
This patch hides the RPC handler overwrite mechanism from the
rest of the code so that it works in all cases and that there
is no cooperation required from the tested code, i.e. we can
drop a.getEndpoint().
2017-06-21 05:42:39 +02:00
Frank Schroeder c49a15d0f3 agent: move structs into consul/structs pkg
* CheckDefinition
 * ServiceDefinition
 * CheckType
2017-06-21 05:42:39 +02:00
Frank Schroeder 4273fb8444 agent: move NotifyGroup into the agent pkg 2017-06-21 05:42:39 +02:00
Frank Schroeder 82a132da60 agent: move conn pool for muxed connections into separate pkg 2017-06-21 05:42:39 +02:00
Frank Schroeder 80971c8a85 agent: move the SnapshotReplyFn out of the way
When splitting up the consul package into server and client
the SnapshotReplyFn needs to be in a separate package to avoid
a circular dependency.
2017-06-21 05:42:39 +02:00
Frank Schroeder 04b9392b00 agent: use the delegate interface for local state 2017-06-21 05:42:39 +02:00
Preetha Appan f658231ab9 Minor fixes per code review 2017-06-20 19:43:07 -05:00
Preetha Appan b3b2e9dcb4 Added unit test to verify consistentRead method behavior 2017-06-16 11:58:12 -05:00
Preetha Appan 44f5086873 Code review feedback, fixed major logic bug 2017-06-16 10:49:54 -05:00
Preetha Appan 72af7b9bc4 Redo bug fix for stale reads on server startup, leveraging RPCHOldtimeout instead of maxQueryTime, plus tests 2017-06-15 22:41:30 -05:00
Frank Schroeder 1c75cf1af5 pkg refactor
command/agent/*                  -> agent/*
    command/consul/*                 -> agent/consul/*
    command/agent/command{,_test}.go -> command/agent{,_test}.go
    command/base/command.go          -> command/base.go
    command/base/*                   -> command/*
    commands.go                      -> command/commands.go

The script which did the refactor is:

(
	cd $GOPATH/src/github.com/hashicorp/consul
	git mv command/agent/command.go command/agent.go
	git mv command/agent/command_test.go command/agent_test.go
	git mv command/agent/flag_slice_value{,_test}.go command/
	git mv command/agent .
	git mv command/base/command.go command/base.go
	git mv command/base/config_util{,_test}.go command/
	git mv commands.go command/
	git mv consul agent
	rmdir command/base/

	gsed -i -e 's|package agent|package command|' command/agent{,_test}.go
	gsed -i -e 's|package agent|package command|' command/flag_slice_value{,_test}.go
	gsed -i -e 's|package base|package command|' command/base.go command/config_util{,_test}.go
	gsed -i -e 's|package main|package command|' command/commands.go

	gsed -i -e 's|base.Command|BaseCommand|' command/commands.go
	gsed -i -e 's|agent.Command|AgentCommand|' command/commands.go
	gsed -i -e 's|\tCommand:|\tBaseCommand:|' command/commands.go
	gsed -i -e 's|base\.||' command/commands.go
	gsed -i -e 's|command\.||' command/commands.go

	gsed -i -e 's|command|c|' main.go
	gsed -i -e 's|range Commands|range command.Commands|' main.go
	gsed -i -e 's|Commands: Commands|Commands: command.Commands|' main.go

	gsed -i -e 's|base\.BoolValue|BoolValue|' command/operator_autopilot_set.go
	gsed -i -e 's|base\.DurationValue|DurationValue|' command/operator_autopilot_set.go
	gsed -i -e 's|base\.StringValue|StringValue|' command/operator_autopilot_set.go
	gsed -i -e 's|base\.UintValue|UintValue|' command/operator_autopilot_set.go

	gsed -i -e 's|\bCommand\b|BaseCommand|' command/base.go
	gsed -i -e 's|BaseCommand Options|Command Options|' command/base.go
	gsed -i -e 's|base.Command|BaseCommand|' command/*.go
	gsed -i -e 's|c\.Command|c.BaseCommand|g' command/*.go
	gsed -i -e 's|\tCommand:|\tBaseCommand:|' command/*_test.go
	gsed -i -e 's|base\.||' command/*_test.go

	gsed -i -e 's|\bCommand\b|AgentCommand|' command/agent{,_test}.go
	gsed -i -e 's|cmd.AgentCommand|cmd.BaseCommand|' command/agent.go

	gsed -i -e 's|cli.AgentCommand = new(Command)|cli.Command = new(AgentCommand)|' command/agent_test.go
	gsed -i -e 's|exec.AgentCommand|exec.Command|' command/agent_test.go
	gsed -i -e 's|exec.BaseCommand|exec.Command|' command/agent_test.go
	gsed -i -e 's|NewTestAgent|agent.NewTestAgent|' command/agent_test.go
	gsed -i -e 's|= TestConfig|= agent.TestConfig|' command/agent_test.go
	gsed -i -e 's|: RetryJoin|: agent.RetryJoin|' command/agent_test.go

	gsed -i -e 's|\.\./\.\./|../|' command/config_util_test.go

	gsed -i -e 's|\bverifyUniqueListeners|VerifyUniqueListeners|' agent/config{,_test}.go command/agent.go
	gsed -i -e 's|\bserfLANKeyring\b|SerfLANKeyring|g' agent/{agent,keyring,testagent}.go command/agent.go
	gsed -i -e 's|\bserfWANKeyring\b|SerfWANKeyring|g' agent/{agent,keyring,testagent}.go command/agent.go
	gsed -i -e 's|\bNewAgent\b|agent.New|g' command/agent{,_test}.go
	gsed -i -e 's|\bNewAgent|New|' agent/{acl_test,agent,testagent}.go

	gsed -i -e 's|\bAgent\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bBool\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bConfig\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bDefaultConfig\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bDevConfig\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bMergeConfig\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bReadConfigPaths\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bParseMetaPair\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bSerfLANKeyring\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bSerfWANKeyring\b|agent.&|g' command/agent{,_test}.go

	gsed -i -e 's|circonus\.agent|circonus|g' command/agent{,_test}.go
	gsed -i -e 's|logger\.agent|logger|g' command/agent{,_test}.go
	gsed -i -e 's|metrics\.agent|metrics|g' command/agent{,_test}.go
	gsed -i -e 's|// agent.Agent|// agent|' command/agent{,_test}.go
	gsed -i -e 's|a\.agent\.Config|a.Config|' command/agent{,_test}.go

	gsed -i -e 's|agent\.AppendSliceValue|AppendSliceValue|' command/{configtest,validate}.go

	gsed -i -e 's|consul/consul|agent/consul|' GNUmakefile

	gsed -i -e 's|\.\./test|../../test|' agent/consul/server_test.go

	# fix imports
	f=$(grep -rl 'github.com/hashicorp/consul/command/agent' * | grep '\.go')
	gsed -i -e 's|github.com/hashicorp/consul/command/agent|github.com/hashicorp/consul/agent|' $f
	goimports -w $f

	f=$(grep -rl 'github.com/hashicorp/consul/consul' * | grep '\.go')
	gsed -i -e 's|github.com/hashicorp/consul/consul|github.com/hashicorp/consul/agent/consul|' $f
	goimports -w $f

	goimports -w command/*.go main.go
)
2017-06-10 18:52:45 +02:00