1188 Commits

Author SHA1 Message Date
James Phillips
9c9195746f Changes ?near=self to a safer ?near=_agent, which is also clearer about what it does. 2015-10-23 15:23:01 -07:00
James Phillips
9caa5b3653 Adds distance sorting to health endpoint. Cleans up unit tests. 2015-10-23 15:23:01 -07:00
James Phillips
e47eea3f3a Adds a magic "self" node name to distance queries. 2015-10-23 15:23:01 -07:00
James Phillips
54ef97b268 Adds tests for HTTP interface. Removes a stray mark. 2015-10-23 15:23:01 -07:00
James Phillips
89c7203f31 Adds coordinate sorting support to catalog queries for nodes and service nodes. 2015-10-23 15:23:01 -07:00
James Phillips
d734697820 Turns down the coordinate sync rate a bit more. 2015-10-23 15:23:01 -07:00
James Phillips
ad65d953f6 Scales coordinate sends to hit a fixed aggregate rate across the cluster. 2015-10-23 15:23:01 -07:00
James Phillips
66a3d29743 Simplifies the batching function and adds some comments. 2015-10-23 15:23:01 -07:00
James Phillips
5f754c4a87 Does some small cleanups based on PR feedback.
* Holds coordinate updates in map and gets rid of the update channel.
* Cleans up config variables a bit.
2015-10-23 15:23:01 -07:00
James Phillips
7e6d52109b Hardens Consul from bad coordinates from other nodes. 2015-10-23 15:23:01 -07:00
James Phillips
d12aa2ffab Moves batching down into the state store and changes it to fail-fast.
* A batch of updates is done all in a single transaction.
* We no longer need to get an update to kick things, there's a periodic flush.
* If incoming updates overwhelm the configured flush rate they will be dumped with an error.
2015-10-23 15:23:01 -07:00
James Phillips
b9d5fb0f90 Flips the sense of the coordinate enable option. 2015-10-23 15:23:01 -07:00
James Phillips
92567841d6 Removes one more WAN leftover. 2015-10-23 15:23:01 -07:00
James Phillips
86b112fe31 Does a clean up pass on the Consul side. 2015-10-23 15:23:01 -07:00
James Phillips
01d2452ea3 Merges config changes after rebase. 2015-10-23 15:23:01 -07:00
Derek Chiang
f144d17b1c Address comments 2015-10-23 15:23:01 -07:00
Derek Chiang
ab9262c656 Add a test case 2015-10-23 15:23:01 -07:00
Derek Chiang
bf5cb7522f Use IndexedCoordinate instead 2015-10-23 15:23:01 -07:00
Derek Chiang
a1854a7614 Some fixes 2015-10-23 15:23:01 -07:00
Derek Chiang
a338e5efe8 Fix a comment 2015-10-23 15:23:01 -07:00
Derek Chiang
98d87b5dd5 Complete logic for sending coordinates 2015-10-23 15:23:01 -07:00
Derek Chiang
9113b26286 Some fixes 2015-10-23 15:23:01 -07:00
Derek Chiang
a9ea503c69 Adding tests and stuff 2015-10-23 15:23:01 -07:00
Armon Dadgar
6a350d5d19 Merge pull request #1318 from daveadams/f-http-header-token
Allow specifying Consul token in an HTTP request header
2015-10-22 13:33:47 -07:00
Jeff Mitchell
1e3840b044 Update cleanhttp repo location 2015-10-22 14:14:22 -04:00
Jeff Mitchell
9a5fd5424a Use cleanhttp to get rid of DefaultTransport 2015-10-22 10:47:50 -04:00
David Adams
b7bcb2a414 Add HTTP request header X-Consul-Token
Add support for an X-Consul-Token HTTP request header to specify the
token with which this request should be fulfilled. The header would have
precedence over the responding Agent's default token, but would have
lower precedence than a token specified in the query string.
2015-10-19 11:26:01 -05:00
James Phillips
4ee43e90b7 Deletes the old state store and all its accoutrements. 2015-10-15 14:59:09 -07:00
James Phillips
0c90bdc61a Knocks out the Raft indexes before doing compare. 2015-10-15 14:59:09 -07:00
James Phillips
d57431e300 Gets new structs changes to compile, adds some corner case handling and extra unit tests. 2015-10-15 14:59:09 -07:00
Ryan Uber
d6af59cded Merge pull request #1309 from hashicorp/f-remove-migrate
Removes consul-migrate for 0.6
2015-10-15 14:50:19 -07:00
Jeff Mitchell
f49fc095ef Don't use http.DefaultClient
Two of the changes are in tests; the one of consequence is in the API.
As explained in #1308 this can cause conflicts with downstream programs.

Fixes #1308.
2015-10-15 17:49:35 -04:00
Ryan Uber
de287e3efb agent: consolidates data dir checker 2015-10-15 14:21:35 -07:00
Ryan Uber
10b971df21 agent: test mdb dir protection 2015-10-15 14:15:41 -07:00
Ryan Uber
d901fa6389 agent: remove migrator, refuse to start if mdb dir found 2015-10-15 14:15:08 -07:00
Armon Dadgar
d137a5fa82 agent: remove an N^2 check. See #1265 2015-10-12 20:30:11 -07:00
Michael Puncel
a98d25b541 Add http method to log output 2015-10-02 18:33:06 -07:00
James Phillips
0b05dbeb21 Merge pull request #1235 from wuub/master
fix conflict between handleReload and antiEntropy critical sections
2015-09-17 07:28:39 -07:00
Wojciech Bederski
c4537ed26f panic when unbalanced localState.Resume() is detected 2015-09-17 11:32:08 +02:00
Dale Wijnand
5a28ebcaa3 Fix a bunch of typos. 2015-09-15 13:22:08 +01:00
James Phillips
2f9ebdb135 Merge pull request #1187 from sfncook/enable_tag_drift_03
Enable tag drift 03
2015-09-11 15:35:32 -07:00
Anthony Scalisi
10e028d599 remove various typos 2015-09-11 12:29:54 -07:00
Wojciech Bederski
b014c0f91b make Pause()/Resume()/isPaused() behave more like a semaphore
see: https://github.com/hashicorp/consul/issues/1173 #1173

Reasoning: somewhere during consul development Pause()/Resume() and
PauseSync()/ResumeSync() were added to protect larger changes to
agent's localState.  A few of the places that it tries to protect are:

- (a *Agent) AddService(...)      # part of the method
- (c *Command) handleReload(...)  # almost the whole method
- (l *localState) antiEntropy(...)# isPaused() prevents syncChanges()

The main problem is, that in the middle of handleReload(...)'s
critical section it indirectly (loadServices()) calls  AddService(...).
AddService() in turn calls Pause() to protect itself against
syncChanges(). At the end of AddService() a defered call to Resume() is
made.

With the current implementation, this releases
isPaused() "lock" in the middle of handleReload() allowing antiEntropy
to kick in while configuration reload is still in progress.
Specifically almost all services and probably all check are unloaded
when syncChanges() is allowed to run.

This in turn can causes massive service/check de-/re-registration,
and since checks are by default registered in the critical state,
a majority of services on a node can be marked as failing.
It's made worse with automation, often calling `consul reload` in close
proximity on many nodes in the cluster.

This change basically turns Pause()/Resume() into P()/V() of
a garden-variety semaphore. Allowing Pause() to be called multiple times,
and releasing isPaused() only after all matching/defered Resumes() are
called as well.

TODO/NOTE: as with many semaphore implementations, it might be reasonable
to panic() if l.paused ever becomes negative.
2015-09-11 18:28:06 +02:00
Wojciech Bederski
24bc17eaa1 failing test showing that nested Pause()/Resume() release too early
see: #1173 / https://github.com/hashicorp/consul/issues/1173
2015-09-11 17:52:57 +02:00
Shawn Cook
66fd8fb2a0 Rename EnableTagOverride and update formatting 2015-09-11 08:35:29 -07:00
Shawn Cook
d7ce0b3c6b Remove debug lines 2015-09-11 08:32:59 -07:00
Shawn Cook
0b3faf6e4a Merge remote-tracking branch 'hashicorp/master' into enable_tag_drift_03 2015-09-10 14:55:30 -07:00
Shawn Cook
35f276f25d Add test cases TestAgentAntiEntropy_EnableTagDrift 2015-09-10 14:08:16 -07:00
Ryan Uber
1908c16f53 Merge pull request #1230 from hashicorp/f-maintfix
Respect tokens in maintenance mode
2015-09-10 12:30:07 -07:00
Ryan Uber
039938a7e0 agent: testing node/service maintenance using tokens 2015-09-10 12:08:08 -07:00