3715 Commits

Author SHA1 Message Date
Sean Chittenden
58246fcc0b Initialize the rebalancce to clientRPCMinReuseDuration
In an earlier version there was a channel to notify when a new server was added, however this has long since been removed.  Just default to the sane value of 2min before the first rebalance calc takes place.

Pointed out by: slackpad
2016-03-25 13:46:18 -07:00
Sean Chittenden
4fec6a9608 Guard against very small or negative rates
Pointed out by: slackpad
2016-03-25 13:31:55 -07:00
Sean Chittenden
d9251e30c8 Use range vs for
Returning a new array vs mutating an array in place so we can use range now.
2016-03-25 13:08:08 -07:00
Sean Chittenden
9c18bb5f1c Comment updates 2016-03-25 13:06:59 -07:00
Sean Chittenden
0b3f6932df Only rotate server list with more than one server
Fantastic observation by slackpad.  This was left over from when there was a boolean for health in the server struct (vs current strategy where we use server position in the list and rely on serf to cleanup the stale members).

Pointed out by: slackpad
2016-03-25 12:54:36 -07:00
Sean Chittenden
24eb274860 Relocate saveServerConfig next to getServerConfig
Requested by: slackpad
2016-03-25 12:41:22 -07:00
Sean Chittenden
b00db393e7 Clarify that ConsulClusterInfo is an interface over serf
An interface was used to break a cyclic import dependency.
2016-03-25 12:38:40 -07:00
Sean Chittenden
b8bbdb9e7a Reword comment after moving code into new packages 2016-03-25 12:34:46 -07:00
Sean Chittenden
68183c4378 Change initialReblaanaceTimeout to a time.Duration
Pointed out by: @slackpad
2016-03-25 12:34:12 -07:00
Sean Chittenden
3433feb93b Negative check: test an invalid condition 2016-03-25 12:22:33 -07:00
Sean Chittenden
0f3ad9c120 Test to make sure bootstrap is missing 2016-03-25 12:20:12 -07:00
Sean Chittenden
7a2d30d1cf Be more Go idiomatic w/ variable names: s/valid/ok/g
Cargo culting is bad, m'kay?

Pointy Hat: sean-
2016-03-25 12:14:24 -07:00
Sean Chittenden
9daccb8b41 Fix stale comment
Pointed out by: @slackpad
2016-03-25 12:00:40 -07:00
Sean Chittenden
db72041063 Add a comment for Client serverMgr 2016-03-25 11:59:27 -07:00
Sean Chittenden
4d4806ab02 Add CHANGELOG entry re: agent rebalancing 2016-03-23 22:36:12 -07:00
Sean Chittenden
828606232e Correct a bogus goimport rewrite for tests 2016-03-23 22:35:49 -07:00
Sean Chittenden
da872fee63 Test ServerManager.refreshServerRebalanceTimer
Change the signature so it returns a value so that this can be tested externally with mock data.  See the sample table in TestServerManagerInternal_refreshServerRebalanceTimer() for the rate at which it will back off.  This function is mostly used to not cripple large clusters in the event of a partition.
2016-03-23 22:10:50 -07:00
Sean Chittenden
a63d5ab963 Add a handful more unit tests to the public interface 2016-03-23 22:10:50 -07:00
Sean Chittenden
8e3c83a258 Rename GetNumServers to NumServers()
Matches the style of the rest of the repo
2016-03-23 22:10:50 -07:00
Sean Chittenden
18f7befba9 Rename NewServerManger to just New
Follow go style recommendations now that this has been refactored out of the consul package and doesn't need the qualifier in the name.
2016-03-23 22:10:50 -07:00
Sean Chittenden
e932e9a435 Rename FindHealthyServer() to FindServer()
There is no guarantee the server coming back is healthy.  It's apt to be healthy by virtue of its place in the server list, but it's not guaranteed.
2016-03-23 22:10:50 -07:00
Sean Chittenden
49a5a1ab84 cycleServer is a pure function, save the result 2016-03-23 22:10:50 -07:00
Sean Chittenden
94f79d2c3d Missed unit test cruft 2016-03-23 22:10:50 -07:00
Sean Chittenden
bc62de541c Update comments to reflect reality 2016-03-23 22:10:50 -07:00
Sean Chittenden
d2d55f4bb0 Remove additional cruft from ServerManager's channels
No longer needed code.
2016-03-23 22:10:50 -07:00
Sean Chittenden
d13e3c18c9 Emulate a TryLock using atomic.CompareAndSwap
Prevent possible queueing behind serverConfigLock in the event that a server fails on a busy host.
2016-03-23 22:10:50 -07:00
Sean Chittenden
295af01680 Make use of interfaces
Use an interface instead of serf.Serf as arg to NewServerManager.  Bonus points for improved testability.

Pointed out by: @slackpad
2016-03-23 22:10:50 -07:00
Sean Chittenden
fdbb142c3f Simplify error handling
Rely on Serf for liveliness.  In the event of a failure, simply cycle the server to the end of the list.  If the server is unhealthy, Serf will reap the dead server.

Additional simplifications:

*) Only rebalance servers based on timers, not when a new server is readded to the cluster.
*) Back out the failure count in server_details.ServerDetails
2016-03-23 22:10:50 -07:00
Sean Chittenden
c2c73bfeab Unbreak client tests by reverting to original test
Debugging code crept into the actual test and hung out for much longer than it should have.
2016-03-23 22:10:50 -07:00
Sean Chittenden
0c87463b7e Introduce asynchronous management of consul server lists
Instead of blocking the RPC call path and performing a potentially expensive calculation (including a call to `c.LANMembers()`), introduce a channel to request a rebalance.  Some events don't force a reshuffle, instead the extend the duration of the current rebalance window because the environment thrashed enough to redistribute a client's load.
2016-03-23 22:10:50 -07:00
Sean Chittenden
bad6cb8897 Comment nits 2016-03-23 22:10:50 -07:00
Sean Chittenden
bf8c860663 Update Serf to include serf.NumNodes() 2016-03-23 22:10:50 -07:00
Sean Chittenden
74bcbc63f8 Use saveServerConfig vs atomic.Value.Store(config) 2016-03-23 22:10:50 -07:00
Sean Chittenden
7f55931d02 Commit a handful of refactoring && copy/paste-o fixes 2016-03-23 22:10:50 -07:00
Sean Chittenden
e53704b032 Mutate copies of serverCfg.servers, not original
Removing any ambiguity re: ownership of the mutated server lists is a win for maintenance and debugging.
2016-03-23 22:10:50 -07:00
Sean Chittenden
e6c27325d9 rebalanceTimer may be nil during initialization
When first starting the server manager, it's possible that the rebalanceTimer in serverConfig will be nil, test accordingly.
2016-03-23 22:10:50 -07:00
Sean Chittenden
a7091b0837 Properly retain a pointer to the rebalanceTimer 2016-03-23 22:10:50 -07:00
Sean Chittenden
00ff8e5307 Cosmetic and various other wordsmithing cleanups 2016-03-23 22:10:50 -07:00
Sean Chittenden
b4db49a62e Document the various functions and their locking 2016-03-23 22:10:50 -07:00
Sean Chittenden
9eb6481d73 Use config convenience method to get config
'cause ELETTHECOMPILERSDOTHEWORK.  I don't need that cluttering up the subconscious with more complexity.
2016-03-23 22:10:50 -07:00
Sean Chittenden
579e536f58 Move consul.serverConfig out of the consul package
Relocated to its own package, server_manager.  This now greatly simplifies the RPC() call path and appropriately hides the locking behind the package boundary.  More work is needed to be done here
2016-03-23 22:10:50 -07:00
Sean Chittenden
c7c551dbe0 Rename serverConfigMtx to serverConfigLock
Pointed out by: @slackpad
2016-03-23 22:10:50 -07:00
Sean Chittenden
e48b910f87 Refactor out the management of Consul servers
Move the management of c.consulServers (fka c.consuls) into consul/server_manager.go.

This commit brings in a background task that proactively manages the server list and:

*) reshuffles the list
*) manages the timer out of the RPC() path
*) uses atomics to detect a server has failed

This is a WIP, more work in testing needs to be completed.
2016-03-23 22:10:50 -07:00
Sean Chittenden
01b637114c Move consul.serverConfig out of the consul package
Relocated to its own package, server_manager.  This now greatly simplifies the RPC() call path and appropriately hides the locking behind the package boundary.  More work is needed to be done here
2016-03-23 22:10:50 -07:00
Sean Chittenden
117c65dc55 Rename serverConfigMtx to serverConfigLock
Pointed out by: @slackpad
2016-03-23 22:10:32 -07:00
Sean Chittenden
0eac826573 Refactor out the management of Consul servers
Move the management of c.consulServers (fka c.consuls) into consul/server_manager.go.

This commit brings in a background task that proactively manages the server list and:

*) reshuffles the list
*) manages the timer out of the RPC() path
*) uses atomics to detect a server has failed

This is a WIP, more work in testing needs to be completed.
2016-03-23 22:09:46 -07:00
Sean Chittenden
b9e5588620 Move consul.serverConfig out of the consul package
Relocated to its own package, server_manager.  This now greatly simplifies the RPC() call path and appropriately hides the locking behind the package boundary.  More work is needed to be done here
2016-03-23 22:05:29 -07:00
Sean Chittenden
a482eaef70 Rename serverConfigMtx to serverConfigLock
Pointed out by: @slackpad
2016-03-23 22:05:05 -07:00
Sean Chittenden
9b8767aa67 Refactor out the management of Consul servers
Move the management of c.consulServers (fka c.consuls) into consul/server_manager.go.

This commit brings in a background task that proactively manages the server list and:

*) reshuffles the list
*) manages the timer out of the RPC() path
*) uses atomics to detect a server has failed

This is a WIP, more work in testing needs to be completed.
2016-03-23 22:03:20 -07:00
Sean Chittenden
075d1b628f Commit miss re: consuls variable rename 2016-03-23 16:24:29 -07:00