Sean Chittenden
e327630523
Reduce the error level from Fatal when unit testing
2016-03-26 22:07:09 -07:00
Sean Chittenden
92c2e8e668
Start server rebalance task after init'ing Serf
...
Now that there is no longer an event loop driven directly by Serf, start the ServerManager task after Serf has been setup. When testing and adjusting timers and timeouts to unreasonably low values, it's possible to tickle a race condition where Serf's NumNodes() would fail because Serf had not been initialized.
2016-03-26 22:04:41 -07:00
Sean Chittenden
ee95c55d88
Catch up to a few renames
2016-03-26 19:32:11 -07:00
Sean Chittenden
dc291e4027
Use empty string for addr in ServerDetails.String()
2016-03-26 19:30:04 -07:00
Sean Chittenden
970938c2dd
Guard against a nil ServerDetails.Addr
...
It's not clear how or why this would ever be nil, but some of the unit tests produce a nil addr. Be defensive.
2016-03-26 19:29:31 -07:00
Sean Chittenden
ca5950a538
Proactively ping server before rotation
...
Before shuffling the server list, proactively ping the next server in the list to establish the connection and verify the remote endpoint is healthy.
2016-03-26 19:28:13 -07:00
Sean Chittenden
18270bbd04
Factor out the shuffle server
2016-03-26 19:19:04 -07:00
Sean Chittenden
90d626b5d9
Revise comments re: cycleServer
...
Improve the comments to discuss what happens presently. Add a note to consider possibly calling to TestConsulServer proactively.
2016-03-26 18:53:13 -07:00
Sean Chittenden
cf59a860e7
Comment why the interface is needed: cyclic import
2016-03-26 18:38:35 -07:00
Sean Chittenden
3ecd72f3b5
Add a struct key type for server_details
2016-03-26 17:58:12 -07:00
Sean Chittenden
ae32a3ceae
Merge pull request #1873 from hashicorp/f-rebalance-worker-0.7
...
Periodically rebalance the servers that agents talk to
2016-03-25 15:03:18 -07:00
Sean Chittenden
5893bd5a35
Add additional checks
2016-03-25 14:40:46 -07:00
Sean Chittenden
d18e4d7455
Delete the right tag
...
"role" != "consul"
2016-03-25 14:31:48 -07:00
Sean Chittenden
b1194e83cb
Don't pass in sm, server manager is already in scope
...
Go closures are implicitly capturing lambdas.
2016-03-25 14:10:09 -07:00
Sean Chittenden
a3a0eeeadd
Trim residual complexity from server join notifications
...
Now that serf node join events are decoupled from rebalancing activities completely, remove the complixity of draining the channel and ensuring only one go routine was rebalancing the server list.
Now that we're no longer initializing a notification channel, we can remove the config load/save from `Start()`
2016-03-25 14:06:35 -07:00
Sean Chittenden
a71fbe57e3
Only log in FindServers
...
In FindServer this is a useful warning hinting why its call failed. RPC returns error and leaves it to the higher level caller to do whatever it wants. As an operator, I'd have the detail necessary to know why the RPC call(s) failed.
2016-03-25 13:58:50 -07:00
Sean Chittenden
58246fcc0b
Initialize the rebalancce to clientRPCMinReuseDuration
...
In an earlier version there was a channel to notify when a new server was added, however this has long since been removed. Just default to the sane value of 2min before the first rebalance calc takes place.
Pointed out by: slackpad
2016-03-25 13:46:18 -07:00
Sean Chittenden
4fec6a9608
Guard against very small or negative rates
...
Pointed out by: slackpad
2016-03-25 13:31:55 -07:00
Sean Chittenden
d9251e30c8
Use range vs for
...
Returning a new array vs mutating an array in place so we can use range now.
2016-03-25 13:08:08 -07:00
Sean Chittenden
9c18bb5f1c
Comment updates
2016-03-25 13:06:59 -07:00
Sean Chittenden
0b3f6932df
Only rotate server list with more than one server
...
Fantastic observation by slackpad. This was left over from when there was a boolean for health in the server struct (vs current strategy where we use server position in the list and rely on serf to cleanup the stale members).
Pointed out by: slackpad
2016-03-25 12:54:36 -07:00
Sean Chittenden
24eb274860
Relocate saveServerConfig next to getServerConfig
...
Requested by: slackpad
2016-03-25 12:41:22 -07:00
Sean Chittenden
b00db393e7
Clarify that ConsulClusterInfo is an interface over serf
...
An interface was used to break a cyclic import dependency.
2016-03-25 12:38:40 -07:00
Sean Chittenden
b8bbdb9e7a
Reword comment after moving code into new packages
2016-03-25 12:34:46 -07:00
Sean Chittenden
68183c4378
Change initialReblaanaceTimeout to a time.Duration
...
Pointed out by: @slackpad
2016-03-25 12:34:12 -07:00
Sean Chittenden
3433feb93b
Negative check: test an invalid condition
2016-03-25 12:22:33 -07:00
Sean Chittenden
0f3ad9c120
Test to make sure bootstrap is missing
2016-03-25 12:20:12 -07:00
Sean Chittenden
7a2d30d1cf
Be more Go idiomatic w/ variable names: s/valid/ok/g
...
Cargo culting is bad, m'kay?
Pointy Hat: sean-
2016-03-25 12:14:24 -07:00
Sean Chittenden
9daccb8b41
Fix stale comment
...
Pointed out by: @slackpad
2016-03-25 12:00:40 -07:00
Sean Chittenden
db72041063
Add a comment for Client serverMgr
2016-03-25 11:59:27 -07:00
James Phillips
3340d7ccd7
Merge pull request #1876 from hashicorp/f-tls-helper
...
Adds TLS config helper to API client.
2016-03-24 11:34:24 -07:00
James Phillips
0eb6279c5e
Improves the comment for the Address field.
2016-03-24 11:33:44 -07:00
James Phillips
e3f6c6a798
Merge pull request #1877 from hashicorp/api-constants
...
Added some constants in the api for check health statuses
2016-03-24 11:29:11 -07:00
Diptanu Choudhury
4811a72d80
Added some constants in the api for check health statuses
2016-03-24 11:26:07 -07:00
James Phillips
1cfed981b3
Adds TLS config helper to API client.
2016-03-24 11:24:18 -07:00
Sean Chittenden
4d4806ab02
Add CHANGELOG entry re: agent rebalancing
2016-03-23 22:36:12 -07:00
Sean Chittenden
828606232e
Correct a bogus goimport rewrite for tests
2016-03-23 22:35:49 -07:00
Sean Chittenden
da872fee63
Test ServerManager.refreshServerRebalanceTimer
...
Change the signature so it returns a value so that this can be tested externally with mock data. See the sample table in TestServerManagerInternal_refreshServerRebalanceTimer() for the rate at which it will back off. This function is mostly used to not cripple large clusters in the event of a partition.
2016-03-23 22:10:50 -07:00
Sean Chittenden
a63d5ab963
Add a handful more unit tests to the public interface
2016-03-23 22:10:50 -07:00
Sean Chittenden
8e3c83a258
Rename GetNumServers to NumServers()
...
Matches the style of the rest of the repo
2016-03-23 22:10:50 -07:00
Sean Chittenden
18f7befba9
Rename NewServerManger to just New
...
Follow go style recommendations now that this has been refactored out of the consul package and doesn't need the qualifier in the name.
2016-03-23 22:10:50 -07:00
Sean Chittenden
e932e9a435
Rename FindHealthyServer() to FindServer()
...
There is no guarantee the server coming back is healthy. It's apt to be healthy by virtue of its place in the server list, but it's not guaranteed.
2016-03-23 22:10:50 -07:00
Sean Chittenden
49a5a1ab84
cycleServer is a pure function, save the result
2016-03-23 22:10:50 -07:00
Sean Chittenden
94f79d2c3d
Missed unit test cruft
2016-03-23 22:10:50 -07:00
Sean Chittenden
bc62de541c
Update comments to reflect reality
2016-03-23 22:10:50 -07:00
Sean Chittenden
d2d55f4bb0
Remove additional cruft from ServerManager's channels
...
No longer needed code.
2016-03-23 22:10:50 -07:00
Sean Chittenden
d13e3c18c9
Emulate a TryLock using atomic.CompareAndSwap
...
Prevent possible queueing behind serverConfigLock in the event that a server fails on a busy host.
2016-03-23 22:10:50 -07:00
Sean Chittenden
295af01680
Make use of interfaces
...
Use an interface instead of serf.Serf as arg to NewServerManager. Bonus points for improved testability.
Pointed out by: @slackpad
2016-03-23 22:10:50 -07:00
Sean Chittenden
fdbb142c3f
Simplify error handling
...
Rely on Serf for liveliness. In the event of a failure, simply cycle the server to the end of the list. If the server is unhealthy, Serf will reap the dead server.
Additional simplifications:
*) Only rebalance servers based on timers, not when a new server is readded to the cluster.
*) Back out the failure count in server_details.ServerDetails
2016-03-23 22:10:50 -07:00
Sean Chittenden
c2c73bfeab
Unbreak client tests by reverting to original test
...
Debugging code crept into the actual test and hung out for much longer than it should have.
2016-03-23 22:10:50 -07:00