James Phillips
0b05dbeb21
Merge pull request #1235 from wuub/master
...
fix conflict between handleReload and antiEntropy critical sections
2015-09-17 07:28:39 -07:00
Wojciech Bederski
c4537ed26f
panic when unbalanced localState.Resume() is detected
2015-09-17 11:32:08 +02:00
Dale Wijnand
5a28ebcaa3
Fix a bunch of typos.
2015-09-15 13:22:08 +01:00
James Phillips
2f9ebdb135
Merge pull request #1187 from sfncook/enable_tag_drift_03
...
Enable tag drift 03
2015-09-11 15:35:32 -07:00
Anthony Scalisi
10e028d599
remove various typos
2015-09-11 12:29:54 -07:00
Wojciech Bederski
b014c0f91b
make Pause()/Resume()/isPaused() behave more like a semaphore
...
see: https://github.com/hashicorp/consul/issues/1173 #1173
Reasoning: somewhere during consul development Pause()/Resume() and
PauseSync()/ResumeSync() were added to protect larger changes to
agent's localState. A few of the places that it tries to protect are:
- (a *Agent) AddService(...) # part of the method
- (c *Command) handleReload(...) # almost the whole method
- (l *localState) antiEntropy(...)# isPaused() prevents syncChanges()
The main problem is, that in the middle of handleReload(...)'s
critical section it indirectly (loadServices()) calls AddService(...).
AddService() in turn calls Pause() to protect itself against
syncChanges(). At the end of AddService() a defered call to Resume() is
made.
With the current implementation, this releases
isPaused() "lock" in the middle of handleReload() allowing antiEntropy
to kick in while configuration reload is still in progress.
Specifically almost all services and probably all check are unloaded
when syncChanges() is allowed to run.
This in turn can causes massive service/check de-/re-registration,
and since checks are by default registered in the critical state,
a majority of services on a node can be marked as failing.
It's made worse with automation, often calling `consul reload` in close
proximity on many nodes in the cluster.
This change basically turns Pause()/Resume() into P()/V() of
a garden-variety semaphore. Allowing Pause() to be called multiple times,
and releasing isPaused() only after all matching/defered Resumes() are
called as well.
TODO/NOTE: as with many semaphore implementations, it might be reasonable
to panic() if l.paused ever becomes negative.
2015-09-11 18:28:06 +02:00
Wojciech Bederski
24bc17eaa1
failing test showing that nested Pause()/Resume() release too early
...
see: #1173 / https://github.com/hashicorp/consul/issues/1173
2015-09-11 17:52:57 +02:00
Shawn Cook
66fd8fb2a0
Rename EnableTagOverride and update formatting
2015-09-11 08:35:29 -07:00
Shawn Cook
d7ce0b3c6b
Remove debug lines
2015-09-11 08:32:59 -07:00
Shawn Cook
0b3faf6e4a
Merge remote-tracking branch 'hashicorp/master' into enable_tag_drift_03
2015-09-10 14:55:30 -07:00
James Phillips
27c59f7b30
Adds missing token to maint unit test.
2015-09-10 14:53:00 -07:00
Shawn Cook
35f276f25d
Add test cases TestAgentAntiEntropy_EnableTagDrift
2015-09-10 14:08:16 -07:00
Ryan Uber
1908c16f53
Merge pull request #1230 from hashicorp/f-maintfix
...
Respect tokens in maintenance mode
2015-09-10 12:30:07 -07:00
Ryan Uber
039938a7e0
agent: testing node/service maintenance using tokens
2015-09-10 12:08:08 -07:00
Ryan Uber
125d7fd4ee
agent: thread tokens through for maintenance mode
2015-09-10 11:43:59 -07:00
Wim
0bc4d9322e
Allow AAAA queries for nodeLookup
2015-09-08 16:54:36 +02:00
Ryan Breen
446c640c17
Merge pull request #1217 from 42wim/fix-rfc2308-part3
...
No NXDOMAIN when the answer is empty
2015-09-04 10:42:38 -04:00
Armon Dadgar
5cb6ab625e
Merge pull request #1214 from zendesk/fix_lock_race_2
...
lock.go: fix another race condition
2015-09-02 16:04:55 -07:00
Wim
2701bb5cc2
No NXDOMAIN when the answer is empty
2015-09-02 16:12:22 +02:00
Ryan Breen
80d26f9156
Merge pull request #1167 from railsguru/master
...
Add -http-port option to change the HTTP API port
2015-09-02 01:15:55 -04:00
Armon Dadgar
52a8a95af9
agent: Always enable the UI endpoints
2015-09-01 18:28:32 -07:00
Michael S. Fischer
43ab372a18
lock.go: fix another race condition
...
The previous fix to `consul lock` (commit 6875e8d
) didn't completely
eliminate the race that could occur if the lock was acquired around the
same time SIGTERM was received: It was still possible for
Run() to spawn the process via startChild() after killChild() had
released the shared mutex.
Now, when SIGTERM is received, we acquire a mutex that prevents
spawning a new process and never release it.
We've tested this fix pretty thoroughly and believe it completely
resolves the issue.
2015-09-01 14:27:23 -07:00
Wim
4a1dc90cba
Limit the DNS responses after getting the NodeRecords
2015-09-01 23:23:05 +02:00
Ryan Breen
f41b79eff2
Merge pull request #1195 from 42wim/fix-rfc2308-part2
...
Return SOA/NXDOMAIN when the answer is empty
2015-09-01 17:08:31 -04:00
Wim
369982270d
Return SOA/not found when the answer is empty
2015-09-01 22:28:12 +02:00
James Phillips
26ce9d16be
Merge pull request #1200 from ryotarai/lock-pass-stdin
...
command/lock: Pass stdin to child process when -pass-stdin passed.
2015-08-31 21:14:45 -07:00
Ryan Uber
11e4cfd72b
agent: reload SCADA client if endpoint changes
2015-08-27 13:29:07 -07:00
Ryan Uber
c468acf222
command: atlas endpoint can be passed
2015-08-27 11:11:05 -07:00
Ryan Uber
1cc2429364
agent: atlas_endpoint is configurable
2015-08-27 11:08:01 -07:00
Ryota Arai
33a6cde7dd
command/lock: Pass stdin to child process when -pass-stdin passed.
2015-08-26 16:27:21 +09:00
Ryan Uber
5ad8bfbd41
agent: log a message when making a new scada connection
2015-08-25 21:03:16 -07:00
Ryan Uber
4b715a7d2c
agent: don't reload scada client if there is no config change
2015-08-25 20:43:57 -07:00
Ryan Uber
ed70720d55
agent: testing scada client creation in command
2015-08-25 20:22:22 -07:00
Ryan Uber
52a7206ff3
agent: test scada HTTP server creation
2015-08-25 18:51:04 -07:00
Ryan Uber
eb8974160f
agent: clean up scada connection manager
2015-08-25 18:27:07 -07:00
Ryan Uber
87c1e4fcd3
agent: document the scada http creation func
2015-08-25 17:19:11 -07:00
Ryan Uber
2e6ccded2c
agent: scada client and HTTP server are tracked separately
2015-08-25 16:59:53 -07:00
Andy Lo-A-Foe
85321301e1
Remove duplicate code
2015-08-20 20:46:20 +02:00
Andy Lo-A-Foe
3e046d3efc
Use Ports.HTTP directly
2015-08-20 20:27:20 +02:00
Andy Lo-A-Foe
4e2c3373bc
Add documentation for http-port option
2015-08-20 20:19:35 +02:00
Shawn Cook
ab91590bcb
Update tests - NodeService init needs bool
2015-08-20 09:09:26 -07:00
Shawn Cook
96785edd9a
Add EnableTagDrift logic to command/agent/local.go
2015-08-18 14:03:48 -07:00
Shawn Cook
a0f8c0a2a0
Remove from command/agent/config_test.go
2015-08-18 10:42:25 -07:00
Shawn Cook
6a835939b8
EnableTagDrift in NodeService struct
2015-08-18 10:34:55 -07:00
Ryan Uber
134db62937
Merge pull request #1166 from hashicorp/f-dns-log
...
Log network address of DNS clients
2015-08-13 18:32:32 -07:00
Ryan Uber
05216d3cc4
agent: log network address of DNS clients
2015-08-11 10:33:27 -07:00
Andy Lo-A-Foe
7b5da2a240
Add -http-port option to change the HTTP API port
...
This is useful when pushing consul to PaaS like
Cloud Foundry making the HTTP API easily routable.
2015-08-11 14:14:21 +02:00
Armon Dadgar
066e772536
Merge pull request #1158 from mfischer-zd/fix_1155
...
lock.go: fix race condition
2015-08-05 14:56:13 -07:00
Michael S. Fischer
6875e8d6b4
lock.go: fix race condition
...
Fix a race condition between startChild() and killChild() that could
lead to an orphaned managed process.
Fixes #1155
2015-08-05 09:06:51 -07:00
J.R. Garcia
4cb6f3e943
Remove trailing slash from lock
...
Lock command will remove trailing slash from path (as it is invalid).
Fixes #1136 .
2015-07-30 12:14:17 -05:00
Ryan Breen
018fd69aa2
Merge pull request #1143 from hashicorp/GH-1142
...
Check NXDOMAIN after filtering nodes
2015-07-29 18:56:08 -04:00
Ryan Breen
0a7dc85076
Test for GH-1142.
2015-07-29 18:21:16 -04:00
Armon Dadgar
0363d4b54b
Merge pull request #1137 from 42wim/fix-1124
...
Recurse when PTR answer is empty
2015-07-29 14:39:04 -07:00
Ryan Breen
42648438a0
Check NXDOMAIN after filtering nodes
...
Move the check for NXDOMAIN below the service health filter.
2015-07-29 17:16:48 -04:00
Ryan Uber
93c9c87f7a
Merge pull request #1141 from hashicorp/f-travis
...
Try moving to newer Travis-CI infrastructure
2015-07-28 10:42:56 -07:00
Ryan Uber
40f3e3fae7
travis-ci: skip syslog tests for container-based travis infra
2015-07-28 09:58:43 -07:00
Wim
5647a37ffe
Recurse when PTR answer is empty
2015-07-27 23:22:36 +02:00
Armon Dadgar
4a9b91f2a2
Merge pull request #1130 from pdf/check_socket
...
Add Socket check type
2015-07-27 14:21:24 -07:00
Ryan Uber
a6317f2fb2
Merge pull request #1090 from hashicorp/f-keyring-acl
...
Keyring ACLs
2015-07-24 10:23:18 -07:00
Peter Fern
b023904298
Add TCP check type
...
Adds the ability to simply check whether a TCP socket accepts
connections to determine if it is healthy. This is a light-weight -
though less comprehensive than scripting - method of checking network
service health.
The check parameter `tcp` should be set to the `address:port`
combination for the service to be tested. Supports both IPv6 and IPv4,
in the case of a hostname that resolves to both, connections will be
attempted via both protocol versions, with the first successful
connection returning a successful check result.
Example check:
```json
{
"check": {
"id": "ssh",
"name": "SSH (TCP)",
"tcp": "example.com:22",
"interval": "10s"
}
}
```
2015-07-24 14:06:05 +10:00
Ryan Uber
7aa8539c10
agent: disable ACLs for RPC client tests
2015-07-23 17:09:33 -07:00
Armon Dadgar
981c62ccba
command/lock: Check for shutdown during lock acquisition. Fixes #800
2015-07-22 16:07:44 -07:00
Benjamin Abbott-Scott
f877b9ecc4
Return every time lock acquisition fails
2015-07-22 10:44:47 -07:00
Ryan Uber
1bbdf3b03b
agent: vet fixes
2015-07-14 11:42:51 -07:00
Ryan Uber
5682b715c4
Merge pull request #995 from 42wim/rfc2308-soa-ttl
...
Send SOA with negative responses (RFC2308)
2015-07-13 08:49:25 -07:00
Ryan Uber
79ac4f3512
agent: testing keyring ACLs
2015-07-07 15:14:06 -06:00
Ryan Uber
5c65bc7df2
agent: write-level keyring ACLs work
2015-07-07 10:36:51 -06:00
Ryan Uber
bffc0861cc
agent: read-level keyring ACLs work
2015-07-07 10:30:34 -06:00
Ryan Uber
e37b5ecb69
Merge pull request #1046 from hashicorp/f-event-acl
...
Event ACLs
2015-07-02 07:02:07 -07:00
Ryan Uber
d0348d1291
Merge pull request #1004 from i0rek/advertise_addrs
...
Add advertise_addrs.
2015-06-23 12:32:07 -07:00
Hans Hasselberg
267e0caf81
Implement advertise_addrs for SerfLan, SerfWan and RPC.
...
Fixes #550 .
This will make it possible to configure the advertised adresses for
SerfLan, SerfWan and RPC. It will enable multiple consul clients on a
single host which is very useful in a container environment.
This option might override advertise_addr and advertise_addr_wan
depending on the configuration.
It will be configureable with advertise_addrs. Example:
{
"advertise_addrs": {
"serf_lan": "10.0.120.91:4424",
"serf_wan": "201.20.10.61:4423",
"rpc": "10.20.10.61:4424"
}
}
2015-06-23 21:23:45 +02:00
Ryan Uber
320ab1448d
command: remote exec takes -token parameter
2015-06-22 17:16:28 -07:00
Ryan Uber
71aae88f3f
command: event command supports -token arg
2015-06-22 16:59:41 -07:00
Ryan Uber
5bde81bcdc
agent: avoid masking errors when ACLs deny a request
2015-06-18 18:13:29 -07:00
Ryan Uber
beb27fb3ef
agent: testing user event endpoint ACLs
2015-06-18 18:13:29 -07:00
Ryan Uber
6e084f6897
consul: always fire events from server nodes
2015-06-18 18:13:29 -07:00
Ryan Uber
6f309c355f
agent: enforce event policy during event fire
2015-06-18 18:13:29 -07:00
Wim
3b1bcaea98
Send SOA with negative responses
2015-06-14 00:03:44 +02:00
Ryan Uber
8ffa0ea8b7
Merge pull request #1028 from sebastianmarkow/master
...
Remove unreachable error handling in AgentRPC.listen()
2015-06-12 22:28:10 -07:00
Ryan Uber
f7f7c4695e
agent: testing dns when acls are in use
2015-06-12 16:01:57 -07:00
Ryan Uber
fb3938d88e
agent: dns uses the configured token during queries
2015-06-12 16:01:57 -07:00
Sebastian Klatt
6ef6e43418
consul: Remove unreachable error handling
2015-06-12 20:21:32 +02:00
Ryan Uber
1b4167699f
agent: don't replace empty tokens in the logs, fixes #1020
2015-06-12 00:11:37 -07:00
Ryan Uber
f5f7e401d5
agent: fix failing test
2015-06-11 15:13:10 -07:00
Ryan Uber
69921808ee
agent: use persist/load/purge convention for function names
2015-06-08 09:35:10 -07:00
Ryan Uber
2d1b873e4b
agent: test check state restoration from AddCheck
2015-06-05 17:33:34 -07:00
Ryan Uber
1636a35289
agent: check state is purged if expired
2015-06-05 16:59:41 -07:00
Ryan Uber
2ee8fa8e15
agent: purge check state when checks are deregistered
2015-06-05 16:57:14 -07:00
Ryan Uber
7e6e861394
agent: testing state persistence, recovery, and expiration
2015-06-05 16:45:05 -07:00
Ryan Uber
7597d3d798
agent: first stab at persisting check state
2015-06-05 16:17:07 -07:00
Ryan Uber
ebe57a1f65
agent: refactor loadChecks/loadServices, fixes a few minor bugs
2015-06-04 14:33:30 -07:00
Ryan Uber
5226e29a69
agent: don't replace config on SIGHUP if parsing fails
2015-05-30 22:50:24 -07:00
Emil Hessman
3bfc6dfe49
command/agent: skip unix file permissions test on windows
2015-05-29 21:12:45 +02:00
Adam Renberg
7d0959b34e
Sort members in by name for consul members
2015-05-22 10:37:54 +02:00
Adam Renberg
6aa21152d1
Sort tags in consul members -detailed output
2015-05-22 10:27:47 +02:00
Ryan Uber
78a80f3a57
agent: flush progress info to console during migrations
2015-05-19 18:47:44 -07:00
Anton Lindström
ce93fdd76b
Set the User Agent for HTTP health checks
2015-05-18 19:12:10 +02:00
Adam Renberg
ed3b0dd9ee
Include DC in the members command output
2015-05-15 23:26:34 +02:00
Ryan Uber
2b98ebca78
agent: log a message when data migrations start
2015-05-12 12:58:44 -07:00
Ryan Uber
72ee584df3
Fix tests after merge
2015-05-11 18:53:09 -07:00
Armon Dadgar
ebf961ef8b
Merge pull request #927 from hashicorp/f-tls
...
Add new `verify_server_hostname` to mitigate possibility of MITM
2015-05-11 18:15:16 -07:00
Armon Dadgar
8d86290ebf
Fixing merge conflict
2015-05-11 16:48:10 -07:00
Armon Dadgar
a485eb8447
agent: copy config into consul config
2015-05-11 15:16:13 -07:00
Armon Dadgar
59d5992355
agent: Adding new VerifyHostname config
2015-05-11 15:13:58 -07:00
Cameron Ruatta
9271d94532
Adding documentation about specifying multiple configuration directories
2015-05-11 10:19:04 -07:00
Ryan Uber
ecf43cf14e
command: fix configtest help format
2015-05-11 09:42:26 -07:00
Ryan Uber
3ad94bd81d
Merge pull request #904 from josephholsten/configtest-clean
...
add minimal configtest command
2015-05-11 09:01:54 -07:00
Joseph Anthony Pasquale Holsten
afbf68878c
command/configtest: add
2015-05-08 13:09:50 -07:00
Ryan Uber
3ed0146e44
agent: use service ID field to determine associated health checks during deregister
2015-05-07 15:30:01 -07:00
Ryan Uber
204c11ec01
agent: restore check status when re-registering (updating) services
2015-05-06 12:28:42 -07:00
Armon Dadgar
f3a8f907fb
Merge pull request #909 from hashicorp/f-create
...
Support ACL upsert behavior
2015-05-06 11:22:11 -07:00
Ryan Uber
8ef01236e1
agent: allow persisted services to be updated on disk
2015-05-05 22:36:45 -07:00
Ryan Uber
739d1fdf03
Merge pull request #891 from hashicorp/f-token
...
ACL tokens for service/check registration
2015-05-05 22:17:31 -07:00
Armon Dadgar
532c06ac43
agent: Support ACL upserting
2015-05-05 19:25:10 -07:00
Armon Dadgar
27a820d611
agent: Adding test for DNS enable_truncate
2015-05-05 14:14:41 -07:00
Armon Dadgar
ea577fbf70
command/agent: Lowercase DC. Fixes #761
2015-05-05 13:56:37 -07:00
Ryan Uber
2b62f2f172
agent: use an additional parameter for passing tokens
2015-05-04 17:48:05 -07:00
Armon Dadgar
a86f31517b
Merge pull request #816 from pepov/master
...
Support different advertise address for WAN gossip
2015-05-04 15:40:25 -07:00
Armon Dadgar
b381cca304
Merge pull request #902 from hashicorp/f-stats-prefix
...
Allow configuring the stats prefix
2015-05-04 15:19:47 -07:00
Armon Dadgar
0dc58140f3
Merge pull request #862 from hashicorp/f-recurse-cname
...
Return all CNAME's during service DNS resolution
2015-05-04 15:19:13 -07:00
Ryan Uber
72524e911d
agent: allow configuring the stats prefix
2015-05-03 16:46:20 -07:00
Ryan Uber
35f5a65fb7
agent: more tests
2015-04-28 13:06:02 -07:00
Ryan Uber
18356328c4
agent: restore tokens for services and checks in config
2015-04-28 12:44:46 -07:00
Ryan Uber
663a86f9b9
agent: backwards compat for persisted services from pre-0.5.1
2015-04-28 12:18:41 -07:00
Ryan Uber
442933650e
agent: safer read methods for tokens
2015-04-28 11:53:53 -07:00
Ryan Uber
1557f7f19c
agent: test coverage loading service/check tokens from persisted files
2015-04-27 22:46:01 -07:00
Ryan Uber
1264f7edf3
agent: fix deadlock reading tokens from state
2015-04-27 22:26:03 -07:00
Ryan Uber
bebb5d9641
agent: add service/check token methods to reduce invasiveness
2015-04-27 22:01:01 -07:00
Ryan Uber
92add18e1e
agent: persist tokens from API registrations
2015-04-27 19:01:02 -07:00
Ryan Uber
bfb27d18cd
agent: initial pass threading through tokens for services/checks
2015-04-27 18:33:46 -07:00
artushin
cc07734d6e
remove config
2015-04-24 09:51:40 -05:00
artushin
7b4720a957
use existing randomStagger
2015-04-23 17:08:17 -05:00
artushin
fc0331ddfc
add CheckUpdateStagger to MergeConfig
2015-04-23 16:56:20 -05:00
artushin
8decf5d394
adding check_update_stagger
2015-04-23 16:27:42 -05:00
Ryan Uber
c9fd3eb469
agent: re-work DNS tests to not rely on the external network
2015-04-14 12:52:26 -07:00
Ryan Uber
116f8b9131
agent: pass through CNAME types for service resolution
2015-04-14 12:52:26 -07:00
Ryan Uber
6f0b1a3b46
agent: Add test for CNAME recursion
2015-04-14 12:52:26 -07:00
Ryan Uber
507917748a
agent: parse raw query URL to avoid closing the request body early
2015-04-13 17:31:53 -07:00
Ryan Uber
ee5659858a
agent: hide tokens from logs and monitor
2015-04-12 11:17:31 -07:00
Ryan Mills
275af975e8
Allow specifying a status field in the agent/service/register and agent/check/register endpoints.
...
This status must be one of the valid check statuses: 'passing', 'warning', 'critical', 'unknown'.
If the status field is not present or the empty string, the default of 'critical' is used.
2015-04-12 02:00:31 +00:00
Ryan Uber
9c85ea0c47
agent: Don't attempt migration on new server
2015-04-10 19:41:09 -07:00
Ryan Uber
6cc0eefa76
Merge pull request #857 from hashicorp/f-boltdb
...
Raft uses BoltDB
2015-04-10 18:30:07 -07:00
Ryan Uber
60a6da213f
agent: handle nil node services in anti-entropy
2015-04-10 11:15:31 -07:00
Ryan Uber
ac0f66a91e
command: automatically migrate raft data on start
2015-04-09 23:00:20 -07:00
Ryan Uber
7e170b047e
agent: fix anti-entropy check sync
2015-04-09 10:40:05 -07:00
Ryan Uber
f417279761
agent: test anti-entropy sync
2015-04-08 12:36:53 -07:00
Ryan Uber
a60f4adf95
agent: anti-entropy sync services/checks if they don't exist in the catalog
2015-04-08 12:21:01 -07:00
Ryan Uber
deec3bef9e
agent: fix dns test
2015-04-01 10:58:05 -07:00
Matt Good
062e4f94c0
Remove unnecessary DNS test entry
...
By using the startup callbacks, the DNS test entry is not needed to check that
the server is alive.
2015-03-31 16:50:44 -07:00
Matt Good
65ada1a62d
Use DNS server startup callbacks
...
Simplify waiting for the DNS server to start with the newer "NotifyStartedFunc"
callback.
2015-03-31 16:48:48 -07:00