8912 Commits

Author SHA1 Message Date
Freddy
d7a404f2ee
Bugfix: Use "%#v" when formatting structs (#4600) 2018-08-28 12:37:34 -04:00
Jack Pearkes
82f9e48a18
github: some minor changes to issue templates (#4521)
Just cleaning these up a tiny bit based on seeing how they're
being used.
2018-08-28 09:07:28 -07:00
Jack Pearkes
c1bf14be30
website: use 127.0.0.1 instead of consul.rocks (#4523)
By default, the Consul agent listens on the local interface
at port 8500 for API requests. This change makes the API examples
using `curl` copy-pasteable for this default configuration.
2018-08-28 09:07:15 -07:00
Siva Prasad
b1a34f899f
TestAgentAntiEntropy: Wait until Consul service is up on the agent. (#4591)
* Anti-Entropy test wait for Consul service added

* Reverted some tests back to using WaitForLeader
2018-08-28 09:52:11 -04:00
Pierre Souchay
8e7b8bb524 Fixed unit test TestCatalogListServicesCommand (#4592) 2018-08-27 13:53:46 -04:00
Pierre Souchay
5e0218ccf4 Fix unit test TestOperatorAutopilotGetConfigCommand (#4594) 2018-08-27 13:29:25 -04:00
Pierre Souchay
aea31d3c5d Fixed unstable test TestUiNodeInfo (#4586) 2018-08-27 11:49:14 -04:00
Pierre Souchay
3101086a27 Fixed unstable test TestProxy_public (#4587) 2018-08-27 11:45:07 -04:00
Pierre Souchay
af90c88f6a Fixed unstable test TestRTTCommand_LAN in command/rtt (#4585) 2018-08-27 11:37:13 -04:00
Pierre Souchay
3f9d1370b7 Fix unstable test TestRegisterMonitor_heartbeat (#4568) 2018-08-24 13:33:58 -04:00
Kyle Havlovitz
2eefa74710
Merge pull request #4577 from remijouannet/patch-1
Update monitoring-telegraf.html.md
2018-08-24 09:13:53 -07:00
Rémi Jouannet
2767ae860b
Update monitoring-telegraf.html.md 2018-08-24 16:48:02 +02:00
John Cowen
4ebd70e6cd
UI: Fixes healthy node listing resize on large portrait screens (#4564)
1. Split the resizing functionality of into a separate mixin to be
shared across components
2. Add basic integration tests to prove that everything is getting
called through out the lifetime of the app. I decided against unit
testing as there isn't really any isolated logic to be tested, more
checking that things are being called in the correct order etc i.e. the
integration is correct.

Adds assertion to with-resizing so its obvious to override `resize`
2018-08-24 12:35:52 +02:00
Freddy
a00a2dc148
Update CHANGELOG.md (#4562) 2018-08-23 15:20:26 -04:00
Pierre Souchay
b898131723 [BUGFIX] Avoid returning empty data on startup of a non-leader server (#4554)
Ensure that DB is properly initialized when performing stale queries

Addresses:
- https://github.com/hashicorp/consul-replicate/issues/82
- https://github.com/hashicorp/consul/issues/3975
- https://github.com/hashicorp/consul-template/issues/1131
2018-08-23 12:06:39 -04:00
Paul Banks
4d658f34cf
Intention ACL API clarification (#4547) 2018-08-20 20:33:15 +01:00
jjshanks
657b8d27ac Update intentions documentation to clarify ACL behavior (#4546)
* Update intentions documentation to clarify ACL behavior

* Incorprate @banks suggestions into docs

* Fix my own typos!
2018-08-20 20:03:53 +01:00
Matt Keeler
542cace9a2
Update CHANGELOG.md 2018-08-17 16:20:34 -04:00
Miroslav Bagljas
3c23979afd Fixes #4483: Add support for Authorization: Bearer token Header (#4502)
Added Authorization Bearer token support as per RFC6750

* appended Authorization header token parsing after X-Consul-Token
* added test cases
* updated website documentation to mention Authorization header

* improve tests, improve Bearer parsing
2018-08-17 16:18:42 -04:00
Matt Keeler
1b1a9c6fc6
Update CHANGELOG.md 2018-08-17 14:45:52 -04:00
Matt Keeler
e81c85c051
Fix #4515: Segfault when serf_wan port was -1 but reconnect_time_wan was set (#4531)
Fixes #4515 

This just slightly refactors the logic to only attempt to set the serf wan reconnect timeout when the rest of the serf wan settings are configured - thus avoiding a segfault.
2018-08-17 14:44:25 -04:00
Siva Prasad
f8cc241b28
Fixed a make build issue with Windows Binaries. (#4538)
* Fixed an issue where Windows binary had trouble being copied correctly

* Enclosed binname inside angular brackets
2018-08-17 09:31:57 -04:00
Kyle Havlovitz
9257300f7e
Merge pull request #4535 from hashicorp/ca-snapshot-fix
fsm: add missing CA config to snapshot/restore logic
2018-08-16 13:05:47 -07:00
Kyle Havlovitz
e5e1f867e5
Merge branch 'master' into ca-snapshot-fix 2018-08-16 13:00:54 -07:00
Kyle Havlovitz
f186edc42c
fsm: add connect service config to snapshot/restore test 2018-08-16 12:58:54 -07:00
Matt Keeler
efe14620ea
Update CHANGELOG.md 2018-08-16 15:35:02 -04:00
nickmy9729
beddf03b26 Added code to allow snapshot inclusion of NodeMeta (#4527) 2018-08-16 15:33:35 -04:00
Kyle Havlovitz
b51d76f469
fsm: add missing CA config to snapshot/restore logic 2018-08-16 11:58:50 -07:00
Kyle Havlovitz
fa8990c5a2
Merge pull request #4528 from hashicorp/autopilot-fixes
Fix inconsistency caused by the autopilot StatsFetcher
2018-08-15 11:35:52 -07:00
Siva Prasad
771e0bafd2
Addresses the flakiness of CatalogNodes (#4530) 2018-08-15 11:16:05 -04:00
sandstrom
14f19f75a6 Clarify port usage for agents (#4510) 2018-08-14 16:10:01 -07:00
Kyle Havlovitz
4b35d877ca
autopilot: don't follow the normal server removal rules for nonvoters 2018-08-14 14:24:51 -07:00
Kyle Havlovitz
ea14482376
Fix stats fetcher healthcheck RPCs not being independent 2018-08-14 14:23:52 -07:00
Shubheksha
fc3997f266 replace old fork of text package (#4501) 2018-08-14 12:23:18 -07:00
Pierre Souchay
0d6de257a2 Display more information about check being not properly added when it fails (#4405)
* Display more information about check being not properly added when it fails

It follows an incident where we add lots of error messages:

  [WARN] consul.fsm: EnsureRegistration failed: failed inserting check: Missing service registration

That seems related to Consul failing to restart on respective agents.

Having Node information as well as service information would help diagnose the issue.

* Renamed ensureCheckIfNodeMatches() as requested by @banks
2018-08-14 17:45:33 +01:00
Freddy
6d43d24edb
Improve reliability of tests with TestAgent (#4525)
- Add WaitForTestAgent to tests flaky due to missing serfHealth registration

- Fix bug in retries calling Fatalf with *testing.T

- Convert TestLockCommand_ChildExitCode to table driven test
2018-08-14 12:08:33 -04:00
Paul Banks
e34acd275f
Update intentions.html.md 2018-08-14 15:09:45 +01:00
Freddy
e305443db4
Address flakiness in command/exec tests (#4517)
* Add fn to wait for TestAgent node and check registration

* Add waits for TestAgent and retries before timeouts in exec_test
2018-08-10 15:04:07 -04:00
Geoffrey Grosenbach
a03512496f
Consul Production Deployment Guide
Renames guide to "Production Deployment"
Adds link in sidebar menu.
Implements edits suggested by Consul engineering team.
2018-08-10 11:51:05 -07:00
Matt Keeler
daacbaa520
Update CHANGELOG.md 2018-08-10 11:33:22 -04:00
Pierre Souchay
ef3b81ab13 Allow to rename nodes with IDs, will fix #3974 and #4413 (#4415)
* Allow to rename nodes with IDs, will fix #3974 and #4413

This change allow to rename any well behaving recent agent with an
ID to be renamed safely, ie: without taking the name of another one
with case insensitive comparison.

Deprecated behaviour warning
----------------------------

Due to asceding compatibility, it is still possible however to
"take" the name of another name by not providing any ID.

Note that when not providing any ID, it is possible to have 2 nodes
having similar names with case differences, ie: myNode and mynode
which might lead to DB corruption on Consul server side and
lead to server not properly restarting.

See #3983 and #4399 for Context about this change.

Disabling registration of nodes without IDs as specified in #4414
should probably be the way to go eventually.

* Removed the case-insensitive search when adding a node within the else
block since it breaks the test TestAgentAntiEntropy_Services

While the else case is probably legit, it will be fixed with #4414 in
a later release.

* Added again the test in the else to avoid duplicated names, but
enforce this test only for nodes having IDs.

Thus most tests without any ID will work, and allows us fixing

* Added more tests regarding request with/without IDs.

`TestStateStore_EnsureNode` now test registration and renaming with IDs

`TestStateStore_EnsureNodeDeprecated` tests registration without IDs
and tests removing an ID from a node as well as updated a node
without its ID (deprecated behaviour kept for backwards compatibility)

* Do not allow renaming in case of conflict, including when other node has no ID

* Fixed function GetNodeID that was not working due to wrong type when searching node from its ID

Thus, all tests about renaming were not working properly.

Added the full test cas that allowed me to detect it.

* Better error messages, more tests when nodeID is not a valid UUID in GetNodeID()

* Added separate TestStateStore_GetNodeID to test GetNodeID.

More complete test coverage for GetNodeID

* Added new unit test `TestStateStore_ensureNoNodeWithSimilarNameTxn`

Also fixed comments to be clearer after remarks from @banks

* Fixed error message in unit test to match test case

* Use uuid.ParseUUID to parse Node.ID as requested by @mkeeler
2018-08-10 11:30:45 -04:00
Paul Banks
9ce10769ce Update Serf and memberlist (#4511)
This includes fixes that improve gossip scalability on very large (> 10k node) clusters.

The Serf changes:
 - take snapshot disk IO out of the critical path for handling messages hashicorp/serf#524
 - make snapshot compaction much less aggressive - the old fixed threshold caused snapshots to be constantly compacted (synchronously with request handling) on clusters larger than about 2000 nodes! hashicorp/serf#525

Memberlist changes:
 - prioritize handling alive messages over suspect/dead to improve stability, and handle queue in LIFO order to avoid acting on info that 's already stale in the queue by the time we handle it. hashicorp/memberlist#159
 - limit the number of concurrent pushPull requests being handled at once to 128. In one test scenario with 10s of thousands of servers we saw channel and lock blocking cause over 3000 pushPulls at once which ballooned the memory of the server because each push pull contained a de-serialised list of all known 10k+ nodes and their tags for a total of about 60 million objects and 7GB of memory stuck. While the rest of the fixes here should prevent the same root cause from blocking in the same way, this prevents any other bug or source of contention from allowing pushPull messages to stack up and eat resources. hashicorp/memberlist#158
2018-08-09 13:16:13 -04:00
Siva Prasad
c88900aaa9
PR to fix TestAgent_IndexChurn and TestPreparedQuery_Wrapper. (#4512)
* Fixes TestAgent_IndexChurn

* Fixes TestPreparedQuery_Wrapper

* Increased sleep in agent_test for IndexChurn to 500ms

* Made the comment about joinWAN operation much less of a cliffhanger
2018-08-09 12:40:07 -04:00
Armon Dadgar
8a1f42e190
Merge pull request #4505 from hashicorp/f-channel-size
consul: Update buffer sizes
2018-08-08 12:26:23 -07:00
Freddy
e21f554923
Improve flaky connect/proxy Listener tests (#4498)
Improve flaky connect/proxy Listener tests

- Add sleep to TestEchoConn to allow for Read/Write to finish before fetching data in reportStats

- Account for flakiness around interval for Gauge

- Improve debug output when dumping metrics
2018-08-08 14:56:03 -04:00
Armon Dadgar
4f1fd34e9e consul: Update buffer sizes 2018-08-08 10:26:58 -07:00
Geoffrey Grosenbach
3e6313bebb
Merge pull request #4485 from hashicorp/doc-remove-atlas
Remove all mention of Atlas, even in deprecated changelogs
2018-08-07 10:45:50 -07:00
Geoffrey Grosenbach
78134fa368
Merge pull request #4318 from hashicorp/doc-discovery-code-snippet
Improve styling of discovery snippet
2018-08-07 10:44:58 -07:00
Siva Prasad
288d350a73
Revert "CA initialization while boostrapping and TestLeader_ChangeServerID fix." (#4497)
* Revert "BUGFIX: Unit test relying on WaitForLeader() did not work due to wrong test (#4472)"

This reverts commit cec5d7239621e0732b3f70158addb1899442acb3.

* Revert "CA initialization while boostrapping and TestLeader_ChangeServerID fix. (#4493)"

This reverts commit 589b589b53e56af38de25db9b56967bdf1f2c069.
2018-08-07 08:29:48 -04:00
Pierre Souchay
cec5d72396 BUGFIX: Unit test relying on WaitForLeader() did not work due to wrong test (#4472)
- Improve resilience of testrpc.WaitForLeader()

- Add additionall retry to CI

- Increase "go test" timeout to 8m

- Add wait for cluster leader to several tests in the agent package

- Add retry to some tests in the api and command packages
2018-08-06 19:46:09 -04:00