707 Commits

Author SHA1 Message Date
Pierre Souchay
aebfcb6767 Fixed minor typo (+ travis tests is unstable) 2018-03-09 18:42:13 +01:00
Pierre Souchay
93fa1f6f49 Optimize size for SRV records, should improve performance a bit
Stricter Unit tests that checks if truncation was OK.
2018-03-09 18:25:29 +01:00
Preetha
210cfe5ef9
Merge pull request #3940 from pierresouchay/dns_max_size
Allow to control the number of A/AAAA Record returned by DNS
2018-03-09 07:35:32 -06:00
Pierre Souchay
d0e45f22df Fixed wrong format of debug msg in unit test 2018-03-08 00:36:17 +01:00
Pierre Souchay
ce3f47a75d Performance optimization for services having more than 2k records 2018-03-08 00:26:41 +01:00
Pierre Souchay
7d59249d96 Avoid issue with compression of DNS messages causing overflow 2018-03-07 23:33:41 +01:00
Pierre Souchay
419bf29041 Cleaner Unit tests from suggestions from @preetapan 2018-03-07 18:24:41 +01:00
Pierre Souchay
b77fd5ce9d 64000 max limit to DNS messages since there is overhead
Added debug log to give information about truncation.
2018-03-07 16:14:41 +01:00
Pierre Souchay
be39fb20cc [BUGFIX] do not break when TCP DNS answer exceeds 64k
It will avoid having discovery broken when having large number
of instances of a service (works with SRV and A* records).

Fixes https://github.com/hashicorp/consul/issues/3850
2018-03-07 10:08:06 +01:00
Mitchell Hashimoto
8217564c48
agent/consul/fsm: begin using testify/assert 2018-03-06 09:48:15 -08:00
Pierre Souchay
0b7f620dc6 Allow to control the number of A/AAAA Record returned by DNS
This allows to have randomized resource records (i.e. each
answer contains only one IP, but the IP changes every request) for
A, AAAA records.

It will fix https://github.com/hashicorp/consul/issues/3355 and
https://github.com/hashicorp/consul/issues/3937

See https://github.com/hashicorp/consul/issues/3937#issuecomment-370610509
for details.

It basically add a new option called `a_record_limit` and will not
return more than a_record_limit when performing A, AAAA or ANY DNS
requests.

The existing `udp_answer_limit` option is still working but should
be considered as deprecated since it works only with DNS clients
not supporting EDNS.
2018-03-06 02:07:42 +01:00
Edd Steel
41b1d45cc7
Re-use defined endpoints for tests 2018-03-03 11:19:18 -08:00
Paul Banks
9a47449c6d
Merge pull request #3899 from pierresouchay/fix_blocking_queries_index
Services Indexes modified per service instead of using a global Index
2018-03-02 16:24:43 +00:00
Pierre Souchay
360dc1dd8d Simplified error handling for maxIndexForService
* added unit tests to ensure service index is properly garbage collected
* added Upgrade from Version 1.0.6 to higher section in documentation
2018-03-01 14:09:36 +01:00
Paul Banks
dbaabb1dbc
Fix test running in non-bash shells 2018-02-22 14:06:06 +00:00
Paul Banks
6da6e086ef
Merge pull request #3900 from hashicorp/fix-monitor-sigint-3891
Fixes #3891: agent monitor no longer unresponsive before logs stream.
2018-02-21 21:28:33 +00:00
Preetha Appan
80791d5b21
Remove extra newline 2018-02-21 13:21:47 -06:00
Preetha Appan
907b97b7f2
Unit test that calls revokeLeadership twice to make sure its idempotent 2018-02-21 12:48:53 -06:00
Preetha Appan
f59abcc394
Make sure revokeLeadership is called if establishLeadership errors 2018-02-21 12:33:22 -06:00
Alex Dadgar
18bf9647d5 Test autopilots start/stop idempotency 2018-02-21 10:19:30 -08:00
Alex Dadgar
33c5afdb31 Improve autopilot shutdown to be idempotent 2018-02-20 15:51:59 -08:00
Pierre Souchay
a8d3745104 Fixed comments for function maxIndexForService 2018-02-20 23:57:28 +01:00
Pierre Souchay
09351ba9a6 [Revert] Only update services if tags are different
This patch did give some better results, but break watches on
the services of a node.

It is possible to apply the same optimization for nodes than
to services (one index per instance), but it would complicate
further the patch.

Let's do it in another PR.
2018-02-20 23:34:42 +01:00
Pierre Souchay
60454b570a Only update services if tags are different 2018-02-20 23:08:04 +01:00
Pierre Souchay
a05d38737c Enable Raft index optimization per service name on health endpoint
Had to fix unit test in order to check properly indexes.
2018-02-20 01:35:50 +01:00
Paul Banks
de58eb1820
Fixes #3891: agent monitor no longer unresponsive before logs stream.
The root cause is actually that the agent's streaming HTTP API didn't flush until the first log line was found which commonly was pretty soon since the default level is INFO. In cases where there were no logs immediately due to level for instance, the client gets stuck in the HTTP code waiting on a response packet from the server before we enter the loop that checks the shutdown channel from the signal handler.

This fix flushes the initial status immediately on the streaming endpoint which lets the client code get into it's expected state where it's listening for shutdown or log lines.
2018-02-19 21:53:10 +00:00
Pierre Souchay
4f10fae3c3 Get only first service to test whether we have to cleanup index of a service 2018-02-19 22:44:49 +01:00
Pierre Souchay
bac8fb046f Fixed comment about raftIndex + use test.Helper() 2018-02-19 19:30:25 +01:00
Pierre Souchay
73127ef407 Services Indexes modified per service instead of using a global Index
This patch improves the watches for services on large cluster:
each service has now its own index, such watches on a specific service
are not modified by changes in the global catalog.

It should improve a lot the performance of tools such as consul-template
or libraries performing watches on very large clusters with many
services/watches.
2018-02-19 18:29:22 +01:00
Edd Steel
d0f0d67b4a
Clarify comments 2018-02-17 17:46:11 -08:00
Edd Steel
f770f360e9 Test every endpoint for OPTIONS/MethodNotFound 2018-02-17 17:34:13 -08:00
Edd Steel
c5f0bb3711 Allow endpoints to handle OPTIONS/MethodNotFound themselves 2018-02-17 17:34:03 -08:00
Edd Steel
f5af8b0f03
Initialise allowedMethods in init() 2018-02-17 17:31:24 -08:00
Kyle Havlovitz
139b98a427
Fix the coordinate update endpoint not passing the ACL token 2018-02-15 11:58:02 -08:00
Edd Steel
77f19f7505
Support OPTIONS requests
- register endpoints with supported methods
- support OPTIONS requests, indicating supported methods
- extract method validation (error 405) from individual endpoints
- on 405 where multiple methods are allowed, create a single Allow
  header with comma-separated values, not multiple Allow headers.
2018-02-12 10:15:31 -08:00
Andrei Burd
b608091014 adding human readability for dns requests debug log (#3751) 2018-02-11 09:02:28 -06:00
Pierre Souchay
b259b1609c Merge remote-tracking branch 'origin/master' into service_metadata 2018-02-11 13:20:49 +01:00
Pierre Souchay
9a57dfd68a Fixed TestSanitize unit test 2018-02-11 12:11:11 +01:00
James Phillips
3724e49ddf
Fixes a panic on TCP-based DNS lookups.
This came in via the monkey patch in #3861.

Fixes #3877
2018-02-08 17:57:41 -08:00
Pierre Souchay
66fdf445e8 Added unit tests for structs and fixed PartialClone() 2018-02-09 01:37:45 +01:00
James Phillips
c2a59f1e6c
Addresses additional state mutations.
Did a sweep of 84d6ac2d51
and checked them all.
2018-02-07 07:02:10 -08:00
James Phillips
1c6de1d623
Fixes all the racy output-side updates to tags. 2018-02-06 20:35:55 -08:00
James Phillips
11f6961e47
Adds a more robust unit test for index churn. 2018-02-06 20:35:38 -08:00
Pierre Souchay
80dde5465b Added support for Service Metadata 2018-02-07 01:54:42 +01:00
James Phillips
d9a6e2a901
Makes server manager shift away from failed servers from Serf events.
Because this code was doing pointer equality checks, it would work for
the case of a failed attempted RPC because the objects are from the
manager itself:

https://github.com/hashicorp/consul/blob/v1.0.3/agent/consul/rpc.go#L283-L302

But the pointer check would always fail for events coming in from the
Serf path because the server object is newly-created:

https://github.com/hashicorp/consul/blob/v1.0.3/agent/router/serf_adapter.go#L14-L40

This means that we didn't proactively shift RPC traffic away from a
failed server, we'd have to wait for an RPC to fail, which exposes
the error to the calling client.

By switching over to a name check vs. a pointer check we get the correct
behavior. We added a DEBUG log as well to help observe this behavior during
integrated testing.

Related to #3863 since the fix here needed the same logic duplicated, owing
to the complicated atomic stuff.

/cc @dadgar for a heads up in case this also affects Nomad.
2018-02-05 17:56:00 -08:00
James Phillips
fc155dac19
Adds a before/after test for #3845. 2018-02-05 16:18:29 -08:00
James Phillips
533f65b7a6
Merge pull request #3845 from 42wim/tagfix
Fix service tags not added to health check. Part two
2018-02-05 16:18:00 -08:00
Kyle Havlovitz
f6ecaa4a1c
Add enterprise default config section 2018-02-05 13:33:59 -08:00
James Phillips
e748c63fff
Merge pull request #3855 from hashicorp/pr-3782-slackpad
Adds support for gRPC health checks.
2018-02-02 17:57:27 -08:00
James Phillips
5f31c8d8d3
Changes "TLS" to "GRPCUseTLS" since it only applies to GRPC checks. 2018-02-02 17:29:34 -08:00