Pierre Souchay
b77fd5ce9d
64000 max limit to DNS messages since there is overhead
...
Added debug log to give information about truncation.
2018-03-07 16:14:41 +01:00
Pierre Souchay
be39fb20cc
[BUGFIX] do not break when TCP DNS answer exceeds 64k
...
It will avoid having discovery broken when having large number
of instances of a service (works with SRV and A* records).
Fixes https://github.com/hashicorp/consul/issues/3850
2018-03-07 10:08:06 +01:00
Mitchell Hashimoto
8217564c48
agent/consul/fsm: begin using testify/assert
2018-03-06 09:48:15 -08:00
Pierre Souchay
0b7f620dc6
Allow to control the number of A/AAAA Record returned by DNS
...
This allows to have randomized resource records (i.e. each
answer contains only one IP, but the IP changes every request) for
A, AAAA records.
It will fix https://github.com/hashicorp/consul/issues/3355 and
https://github.com/hashicorp/consul/issues/3937
See https://github.com/hashicorp/consul/issues/3937#issuecomment-370610509
for details.
It basically add a new option called `a_record_limit` and will not
return more than a_record_limit when performing A, AAAA or ANY DNS
requests.
The existing `udp_answer_limit` option is still working but should
be considered as deprecated since it works only with DNS clients
not supporting EDNS.
2018-03-06 02:07:42 +01:00
Edd Steel
41b1d45cc7
Re-use defined endpoints for tests
2018-03-03 11:19:18 -08:00
Paul Banks
9a47449c6d
Merge pull request #3899 from pierresouchay/fix_blocking_queries_index
...
Services Indexes modified per service instead of using a global Index
2018-03-02 16:24:43 +00:00
Pierre Souchay
360dc1dd8d
Simplified error handling for maxIndexForService
...
* added unit tests to ensure service index is properly garbage collected
* added Upgrade from Version 1.0.6 to higher section in documentation
2018-03-01 14:09:36 +01:00
Paul Banks
dbaabb1dbc
Fix test running in non-bash shells
2018-02-22 14:06:06 +00:00
Paul Banks
6da6e086ef
Merge pull request #3900 from hashicorp/fix-monitor-sigint-3891
...
Fixes #3891 : agent monitor no longer unresponsive before logs stream.
2018-02-21 21:28:33 +00:00
Preetha Appan
80791d5b21
Remove extra newline
2018-02-21 13:21:47 -06:00
Preetha Appan
907b97b7f2
Unit test that calls revokeLeadership twice to make sure its idempotent
2018-02-21 12:48:53 -06:00
Preetha Appan
f59abcc394
Make sure revokeLeadership is called if establishLeadership errors
2018-02-21 12:33:22 -06:00
Alex Dadgar
18bf9647d5
Test autopilots start/stop idempotency
2018-02-21 10:19:30 -08:00
Alex Dadgar
33c5afdb31
Improve autopilot shutdown to be idempotent
2018-02-20 15:51:59 -08:00
Pierre Souchay
a8d3745104
Fixed comments for function maxIndexForService
2018-02-20 23:57:28 +01:00
Pierre Souchay
09351ba9a6
[Revert] Only update services if tags are different
...
This patch did give some better results, but break watches on
the services of a node.
It is possible to apply the same optimization for nodes than
to services (one index per instance), but it would complicate
further the patch.
Let's do it in another PR.
2018-02-20 23:34:42 +01:00
Pierre Souchay
60454b570a
Only update services if tags are different
2018-02-20 23:08:04 +01:00
Pierre Souchay
a05d38737c
Enable Raft index optimization per service name on health endpoint
...
Had to fix unit test in order to check properly indexes.
2018-02-20 01:35:50 +01:00
Paul Banks
de58eb1820
Fixes #3891 : agent monitor no longer unresponsive before logs stream.
...
The root cause is actually that the agent's streaming HTTP API didn't flush until the first log line was found which commonly was pretty soon since the default level is INFO. In cases where there were no logs immediately due to level for instance, the client gets stuck in the HTTP code waiting on a response packet from the server before we enter the loop that checks the shutdown channel from the signal handler.
This fix flushes the initial status immediately on the streaming endpoint which lets the client code get into it's expected state where it's listening for shutdown or log lines.
2018-02-19 21:53:10 +00:00
Pierre Souchay
4f10fae3c3
Get only first service to test whether we have to cleanup index of a service
2018-02-19 22:44:49 +01:00
Pierre Souchay
bac8fb046f
Fixed comment about raftIndex + use test.Helper()
2018-02-19 19:30:25 +01:00
Pierre Souchay
73127ef407
Services Indexes modified per service instead of using a global Index
...
This patch improves the watches for services on large cluster:
each service has now its own index, such watches on a specific service
are not modified by changes in the global catalog.
It should improve a lot the performance of tools such as consul-template
or libraries performing watches on very large clusters with many
services/watches.
2018-02-19 18:29:22 +01:00
Edd Steel
d0f0d67b4a
Clarify comments
2018-02-17 17:46:11 -08:00
Edd Steel
f770f360e9
Test every endpoint for OPTIONS/MethodNotFound
2018-02-17 17:34:13 -08:00
Edd Steel
c5f0bb3711
Allow endpoints to handle OPTIONS/MethodNotFound themselves
2018-02-17 17:34:03 -08:00
Edd Steel
f5af8b0f03
Initialise `allowedMethods` in init()
2018-02-17 17:31:24 -08:00
Kyle Havlovitz
139b98a427
Fix the coordinate update endpoint not passing the ACL token
2018-02-15 11:58:02 -08:00
Edd Steel
77f19f7505
Support OPTIONS requests
...
- register endpoints with supported methods
- support OPTIONS requests, indicating supported methods
- extract method validation (error 405) from individual endpoints
- on 405 where multiple methods are allowed, create a single Allow
header with comma-separated values, not multiple Allow headers.
2018-02-12 10:15:31 -08:00
Andrei Burd
b608091014
adding human readability for dns requests debug log ( #3751 )
2018-02-11 09:02:28 -06:00
Pierre Souchay
b259b1609c
Merge remote-tracking branch 'origin/master' into service_metadata
2018-02-11 13:20:49 +01:00
Pierre Souchay
9a57dfd68a
Fixed TestSanitize unit test
2018-02-11 12:11:11 +01:00
James Phillips
3724e49ddf
Fixes a panic on TCP-based DNS lookups.
...
This came in via the monkey patch in #3861 .
Fixes #3877
2018-02-08 17:57:41 -08:00
Pierre Souchay
66fdf445e8
Added unit tests for structs and fixed PartialClone()
2018-02-09 01:37:45 +01:00
James Phillips
c2a59f1e6c
Addresses additional state mutations.
...
Did a sweep of 84d6ac2d51
and checked them all.
2018-02-07 07:02:10 -08:00
James Phillips
1c6de1d623
Fixes all the racy output-side updates to tags.
2018-02-06 20:35:55 -08:00
James Phillips
11f6961e47
Adds a more robust unit test for index churn.
2018-02-06 20:35:38 -08:00
Pierre Souchay
80dde5465b
Added support for Service Metadata
2018-02-07 01:54:42 +01:00
James Phillips
d9a6e2a901
Makes server manager shift away from failed servers from Serf events.
...
Because this code was doing pointer equality checks, it would work for
the case of a failed attempted RPC because the objects are from the
manager itself:
https://github.com/hashicorp/consul/blob/v1.0.3/agent/consul/rpc.go#L283-L302
But the pointer check would always fail for events coming in from the
Serf path because the server object is newly-created:
https://github.com/hashicorp/consul/blob/v1.0.3/agent/router/serf_adapter.go#L14-L40
This means that we didn't proactively shift RPC traffic away from a
failed server, we'd have to wait for an RPC to fail, which exposes
the error to the calling client.
By switching over to a name check vs. a pointer check we get the correct
behavior. We added a DEBUG log as well to help observe this behavior during
integrated testing.
Related to #3863 since the fix here needed the same logic duplicated, owing
to the complicated atomic stuff.
/cc @dadgar for a heads up in case this also affects Nomad.
2018-02-05 17:56:00 -08:00
James Phillips
fc155dac19
Adds a before/after test for #3845 .
2018-02-05 16:18:29 -08:00
James Phillips
533f65b7a6
Merge pull request #3845 from 42wim/tagfix
...
Fix service tags not added to health check. Part two
2018-02-05 16:18:00 -08:00
Kyle Havlovitz
f6ecaa4a1c
Add enterprise default config section
2018-02-05 13:33:59 -08:00
James Phillips
e748c63fff
Merge pull request #3855 from hashicorp/pr-3782-slackpad
...
Adds support for gRPC health checks.
2018-02-02 17:57:27 -08:00
James Phillips
5f31c8d8d3
Changes "TLS" to "GRPCUseTLS" since it only applies to GRPC checks.
2018-02-02 17:29:34 -08:00
Wim
ce771f1fb3
Fix service tags not added to health check. Part two
2018-01-29 20:32:44 +01:00
Veselkov Konstantin
5f38e1148a
fix refactoring
2018-01-28 22:53:30 +04:00
Veselkov Konstantin
8e16bd7d77
fix refactoring
2018-01-28 22:48:21 +04:00
Veselkov Konstantin
7de57ba4de
remove golint warnings
2018-01-28 22:40:13 +04:00
James Phillips
9cd602de06
Improves user lookup error message.
...
Closes #3188
Closes #3184
2018-01-26 07:56:44 -08:00
Kyle Havlovitz
144e6e7d31
Remove nonvoter from metadata.Server
2018-01-25 17:08:03 -08:00
James Phillips
64acd0ade0
Gets rid of named return parameters.
...
This wasn't wrong before but we don't generally use this style in
Consul.
2018-01-25 14:29:50 -08:00
James Phillips
b443bd1438
Moves non-stdlib includes into their own section.
2018-01-25 14:26:15 -08:00
Kyle Havlovitz
bfeb09983b
Reset clusterHealth when autopilot starts
2018-01-23 12:52:28 -08:00
Kyle Havlovitz
17805e4634
Move autopilot health loop into leader operations
2018-01-23 11:17:41 -08:00
James Phillips
c190b35b0e
Updates web assets to latest.
2018-01-22 14:46:07 -08:00
Kyle Havlovitz
cde1e7ceb6
Merge pull request #3821 from hashicorp/persist-file-handling
...
Add graceful handling of malformed persisted service/check files.
2018-01-22 12:31:33 -08:00
Kyle Havlovitz
f156b12b22
Merge pull request #3820 from hashicorp/serfwan-port-fix
...
Enforce a valid port for the Serf WAN since it can't be disabled.
2018-01-19 15:40:56 -08:00
James Phillips
93fd6bfeb4
Moves the coordinate fetch after the ACL check.
2018-01-19 15:25:22 -08:00
Kyle Havlovitz
68ae92cb8c
Don't remove the files, just log an error
2018-01-19 14:25:51 -08:00
Kyle Havlovitz
8c5be2dd97
Enforce a valid port for the Serf WAN since it can't be disabled.
...
Fixes #3817
2018-01-19 14:22:23 -08:00
Kyle Havlovitz
4e325a6b8f
Add graceful handling of malformed persisted service/check files.
...
Previously a change was made to make the file writing atomic,
but that wasn't enough to cover something like an OS crash so we
needed something here to handle the situation more gracefully.
Fixes #1221 .
2018-01-19 14:07:36 -08:00
James Hartig
aedab91a66
Resolve symlinks in config directory
...
Docker/Openshift/Kubernetes mount the config file as a symbolic link and
IsDir returns true if the file is a symlink. Before calling IsDir, the
symlink should be resolved to determine if it points at a file or
directory.
Fixes #3753
2018-01-12 15:43:38 -05:00
James Phillips
9509aa6c4b
Adds the NodeID field back to the /v1/agent/self Config block.
...
Fixes #3778
2018-01-10 15:17:54 -08:00
James Phillips
ebcd1787db
Adds more info about how to fix the private IP error.
...
Closes #3790
2018-01-10 09:53:41 -08:00
James Phillips
48cfe6ff5f
Fixes crash where body was optional for PQ endpoint (it is not).
...
Fixes #3791
2018-01-10 09:33:49 -08:00
Dmytro Kostiuchenko
1a10b08e82
Add gRPC health-check #3073
2018-01-04 16:42:30 -05:00
Diptanu Choudhury
294151c1ad
Using labels
2017-12-21 20:30:29 -08:00
Diptanu Choudhury
006eab2394
Added telemetry around Catalog APIs
2017-12-21 16:35:12 -08:00
James Phillips
5b88b8df38
Updates the checked in web assets.
2017-12-20 19:51:04 -08:00
James Phillips
6412d8d9aa
Updates the built-in web assets.
2017-12-20 17:48:51 -08:00
James Phillips
7a46d9c1e3
Wraps HTTP mux to ban all non-printable characters from paths.
2017-12-20 15:47:53 -08:00
James Phillips
2edc11b44c
Updates the built-in web UI assets.
2017-12-20 13:43:52 -08:00
James Phillips
da6a4635b0
Fixes a `go fmt` cleanup.
2017-12-20 13:43:38 -08:00
Kyle Havlovitz
11a0c9cc58
Fix vet error
2017-12-18 18:04:42 -08:00
Kyle Havlovitz
77dc52f430
Move autopilot initializing to oss file
2017-12-18 18:02:44 -08:00
Kyle Havlovitz
039e7f1880
Move autopilot setup to a separate file
2017-12-18 16:55:51 -08:00
Kyle Havlovitz
d08ab9fd19
Make some final tweaks to autopilot package
2017-12-18 12:26:47 -08:00
Kyle Havlovitz
a86d11ec0a
Merge pull request #3737 from hashicorp/autopilot-refactor
...
Move autopilot to a standalone package
2017-12-15 14:09:40 -08:00
James Phillips
06f980061e
Merge pull request #3728 from weiwei04/fix_globalRPC_goroutine_leak
...
fix globalRPC goroutine leak
2017-12-14 17:54:19 -08:00
James Phillips
f491a55e47
Merge pull request #3642 from yfouquet/master
...
[Fix] Service tags not added to health checks
2017-12-14 13:59:39 -08:00
James Phillips
ca3f9024ac
Works around mapstructure behavior to enable sessions with no checks.
...
Fixes #3732
2017-12-14 09:07:56 -08:00
Kyle Havlovitz
324c2ecb53
Expose IsPotentialVoter for advanced autopilot logic
2017-12-13 17:53:51 -08:00
James Phillips
98e837167e
Changes maps to merge vs. overwrite when processing configs.
...
Fixes #3716
2017-12-13 16:06:01 -08:00
Kyle Havlovitz
12bf61c851
Merge branch 'master' into autopilot-refactor
2017-12-13 11:54:32 -08:00
Kyle Havlovitz
d6b266c045
A few last autopilot adjustments
2017-12-13 11:19:17 -08:00
Kyle Havlovitz
2310687c1d
More autopilot reorganizing
2017-12-13 10:57:37 -08:00
James Phillips
46742a5041
Adds TODOs referencing #3744 .
2017-12-13 10:52:06 -08:00
James Phillips
2892f91d0b
Copies the autopilot settings from the runtime config.
...
Fixes #3730
2017-12-13 10:32:05 -08:00
Kyle Havlovitz
b92f895c23
More refactoring to make autopilot consul-agnostic
2017-12-12 17:46:28 -08:00
Yoann Fouquet
986148cfe5
[Fix] Service tags not added to health checks
...
Since commit 9685bdcd0b
, service tags are added to the health checks.
Otherwise, when adding a service, tags are not added to its check.
In updateSyncState, we compare the checks of the local agent with the checks of the catalog.
It appears that the service tags are different (missing in one case), and so the check is synchronized.
That increase the ModifyIndex periodically when nothing changes.
Fixed it by adding serviceTags to the check.
Note that the issue appeared in version 0.8.2.
Looks related to #3259 .
2017-12-12 13:39:37 +01:00
Kyle Havlovitz
de28555671
Move autopilot to a standalone package
2017-12-11 16:45:33 -08:00
James Phillips
d12e81860f
Moves Serf helper into lib to fix import cycle in consul-enterprise.
2017-12-07 16:57:58 -08:00
James Phillips
5065f3d82e
Turns of intent queue warnings and enables dynamic queue sizing.
2017-12-07 16:27:06 -08:00
Wei Wei
cc9648c957
fix globalRPC goroutine leak
...
Signed-off-by: Wei Wei <weiwei.inf@gmail.com>
2017-12-05 11:53:30 +08:00
James Phillips
3e46544085
Creates a registration mechanism for snapshot and restore.
2017-11-29 18:36:53 -08:00
James Phillips
f53f521072
Begins split out of snapshots from the main FSM class.
2017-11-29 18:36:53 -08:00
James Phillips
c8e763667f
Creates a registration mechanism for FSM commands.
2017-11-29 18:36:53 -08:00
James Phillips
78292662d7
Moves the FSM into its own package.
...
This will help make it clearer what happens when we add some registration
plumbing for the different operations and snapshots.
2017-11-29 18:36:53 -08:00
James Phillips
e810697e06
Resolves an FSM snapshot TODO.
...
This adds checks for sink write calls before we continue the refactor, which
will resolve the other TODO comment we deleted as part of this change.
2017-11-29 18:36:53 -08:00
James Phillips
aa61159b74
Creates a registration mechanism for schemas.
...
This also splits out the registration into the table-specific source
files.
2017-11-29 18:36:52 -08:00
James Phillips
93ff33b1be
Creates a registration mechanism for RPC endpoints.
2017-11-29 18:36:52 -08:00