Commit Graph

47 Commits

Author SHA1 Message Date
Freddy c9349e353b Avoid panic on concurrent writes to cached service config map (#10647)
If multiple instances of a service are co-located on the same node then
their proxies will all share a cache entry for their resolved service
configuration. This is because the cache key contains the name of the
watched service but does not take into account the ID of the watching
proxies.

This means that there will be multiple agent service manager watches
that can wake up on the same cache update. These watchers then
concurrently modify the value in the cache when merging the resolved
config into the local proxy definitions.

To avoid this concurrent map write we will only delete the key from
opaque config in the local proxy definition after the merge, rather
than from the cached value before the merge.
2021-07-20 16:10:37 +00:00
Freddy 168073c4dc Add flag for transparent proxies to dial individual instances (#10329) 2021-06-09 20:39:37 +00:00
freddygv 7bd51ff536 Replace TransparentProxy bool with ProxyMode
This PR replaces the original boolean used to configure transparent
proxy mode. It was replaced with a string mode that can be set to:

- "": Empty string is the default for when the setting should be
defaulted from other configuration like config entries.
- "direct": Direct mode is how applications originally opted into the
mesh. Proxy listeners need to be dialed directly.
- "transparent": Transparent mode enables configuring Envoy as a
transparent proxy. Traffic must be captured and redirected to the
inbound and outbound listeners.

This PR also adds a struct for transparent proxy specific configuration.
Initially this is not stored as a pointer. Will revisit that decision
before GA.
2021-04-12 09:35:14 -06:00
freddygv a6388c7e2f Revert "Avoid accumulating synthetic upstreams"
This reverts commit 86672df4fa.
2021-04-07 14:30:30 -06:00
freddygv 86672df4fa Avoid accumulating synthetic upstreams
Synthetic upstreams from service-defaults config are stored locally in
the Upstreams list. Since these come from service-defaults they should
be cleaned up locally when no longer present in the service config
response.
2021-04-07 09:32:48 -06:00
freddygv 986bcccbea Pass down upstream defaults to client proxies
This is needed in case the client proxy is in TransparentProxy mode.
Typically they won't have explicit configuration for every upstream, so
this ensures the settings can be applied to all of them when generating
xDS config.
2021-04-07 09:32:47 -06:00
freddygv 2b49cc39ed Fixup doc phrasing 2021-04-07 09:32:47 -06:00
freddygv 52bf00de8b Split up normalizing from defaulting values for upstream cfg 2021-03-17 21:37:55 -06:00
Freddy 8207b832df
Add TransparentProxy option to proxy definitions 2021-03-17 17:01:45 -06:00
freddygv 770c5552d6 Update service manager to pass MeshGateway with config req 2021-03-15 16:08:03 -06:00
freddygv 6090cfcf68 PR comments 2021-03-15 16:02:03 -06:00
freddygv df1f3995f8 Update service manager to store centrally configured upstreams 2021-03-11 11:37:21 -07:00
freddygv 6fd30d0384 Add TransparentProxy opt to proxy definition 2021-03-11 11:37:21 -07:00
freddygv acec711a6a Update server-side config resolution and client-side merging 2021-03-10 21:05:11 -07:00
Daniel Nephin 8c45f2c1fe agent: use the new lib/mutex for stateLock
Previously the ServiceManager had to run a separate goroutine so that it could block on a channel
send/receive instead of a lock. Using this mutex with TryLock allows us to cancel the lock when
the serviceConfigWatch is stopped.

Without this change removing the ServiceManager.Start goroutine would not be possible because
when AddService is called it acquires the stateLock. While that lock is held, if there are
existing watches for the service, the old watch will be stopped, and the goroutine holding the
lock will attempt to wait for that watcher goroutine to exit.

If the goroutine is handling an update (serviceConfigWatch.handleUpdate) then it can block on
acquiring the stateLock and deadlock the agent.  With this change the context is cancelled as
and the goroutine will exit instead of waiting on the stateLock.
2021-01-25 18:01:47 -05:00
Daniel Nephin 4e42b8f1d4 agent: remove ServiceManager goroutine
The ServiceManager.Start goroutine was used to serialize calls to
agent.addServiceInternal.

All the goroutines which sent events to the channel would block waiting
for a response from that same goroutine, which is effectively the same
as a synchronous call without any channels.

This commit removes the goroutine and channels, and instead calls
addServiceInternal directly. Since all of these goroutines will need to
take the agent.stateLock, the mutex handles the serializing of calls.
2021-01-25 18:01:47 -05:00
Daniel Nephin 45b315060a agent: Minor cosmetic changes in ServiceManager
Also use the non-deprecated func in a test
2021-01-25 18:01:47 -05:00
Daniel Nephin 8b67231e8c agent: update godoc for AddServiceRequest 2021-01-25 18:01:03 -05:00
Daniel Nephin f34703fc63 agent: move checkStateSnapshot
Move the field into the struct for addServiceLocked. Also don't require
setting a default value, so that the callers can leave it as nil if they
don't already have a snapshot.
2021-01-25 18:01:03 -05:00
Daniel Nephin 1495291054 agent: move two fields off of AddServiceRequest 2021-01-25 18:01:03 -05:00
Daniel Nephin 60c6a1c220 agent: Replace two fields on AddServiceRequest with a func field
The two previous fields were mutually exclusive. They can be represented
with a single function which provides the value.
2021-01-25 18:01:03 -05:00
Daniel Nephin d0386ae025 agent: remove serviceRegiration type
Replace with the existing AddServiceRequest struct. These structs are
almost identical. Additionally, the only reason the serviceRegistration
struct existed was to recreate an AddServiceRequest.

By storing and re-using the AddServiceRequest we remove the need to
translate into one type and back to the original type.

We also remove the extra parameters to a function, because those values
are already available from the AddServiceRequest field.

Also a minor optimization to only call tokens.AgentToken() when
necessary. Previous it was being called every time, but the value was
being ignored if the AddServiceRequest had a token.
2021-01-25 18:01:03 -05:00
Daniel Nephin 5f9930a9be agent: remove an a branch in the AddService flow
Handle the decision to use ServiceManager in a single place. Instead of
calling ServiceManager.AddService, then calling back into
addServiceInternal, only call ServiceManager.AddService if we are going
to use it.

This change removes some small duplication and removes a branch from the
AddService flow.
2021-01-25 18:01:03 -05:00
Daniel Nephin 4da0718c57 agent: use fields directly, not temp variables
The temprorary variables make it much harder to trace where and how struct
fields are used. If a field is only used a small number of times than
refer to the field directly.
2021-01-25 17:25:04 -05:00
Daniel Nephin 5b6f806f4f agent: addServiceIternalRequest
Move fields that are only relevant for addServiceInternal onto a new
struct and embed AddServiceRequest.
2021-01-25 17:25:04 -05:00
Daniel Nephin 3d39359bcb agent: move deprecated AddServiceFromSource to a test file
The method is only used in tests, and only exists for legacy calls.

There was one other package which used this method in tests. Export
the AddServiceRequest and a couple of its fields so the new function can
be used in those tests.
2021-01-25 17:25:03 -05:00
R.B. Boyer 7eef25daf5
agent: when enable_central_service_config is enabled ensure agent reload doesn't revert check state to critical (#8747)
Likely introduced when #7345 landed.
2020-09-24 16:24:04 -05:00
Daniel Nephin 73cd0b6fac agent/service_manager: remove 'updateCh' field from serviceConfigWatch
Passing the channel to the function which uses it significantly
reduces the scope of the variable, and makes its usage more explicit. It
also moves the initialization of the channel closer to where it is used.

Also includes a couple very small cleanups to remove a local var and
read the error from `ctx.Err()` directly instead of creating a channel
to check for an error.
2020-06-16 12:15:57 -04:00
Daniel Nephin 26291a8482 agent/service_manager: remove 'defaults' field from serviceConfigWatch
This field was always read by the same function that populated the field,
so it does not need to be a field. Passing the value as an argument to
functions makes it more obvious where the value comes from, and also reduces
the scope of the variable significantly.
2020-06-16 12:15:52 -04:00
Daniel Nephin 93235da253 agent/service_manager: Pass ctx around
[The documentation for context](https://golang.org/pkg/context/)
recommends not storing context in a struct field:

> Do not store Contexts inside a struct type; instead, pass a Context
> explicitly to each function that needs it. The Context should be the
> first parameter, typically named ctx...

Sometimes there are good reasons to not follow this recommendation, but
in this case it seems easy enough to follow.

Also moved the ctx argument to be the first in one of the function calls
to follow the same recommendation.
2020-06-16 12:14:00 -04:00
Matt Keeler 8837907de4
Make the Agent Cache more Context aware (#8092)
Blocking queries issues will still be uncancellable (that cannot be helped until we get rid of net/rpc). However this makes it so that if calling getWithIndex (like during a cache Notify go routine) we can cancell the outer routine. Previously it would keep issuing more blocking queries until the result state actually changed.
2020-06-15 11:01:25 -04:00
Freddy 18d356899c
Enable CLI to register terminating gateways (#7500)
* Enable CLI to register terminating gateways

* Centralize gateway proxy configuration
2020-03-26 10:20:56 -06:00
Pierre Souchay 864f7efffa
agent: configuration reload preserves check's statuses for services (#7345)
This fixes issue #7318

Between versions 1.5.2 and 1.5.3, a regression has been introduced regarding health
of services. A patch #6144 had been issued for HealthChecks of nodes, but not for healthchecks
of services.

What happened when a reload was:

1. save all healthcheck statuses
2. cleanup everything
3. add new services with healthchecks

In step 3, the state of healthchecks was taken into account locally,
so at step 3, but since we cleaned up at step 2, state was lost.

This PR introduces the snap parameter, so step 3 can use information from step 1
2020-03-09 12:59:41 +01:00
Chris Piraino 401221de58
Allow users to configure either unstructured or JSON logging (#7130)
* hclog Allow users to choose between unstructured and JSON logging
2020-01-28 17:50:41 -06:00
Matt Keeler c09693e545
Updates to Config Entries and Connect for Namespaces (#7116) 2020-01-24 10:04:58 -05:00
Matt Keeler 5934f803bf
Sync of OSS changes to support namespaces (#6909) 2019-12-09 21:26:41 -05:00
Freddy fdd10dd8b8
Expose HTTP-based paths through Connect proxy (#6446)
Fixes: #5396

This PR adds a proxy configuration stanza called expose. These flags register
listeners in Connect sidecar proxies to allow requests to specific HTTP paths from outside of the node. This allows services to protect themselves by only
listening on the loopback interface, while still accepting traffic from non
Connect-enabled services.

Under expose there is a boolean checks flag that would automatically expose all
registered HTTP and gRPC check paths.

This stanza also accepts a paths list to expose individual paths. The primary
use case for this functionality would be to expose paths for third parties like
Prometheus or the kubelet.

Listeners for requests to exposed paths are be configured dynamically at run
time. Any time a proxy, or check can be registered, a listener can also be
created.

In this initial implementation requests to these paths are not
authenticated/encrypted.
2019-09-25 20:55:52 -06:00
R.B. Boyer 5882e21b2b
agent: tolerate more failure scenarios during service registration with central config enabled (#6472)
Also:

* Finished threading replaceExistingChecks setting (from GH-4905)
  through service manager.

* Respected the original configSource value that was used to register a
  service or a check when restoring persisted data.

* Run several existing tests with and without central config enabled
  (not exhaustive yet).

* Switch to ioutil.ReadFile for all types of agent persistence.
2019-09-24 10:04:48 -05:00
Aestek 7c7b7f24fd Add option to register services and their checks idempotently (#4905) 2019-09-02 09:38:29 -06:00
R.B. Boyer 913d85ea5b
connect: allow mesh gateways to use central config (#6302) 2019-08-09 15:07:01 -05:00
Matt Keeler 8d953f5840 Implement Mesh Gateways
This includes both ingress and egress functionality.
2019-07-01 16:28:30 -04:00
Paul Banks 0cfb6051ea Add integration test for central config; fix central config WIP (#5752)
* Add integration test for central config; fix central config WIP

* Add integration test for central config; fix central config WIP

* Set proxy protocol correctly and begin adding upstream support

* Add upstreams to service config cache key and start new notify watcher if they change.

This doesn't update the tests to pass though.

* Fix some merging logic get things working manually with a hack (TODO fix properly)

* Simplification to not allow enabling sidecars centrally - it makes no sense without upstreams anyway

* Test compile again and obvious ones pass. Lots of failures locally not debugged yet but may be flakes. Pushing up to see what CI does

* Fix up service manageer and API test failures

* Remove the enable command since it no longer makes much sense without being able to turn on sidecar proxies centrally

* Remove version.go hack - will make integration test fail until release

* Remove unused code from commands and upstream merge

* Re-bump version to 1.5.0
2019-05-01 16:39:31 -07:00
Kyle Havlovitz cba47aa0ca Fix a race in the ready logic 2019-04-24 06:48:11 -07:00
Kyle Havlovitz c269369760 Make central service config opt-in and rework the initial registration 2019-04-24 06:11:08 -07:00
Kyle Havlovitz b58572afbd Fix a race in the service updates 2019-04-23 03:31:24 -07:00
Kyle Havlovitz 88e1d8ce03 Fill out the service manager functionality and fix tests 2019-04-23 00:17:28 -07:00
Kyle Havlovitz 7c25869e67 Add the service registration manager to the agent 2019-04-23 00:17:27 -07:00