* Add function to get update channel for watching HCP Link
* Add MonitorHCPLink function
This function can be called in a goroutine to manage the lifecycle
of the HCP manager.
* Update HCP Manager config in link monitor before starting
This updates HCPMonitorLink so it updates the HCP manager
with an HCP client and management token when a Link is upserted.
* Let MonitorHCPManager handle lifecycle instead of link controller
* Remove cleanup from Link controller and move it to MonitorHCPLink
Previously, the Link Controller was responsible for cleaning up the
HCP-related files on the file system. This change makes it so
MonitorHCPLink handles this cleanup. As a result, we are able to remove
the PlacementEachServer placement strategy for the Link controller
because it no longer needs to do this per-node cleanup.
* Remove HCP Manager dependency from Link Controller
The Link controller does not need to have HCP Manager
as a dependency anymore, so this removes that dependency
in order to simplify the design.
* Add Linked prefix to Linked status variables
This is in preparation for adding a new status type to the
Link resource.
* Add new "validated" status type to link resource
The link resource controller will now set a "validated" status
in addition to the "linked" status. This is needed so that other
components (eg the HCP manager) know when the Link is ready to link
with HCP.
* Fix tests
* Handle new 'EndOfSnapshot' WatchList event
* Fix watch test
* Remove unnecessary config from TestAgent_scadaProvider
Since the Scada provider is now started on agent startup
regardless of whether a cloud config is provided, this removes
the cloud config override from the relevant test.
This change is not exactly related to the changes from this PR,
but rather is something small and sort of related that was noticed
while working on this PR.
* Simplify link watch test and remove sleep from link watch
This updates the link watch test so that it uses more mocks
and does not require setting up the infrastructure for the HCP Link
controller.
This also removes the time.Sleep delay in the link watcher loop in favor
of an error counter. When we receive 10 consecutive errors, we shut down
the link watcher loop.
* Add better logging for link validation. Remove EndOfSnapshot test.
* Refactor link monitor test into a table test
* Add some clarifying comments to link monitor
* Simplify link watch test
* Test a bunch more errors cases in link monitor test
* Use exponential backoff instead of errorCounter in LinkWatch
* Move link watch and link monitor into a single goroutine called from server.go
* Refactor HCP link watcher to use single go-routine.
Previously, if the WatchClient errored, we would've never recovered
because we never retry to create the stream. With this change,
we have a single goroutine that runs for the life of the server agent
and if the WatchClient stream ever errors, we retry the creation
of the stream with an exponential backoff.
Creates a new controller to create ComputedImplicitDestinations resources by
composing ComputedRoutes, Services, and ComputedTrafficPermissions to
infer all ParentRef services that could possibly send some portion of traffic to a
Service that has at least one accessible Workload Identity. A followup PR will
rewire the sidecar controller to make use of this new resource.
As this is a performance optimization, rather than a security feature the following
aspects of traffic permissions have been ignored:
- DENY rules
- port rules (all ports are allowed)
Also:
- Add some v2 TestController machinery to help test complex dependency mappers.
Previously calling `index.New` would return an object with the index information such as the Indexer, whether it was required, and the name of the index as well as a radix tree to store indexed data.
Now the main `Index` type doesn’t contain the radix tree for indexed data. Instead the `IndexedData` method can be used to combine the main `Index` with a radix tree in the `IndexedData` structure.
The cache still only allows configuring the `Index` type and will invoke the `IndexedData` method on the provided indexes to get the structure that the cache can use for actual data management.
All of this makes it now safe to reuse the `index.Index` types.
Decouple xds capacity controller and autopilot
This prevents a potential bug where autopilot deadlocks while attempting
to execute `AutopilotDelegate.NotifyState()` on an xdscapacity controller
that stopped consuming messages.
* Update cache section for certain API calls
Providing more detail to the cache section to address behavior of API calls using streaming backend. This will help users understand that when '?index' is used and '?cached' is not, caching to servers will be bypassed, causing entry_fetch_max_burst and entry_fetch_rate to not be used in this scenario.
* Update website/content/docs/agent/config/config-files.mdx
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
---------
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
* Skip collecting data directory stats in dev mode
In dev mode, the data directory is not set, so this metrics collection would
always fail and logs errors.
* Log collection errors at DEBUG level
There isn't much action a user can take to fix these errors, so
logging them as DEBUG rather than ERROR.
* Panic when controllers attempt to make invalid requests to the resource service
This will help to catch bugs in tests that could cause infinite errors to be emitted.
* Disable the API GW v2 controller
With the previous commit, this would cause a server to panic due to watching a type which has not yet been created/registered.
* Ensure that a test server gets the full type registry instead of constructing its own
* Skip TestServer_ControllerDependencies
* Fix peering tests so that they use the full resource registry.
* NET-7630 - Fix TXT record creation on node queries
* NET-7631 - Fix Node records that point to external/ non-IP addresses
* NET-7630 - Fix TXT record creation on node queries
Fix issue with persisting proxy-defaults
This resolves an issue introduced in hashicorp/consul#19829
where the proxy-defaults configuration entry with an HTTP protocol
cannot be updated after it has been persisted once and a router
exists. This occurs because the protocol field is not properly
pre-computed before being passed into validation functions.
* DNS V2 - Revise discovery result to have service and node name and address fields.
* NET-7488 - dns v2 add support for prepared queries in catalog v1 data model (#20470)
NET-7488 - dns v2 add support for prepared queries in catalog v1 data model.
The endpoints controller currently encodes the list of unique workload identities
referenced by all workload matched by a Service into a special data-bearing
status condition on that Service. This allows a downstream controller to avoid an
expensive watch on the ServiceEndpoints type just to get this data.
The current encoding does not lend itself well to machine parsing, which is what
the field is meant for, so this PR simplifies the encoding from:
"blah blah: " + strings.Join(ids, ",") + "."
to
strings.Join(ids, ",")
It also provides an exported utility function to easily extract this data.
The new controller caches are initialized before the DependencyMappers or the
Reconciler run, but importantly they are not populated. The expectation is that
when the WatchList call is made to the resource service it will send an initial
snapshot of all resources matching a single type, and then perpetually send
UPSERT/DELETE events afterward. This initial snapshot will cycle through the
caching layer and will catch it up to reflect the stored data.
Critically the dependency mappers and reconcilers will race against the restoration
of the caches on server startup or leader election. During this time it is possible a
mapper or reconciler will use the cache to lookup a specific relationship and
not find it. That very same reconciler may choose to then recompute some
persisted resource and in effect rewind it to a prior computed state.
Change
- Since we are updating the behavior of the WatchList RPC, it was aligned to
match that of pbsubscribe and pbpeerstream using a protobuf oneof instead of the enum+fields option.
- The WatchList rpc now has 3 alternating response events: Upsert, Delete,
EndOfSnapshot. When set the initial batch of "snapshot" Upserts sent on a new
watch, those operations will be followed by an EndOfSnapshot event before beginning
the never-ending sequence of Upsert/Delete events.
- Within the Controller startup code we will launch N+1 goroutines to execute WatchList
queries for the watched types. The UPSERTs will be applied to the nascent cache
only (no mappers will execute).
- Upon witnessing the END operation, those goroutines will terminate.
- When all cache priming routines complete, then the normal set of N+1 long lived
watch routines will launch to officially witness all events in the system using the
primed cached.