* Increase timeouts for flakey peering test.
* Various test fixes.
* Fix race condition in reconcilePeering.
This resolves an issue where a peering object in the state store was
incorrectly mutated by a function, resulting in the test being flagged as
failing when the -race flag was used.
* Implement In-Process gRPC for use by controller caching/indexing
This replaces the pipe base listener implementation we were previously using. The new style CAN avoid cloning resources which our controller caching/indexing is taking advantage of to not duplicate resource objects in memory.
To maintain safety for controllers and for them to be able to modify data they get back from the cache and the resource service, the client they are presented in their runtime will be wrapped with an autogenerated client which clones request and response messages as they pass through the client.
Another sizable change in this PR is to consolidate how server specific gRPC services get registered and managed. Before this was in a bunch of different methods and it was difficult to track down how gRPC services were registered. Now its all in one place.
* Fix race in tests
* Ensure the resource service is registered to the multiplexed handler for forwarding from client agents
* Expose peer streaming on the internal handler
* Add a make target to run lint-consul-retry on all the modules
* Cleanup sdk/testutil/retry
* Fix a bunch of retry.Run* usage to not use the outer testing.T
* Fix some more recent retry lint issues and pin to v1.4.0 of lint-consul-retry
* Fix codegen copywrite lint issues
* Don’t perform cleanup after each retry attempt by default.
* Use the common testutil.TestingTB interface in test-integ/tenancy
* Fix retry tests
* Update otel access logging extension test to perform requests within the retry block
* Adding explicit MPL license for sub-package
This directory and its subdirectories (packages) contain files licensed with the MPLv2 `LICENSE` file in this directory and are intentionally licensed separately from the BSL `LICENSE` file at the root of this repository.
* Adding explicit MPL license for sub-package
This directory and its subdirectories (packages) contain files licensed with the MPLv2 `LICENSE` file in this directory and are intentionally licensed separately from the BSL `LICENSE` file at the root of this repository.
* Updating the license from MPL to Business Source License
Going forward, this project will be licensed under the Business Source License v1.1. Please see our blog post for more details at <Blog URL>, FAQ at www.hashicorp.com/licensing-faq, and details of the license at www.hashicorp.com/bsl.
* add missing license headers
* Update copyright file headers to BUSL-1.1
* Update copyright file headers to BUSL-1.1
* Update copyright file headers to BUSL-1.1
* Update copyright file headers to BUSL-1.1
* Update copyright file headers to BUSL-1.1
* Update copyright file headers to BUSL-1.1
* Update copyright file headers to BUSL-1.1
* Update copyright file headers to BUSL-1.1
* Update copyright file headers to BUSL-1.1
* Update copyright file headers to BUSL-1.1
* Update copyright file headers to BUSL-1.1
* Update copyright file headers to BUSL-1.1
* Update copyright file headers to BUSL-1.1
* Update copyright file headers to BUSL-1.1
* Update copyright file headers to BUSL-1.1
---------
Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com>
* Fix namespaced peer service updates / deletes.
This change fixes a function so that namespaced services are
correctly queried when handling updates / deletes. Prior to this
change, some peered services would not correctly be un-exported.
* Add changelog.
Fix issue with peer stream node cleanup.
This commit encompasses a few problems that are closely related due to their
proximity in the code.
1. The peerstream utilizes node IDs in several locations to determine which
nodes / services / checks should be cleaned up or created. While VM deployments
with agents will likely always have a node ID, agentless uses synthetic nodes
and does not populate the field. This means that for consul-k8s deployments, all
services were likely bundled together into the same synthetic node in some code
paths (but not all), resulting in strange behavior. The Node.Node field should
be used instead as a unique identifier, as it should always be populated.
2. The peerstream cleanup process for unused nodes uses an incorrect query for
node deregistration. This query is NOT namespace aware and results in the node
(and corresponding services) being deregistered prematurely whenever it has zero
default-namespace services and 1+ non-default-namespace services registered on
it. This issue is tricky to find due to the incorrect logic mentioned in #1,
combined with the fact that the affected services must be co-located on the same
node as the currently deregistering service for this to be encountered.
3. The stream tracker did not understand differences between services in
different namespaces and could therefore report incorrect numbers. It was
updated to utilize the full service name to avoid conflicts and return proper
results.
Prior to this commit, all peer services were transmitted as connect-enabled
as long as a one or more mesh-gateways were healthy. With this change, there
is now a difference between typical services and connect services transmitted
via peering.
A service will be reported as "connect-enabled" as long as any of these
conditions are met:
1. a connect-proxy sidecar is registered for the service name.
2. a connect-native instance of the service is registered.
3. a service resolver / splitter / router is registered for the service name.
4. a terminating gateway has registered the service.
Protobuf Refactoring for Multi-Module Cleanliness
This commit includes the following:
Moves all packages that were within proto/ to proto/private
Rewrites imports to account for the packages being moved
Adds in buf.work.yaml to enable buf workspaces
Names the proto-public buf module so that we can override the Go package imports within proto/buf.yaml
Bumps the buf version dependency to 1.14.0 (I was trying out the version to see if it would get around an issue - it didn't but it also doesn't break things and it seemed best to keep up with the toolchain changes)
Why:
In the future we will need to consume other protobuf dependencies such as the Google HTTP annotations for openapi generation or grpc-gateway usage.
There were some recent changes to have our own ratelimiting annotations.
The two combined were not working when I was trying to use them together (attempting to rebase another branch)
Buf workspaces should be the solution to the problem
Buf workspaces means that each module will have generated Go code that embeds proto file names relative to the proto dir and not the top level repo root.
This resulted in proto file name conflicts in the Go global protobuf type registry.
The solution to that was to add in a private/ directory into the path within the proto/ directory.
That then required rewriting all the imports.
Is this safe?
AFAICT yes
The gRPC wire protocol doesn't seem to care about the proto file names (although the Go grpc code does tack on the proto file name as Metadata in the ServiceDesc)
Other than imports, there were no changes to any generated code as a result of this.
Previously, we'd begin a session with the xDS concurrency limiter
regardless of whether the proxy was registered in the catalog or in
the server's local agent state.
This caused problems for users who run `consul connect envoy` directly
against a server rather than a client agent, as the server's locally
registered proxies wouldn't be included in the limiter's capacity.
Now, the `ConfigSource` is responsible for beginning the session and we
only do so for services in the catalog.
Fixes: https://github.com/hashicorp/consul/issues/15753
* Protobuf Modernization
Remove direct usage of golang/protobuf in favor of google.golang.org/protobuf
Marshallers (protobuf and json) needed some changes to account for different APIs.
Moved to using the google.golang.org/protobuf/types/known/* for the well known types including replacing some custom Struct manipulation with whats available in the structpb well known type package.
This also updates our devtools script to install protoc-gen-go from the right location so that files it generates conform to the correct interfaces.
* Fix go-mod-tidy make target to work on all modules
During peer stream replication we flatten checks from the source cluster and build one thin overall check to hide the irrelevant details from the consuming cluster. This flattening logic did correctly flip to non-passing if there were any non-passing checks, but WHICH status it got during that was random (warn/error).
Also it didn't represent "maintenance" operations. There is an api package call AggregatedStatus which more correctly flattened check statuses.
This PR replicated the more complete logic into the peer stream package.
Re-add ServerExternalAddresses parameter in GenerateToken endpoint
This reverts commit 5e156772f6
and adds extra functionality to support newer peering behaviors.
* Backport agent tests.
Original commit: 0710b2d12fb51a29cedd1119b5fb086e5c71f632
Original commit: aaedb3c28bfe247266f21013d500147d8decb7cd (partial)
* Backport test fix and reduce flaky failures.
* Regenerate golden files.
* Backport from ENT: "Avoid race"
Original commit: 5006c8c858b0e332be95271ef9ba35122453315b
Original author: freddygv
* Backport from ENT: "chore: fix flake peerstream test"
Original commit: b74097e7135eca48cc289798c5739f9ef72c0cc8
Original author: DanStough
* peering: skip register duplicate node and check from the peer
* Prebuilt the nodes map and checks map to avoid repeated for loop
* use key type to struct: node id, service id, and check id
This commit adds a monotonically increasing nonce to include in peering
replication response messages. Every ack/nack from the peer handling a
response will include this nonce, allowing to correlate the ack/nack
with a specific resource.
At the moment nothing is done with the nonce when it is received. In the
future we may want to add functionality such as retries on NACKs,
depending on the class of error.
This commit adds handling so that the replication stream considers
whether the user intends to peer through mesh gateways.
The subscription will return server or mesh gateway addresses depending
on the mesh configuration setting. These watches can be updated at
runtime by modifying the mesh config entry.