Commit Graph

20920 Commits

Author SHA1 Message Date
Ashesh Vidyut c5cce63777
NET 6761 (#19837)
NET-6761 explicit destinations tests updated
2023-12-12 10:38:00 +05:30
Valeriia Ruban a6d6164ba0
fix: remove test to unblock CI (#19908) 2023-12-11 20:11:36 -08:00
Ronald e13fbc743e
Remove warning for consul 1.17 deprecation (#19897) 2023-12-11 23:28:04 +00:00
Jeff Boruszak 659868ee73
docs: Updates to required ports (#19755)
* improvements

* Anchor link fixes

* Apply suggestions from code review

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Explicit list of six ports

* Apply suggestions from code review

---------

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
2023-12-11 14:42:57 -08:00
Derek Menteer ccb2bf6170
Add documentation for proxy-config-map and xds_fetch_timeout_ms. (#19893)
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
2023-12-11 15:53:35 -06:00
Ronald 195e3aab8c
[NET-6842] splitting go version on different lines (#19887) 2023-12-11 11:15:32 -05:00
Derek Menteer dfab5ade50
Fix ClusterLoadAssignment timeouts dropping endpoints. (#19871)
When a large number of upstreams are configured on a single envoy
proxy, there was a chance that it would timeout when waiting for
ClusterLoadAssignments. While this doesn't always immediately cause
issues, consul-dataplane instances appear to consistently drop
endpoints from their configurations after an xDS connection is
re-established (the server dies, random disconnect, etc).

This commit adds an `xds_fetch_timeout_ms` config to service registrations
so that users can set the value higher for large instances that have
many upstreams. The timeout can be disabled by setting a value of `0`.

This configuration was introduced to reduce the risk of causing a
breaking change for users if there is ever a scenario where endpoints
would never be received. Rather than just always blocking indefinitely
or for a significantly longer period of time, this config will affect
only the service instance associated with it.
2023-12-11 09:25:11 -06:00
John Murret 5ec84dbfd8
security: update supported envoy version 1.28.0 in addition to 1.25.11, 1.26.6, 1.27.2, 1.28.0 to address CVE-2023-44487 (#19879)
* update too support envoy 1.28.0

* add changelog

* update docs
2023-12-08 14:42:04 -07:00
Michael Zalimeni 1d9234a87a
ci: sanitize commit message for Slack failure alerts (#19876)
To ensure that shell code cannot be injected, capture the commit message
in an env var, then format it as needed.

Also fix several other issues with formatting and JSON escaping by
wrapping the entire message in a `toJSON` expression.
2023-12-08 16:04:45 -05:00
Derek Menteer 0ac958f27b
Fix xDS missing endpoint race condition. (#19866)
This fixes the following race condition:
- Send update endpoints
- Send update cluster
- Recv ACK endpoints
- Recv ACK cluster

Prior to this fix, it would have resulted in the endpoints NOT existing in
Envoy. This occurred because the cluster update implicitly clears the endpoints
in Envoy, but we would never re-send the endpoint data to compensate for the
loss, because we would incorrectly ACK the invalid old endpoint hash. Since the
endpoint's hash did not actually change, they would not be resent.

The fix for this is to effectively clear out the invalid pending ACKs for child
resources whenever the parent changes. This ensures that we do not store the
child's hash as accepted when the race occurs.

An escape-hatch environment variable `XDS_PROTOCOL_LEGACY_CHILD_RESEND` was
added so that users can revert back to the old legacy behavior in the event
that this produces unknown side-effects. Visit the following thread for some
extra context on why certainty around these race conditions is difficult:
https://github.com/envoyproxy/envoy/issues/13009

This bug report and fix was mostly implemented by @ksmiley with some minor
tweaks.

Co-authored-by: Keith Smiley <ksmiley@salesforce.com>
2023-12-08 11:37:12 -06:00
cskh 0ca070b301
upgrade test(LTS): add segments to version 1.10 (#19861) 2023-12-08 12:22:16 -05:00
Matt Keeler d4fda945bb
Fix a test flake where a retry timer was being reused causing tests after the first to exit early (#19864)
Fix a test flake where a retry timer was being reused causing tests after the first to exit too early.
2023-12-08 11:31:59 -05:00
Thomas Eckert 8125a32a4e
Add CE version of Gateway Upstream Disambiguation (#19860)
* Add CE version of gateway-upstream-disambiguation

* Use NamespaceOrDefault and PartitionOrDefault

* Add Changelog entry

* Remove the unneeded reassignment

* Use c.ID()
2023-12-07 17:56:14 -05:00
Dhia Ayachi d93f7f730d
parse config protocol on write to optimize disco-chain compilation (#19829)
* parse config protocol on write to optimize disco-chain compilation

* add changelog
2023-12-07 13:46:46 -05:00
Matt Keeler bfad6a4e07
Ensure that the default namespace always exists even prior to resource creation (#19852) 2023-12-07 13:23:06 -05:00
Poonam Jadhav 06b3038643
Net-6730/namespace intg test (#19798)
test: add intg test for namespace lifecycle
2023-12-07 13:12:45 -05:00
Michael Zalimeni 645cbf9098
chore: update changelog for patch releases (#19855)
* 1.16.3
* 1.15.7
* 1.14.11
2023-12-07 12:43:33 -05:00
Tauhid Anjum ab68ddff91
NET-6784: Adding cli command to list exported services to a peer (#19821)
* Adding cli command to list exported services to a peer

* Changelog added

* Addressing docs comments

* Adding test case for no exported services scenario
2023-12-07 12:55:15 +05:30
Michael Zalimeni 3a78446114
ci: fix escaping for Slack failure notifications (#19838)
Allow '()', '#', and other bash-interpretable special characters by
properly quoting the commit message when shortening.
2023-12-06 21:00:30 +00:00
cskh 04d4412afd
NET-6643: upgrade test from 1.10 to 1.15 (lts) of a single cluster (#19847)
* NET-6643: upgrade test from 1.10 to 1.15 (lts) of a single cluster

* license header
2023-12-06 19:45:37 +00:00
Ronald 053367a3b2
[NET-6650] Bump go version to 1.20.12 (#19840) 2023-12-06 13:22:00 -05:00
Jared Kirschner d3e658b0e7
improve client RPC metrics consistency (#19721)
The client.rpc metric now excludes internal retries for consistency
with client.rpc.exceeded and client.rpc.failed. All of these metrics
now increment at most once per RPC method call, allowing for
accurate calculation of failure / rate limit application occurrence.

Additionally, if an RPC fails because no servers are present,
client.rpc.failed is now incremented.
2023-12-06 13:21:08 -05:00
Matt Keeler efe279f802
Retry lint fixes (#19151)
* Add a make target to run lint-consul-retry on all the modules
* Cleanup sdk/testutil/retry
* Fix a bunch of retry.Run* usage to not use the outer testing.T
* Fix some more recent retry lint issues and pin to v1.4.0 of lint-consul-retry
* Fix codegen copywrite lint issues
* Don’t perform cleanup after each retry attempt by default.
* Use the common testutil.TestingTB interface in test-integ/tenancy
* Fix retry tests
* Update otel access logging extension test to perform requests within the retry block
2023-12-06 12:11:32 -05:00
Ronald dc02fa695f
[NET-6251] Nomad client templated policy (#19827) 2023-12-06 10:32:12 -05:00
aahel 334de1460c
update l7expplicit dest test to test cross tenancy (#19834) 2023-12-06 06:42:19 +00:00
Ashesh Vidyut 6c88122fdb
NET-3860 - [Supportability] consul troubleshoot CLI for verifying ports (#18329)
* init

* udp

* added support for custom port

* removed grpc

* rename constants

* removed udp

* added change log

* fix synopsis

* pr comment chagnes

* make private

* added tests

* added one more test case

* defer close results channel

* removed unwanted comment

* licence update

* updated docs

* fix indent

* fix path

* example update

* Update website/content/commands/troubleshoot/ports.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update website/content/commands/troubleshoot/ports.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update command/troubleshoot/ports/troubleshoot_ports.go

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update website/content/commands/troubleshoot/ports.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update website/content/commands/troubleshoot/index.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update command/troubleshoot/ports/troubleshoot_ports.go

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update command/troubleshoot/ports/troubleshoot_ports.go

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update website/content/commands/troubleshoot/ports.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update website/content/commands/troubleshoot/ports.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update website/content/commands/troubleshoot/ports.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* pr comment resolved

---------

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
2023-12-06 11:12:15 +05:30
cskh b5edf5cd10
doc: clarify the portNames used in trafficpermission V2 (#19807)
* doc: clarify the portNames used in trafficpermission V2 and fix broken links and examples
2023-12-05 19:21:52 +00:00
Semir Patel c1bbda8128
resource: block default namespace deletion + test refactorings (#19822) 2023-12-05 14:00:06 -05:00
Michael Zalimeni aca8a185ca
ci: fix test failure Slack notifications (#19766)
- Skip notifications for cancelled workflows. Cancellation can be
manual or caused by branch concurrency limits.
- Fix multi-line JSON parsing error by only printing the summary line
of the commit message. We do not need more than this in Slack.
- Update Slack webhook name to match purpose.
2023-12-05 10:24:04 -05:00
aahel 649aa5655f
skip TestCatalogUpgrade for consul versions < 1.18.0 (#19811)
skip TestCatalogUpgrade for conul versions < 1.18.0
2023-12-04 18:27:36 +00:00
lornasong edf4610ed9
[Cloud][CC-6925] Updates to pushing server state (#19682)
* Upgrade hcp-sdk-go to latest version v0.73

Changes:
- go get github.com/hashicorp/hcp-sdk-go
- go mod tidy

* From upgrade: regenerate protobufs for upgrade from 1.30 to 1.31

Ran: `make proto`

Slack: https://hashicorp.slack.com/archives/C0253EQ5B40/p1701105418579429

* From upgrade: fix mock interface implementation

After upgrading, there is the following compile error:

cannot use &mockHCPCfg{} (value of type *mockHCPCfg) as "github.com/hashicorp/hcp-sdk-go/config".HCPConfig value in return statement: *mockHCPCfg does not implement "github.com/hashicorp/hcp-sdk-go/config".HCPConfig (missing method Logout)

Solution: update the mock to have the missing Logout method

* From upgrade: Lint: remove usage of deprecated req.ServerState.TLS

Due to upgrade, linting is erroring due to usage of a newly deprecated field

22:47:56 [consul]: make lint
--> Running golangci-lint (.)
agent/hcp/testing.go:157:24: SA1019: req.ServerState.TLS is deprecated: use server_tls.internal_rpc instead. (staticcheck)
                time.Until(time.Time(req.ServerState.TLS.CertExpiry)).Hours()/24,
                                     ^

* From upgrade: adjust oidc error message

From the upgrade, this test started failing:

=== FAIL: internal/go-sso/oidcauth TestOIDC_ClaimsFromAuthCode/failed_code_exchange (re-run 2) (0.01s)
    oidc_test.go:393: unexpected error: Provider login failed: Error exchanging oidc code: oauth2: "invalid_grant" "unexpected auth code"

Prior to the upgrade, the error returned was:
```
Provider login failed: Error exchanging oidc code: oauth2: cannot fetch token: 401 Unauthorized\nResponse: {\"error\":\"invalid_grant\",\"error_description\":\"unexpected auth code\"}\n
```

Now the error returned is as below and does not contain "cannot fetch token"
```
Provider login failed: Error exchanging oidc code: oauth2: "invalid_grant" "unexpected auth code"

```

* Update AgentPushServerState structs with new fields

HCP-side changes for the new fields are in:
https://github.com/hashicorp/cloud-global-network-manager-service/pull/1195/files

* Minor refactor for hcpServerStatus to abstract tlsInfo into struct

This will make it easier to set the same tls-info information to both
 - status.TLS (deprecated field)
 - status.ServerTLSMetadata (new field to use instead)

* Update hcpServerStatus to parse out information for new fields

Changes:
 - Improve error message and handling (encountered some issues and was confused)
 - Set new field TLSInfo.CertIssuer
 - Collect certificate authority metadata and set on TLSInfo.CertificateAuthorities
 - Set TLSInfo on both server.TLS and server.ServerTLSMetadata.InternalRPC

* Update serverStatusToHCP to convert new fields to GNM rpc

* Add changelog

* Feedback: connect.ParseCert, caCerts

* Feedback: refactor and unit test server status

* Feedback: test to use expected struct

* Feedback: certificate with intermediate

* Feedback: catch no leaf, remove expectedErr

* Feedback: update todos with jira ticket

* Feedback: mock tlsConfigurator
2023-12-04 10:25:18 -05:00
aahel 7936e55807
added node health resource (#19803) 2023-12-02 11:14:03 +05:30
Jeff Boruszak 65c06f67e6
docs: improvements to v2 catalog explanation (#19678)
* commit

* Addresses comments from review
2023-12-01 14:35:44 -08:00
Ashesh Vidyut 82f6a8d7f3
Net 6585 (#19797)
Add multi tenancy to sidecar proxy controller
2023-12-01 21:28:57 +05:30
aahel ac9261ac3e
made node parition scoped (#19794)
* made node parition scoped

* removed namespace from node testdata
2023-12-01 07:42:29 +00:00
Manoj Srinivasamurthy c9f85eb925
NET-6692: Ensure 'upload test results' step is always run (#19783) 2023-12-01 09:23:25 +05:30
emily neil 2eebdb22ba
Remove Duplicate UBI Tags (#19737)
- Amalgamate UBI with Dockerhub and Redhat tags into one step
- Avoids a production incident that errors on duplicate tags:
https://github.com/hashicorp/releng-support/issues/123
2023-11-30 14:49:40 -08:00
Semir Patel 2d1f308138
resource: add v2tenancy feature flag to deployer tests (#19774) 2023-11-30 11:41:30 -06:00
Matt Keeler 8f7f15e430
Pin lint-consul-retry to v1.3.0 (#19781)
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2023-11-29 22:44:22 +00:00
Jeff Apple 790cb30173
Docs: FIPS - add cluster peering info (#19768)
* Docs: FIPS - add cluster peering info

* Update website/content/docs/enterprise/fips.mdx

Co-authored-by: David Yu <dyu@hashicorp.com>

* Update website/content/docs/enterprise/fips.mdx

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>

* Update website/content/docs/enterprise/fips.mdx

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>

---------

Co-authored-by: David Yu <dyu@hashicorp.com>
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
2023-11-29 13:08:47 -08:00
John Maguire 69b1d2072b
[V2] Move resource field on gateway class config from repeated map to single map (#19773)
Move resource field on gateway class config from repeated map to single
map
2023-11-29 18:12:42 +00:00
Michael Zalimeni 54f13ebaa5
docs: Rename locality docs observe section to verification (#19769)
* docs: Rename locality docs observe section to verification

Follow-up to #19605 review.

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
2023-11-29 17:16:51 +00:00
Michael Zalimeni d1f2fa1841
[NET-6725] test: Address occasional flakes in sidecarproxy/controller_test.go (#19760)
test: Address occasional flakes in sidecarproxy/controller_test.go

We've observed an occasional flake in this test where some state check
fails. Adding in some wait wrappers to these state checks will hopefully
address the issue, assuming it is a simple flake.
2023-11-29 16:56:14 +00:00
John Maguire a0240e3794
[NET-5688] APIGateway UI Topology Fixes (#19657)
* Update catalog and ui endpoints to show APIGateway in gateway service
topology view

* Added initial implementation for service view

* updated ui

* Fix topology view for gateways

* Adding tests for gw controller

* remove unused args

* Undo formatting changes

* Fix call sites for upstream/downstream gw changes

* Add config entry tests

* Fix function calls again

* Move from ServiceKey to ServiceName, cleanup from PR review

* Add additional check for length of services in bound apigateway for
IsSame comparison

* fix formatting for proto

* gofmt

* Add DeepCopy for retrieved BoundAPIGateway

* gofmt

* gofmt

* Rename function to be more consistent
2023-11-28 21:27:14 +00:00
sarahalsmiller fd1d97c334
Add Kubebuilder tags to Gatewayclassconfig proto messages (#19725)
* add build tags/import k8s specific proto packages

* fix generated import paths

* fix gomod linting issue

* mod tidy every go mod file

* revert protobuff version, take care of in different pr

* cleaned up new lines

* added newline to end of file
2023-11-28 14:46:11 -06:00
hc-github-team-es-release-engineering 39136f46fe
license file updates (#19750) 2023-11-28 11:59:45 -08:00
Michael Zalimeni 66306a8ac2
[NET-5916] docs: Add locality examples and troubleshooting (#19605)
docs: Add locality examples and troubleshooting

Add further examples and tips for locality-aware routing configuration,
observability, and troubleshooting.
2023-11-28 19:15:24 +00:00
wangxinyi7 9dc24448ae
grpc client default in plaintext mode (#19412)
* grpc client default in plaintext mode

* renaming and fix linter

* update the description and remove the context

* trim tests
2023-11-28 10:58:57 -08:00
Thomas Eckert 419677cc9e
[NET-6420] Add MeshConfiguration Controller stub (#19745)
* Add meshconfiguration/controller

* Add MeshConfiguration Registration function

* Fix the TODOs on the RegisterMeshGateway function

* Call RegisterMeshConfiguration

* Add comment to MeshConfigurationRegistration

* Add a test for Reconcile and some comments
2023-11-28 18:56:07 +00:00
Chris S. Kim 5107764115
Move test setup out of subtest (#19753) 2023-11-28 18:39:37 +00:00