consul/test/integration/connect/envoy/windows-troubleshooting.md
Ashesh Vidyut 47d445d680
Envoy Integration Test Windows (#18007)
* [CONSUL-395] Update check_hostport and Usage (#40)

* [CONSUL-397] Copy envoy binary from Image (#41)

* [CONSUL-382] Support openssl in unique test dockerfile (#43)

* [CONSUL-405] Add bats to single container (#44)

* [CONSUL-414] Run Prometheus Test Cases and Validate Changes (#46)

* [CONSUL-410] Run Jaeger in Single container (#45)

* [CONSUL-412] Run test-sds-server in single container (#48)

* [CONSUL-408] Clean containers (#47)

* [CONSUL-384] Rebase and sync fork (#50)

* [CONSUL-415] Create Scenarios Troubleshooting Docs (#49)

* [CONSUL-417] Update Docs Single Container (#51)

* [CONSUL-428] Add Socat to single container (#54)

* [CONSUL-424] Replace pkill in kill_envoy function (#52)

* [CONSUL-434] Modify Docker run functions in Helper script (#53)

* [CONSUL-435] Replace docker run in set_ttl_check_state & wait_for_agent_service_register functions (#55)

* [CONSUL-438] Add netcat (nc) in the Single container Dockerfile (#56)

* [CONSUL-429] Replace Docker run with Docker exec (#57)

* [CONSUL-436] Curl timeout and run tests (#58)

* [CONSUL-443] Create dogstatsd Function (#59)

* [CONSUL-431] Update Docs Netcat (#60)

* [CONSUL-439] Parse nc Command in function (#61)

* [CONSUL-463] Review curl Exec and get_ca_root Func (#63)

* [CONSUL-453] Docker hostname in Helper functions (#64)

* [CONSUL-461] Test wipe volumes without extra cont (#66)

* [CONSUL-454] Check ports in the Server and Agent containers (#65)

* [CONSUL-441] Update windows dockerfile with version (#62)

* [CONSUL-466] Review case-grpc Failing Test (#67)

* [CONSUL-494] Review case-cfg-resolver-svc-failover (#68)

* [CONSUL-496] Replace docker_wget & docker_curl (#69)

* [CONSUL-499] Cleanup Scripts - Remove nanoserver (#70)

* [CONSUL-500] Update Troubleshooting Docs (#72)

* [CONSUL-502] Pull & Tag Envoy Windows Image (#73)

* [CONSUL-504] Replace docker run in docker_consul (#76)

* [CONSUL-505] Change admin_bind

* [CONSUL-399] Update envoy to 1.23.1 (#78)

* [CONSUL-510] Support case-wanfed-gw on Windows (#79)

* [CONSUL-506] Update troubleshooting Documentation (#80)

* [CONSUL-512] Review debug_dump_volumes Function (#81)

* [CONSUL-514] Add zipkin to Docker Image (#82)

* [CONSUL-515] Update Documentation (#83)

* [CONSUL-529] Support case-consul-exec (#86)

* [CONSUL-530] Update Documentation (#87)

* [CONSUL-530] Update default consul version 1.13.3

* [CONSUL-539] Cleanup (#91)

* [CONSUL-546] Scripts Clean-up (#92)

* [CONSUL-491] Support admin_access_log_path value for Windows (#71)

* [CONSUL-519] Implement mkfifo Alternative (#84)

* [CONSUL-542] Create OS Specific Files for Envoy Package (#88)

* [CONSUL-543] Create exec_supported.go (#89)

* [CONSUL-544] Test and Build Changes (#90)

* Implement os.DevNull

* using mmap instead of disk files

* fix import in exec-unix

* fix nmap open too many arguemtn

* go fmt on file

* changelog file

* fix go mod

* Update .changelog/17694.txt

Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>

* different mmap library

* fix bootstrap json

* some fixes

* chocolatey version fix and image fix

* using different library

* fix Map funciton call

* fix mmap call

* fix tcp dump

* fix tcp dump

* windows tcp dump

* Fix docker run

* fix tests

* fix go mod

* fix version 16.0

* fix version

* fix version dev

* sleep to debug

* fix sleep

* fix permission issue

* fix permission issue

* fix permission issue

* fix command

* fix command

* fix funciton

* fix assert config entry status command not found

* fix command not found assert_cert_has_cn

* fix command not found assert_upstream_missing

* fix command not found assert_upstream_missing_once

* fix command not found get_upstream_endpoint

* fix command not found get_envoy_public_listener_once

* fix command not found

* fix test cases

* windows integration test workflow github

* made code similar to unix using npipe

* fix go.mod

* fix dialing of npipe

* dont wait

* check size of written json

* fix undefined n

* running

* fix dep

* fix syntax error

* fix workflow file

* windows runner

* fix runner

* fix from json

* fix runs on

* merge connect envoy

* fix cin path

* build

* fix file name

* fix file name

* fix dev build

* remove unwanted code

* fix upload

* fix bin name

* fix path

* checkout current branch

* fix path

* fix tests

* fix shell bash for windows sh files

* fix permission of run-test.sh

* removed docker dev

* added shell bash for tests

* fix tag

* fix win=true

* fix cd

* added dev

* fix variable undefined

* removed failing tests

* fix tcp dump image

* fix curl

* fix curl

* tcp dump path

* fix tcpdump path

* fix curl

* fix curl install

* stop removing intermediate containers

* fix tcpdump docker image

* revert -rm

* --rm=false

* makeing docker image before

* fix tcpdump

* removed case consul exec

* removed terminating gateway simple

* comment case wasm

* removed data dog

* comment out upload coverage

* uncomment case-consul-exec

* comment case consul exec

* if always

* logs

* using consul 1.17.0

* fix quotes

* revert quotes

* redirect to dev null

* Revert version

* revert consul connect

* fix version

* removed envoy connect

* not using function

* change log

* docker logs

* fix logs

* restructure bad authz

* rmeoved dev null

* output

* fix file descriptor

* fix cacert

* fix cacert

* fix ca cert

* cacert does not work in windows curl

* fix func

* removed docker logs

* added sleep

* fix tls

* commented case-consul-exec

* removed echo

* retry docker consul

* fix upload bin

* uncomment consul exec

* copying consul.exe to docker image

* copy fix

* fix paths

* fix path

* github workspace path

* latest version

* Revert "latest version"

This reverts commit 5a7d7b82d9e7553bcb01b02557ec8969f9deba1d.

* commented consul exec

* added ssl revoke best effort

* revert best effort

* removed unused files

* rename var name and change dir

* windows runner

* permission

* needs setup fix

* swtich to github runner

* fix file path

* fix path

* fix path

* fix path

* fix path

* fix path

* fix build paths

* fix tag

* nightly runs

* added matrix in github workflow, renamed files

* fix job

* fix matrix

* removed brackes

* from json

* without using job matrix

* fix quotes

* revert job matrix

* fix workflow

* fix comment

* added comment

* nightly runs

* removed datadog ci as it is already measured in linux one

* running test

* Revert "running test"

This reverts commit 7013d15a23732179d18ec5d17336e16b26fab5d4.

* pr comment fixes

* running test now

* running subset of test

* running subset of test

* job matrix

* shell bash

* removed bash shell

* linux machine for job matrix

* fix output

* added cat to debug

* using ubuntu latest

* fix job matrix

* fix win true

* fix go test

* revert job matrix

---------

Co-authored-by: Jose Ignacio Lorenzo <74208929+joselo85@users.noreply.github.com>
Co-authored-by: Franco Bruno Lavayen <cocolavayen@gmail.com>
Co-authored-by: Ivan K Berlot <ivanberlot@gmail.com>
Co-authored-by: Ezequiel Fernández Ponce <20102608+ezfepo@users.noreply.github.com>
Co-authored-by: joselo85 <joseignaciolorenzo85@gmail.com>
Co-authored-by: Ezequiel Fernández Ponce <ezequiel.fernandez@southworks.com>
Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
2023-07-21 20:26:00 +05:30

6.6 KiB
Raw Blame History

Envoy Integration Tests on Windows

Index

About this Guide

On this guide you will find all the information required to run the Envoy integration tests on Windows.

Prerequisites

To run the integration tests yo will need to have the following installed on your System:

  • GO v1.18(or later).
  • Gotestsum library installation.
  • Docker.

Before running the tests, you will need to build the required Docker images, to do so, you can use the script provided here:

  • Build Images Script Execution
    • From a Bash console (GitBash or WSL) execute: ./build-images.sh

Running the Tests

To execute the tests you need to run the following command depending on the shell you are using:
On Powershell:
go test -v -timeout=30m -tags integration ./test/integration/connect/envoy -run="TestEnvoy/<TEST CASE>" -win=true
Where TEST CASE is the individual test case we want to execute (e.g. case-badauthz).

On Git Bash:
ENVOY_VERSION=<ENVOY VERSION> go test -v -timeout=30m -tags integration ./test/integration/connect/envoy -run="TestEnvoy/<TEST CASE>" -win=true
Where TEST CASE is the individual test case we want to execute (e.g. case-badauthz), and ENVOY VERSION is the version which you are currently testing.

Tip

When executing the integration tests using Powershell you may need to set the ENVOY_VERSION value manually in line 20 of the run-tests.windows.sh file.

Warning

When executing the integration tests for Windows environments, the End of Line Sequence of every related file and/or script will be changed from LF to CRLF.

About Envoy Integration Tests on Windows

Integration tests on Linux run a multi-container architecture that take advantage of the Host Network Docker feature, using this feature means that the container's network stack is not isolated from the Docker host (the container shares the hosts networking namespace), and the container does not get its own IP-address allocated (read more about this here). This feature is only available for Linux, which made migrating the tests to Windows challenging, since replicating the same architecture created more issues, that's why a single container architecture was chosen to run the Envoy integration tests.
Using a single container architecture meant that we could use the same tests as on linux, moreover we were able to speed-up their execution by replacing docker run commands which started utility containers, for docker exec commands.

Common errors

If the tests are executed without docker running, the following error will be seen:

error during connect: This error may indicate that the docker daemon is not running.: Post "http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.24/build?buildargs=%7B%7D&cachefrom=%5B%5D&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=Dockerfile-bats-windows&labels=%7B%7D&memory=0&memswap=0&networkmode=default&rm=1&shmsize=0&t=bats-verify&target=&ulimits=null&version=1": open //./pipe/docker_engine: The system cannot find the file specified.

If any of the docker images does not exist or is mistagged, an error similar to the following will be displayed:

Error response from daemon: No such container: envoy_workdir_1

If you run the Windows tests from WSL you will get the following error message:

main_test.go:34: command failed: exec: "cmd": executable file not found in $PATH

Windows Scripts Changes

  • The "http-addr", "grpc-addr" and "admin-access-log-path" flags were added to the creation of the Envoy Bootstrap files.
  • To execute commands sh was replaced by bash on our Windows container.
  • All paths were updated to use Windows format.
  • Created stop_and_copy_files function to copy files into the shared volume (see volume issues).
  • Changed the -admin-bind value from 0.0.0.0 to 127.0.0.1 when generating the Envoy Bootstrap files.
  • Removed the && from the common_run_container_service's docker exec command and replaced it with **.
  • Removed docker_wget and docker_curl functions from helpers.windows.bash file and replaced them with docker_consul_exec, this way we avoid starting intermediate containers when capturing logs.
  • The function wipe_volumes uses a docker exec command instead of the original docker run, this way we speed up test execution by avoiding to start a new container just to delete volume content before each test run.
  • For case-grpc we increased the envoy_stats_flush_interval value from 1s to 5s, on Windows, the original value caused the test to pass or fail randomly.
  • For case-wanfed-gw a new script was created: global-setup-windows.sh, this file replaces global-setup.sh when running this test in Windows. The new script uses the windows/consul:local Docker image to generate the required TLS files and copies them into host's workdir directory.
  • To use the debug_dump_volumes function, you need to use it via Powershell and execute the following command: bash run-tests.windows.sh debug_dump_volumes Make sure to be positioned with your terminal in the correct directory.
  • For case-consul-exec this case can only be run when using the consul-dev Docker image on this repository, since it relies on features implemented only here. These features are: Windows valid default value for "-admin-access-log-path" and consul connect envoy command starts Envoy. This features have also been submitted in PR#15114.

Volume Issues

Another difference that arose when migrating the tests from Linux to Windows, is that file system operations can't be executed while Windows containers are running. Currently, when running the tests a named volume is created and all of the required files are copied into that volume. Because of the constraint mentioned before, the workaround we implemented was creating a function (stop_and_copy_files) that stops the kubernetes/pause container and executes a script to copy the required files and finally starts the container again.