From 8afcf9f1526f94b60c1879c86c532baae02ed059 Mon Sep 17 00:00:00 2001 From: Ryan Breen Date: Thu, 29 Jan 2015 16:45:19 -0500 Subject: [PATCH 1/6] Language touch-ups for the checks docs. --- .../source/docs/agent/checks.html.markdown | 61 ++++++++++--------- 1 file changed, 31 insertions(+), 30 deletions(-) diff --git a/website/source/docs/agent/checks.html.markdown b/website/source/docs/agent/checks.html.markdown index 0f84c32e5f..2ec67e973b 100644 --- a/website/source/docs/agent/checks.html.markdown +++ b/website/source/docs/agent/checks.html.markdown @@ -3,39 +3,39 @@ layout: "docs" page_title: "Check Definition" sidebar_current: "docs-agent-checks" description: |- - One of the primary roles of the agent is the management of system and application level health checks. A health check is considered to be application level if it associated with a service. A check is defined in a configuration file, or added at runtime over the HTTP interface. + One of the primary roles of the agent is management of system- and application-level health checks. A health check is considered to be application-level if it is associated with a service. A check is defined in a configuration file or added at runtime over the HTTP interface. --- # Checks -One of the primary roles of the agent is the management of system and -application level health checks. A health check is considered to be application -level if it associated with a service. A check is defined in a configuration file, -or added at runtime over the HTTP interface. +One of the primary roles of the agent is management of system- and application-level health +checks. A health check is considered to be application-level if it is associated with a +service. A check is defined in a configuration file or added at runtime over the HTTP interface. There are three different kinds of checks: * Script + Interval - These checks depend on invoking an external application - that does the health check and exits with an appropriate exit code, potentially - generating some output. A script is paired with an invocation interval (e.g. + that performs the health check, exits with an appropriate exit code, and potentially + generates some output. A script is paired with an invocation interval (e.g. every 30 seconds). This is similar to the Nagios plugin system. - * HTTP + Interval - These checks make an `HTTP GET` request every Interval (e.g. - every 30 seconds) to the specified URL. The status of the service depends on the HTTP Response Code. - any `2xx` code is passing, `429 Too Many Requests` is warning and anything else is failing. - This type of check should be preferred over a script that for example uses `curl`. + * HTTP + Interval - These checks make an HTTP `GET` request every Interval (e.g. + every 30 seconds) to the specified URL. The status of the service depends on the HTTP response code: + any `2xx` code is considered passing, a `429 Too Many Requests` is a warning, and anything else is a failure. + This type of check should be preferred over a script that uses `curl` or another external process + to check a simple HTTP operation. * Time to Live (TTL) - These checks retain their last known state for a given TTL. The state of the check must be updated periodically over the HTTP interface. If an external system fails to update the status within a given TTL, the check is set to the failed state. This mechanism is used to allow an application to - directly report its health. For example, a web app can periodically curl the - endpoint, and if the app fails, then the TTL will expire and the health check + directly report its health. For example, a healthy web app can periodically `PUT` a status + update to the HTTP endpoint; if the app fails, the TTL will expire and the health check enters a critical state. This is conceptually similar to a dead man's switch. ## Check Definition -A check definition that is a script looks like: +A script check: ```javascript { @@ -48,7 +48,7 @@ A check definition that is a script looks like: } ``` -An HTTP based check looks like: +A HTTP check: ```javascript { @@ -61,7 +61,7 @@ An HTTP based check looks like: } ``` -A TTL based check is very similar: +A TTL check: ```javascript { @@ -74,18 +74,19 @@ A TTL based check is very similar: } ``` -Each type of definitions must include a `name`, and may optionally +Each type of definition must include a `name` and may optionally provide an `id` and `notes` field. The `id` is set to the `name` if not -provided. It is required that all checks have a unique ID per node, so if names -might conflict then unique ID's should be provided. +provided. It is required that all checks have a unique ID per node: if names +might conflict, unique IDs should be provided. -The `notes` field is opaque to Consul, but may be used for human -readable descriptions. The field is set to any output that a script -generates, and similarly the TTL update hooks can update the `notes` -as well. + +The `notes` field is opaque to Consul but can be used to provide a human-readable +descriptions. With a script check, the field is set to any output generated by the +script. Similarly, an external process updating a TTL check via the HTTP interface +can set the `notes` value. To configure a check, either provide it as a `-config-file` option to the -agent, or place it inside the `-config-dir` of the agent. The file must +agent or place it inside the `-config-dir` of the agent. The file must end in the ".json" extension to be loaded by Consul. Check definitions can also be updated by sending a `SIGHUP` to the agent. Alternatively, the check can be registered dynamically using the [HTTP API](/docs/agent/http.html). @@ -93,8 +94,8 @@ check can be registered dynamically using the [HTTP API](/docs/agent/http.html). ## Check Scripts A check script is generally free to do anything to determine the status -of the check. The only limitations placed are that the exit codes must convey -a specific meaning. Specifically: +of the check. The only limitations placed are that the exit codes must obey +this convention: * Exit code 0 - Check is passing * Exit code 1 - Check is warning @@ -106,7 +107,7 @@ by human operators. ## Service-bound checks -Health checks may also be optionally bound to a specific service. This ensures +Health checks may optionally be bound to a specific service. This ensures that the status of the health check will only affect the health status of the given service instead of the entire node. Service-bound health checks may be provided by adding a `service_id` field to a check configuration: @@ -123,12 +124,12 @@ provided by adding a `service_id` field to a check configuration: ``` In the above configuration, if the web-app health check begins failing, it will -only affect the availability of the web-app service and no other services -provided by the node. +only affect the availability of the web-app service. All other services +provided by the node will remain unchanged. ## Multiple Check Definitions -Multiple check definitions can be provided at once using the `checks` (plural) +Multiple check definitions can be defined using the `checks` (plural) key in your configuration file. ```javascript From eea0029f2dd591cd1f9227ef1337adb88b678b8e Mon Sep 17 00:00:00 2001 From: Ryan Breen Date: Thu, 29 Jan 2015 16:54:36 -0500 Subject: [PATCH 2/6] Add a bit more detail around checks and clarify some language. --- website/source/docs/agent/checks.html.markdown | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/website/source/docs/agent/checks.html.markdown b/website/source/docs/agent/checks.html.markdown index 2ec67e973b..c3bf9449c6 100644 --- a/website/source/docs/agent/checks.html.markdown +++ b/website/source/docs/agent/checks.html.markdown @@ -8,9 +8,12 @@ description: |- # Checks -One of the primary roles of the agent is management of system- and application-level health +One of the primary roles of the agent is management of system-level and application-level health checks. A health check is considered to be application-level if it is associated with a -service. A check is defined in a configuration file or added at runtime over the HTTP interface. +service. If not associated with a service, the check monitors the health of the entire node. + +A check is defined in a configuration file or added at runtime over the HTTP interface. Checks +created via the HTTP interface persist across runs of the Consul agent on that node. There are three different kinds of checks: From cef8305bd0faf4fa3530228e5039a1e06604c9d0 Mon Sep 17 00:00:00 2001 From: Ryan Breen Date: Thu, 29 Jan 2015 17:10:15 -0500 Subject: [PATCH 3/6] Make it clear that checks persist with the node, period, not just across runs of the agent but across reboots as well. --- website/source/docs/agent/checks.html.markdown | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/source/docs/agent/checks.html.markdown b/website/source/docs/agent/checks.html.markdown index c3bf9449c6..f87702f031 100644 --- a/website/source/docs/agent/checks.html.markdown +++ b/website/source/docs/agent/checks.html.markdown @@ -13,7 +13,7 @@ checks. A health check is considered to be application-level if it is associated service. If not associated with a service, the check monitors the health of the entire node. A check is defined in a configuration file or added at runtime over the HTTP interface. Checks -created via the HTTP interface persist across runs of the Consul agent on that node. +created via the HTTP interface persist with that node. There are three different kinds of checks: From a8470991cd872bb77e369191913ee067065c5515 Mon Sep 17 00:00:00 2001 From: Ryan Breen Date: Thu, 29 Jan 2015 17:12:20 -0500 Subject: [PATCH 4/6] Some reorg of the TTL description. --- website/source/docs/agent/checks.html.markdown | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/website/source/docs/agent/checks.html.markdown b/website/source/docs/agent/checks.html.markdown index f87702f031..85b48048a9 100644 --- a/website/source/docs/agent/checks.html.markdown +++ b/website/source/docs/agent/checks.html.markdown @@ -31,10 +31,10 @@ There are three different kinds of checks: * Time to Live (TTL) - These checks retain their last known state for a given TTL. The state of the check must be updated periodically over the HTTP interface. If an external system fails to update the status within a given TTL, the check is - set to the failed state. This mechanism is used to allow an application to - directly report its health. For example, a healthy web app can periodically `PUT` a status - update to the HTTP endpoint; if the app fails, the TTL will expire and the health check - enters a critical state. This is conceptually similar to a dead man's switch. + set to the failed state. This mechanism, conceptually similar to a dead man's switch, + relies on the application to directly report its health. For example, a healthy web app + can periodically `PUT` a status update to the HTTP endpoint; if the app fails, the TTL will + expire and the health check enters a critical state. ## Check Definition From cf033f448dd91893553d77bdcdce89ced99bee5d Mon Sep 17 00:00:00 2001 From: Ryan Breen Date: Thu, 29 Jan 2015 17:14:19 -0500 Subject: [PATCH 5/6] No need to confine the example to a web app. --- website/source/docs/agent/checks.html.markdown | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/source/docs/agent/checks.html.markdown b/website/source/docs/agent/checks.html.markdown index 85b48048a9..93bd783ba3 100644 --- a/website/source/docs/agent/checks.html.markdown +++ b/website/source/docs/agent/checks.html.markdown @@ -32,7 +32,7 @@ There are three different kinds of checks: The state of the check must be updated periodically over the HTTP interface. If an external system fails to update the status within a given TTL, the check is set to the failed state. This mechanism, conceptually similar to a dead man's switch, - relies on the application to directly report its health. For example, a healthy web app + relies on the application to directly report its health. For example, a healthy app can periodically `PUT` a status update to the HTTP endpoint; if the app fails, the TTL will expire and the health check enters a critical state. From 7fd0b148ac215ac809e6d7fa9dcc310cfebef94d Mon Sep 17 00:00:00 2001 From: Ryan Breen Date: Thu, 29 Jan 2015 17:17:02 -0500 Subject: [PATCH 6/6] A bit more language cleanup to checks. --- website/source/docs/agent/checks.html.markdown | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/website/source/docs/agent/checks.html.markdown b/website/source/docs/agent/checks.html.markdown index 93bd783ba3..0bd8f02f35 100644 --- a/website/source/docs/agent/checks.html.markdown +++ b/website/source/docs/agent/checks.html.markdown @@ -82,11 +82,10 @@ provide an `id` and `notes` field. The `id` is set to the `name` if not provided. It is required that all checks have a unique ID per node: if names might conflict, unique IDs should be provided. - The `notes` field is opaque to Consul but can be used to provide a human-readable -descriptions. With a script check, the field is set to any output generated by the -script. Similarly, an external process updating a TTL check via the HTTP interface -can set the `notes` value. +description of the current state of the check. With a script check, the field is +set to any output generated by the script. Similarly, an external process updating +a TTL check via the HTTP interface can set the `notes` value. To configure a check, either provide it as a `-config-file` option to the agent or place it inside the `-config-dir` of the agent. The file must