consul/website/source/docs/agent/checks.html.markdown

---
layout: "docs"
page_title: "Check Definition"
sidebar_current: "docs-agent-checks"
---

# Checks

One of the primary roles of the agent is the management of system and
application level health checks. A health check is considered to be application
level if it associated with a service. A check is defined in a configuration file,
or added at runtime over the HTTP interface.

There are two different kinds of checks:

 * Script + Interval - These checks depend on invoking an external application
 which does the health check and exits with an appropriate exit code, potentially
 generating some output. A script is paired with an invocation interval (e.g.
 every 30 seconds). This is similar to the Nagios plugin system.

 * TTL - These checks retain their last known state for a given TTL. The state
 of the check must be updated periodically over the HTTP interface. If an
 external system fails to update the status within a given TTL, the check is
 set to the failed state. This mechanism is used to allow an application to
 directly report its health. For example, a web app can periodically curl the
 endpoint, and if the app fails, then the TTL will expire and the health check
 enters a critical state. This is conceptually similar to a dead man's switch.

## Check Definition

A check definition that is a script looks like:

    {
        "check": {
            "id": "mem-util",
            "name": "Memory utilization",
            "script": "/usr/local/bin/check_mem.py",
            "interval": "10s"
        }
    }

A TTL based check is very similar:

    {
        "check": {
            "id": "web-app",
            "name": "Web App Status",
            "notes": "Web app does a curl internally every 10 seconds",
            "ttl": "30s"
        }
    }

Both types of definitions must include a `name`, and may optionally
provide an `id` and `notes` field. The `id` is set to the `name` if not
provided. It is required that all checks have a unique ID, so if names
might conflict, then unique ID's should be provided.

The `notes` field is opaque to Consul, but may be used for human
readable descriptions. The field is set to any output that a script
generates, and similarly the TTL update hooks can update the `notes`
as well.

To configure a check, either provide it as a `-config-file` option to the
agent, or place it inside the `-config-dir` of the agent. The file must
end in the ".json" extension to be loaded by Consul. Check definitions can
also be updated by sending a `SIGHUP` to the agent. Alternatively, the
check can be registered dynamically using the [HTTP API](/docs/agent/http.html).

## Check Scripts

A check script is generally free to do anything to determine the status
of the check. The only limitations placed are the exit codes must convey
a specific meaning. Specifically:

 * Exit code 0 - Check is passing
 * Exit code 1 - Check is warning
 * Any other code - Check is failing

This is the only convention that Consul depends on. Any output of the script
will be captured and stored in the `notes` field so that it can be viewed
by human operators.
website: document checks and services 2014-02-18 18:05:18 -08:00			`---`
			`layout: "docs"`
			`page_title: "Check Definition"`
			`sidebar_current: "docs-agent-checks"`
			`---`

			`# Checks`

			`One of the primary roles of the agent is the management of system and`
			`application level health checks. A health check is considered to be application`
			`level if it associated with a service. A check is defined in a configuration file,`
			`or added at runtime over the HTTP interface.`

			`There are two different kinds of checks:`

			`* Script + Interval - These checks depend on invoking an external application`
			`which does the health check and exits with an appropriate exit code, potentially`
			`generating some output. A script is paired with an invocation interval (e.g.`
			`every 30 seconds). This is similar to the Nagios plugin system.`

			`* TTL - These checks retain their last known state for a given TTL. The state`
Minor documentation fixes - Correct spotted typos - Ran JSON snippets through jsonpp for consistent display of JSON results - Unfortunately my editor stripped EOL whitespace so there's a bit of whitespace diff 2014-04-30 15:26:07 -04:00			`of the check must be updated periodically over the HTTP interface. If an`
website: document checks and services 2014-02-18 18:05:18 -08:00			`external system fails to update the status within a given TTL, the check is`
			`set to the failed state. This mechanism is used to allow an application to`
website: fix a couple of typos. 2014-05-03 18:23:16 -04:00			`directly report its health. For example, a web app can periodically curl the`
website: document checks and services 2014-02-18 18:05:18 -08:00			`endpoint, and if the app fails, then the TTL will expire and the health check`
website: Documentation cleanup 2014-04-09 11:06:27 -07:00			`enters a critical state. This is conceptually similar to a dead man's switch.`
website: document checks and services 2014-02-18 18:05:18 -08:00
			`## Check Definition`

			`A check definition that is a script looks like:`

			`{`
			`"check": {`
			`"id": "mem-util",`
			`"name": "Memory utilization",`
			`"script": "/usr/local/bin/check_mem.py",`
			`"interval": "10s"`
			`}`
			`}`

			`A TTL based check is very similar:`

			`{`
			`"check": {`
			`"id": "web-app",`
			`"name": "Web App Status",`
			`"notes": "Web app does a curl internally every 10 seconds",`
			`"ttl": "30s"`
			`}`
			`}`

			Both types of definitions must include a `name`, and may optionally
			provide an `id` and `notes` field. The `id` is set to the `name` if not
			`provided. It is required that all checks have a unique ID, so if names`
			`might conflict, then unique ID's should be provided.`

			The `notes` field is opaque to Consul, but may be used for human
			`readable descriptions. The field is set to any output that a script`
			generates, and similarly the TTL update hooks can update the `notes`
			`as well.`

website: document registering checks and services better. Fixes #6 2014-02-22 18:53:31 -08:00			To configure a check, either provide it as a `-config-file` option to the
			agent, or place it inside the `-config-dir` of the agent. The file must
			`end in the ".json" extension to be loaded by Consul. Check definitions can`
			also be updated by sending a `SIGHUP` to the agent. Alternatively, the
			`check can be registered dynamically using the [HTTP API](/docs/agent/http.html).`

website: working on documenting http api 2014-02-19 12:05:18 -08:00			`## Check Scripts`

			`A check script is generally free to do anything to determine the status`
			`of the check. The only limitations placed are the exit codes must convey`
			`a specific meaning. Specifically:`

			`* Exit code 0 - Check is passing`
			`* Exit code 1 - Check is warning`
			`* Any other code - Check is failing`

			`This is the only convention that Consul depends on. Any output of the script`
			will be captured and stored in the `notes` field so that it can be viewed
			`by human operators.`