Merge pull request #10281 from hashicorp/docs-pairing

New structure for contributing docs
2021-07-07 14:42:36 -04:00 · 2021-07-07 14:42:36 -04:00 · 7f083f70ca
parent 6390e91be5 27c85d8df2
commit 7f083f70ca
38 changed files with 533 additions and 40 deletions
--- a/contributing/INTERNALS.md
+++ b/contributing/INTERNALS.md
@ -30,47 +30,12 @@ The components in this section are shared between Consul agents in client and se
 | [agent/router](https://github.com/hashicorp/consul/tree/main/agent/router), [agent/pool](https://github.com/hashicorp/consul/tree/main/agent/pool) | These are used for routing RPC queries to Consul servers and for connection pooling. |
 | [agent/structs](https://github.com/hashicorp/consul/tree/main/agent/structs) | This has definitions of all the internal RPC protocol request and response structures. |

-### Server Components
-
-The components in this section are only used by Consul servers.
-
-| Directory | Contents |
-| --------- | -------- |
-| [agent/consul](https://github.com/hashicorp/consul/tree/main/agent/consul) | This is where the Consul server object is defined, and the top-level `consul` package has all of the functionality that's used by server agents. This includes things like the internal RPC endpoints. |
-| [agent/consul/fsm](https://github.com/hashicorp/consul/tree/main/agent/consul/fsm), [agent/consul/state](https://github.com/hashicorp/consul/tree/main/agent/consul/state) | These components make up Consul's finite state machine (updated by the Raft consensus algorithm) and backed by the state store (based on immutable radix trees). All updates of Consul's consistent state is handled by the finite state machine, and all read queries to the Consul servers are serviced by the state store's data structures. |
-| [agent/consul/autopilot](https://github.com/hashicorp/consul/tree/main/agent/consul/autopilot) | This contains a package of functions that provide Consul's [Autopilot](https://www.consul.io/docs/guides/autopilot.html) features. |
-
 ### Other Components

 There are several other top-level packages used internally by Consul as well as externally by other applications.

 | Directory | Contents |
 | --------- | -------- |
-| [acl](https://github.com/hashicorp/consul/tree/main/api) | This supports the underlying policy engine for Consul's [ACL](https://www.consul.io/docs/guides/acl.html) system. |
 | [api](https://github.com/hashicorp/consul/tree/main/api) | This `api` package provides an official Go API client for Consul, which is also used by Consul's [CLI](https://www.consul.io/docs/commands/index.html) commands to communicate with the local Consul agent. |
-| [command](https://github.com/hashicorp/consul/tree/main/command) | This contains a sub-package for each of Consul's [CLI](https://www.consul.io/docs/commands/index.html) command implementations. |
-| [snapshot](https://github.com/hashicorp/consul/tree/main/snapshot) | This has implementation details for Consul's [snapshot archives](https://www.consul.io/api/snapshot.html). |
 | [api/watch](https://github.com/hashicorp/consul/tree/main/api/watch) | This has implementation details for Consul's [watches](https://www.consul.io/docs/agent/watches.html), used both internally to Consul and by the [watch CLI command](https://www.consul.io/docs/commands/watch.html). |
 | [website](https://github.com/hashicorp/consul/tree/main/website) | This has the full source code for [consul.io](https://www.consul.io/). Pull requests can update the source code and Consul's documentation all together. |
-
-## FAQ
-
-This section addresses some frequently asked questions about Consul's architecture.
-
-### How does eventually-consistent gossip relate to the Raft consensus protocol?
-
-When you query Consul for information about a service, such as via the [DNS interface](https://www.consul.io/docs/discovery/dns), the agent will always make an internal RPC request to a Consul server that will query the consistent state store. Even though an agent might learn that another agent is down via gossip, that won't be reflected in service discovery until the current Raft leader server perceives that through gossip and updates the catalog using Raft. You can see an example of where these layers are plumbed together here - https://github.com/hashicorp/consul/blob/v1.0.5/agent/consul/leader.go#L559-L602.
-
-## Why does a blocking query sometimes return with identical results?
-
-Consul's [blocking queries](https://www.consul.io/api/index.html#blocking-queries) make a best-effort attempt to wait for new information, but they may return the same results as the initial query under some circumstances. First, queries are limited to 10 minutes max, so if they time out they will return. Second, due to Consul's prefix-based internal immutable radix tree indexing, there may be modifications to higher-level nodes in the radix tree that cause spurious wakeups. In particular, waiting on things that do not exist is not very efficient, but not very expensive for Consul to serve, so we opted to keep the code complexity low and not try to optimize for that case. You can see the common handler that implements the blocking query logic here - https://github.com/hashicorp/consul/blob/v1.0.5/agent/consul/rpc.go#L361-L439. For more on the immutable radix tree implementation, see https://github.com/hashicorp/go-immutable-radix/ and https://github.com/hashicorp/go-memdb, and the general support for "watches".
-
-### Do the client agents store any key/value entries?
-
-No. These are always fetched via an internal RPC request to a Consul server. The agent doesn't do any caching, and if you want to be able to fetch these values even if there's no cluster leader, then you can use a more relaxed [consistency mode](https://www.consul.io/api/index.html#consistency-modes). You can see an example where the `/v1/kv/<key>` HTTP endpoint on the agent makes an internal RPC call here - https://github.com/hashicorp/consul/blob/v1.0.5/agent/kvs_endpoint.go#L56-L90.
-
-### I don't want to run a Consul agent on every node, can I just run servers with a load balancer in front?
-
-We strongly recommend running the Consul agent on each node in a cluster. Even the key/value store benefits from having agents on each node. For example, when you lock a key it's done through a [session](https://www.consul.io/docs/internals/sessions.html), which has a lifetime that's by default tied to the health of the agent as determined by Consul's gossip-based distributed failure detector. If the agent dies, the session will be released automatically, allowing some other process to quickly see that and obtain the lock without having to wait for an open-ended TTL to expire. If you are using Consul's service discovery features, the local agent runs the health checks for each service registered on that node and only needs to send edge-triggered updates to the Consul servers (because gossip will determine if the agent itself dies). Most attempts to avoid running an agent on each node will face solving issues that are already solved by Consul's design if the agent is deployed as intended.
-
-For cases where you really cannot run an agent alongside a service, such as for monitoring an [external service](https://www.consul.io/docs/guides/external.html), there's a companion project called the [Consul External Service Monitor](https://github.com/hashicorp/consul-esm) that may help.
--- a/contributing/README.md
+++ b/contributing/README.md
@ -5,10 +5,48 @@ See [our contributing guide](../.github/CONTRIBUTING.md) to get started.
 This directory contains documentation intended for anyone interested in
 understanding, and contributing changes to, the Consul codebase.

-## Contents
+## Overview
+
+This documentation is organized into the following categories. Each category is 
+either a significant architectural layer, or major functional area of Consul. 
+These documents assume a basic understanding of Consul's feature set, which can
+found in the public [user documentation].
+
+[user documentation]: https://www.consul.io/docs
+
+![Overview](./overview.svg)
+
+<sup>[source](./overview.mmd)</sup>
+
+## Contents 

 1. [Overview](./INTERNALS.md)
-2. [Configuration](./checklist-adding-config-fields.md)
-3. [Streaming](./streaming)
-4. [Network Areas](./network-areas)
-5. [Service Discovery](./service-discovery)
+1. [Command-Line Interface (CLI)](./cli)
+1. [HTTP API](./http-api)
+1. [Agent Configuration](./config)
+1. [RPC](./rpc)
+1. [Cluster Persistence](./persistence)
+1. [Client Agent](./client-agent)
+1. [Service Discovery](./service-discovery)
+1. [Service Mesh (Connect)](./service-mesh)
+1. [Cluster Membership](./cluster-membership)
+1. [Key/Value Store](./kv)
+1. [ACL](./acl)
+1. [Multi-Cluster Federation](./cluster-federation)
+
+Also see the [FAQ](./faq.md).
+
+## Contributing to these docs
+
+This section is meta documentation about contributing to these docs.
+
+### Diagrams
+
+The diagrams in these documents are created using the [mermaid-js live editor]. 
+The [mermaid-js docs] provide a complete reference for how to create and edit 
+the diagrams. Use the [consul-mermaid-theme.json] (paste it into the Config tab 
+in the editor) to maintain a consistent Consul style for the diagrams.
+
+[mermaid-js live editor]: https://mermaid-js.github.io/mermaid-live-editor/edit/
+[mermaid-js docs]: https://mermaid-js.github.io/mermaid/
+[consul-mermaid-theme.json]: ./consul-mermaid-theme.json
--- a/contributing/acl/README.md
+++ b/contributing/acl/README.md
@ -0,0 +1,35 @@
+# ACL
+
+This section is a work in progress.
+
+The ACL subsystem is responsible for authenticating and authorizing access to Consul
+operations ([HTTP API], and [RPC]). 
+
+[HTTP API]: ../http-api
+[RPC]: ../rpc
+
+## ACL Entities
+
+There are many entities in the ACL subsystem. The diagram below shows the relationship
+between the entities.
+
+![Entity Relationship Diagram](./erd.svg)
+
+<sup>[source](./erd.mmd)</sup>
+
+ACL Tokens are at the center of the ACL system. Tokens are associated with a set of
+Policies, and Roles.
+
+AuthMethods, which consist of BindingRules, are a mechanism for creating ACL Tokens from
+policies stored in external systems (ex: kubernetes, JWT, or OIDC).
+
+Roles are a set of policies associated with a named role, and ServiceIdentity and
+NodeIdentity are policy templates that are associated with a specific service or node and
+can be rendered into a full policy.
+
+Each Policy contains a set of rules. Each rule relates to a specific resource, and
+includes an AccessLevel (read, write, list or deny).
+
+An ACL Token can be resolved into an Authorizer. The Authorizer is what is used by the
+[HTTP API], and [RPC] endpoints to determine if an operation is allowed or forbidden (the
+enforcement decision).
--- a/contributing/acl/erd.mmd
+++ b/contributing/acl/erd.mmd
@ -0,0 +1,33 @@
+erDiagram
+
+    Token
+    Policy
+    Role
+    ServiceIdentity
+    NodeIdentity
+    AuthMethod
+    BindingRule
+    Rule {
+        string Resource
+        enum AccessLevel
+    }
+
+    Policy ||--|{ Rule: grants
+    Role ||--|{ Policy: includes
+    Role }|--|{ ServiceIdentity: includes
+    Role }|--|{ NodeIdentity: includes
+
+    Token }|--|{ Policy: includes
+    Token }|--|{ Role: includes
+    Token }|--|{ ServiceIdentity: includes
+    Token }|--|{ NodeIdentity: includes
+
+    AuthMethod ||--|{ BindingRule: defines
+    AuthMethod ||--|{ Token: creates
+
+    ServiceIdentity ||--|{ Rule: implies
+    NodeIdentity ||--|{ Rule: implies
+
+    Token ||--|| Authorizer: "resolves to"
+    Authorizer ||--|{ EnforcementDecision: produces
+
--- a/contributing/acl/erd.svg
+++ b/contributing/acl/erd.svg
--- a/contributing/cli/README.md
+++ b/contributing/cli/README.md
@ -0,0 +1,36 @@
+# Command-Line Interface (CLI)
+
+This section is a work in progress.
+
+The `consul` binary provides a CLI for interacting with the [HTTP API]. Some commands may
+also exec other processes or generate data used by Consul (ex: tls certificates). The
+`agent` command is responsible for starting the Consul agent.
+
+The [cli reference] in Consul user documentation has a full reference to all available
+commands.
+
+[HTTP API]: ../http-api
+[cli reference]: https://www.consul.io/commands
+
+## Code
+
+The CLI entrypoint is [main.go] and the majority of the source for the CLI is under the
+[command] directory. Each subcommand is a separate package under [command]. The CLI uses
+[github.com/mitchellh/cli] as a framework, and uses the [flag] package from the stdlib for
+command line flags.
+
+
+[command]: https://github.com/hashicorp/consul/tree/main/command
+[main.go]: https://github.com/hashicorp/consul/blob/main/main.go
+[flag]: https://pkg.go.dev/flag
+[github.com/mitchellh/cli]: https://github.com/mitchellh/cli
+
+## Important notes
+
+The [cli.Ui] wraps an `io.Writer` for both stdout and stderr. At the time of writing both
+`Info` and `Output` go to stdout. Writing `Info` to stdout has been a source of a couple
+bugs. To prevent these bugs in the future it is recommended that `Info` should no longer
+be used. Instead, send all information messages to stderr by using `Warn`.
+
+
+[cli.Ui]: https://pkg.go.dev/github.com/mitchellh/cli#Ui
--- a/contributing/client-agent/README.md
+++ b/contributing/client-agent/README.md
@ -0,0 +1,5 @@
+# Client Agent
+
+- agent/cache
+- agent/local (local state)
+- anti-entropy sync
--- a/contributing/cluster-federation/README.md
+++ b/contributing/cluster-federation/README.md
@ -0,0 +1,4 @@
+# Multi-Cluster Federation
+
+1. [Network Areas](./network-areas)
+
--- a/contributing/cluster-federation/network-areas/README.md
+++ b/contributing/cluster-federation/network-areas/README.md
--- a/contributing/cluster-membership/README.md
+++ b/contributing/cluster-membership/README.md
@ -0,0 +1,7 @@
+# Cluster membership
+   - hashicorp/serf
+   - hashicorp/memberlist
+   - network coordinates
+   - consul events
+   - consul exec
+
--- a/contributing/config/README.md
+++ b/contributing/config/README.md
@ -0,0 +1,3 @@
+# Agent Configuration
+
+- [Checklist for adding a new field](./checklist-adding-config-fields.md)
--- a/contributing/config/checklist-adding-config-fields.md
+++ b/contributing/config/checklist-adding-config-fields.md
--- a/contributing/consul-mermaid-theme.json
+++ b/contributing/consul-mermaid-theme.json
@ -0,0 +1,4 @@
+{
+  "theme": "default",
+  "themeCSS": ".node rect, .er.entityBox { fill: rgb(220, 71, 125); stroke-width: 1; stroke: black; } .node .label { color: white; }; .cluster rect { fill: #f0f0f0; stroke-width: 1px; stroke: #333}; .edgeLabel { background-color: #f0f0f0; }; .er.entityBox + .er.entityLabel { fill: white }; .er.attributeBoxEven, .er.attributeBoxOdd { fill: #fff; stroke: #777 }"
+}
--- a/contributing/faq.md
+++ b/contributing/faq.md
@ -0,0 +1,21 @@
+# FAQ
+
+This section addresses some frequently asked questions about Consul's architecture.
+
+### How does eventually-consistent gossip relate to the Raft consensus protocol?
+
+When you query Consul for information about a service, such as via the [DNS interface](https://www.consul.io/docs/discovery/dns), the agent will always make an internal RPC request to a Consul server that will query the consistent state store. Even though an agent might learn that another agent is down via gossip, that won't be reflected in service discovery until the current Raft leader server perceives that through gossip and updates the catalog using Raft. You can see an example of where these layers are plumbed together here - https://github.com/hashicorp/consul/blob/v1.0.5/agent/consul/leader.go#L559-L602.
+
+## Why does a blocking query sometimes return with identical results?
+
+Consul's [blocking queries](https://www.consul.io/api/index.html#blocking-queries) make a best-effort attempt to wait for new information, but they may return the same results as the initial query under some circumstances. First, queries are limited to 10 minutes max, so if they time out they will return. Second, due to Consul's prefix-based internal immutable radix tree indexing, there may be modifications to higher-level nodes in the radix tree that cause spurious wakeups. In particular, waiting on things that do not exist is not very efficient, but not very expensive for Consul to serve, so we opted to keep the code complexity low and not try to optimize for that case. You can see the common handler that implements the blocking query logic here - https://github.com/hashicorp/consul/blob/v1.0.5/agent/consul/rpc.go#L361-L439. For more on the immutable radix tree implementation, see https://github.com/hashicorp/go-immutable-radix/ and https://github.com/hashicorp/go-memdb, and the general support for "watches".
+
+### Do the client agents store any key/value entries?
+
+No. These are always fetched via an internal RPC request to a Consul server. The agent doesn't do any caching, and if you want to be able to fetch these values even if there's no cluster leader, then you can use a more relaxed [consistency mode](https://www.consul.io/api/index.html#consistency-modes). You can see an example where the `/v1/kv/<key>` HTTP endpoint on the agent makes an internal RPC call here - https://github.com/hashicorp/consul/blob/v1.0.5/agent/kvs_endpoint.go#L56-L90.
+
+### I don't want to run a Consul agent on every node, can I just run servers with a load balancer in front?
+
+We strongly recommend running the Consul agent on each node in a cluster. Even the key/value store benefits from having agents on each node. For example, when you lock a key it's done through a [session](https://www.consul.io/docs/internals/sessions.html), which has a lifetime that's by default tied to the health of the agent as determined by Consul's gossip-based distributed failure detector. If the agent dies, the session will be released automatically, allowing some other process to quickly see that and obtain the lock without having to wait for an open-ended TTL to expire. If you are using Consul's service discovery features, the local agent runs the health checks for each service registered on that node and only needs to send edge-triggered updates to the Consul servers (because gossip will determine if the agent itself dies). Most attempts to avoid running an agent on each node will face solving issues that are already solved by Consul's design if the agent is deployed as intended.
+
+For cases where you really cannot run an agent alongside a service, such as for monitoring an [external service](https://www.consul.io/docs/guides/external.html), there's a companion project called the [Consul External Service Monitor](https://github.com/hashicorp/consul-esm) that may help.
--- a/contributing/http-api/README.md
+++ b/contributing/http-api/README.md
@ -0,0 +1,3 @@
+# HTTP API
+
+Work in progress.
--- a/contributing/overview.mmd
+++ b/contributing/overview.mmd
@ -0,0 +1,30 @@
+graph TD
+    
+    ServiceMesh[Service Mesh]
+    ServiceDiscovery[Service Discovery]
+    ClusterMembership[Cluster Membership]
+    KV[Key/Value Store]
+    MultiClusterFederation[Multi-Cluster Federation]
+
+    ACL
+    AgentConfiguration[Agent Configuration]
+    ClientAgent[Client Agent]
+    RPC
+    ClusterPersistence[Cluster Persistence]
+    CLI
+    HTTPAPI[HTTP API]
+
+    CLI --> HTTPAPI
+    HTTPAPI --> ClientAgent
+    HTTPAPI --> ACL
+
+    AgentConfiguration --> ClientAgent
+    ClientAgent --> RPC
+    ClientAgent --> ACL
+    RPC --> ClusterPersistence
+    RPC --> ACL
+
+    MultiClusterFederation --> ClusterMembership
+    MultiClusterFederation --> RPC
+    ServiceMesh --> ServiceDiscovery
+    
--- a/contributing/overview.svg
+++ b/contributing/overview.svg
--- a/contributing/persistence/README.md
+++ b/contributing/persistence/README.md
@ -0,0 +1,108 @@
+# Cluster Persistence
+
+The cluser persistence subsystem runs entirely in Server Agents. It handles both read and
+write requests from the [RPC] subsystem. See the [Consul Architecture Guide] for an
+introduction to the Consul deployment architecture and the [Consensus Protocol] used by
+the cluster persistence subsystem.
+
+[RPC]: ../rpc
+[Consul Architecture Guide]: https://www.consul.io/docs/architecture
+[Consensus Protocol]: https://www.consul.io/docs/architecture/consensus
+
+
+![Overview](./overview.svg)
+
+<sup>[source](./overview.mmd)</sup>
+
+
+## Raft and FSM
+
+[hashicorp/raft] is at the core of cluster persistence. Raft requires an [FSM], a
+finite-state machine implementation, to persist state changes. The Consul FSM is
+implemented in [agent/consul/fsm] as a set of commands.
+
+[FSM]: https://pkg.go.dev/github.com/hashicorp/raft#FSM
+[hashicorp/raft]: https://github.com/hashicorp/raft
+[agent/consul/fsm]: https://github.com/hashicorp/consul/tree/main/agent/consul/fsm
+
+Raft also requires a [LogStore] to persist logs to disk. Consul uses [hashicorp/raft-boltdb]
+which implements [LogStore] using [boltdb]. In the near future we should be updating to
+use [bbolt].
+
+
+[LogStore]: https://pkg.go.dev/github.com/hashicorp/raft#LogStore
+[hashicorp/raft-boltdb]: https://github.com/hashicorp/raft-boltdb
+[boltdb]: https://github.com/boltdb/bolt
+[bbolt]: https://github.com/etcd-io/bbolt
+
+
+## State Store
+
+Consul stores the full state of the cluster in memory using the state store. The state store is
+implemented in [agent/consul/state] and uses [hashicorp/go-memdb] to maintain indexes of
+data stored in a set of tables. The main entrypoint to the state store is [NewStateStore].
+
+[agent/consul/state]: https://github.com/hashicorp/consul/tree/main/agent/consul/state
+[hashicorp/go-memdb]: https://github.com/hashicorp/go-memdb
+[NewStateStore]: https://github.com/hashicorp/consul/blob/main/agent/consul/state/state_store.go
+
+### Tables, Schemas, and Indexes
+
+The state store is organized as a set of tables, and each table has a set of indexes.
+`newDBSchema` in [schema.go] shows the full list of tables, and each schema function shows
+the full list of indexes.
+
+[schema.go]: https://github.com/hashicorp/consul/blob/main/agent/consul/state/schema.go
+
+There are two styles for defining table indexes. The original style uses generic indexer
+implementations from [hashicorp/go-memdb] (ex: `StringFieldIndex`). These indexes use
+[reflect] to find values for an index. These generic indexers work well when the index
+value is a single value available directly from the struct field, and there are no
+oss/enterprise differences.
+
+The second style of indexers are custom indexers implemented using only functions and
+based on the types defined in [indexer.go]. This style of index works well when the index
+value is a value derived from one or multiple fields, or when there are oss/enterprise
+differences between the indexes.
+
+[reflect]: https://golang.org/pkg/reflect/
+[indexer.go]: https://github.com/hashicorp/consul/blob/main/agent/consul/state/indexer.go
+
+
+## Snapshot and Restore
+
+Snapshots are the primary mechanism used to backup the data stored by cluster persistence.
+If all Consul servers fail, a snapshot can be used to restore the cluster back
+to its previous state.
+
+Note that there are two different snapshot and restore concepts that exist at different
+layers. First there is the `Snapshot` and `Restore` methods on the raft [FSM] interface,
+that Consul must implement. These methods are implemented as mostly passthrough to the
+state store. These methods may be called internally by raft to perform log compaction
+(snapshot) or to bootstrap a new follower (restore). Consul implements snapshot and
+restore using the `Snapshot` and `Restore` types in [agent/consul/state].
+
+Snapshot and restore also exist as actions that a user may perform. There are [CLI]
+commands, [HTTP API] endpoints, and [RPC] endpoints that allow a user to capture an
+archive which contains a snapshot of the state, and restore that state to a running
+cluster. The [consul/snapshot] package provides some of the logic for creating and reading
+the snapshot archives for users. See [commands/snapshot] for a reference to these user
+facing operations.
+
+[CLI]: ../cli
+[HTTP API]: ../http-api
+[commands/snapshot]: https://www.consul.io/commands/snapshot
+[consul/snapshot]: https://github.com/hashicorp/consul/tree/main/snapshot
+
+Finally, there is also a [snapshot agent] (enterprise only) that uses the snapshot API
+endpoints to periodically capture a snapshot, and optionally send it somewhere for
+storage. 
+
+[snapshot agent]: https://www.consul.io/commands/snapshot/agent
+
+## Raft Autopilot
+
+[hashicorp/raft-autopilot] is used by Consul to automate some parts of the upgrade process.
+
+
+[hashicorp/raft-autopilot]: https://github.com/hashicorp/raft-autopilot
--- a/contributing/persistence/overview.mmd
+++ b/contributing/persistence/overview.mmd
@ -0,0 +1,34 @@
+graph TB
+
+    requestLeader[request] --> RPCLeader
+    requestFollower[request] --> RPCFollower
+
+    class requestLeader,requestFollower req;
+    classDef req fill:transparent,color:#000,stroke-width:1;
+
+    subgraph Leader
+        RPCLeader[RPC]
+        RaftLeader[Raft]
+        StateStoreLeader[State Store]
+        FSMLeader[FSM]
+    end
+
+    RPCLeader -->|write| RaftLeader
+    RPCLeader -->|read| StateStoreLeader
+    RaftLeader ---> FSMLeader
+    FSMLeader --> StateStoreLeader
+
+    subgraph Follower
+        RPCFollower[RPC]
+        RaftFollower[Raft]
+        StateStoreFollower[State Store]
+        FSMFollower[FSM]
+    end
+
+    RaftLeader <-.->|consensus and replication| RaftFollower
+
+    RPCFollower -->|forward write to leader| RPCLeader
+    RPCFollower -->|read| StateStoreFollower
+    RaftFollower --> FSMFollower
+    FSMFollower --> StateStoreFollower
+
--- a/contributing/persistence/overview.svg
+++ b/contributing/persistence/overview.svg
--- a/contributing/rpc/README.md
+++ b/contributing/rpc/README.md
@ -0,0 +1,50 @@
+# RPC
+
+This section is a work in progress.
+
+The RPC subsystem is exclusicely in Server Agents. It is comprised of two main components:
+
+1. the "RPC Server" (for lack of a better term) handles multiplexing of many different
+   requests on a single TCP port.
+2. RPC endpoints handle RPC requests and return responses.
+
+The RPC subsystems handles requests from:
+
+1. Client Agents in the local DC
+2. (if the server is a leader) other Server Agents in the local DC
+3. Server Agents in other Datacenters
+4. in-process requests from other components running in the same process (ex: the HTTP API
+   or DNS interface).
+
+## Routing
+
+The "RPC Server" accepts requests to the [server port] and routes the requests based on
+configuration of the Server and the the first byte in the request. The diagram below shows
+all the possible routing flows.
+
+[server port]: https://www.consul.io/docs/agent/options#server_rpc_port
+
+![RPC Routing](./routing.svg)
+
+<sup>[source](./routing.mmd)</sup>
+
+The main entrypoint to RPC routing is `handleConn` in [agent/consul/rpc.go].
+
+[agent/consul/rpc.go]: https://github.com/hashicorp/consul/blob/main/agent/consul/rpc.go
+
+
+## RPC Endpoints
+
+This section is a work in progress, it will eventually cover topics like:
+
+- net/rpc - (in the stdlib)
+- new grpc endpoints
+- [Streaming](./streaming)
+
+
+## RPC connections and load balancing
+
+This section is a work in progress, it will eventually cover topics like:
+
+- agent/router
+- agent/pool
--- a/contributing/rpc/routing.mmd
+++ b/contributing/rpc/routing.mmd
@ -0,0 +1,33 @@
+graph LR
+
+    handleConn
+
+    handleConn -->|RPCConsul| handleConsulConn
+    handleConn -->|RPCRaft| raftLayer
+    handleConn -->|RPCTLS| handleConn
+    handleConn -->|RPCMultiplexV2| handleMultiplexV2
+    handleConn -->|RPCSnapshot| handleSnapshotConn
+    handleConn -->|RPCTLSInsecure| handleInsecureConn
+    handleConn -->|RPCGossip| handleGossipConn
+
+    handleConsulConn --> RPCServer
+    handleMultiplexV2 --> handleConsulConn
+
+    %% new after 1.6.9
+
+    handleConn -->|PeekForTLS| handleNativeTLS
+
+    handleNativeTLS -->|ALPN_RPCConsul| handleConsulConn
+    handleNativeTLS -->|ALPN_RPCRaft| raftLayer
+    handleNativeTLS -->|ALPN_RPCMultiplexV2| handleMultiplexV2
+    handleNativeTLS -->|ALPN_RPCSnapshot| handleSnapshotConn
+    handleNativeTLS -->|ALPN_RPCGRPC| grpcHandler
+    handleNativeTLS -->|ALPN_WANGossipPacket| handleWANGossipPacket
+    handleNativeTLS -->|ALPN_WANGossipStream | handleWANGossipStream
+    handleNativeTLS -->|ALPN_RPCGossip| handleGossipConn
+
+    handleMultiplexV2 -->|RPCGossip| handleGossipConn
+    handleConn -->|RPCGRPC| grpcHandler
+
+
+
--- a/contributing/rpc/routing.svg
+++ b/contributing/rpc/routing.svg
--- a/contributing/rpc/streaming/README.md
+++ b/contributing/rpc/streaming/README.md
--- a/contributing/rpc/streaming/adding-a-topic.md
+++ b/contributing/rpc/streaming/adding-a-topic.md
--- a/contributing/rpc/streaming/event-filtering.mmd
+++ b/contributing/rpc/streaming/event-filtering.mmd
--- a/contributing/rpc/streaming/event-filtering.svg
+++ b/contributing/rpc/streaming/event-filtering.svg
--- a/contributing/rpc/streaming/event-publisher-layout.mmd
+++ b/contributing/rpc/streaming/event-publisher-layout.mmd
--- a/contributing/rpc/streaming/event-publisher-layout.svg
+++ b/contributing/rpc/streaming/event-publisher-layout.svg
--- a/contributing/rpc/streaming/framing-events.mmd
+++ b/contributing/rpc/streaming/framing-events.mmd
--- a/contributing/rpc/streaming/framing-events.svg
+++ b/contributing/rpc/streaming/framing-events.svg
--- a/contributing/rpc/streaming/overview.mmd
+++ b/contributing/rpc/streaming/overview.mmd
--- a/contributing/rpc/streaming/overview.svg
+++ b/contributing/rpc/streaming/overview.svg
--- a/contributing/service-discovery/README.md
+++ b/contributing/service-discovery/README.md
@ -2,4 +2,6 @@

 This section is still a work in progress.

+1. [catalog](./catalog.md)
 1. [DNS Interface](./dns.md)
+1. health checking
--- a/contributing/service-discovery/catalog-2.mmd
+++ b/contributing/service-discovery/catalog-2.mmd
@ -0,0 +1,36 @@
+erDiagram
+   
+    CheckServiceNode
+    Node
+    NodeService
+    ServiceNode
+    HealthCheck
+
+    CheckServiceNode ||--|| Node: has
+    CheckServiceNode ||--|| NodeService: has
+    CheckServiceNode ||--o{ HealthCheck: has
+
+    Store ||--o{ Node: "stored in the node table"
+    Store ||--o{ ServiceNode: "stored in the service table"
+    Store ||--o{ HealthCheck: "stored in the checks table"
+
+    ServiceNode ||--|| Node: references
+    HealthCheck ||--o| Node: references
+    HealthCheck ||--o| Service: references
+
+    RegisterRequest ||--o| Node: has
+    RegisterRequest ||--o| NodeService: has
+    RegisterRequest ||--o{ HealthCheck: has
+
+
+    CheckDefinition
+    HealthCheckDefinition
+    CheckType
+
+    HealthCheck ||--|| HealthCheckDefinition: has
+
+    ServiceDefinition ||--|| NodeService: "is essentially a"
+    ServiceDefinition ||--o{ CheckType: "has"
+
+    Config ||--o{ CheckDefinition: "has"
+    Config ||--o{ ServiceDefinition: "has"
--- a/contributing/service-discovery/catalog.md
+++ b/contributing/service-discovery/catalog.md
@ -0,0 +1,6 @@
+# Catalog
+
+This section is a work in progress.
+
+The catalog is at the core of both Service Discovery and Service Mesh. It accepts
+registrations and deregistrations of Services, Nodes, and Checks.
--- a/contributing/service-discovery/catalog.mmd
+++ b/contributing/service-discovery/catalog.mmd
@ -0,0 +1,24 @@
+erDiagram
+   
+    CheckServiceNode
+    Node
+    NodeService
+    ServiceNode
+    HealthCheck
+
+    CheckServiceNode ||--|| Node: has
+    CheckServiceNode ||--|| NodeService: has
+    CheckServiceNode ||--o{ HealthCheck: has
+
+    Store ||--o{ Node: "stored in the node table"
+    Store ||--o{ ServiceNode: "stored in the service table"
+    Store ||--o{ HealthCheck: "stored in the checks table"
+
+    ServiceNode ||--|| Node: references
+    HealthCheck ||--o| Node: references
+    HealthCheck ||--o| Service: references
+
+    RegisterRequest ||--o| Node: has
+    RegisterRequest ||--o| NodeService: has
+    RegisterRequest ||--o{ HealthCheck: has
+
--- a/contributing/service-mesh/README.md
+++ b/contributing/service-mesh/README.md
@ -0,0 +1,12 @@
+# Service Mesh (Connect)
+
+- call out: envoy/proxy is the data plane, Consul is the control plane
+- agent/xds - gRPC service that implements
+  [xDS](https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol)
+- [agent/proxycfg](https://github.com/hashicorp/consul/blob/master/agent/proxycfg/proxycfg.go)
+- CA Manager - certificate authority
+- command/connect/envoy - bootstrapping and running envoy
+- command/connect/proxy - built-in proxy that is dev-only and not supported 
+  for production.
+- `connect/` - "Native" service mesh
+