{"token_count": 4528}

# Bound Keypair Joining Admin Guide

This guide discusses various tasks users administering Bound Keypair clients may need to perform over the lifespan of the bot or agent.

## Allowing additional recovery attempts

When using the `standard` recovery mode, only a configured number of recovery attempts can be made. If the limit is reached, no further recovery attempts can be made until the limit is increased.

To increase this limit and allow an expired client to join again, edit the token using `tctl edit`:

```
$ tctl edit token/example-token
```

Find the `spec.bound_keypair.recovery.limit` field and increment the limit by the desired amount. You are free to select any desired threshold. For example, consider these use cases:

- If human intervention is desired for each join attempt you can increase this value by 1. This single recovery attempt will be immediately consumed, so future recoveries will again require human intervention, and may result in downtime.

  While this approach makes downtime likely, it does ensure a human verifies the state of the client on each recovery.

- If you want human intervention for each recovery, but want to avoid downtime, you can increase this value by 2. The first attempt will be consumed immediately, but the client will have one recovery attempt for automatic future use.

  A human user can periodically audit the recovery count and client to ensure a recovery attempt is always available and the host is behaving as expected.

- Any larger value will increase the amount of time required between human intervention. You can select your tolerance for automatic client recoveries as desired.

Alternatively, if you wish to allow an unlimited number of automatic recovery attempts, [refer to the entry below](#allowing-unlimited-recovery-attempts) on the `relaxed` recovery mode.

Note that the recovery limit is always relative to the recovery counter (in the `status.bound_keypair.recovery_count` field in the token resource). It is valid to decrease the limit or set it to zero, however doing so may prevent future recovery attempts until the limit is increased again.

Additionally, note that [join state verification](https://goteleport.com/docs/reference/machine-workload-identity/bound-keypair/concepts.md#join-state-verification) is still required, and will prevent multiple concurrent uses of the same keypair and token. In other words, increasing the recovery limit will not allow multiple clients to join.

## Allowing unlimited recovery attempts

To allow unlimited recovery attempts, the `spec.bound_keypair.recovery.mode` field should be set to `relaxed`. To do this, use `tctl edit` to edit the token:

```
$ tctl edit token/example-token
```

Find or create the `spec.bound_keypair.recovery.mode` field and set the value to `relaxed`. Save the file and quit your editor to update the token.

When the recovery mode is set to `relaxed`, the `limit` field is ignored and the `status.bound_keypair.recovery_count` field may increase beyond the written limit. If the mode is later changed back to `standard`, be aware that future recovery attempts will fail unless the `limit` is increased to accommodate the current value of `recovery_count`.

Note that when `relaxed` mode is in use, [join state verification](https://goteleport.com/docs/reference/machine-workload-identity/bound-keypair/concepts.md#join-state-verification) is still required and will prevent multiple concurrent uses of the same keypair and token. If your use case requires this, you can [disable join state verification](#disabling-join-state-verification), but doing so does impact the security of the token.

## Requesting a keypair rotation

To request a keypair rotation, set the `.spec.bound_keypair.rotate_after` field to contain a timestamp. On the next authentication attempt after that timestamp has elapsed, the bot will automatically rotate its keypair.

To simplify this process, you can use the `tctl bound-keypair rotate` helper:

```
$ tctl bound-keypair rotate token-name
```

This sets the timestamp to the current time. Note that by default bots only reauthenticate every 20 minutes, so it may take some time for the request to be acknowledged. You can monitor the rotation status by watching the token's `.status.bound_keypair.last_rotated_at` field.

If you want to force an early rotation and have access to the bot host, you can restart the `tbot` process, or send it a signal with `pkill -usr1 tbot` to request an early rotation.

Note that the previous 10 keypairs are retained on the client for use in case of a cluster rollback; refer to the [cluster rollback](#recovery-after-a-cluster-rollback) section for additional information.

---

AGENT NOTE

Note that non-bot agents manage their certificates differently from bots: they generally do not reauthenticate to Teleport on a regular basis and only do so under certain conditions, like if role changes are requested (adding a new service role) or cluster CAs are rotated. Bound Keypair rotation can only happen during the bound keypair challenge process, which for agents only happens during these rare, manual events.

---

## Locking a `bound_keypair` client

The simplest way to lock out a client - either bot or agent - that joined using the `bound_keypair` join method is to use a join token lock target:

```
$ tctl lock --join-token=token-name
```

As a bound keypair token is linked to a single client, this will effectively lock it. It will not be able to reauthenticate, recover, interact with the Teleport API, or otherwise use its credentials until the lock is removed.

### Bot-specific locking notes

Note that if a bot is locked for long enough - bots have a 1 hour certificate TTL by default - its certificates will expire. If you intend to remove this lock and reinstate the bot, you may also need to increase the recovery limit (`.spec.bound_keypair.recovery.limit`) to accommodate the additional recovery attempt.

Other lock targets can also be used, but are not preferred:

- Bot instance (`tctl lock --bot-instance-id ...`): will lock only a single instance of the bot. Note that if the recovery limit allows for it, the [automatic recovery process](https://goteleport.com/docs/reference/machine-workload-identity/bound-keypair/concepts.md#recovery) will attempt to rejoin and, if successful, will generate a new bot instance ID.
- Bot name (`tctl lock --user bot-<name>`): will lock all bots using the same bot / user. This may be overly broad and lock other instances running under this bot user.

### Locking agents by their unique ID

Similar to bots, agents have a unique ID. Find the unique ID by running:

```
$ tctl inventory ls
```

Find the desired host in the list and copy its ID from the "Server ID" column. Then, create a lock for that ID:

```
$ tctl lock --server-id=1234abcd-1234-5678-abcd-1234abcd5678
```

## Recovering a locked `bound_keypair` client

Bots or agents joined with the `bound_keypair` join method can become automatically locked under various conditions, including:

- Failing to correctly complete [join state verification](https://goteleport.com/docs/reference/machine-workload-identity/bound-keypair/concepts.md#join-state-verification)
- Connecting with certificates that have an invalid [generation counter](https://goteleport.com/docs/reference/architecture/machine-id-architecture.md#ephemeral-token) (bots only)
- Locked manually by a cluster admin

To recover a client that has become locked, first ensure the client's internal storage has not been compromised, like the bot's `storage` directory (usually `/var/lib/teleport/bot`), or the agent's data directory (`/var/lib/teleport`). These locking conditions are designed to trigger if more than one client tries to join using a copy of the same certificates and private key. This can occur due to a misconfiguration or due to an attacker copying a bot's credentials, so ideally the latter should be ruled out before unlocking the client.

Next, determine the name (UUID) of the lock or locks targeting the client:

```
$ tctl get lock
kind: lock
metadata:
  name: 372af058-76d1-4e64-93da-3b04d7d03ac2
spec:
  target:
    user: bot-example
version: v2
---
kind: lock
metadata:
  name: 791d0b1d-01b4-4752-8a99-9b2908aebfae
spec:
  target:
    bot_instance_id: e7d494ae-a0ff-4d12-b935-de5e2025f667
version: v2
---
kind: lock
metadata:
  name: a69fdbb2-8e53-406a-b453-48b2cda6991d
spec:
  target:
    join_token: example-token-name
version: v2
---
kind: lock
metadata:
  name: 7a892628-97a6-4038-be36-d4739bf78109
spec:
  target:
    server_id: 3d8b0237-f2e0-4add-aec1-7f293278b35e
version: v2
```

Note the different locks and lock targets shown above. Bots can be targeted by any of these lock types:

- their Teleport user name (`bot-example`)
- the bot instance ID (a UUID)
- the join token name.

For agents, a different set of locks can be generated:

- the join token name
- the server ID (only manually)

Locks created automatically for clients using Bound Keypair Joining will typically use a `join_token` target, but a lock targeting any of these values could be created manually.

Note that locks may have a message field containing details about why the lock was created.

Once the lock name(s) have been determined, remove each using `tctl rm`:

```
$ tctl rm lock/372af058-76d1-4e64-93da-3b04d7d03ac2
```

Next, join state should be reset. Use `tctl edit` to set the token's recovery mode to `insecure`, but make a note of the current value (`standard` or `relaxed`):

```
$ tctl edit token/example-token
```

Change the `.spec.bound_keypair.recovery.mode` field to `insecure`, save, and quit the editor.

The client is now allowed to rejoin. Given sufficient time it will retry on its own, but if you have access to the host, run the following to restart it:

- For bots: `systemctl restart tbot`
- For agents: `systemctl restart teleport`

The client should now be able to join successfully. You can monitor progress by watching for new audit events in Teleport's web UI, or by waiting for the recovery counter to increase:

```
$ tctl get token/example-token --format=json | jq '.[].status.bound_keypair.recovery_count'
```

Once the client has joined successfully, reset the recovery mode to its previous value using `tctl edit`:

```
$ tctl edit token/example-token
```

If you do suspect the client's credentials may have been compromised, you may also want to [request a keypair rotation](#requesting-a-keypair-rotation) in addition to taking other steps to ensure the host is properly secured.

## Disabling join state verification

It is occasionally useful to intentionally disable join state verification. For example, this can enable use with:

- CI/CD providers without an explicit [delegated join method](https://goteleport.com/docs/reference/deployment/join-methods.md#delegated-join-methods).
- Nodes with immutable storage that cannot store an updated join state document after each join.

Before continuing, be aware that disabling join state verification will prevent Teleport from detecting if multiple clients are joining using the same bound keypair token. In other words, if the private key is copied by an attacker, they will be able to join indefinitely. Take care to protect the keypair, and for bots, make certain to limit access from the bot identity using Teleport's [RBAC system](https://goteleport.com/docs/reference/access-controls/roles.md).

When ready, use `tctl edit` to modify the Bound Keypair token:

```
$ tctl edit token/example-token
```

Find or add the `spec.bound_keypair.recovery.mode` field and set it to `insecure`. Save and quit your editor to update the token.

With the mode set to `insecure`, the `recovery.limit` is ignored, allowing unlimited reuse of the token, and join state verification is disabled, allowing concurrent or stateless reuse.

## Recovery after a cluster rollback

If your Teleport cluster is rolled back for any reason, joining clients may fail [join state verification](https://goteleport.com/docs/reference/machine-workload-identity/bound-keypair/concepts.md#join-state-verification) as their local join state document may not match the values currently (or previously) known to Teleport.

The simplest workaround is to temporarily set all bound keypair tokens to `insecure` recovery mode for the first join attempt following a cluster restore. Once they've joined once, they will once again have a valid join state, so the recovery mode can be restored to its previous value.

To change the recovery mode, use `tctl edit` to modify the token resource:

```
$ tctl edit token/example-token
```

Find the `spec.bound_keypair.recovery.mode` field, and set the value to "insecure". Repeat this for each bound keypair token. Wait for all bound keypair clients to reauthenticate, and repeat this process to restore the recovery mode to its previous value.

If [keypairs were rotated](#requesting-a-keypair-rotation) between the snapshot and restore of the Teleport cluster, note that clients only keep a record of the previous 10 keypairs. This means server-side recovery may be impossible if the keypair expected by the restored Teleport cluster has been rotated out of the client-side history, or if the client-side history has been lost or deleted.

## Manually rotating static keys

Static keys prevent automatic key rotation as clients cannot update keys in an arbitrary - and potentially remote - keystore. However, it may still be possible to automate rotation if your environment or secret store allows you to update secrets through an API.

The specific steps needed to automate these will vary based on your environment, but the general steps are:

1. Generate a new keypair on any node using the tbot client:

   ```
   $ tbot keypair create --proxy-server example.teleport.sh:443 --static --format json
   ```

2. Parse the `.public_key` and `.private_key` values using your tool of choice, like `jq` or any other JSON parser.

3. Replace the token in Teleport to trust the new public key, using the value in the `.public_key` field:

   ```
   $ cat my-token.yaml
   version: v2
   kind: token
   metadata:
     name: my-token
   spec:
     bot_name: example-bot
     bound_keypair:
       onboarding:
         initial_public_key: <insert .public_key value here>
       recovery:
         mode: insecure
     join_method: bound_keypair
     roles:
     - Bot
   $ tctl create -f my-token.yaml
   ```

4. Insert the new private key into your keystore. This will vary depending on which keystore or provider you are using.

   - If passing the private key via an environment variable, copy the value directly
   - If passing the private key via file, decode the base64-encoded private key first:
     ```
     $ tbot keypair create --proxy-server example.teleport.sh:443 --static --format json > bound_keypair.json
     $ cat bound_keypair.json | jq -r .private_key | base64 -d
     ```
     ...and store the result as needed.

5. Future jobs should now use the new keypair.

Frequently rotating static keys can help to mitigate the security tradeoffs of `insecure` recovery. See the [concepts page](https://goteleport.com/docs/reference/machine-workload-identity/bound-keypair/concepts.md#static-keys) for more information about static keys.

## Limitations of Bound Keypair joining for Teleport Agents

As of Teleport v18.8, the `bound_keypair` join method supports standard agent types as well as Machine & Workload Identity bots.

Note that the benefits of Bound Keypair joining for agents are more limited, so most users will still probably prefer to use the traditional `token` join method:

- Agents are ultimately still issued long-lived certificates and do not generally renew these identities outside of limited circumstances, namely during [cluster CA rotation](https://goteleport.com/docs/zero-trust-access/management/security/ca-rotation.md) or when new system roles are added to an existing agent.
- As most Agents only join a single time, Bound Keypair's [Join State Verification](https://goteleport.com/docs/reference/machine-workload-identity/bound-keypair/concepts.md#join-state-verification) is effectively unused.
- As [keypair rotation](#requesting-a-keypair-rotation) can only take place at join time, agents will only honor a keypair rotation request under certain conditions, like a role change.
- Agent identities don't have access privileges for other Teleport resources, so they don't benefit from or take advantage of the additional protections provided by Bound Keypair joining.
- The Bound Keypair join method is more complex than `token` joining, and does introduce potential new failure modes.
- Agents can only join via registration secrets (Bound Keypair's default mode) or static key files. Non-static preregistered keys are not currently supported.

That said, you might consider Bound Keypair for Agents in certain circumstances:

- You want an alternative to standard `token`-type tokens that is strictly single-use.

  Bound Keypair tokens (and their registration secrets) are strictly paired to a single host (be it a bot or agent), so providing a bound keypair token and registration secret to a user would only allow them to join a single node.

- You do not want any shared secrets in your joining process. Bound Keypair's [static keys](https://goteleport.com/docs/installation/agents/bound-keypair-static-keys.md) allow you to generate a keypair on the target node itself. Only the public key would ever leave the system, and no secrets or private key material would ever have to leave the host.

If you'd like to join an agent using Bound Keypair joining, refer to these guides:

- [Joining an agent using Bound Keypair](https://goteleport.com/docs/installation/agents/bound-keypair.md)
- [Joining an agent using Bound Keypair Static Keys](https://goteleport.com/docs/installation/agents/bound-keypair-static-keys.md)
