What: - bound CLAWDINATOR image artifact retention with S3 lifecycle, AMI pruning, and import provenance tags - reduce the AWS fleet to Babelfish-only and make GitHub credentials opt-in per host - disable the AMI build, nix-openclaw bump, and release workflows by moving them out of .github/workflows/ - update operator docs for the new explicit build and deploy model Why: - stop unbounded S3 and snapshot growth from image builds - remove unattended resurrection paths and shut down the unused t3.large instances - keep the remaining Babelfish host running without GitHub App credentials or sync timers Tests: - `nix shell nixpkgs#shellcheck nixpkgs#shfmt -c bash scripts/lint-shell.sh` (pass) - `nix build .#nixosConfigurations.clawdinator-babelfish.config.system.build.toplevel .#nixosConfigurations.clawdinator-1.config.system.build.toplevel .#nixosConfigurations.clawdinator-2.config.system.build.toplevel` (pass) - `AWS_PROFILE=homelab-admin TF_VAR_aws_region=eu-central-1 TF_VAR_ami_id=ami-0a9abe17feeee0079 TF_VAR_ssh_public_key="$(cat ~/.ssh/id_ed25519.pub)" nix shell nixpkgs#opentofu -c sh -lc 'tofu fmt -check && tofu validate'` (pass) - live AWS apply: destroyed `clawdinator-1` and `clawdinator-2`, replaced Babelfish, and verified only `Fleet Deploy` remains active in GitHub Actions
7.4 KiB
CLAWDINATOR Agent Notes
Read these before acting:
- docs/PHILOSOPHY.md
- docs/ARCHITECTURE.md
- docs/SHARED_MEMORY.md
- docs/SECRETS.md
- docs/POC.md
- BOOTSTRAP.md
- IDENTITY.md
- SOUL.md
- TOOLS.md
- USER.md
Memory references:
- For project goals, read memory/project.md
- For architecture decisions, read memory/architecture.md
- For ops runbook, read memory/ops.md
- For Discord context, also read memory/discord.md
Repo rule: no inline scripting languages (Python/Node/etc.) in Nix or shell blocks; put logic in script files and call them.
System ownership (3 repos):
openclaw: upstream runtime and behavior.nix-openclaw: packaging/build fixes for clawbot.clawdinators: infra, NixOS config, secrets wiring, deployment flow.
Maintainer role:
- Monitor issues + PRs and keep an inventory of what needs human attention.
- Surface priorities and context; do not file issues or modify code unless asked.
- Track running versions (openclaw/nix-openclaw/clawdinators) and note them in
memory/ops.md.
Toolchain workflow (repo source of truth):
- Add/remove tools in
nix/tools/clawdinator-tools.nix(packages + descriptions). - Tools list is rendered into
/etc/clawdinator/tools.mdby Nix and appended to workspaceTOOLS.mdat seed time. - Keep
clawdinator/workspace/TOOLS.mdaligned with upstream template; do not hardcode tool lists there. - When you add a new tool, verify it appears in
/etc/clawdinator/tools.mdand in the workspaceTOOLS.mdafter seed.
The Zen of Python Moltbot, by shamelessly stolen from Tim Peters:
- Beautiful is better than ugly.
- Explicit is better than implicit.
- Simple is better than complex.
- Complex is better than complicated.
- Flat is better than nested.
- Sparse is better than dense.
- Readability counts.
- Special cases aren't special enough to break the rules.
- Although practicality beats purity.
- Errors should never pass silently.
- Unless explicitly silenced.
- In the face of ambiguity, refuse the temptation to guess.
- There should be one-- and preferably only one --obvious way to do it.
- Although that way may not be obvious at first unless you're Dutch.
- Now is better than never.
- Although never is often better than right now.
- If the implementation is hard to explain, it's a bad idea.
- If the implementation is easy to explain, it may be a good idea.
- Namespaces are one honking great idea -- let's do more of those!
Deploy flow (automation-first):
- Use
devenv.nixfor tooling (nixos-generators, awscli2). - Build a bootstrap NixOS image with nixos-generators (raw) and upload it to S3.
- Use
nix/hosts/clawdinator-1-image.nixfor image builds.
- Use
- The old CI AMI/update/release workflows are intentionally disabled under
.github/workflows-disabled/; AMI builds and deploys now require an explicit code change or a local operator run. - Image history is bounded on purpose: raw
clawdinator-nixos-*uploads expire automatically, and old CLAWDINATOR AMIs/snapshots are pruned after successful builds while keeping the live fleet AMI plus a short rollback window. - Resume AMI pipeline work immediately if it stalls; do not use rsync as a workaround. Host edits are allowed but must be committed and baked into a new AMI to persist.
- CI must provide
CLAWDINATOR_AGE_KEYto build + upload the runtime bootstrap bundle to S3. - Bootstrap bundle location:
s3://${S3_BUCKET}/bootstrap/<instance>/(secrets + repo seeds). - Bootstrap S3 bucket + scoped IAM user + VM Import role with
infra/opentofu/aws(use homelab-admin creds). - Bootstrap AWS instances from the AMI with
infra/opentofu/aws(setTF_VAR_ami_id). - Import the image into AWS as an AMI (snapshot import + register image).
- Ensure secrets are encrypted to the baked agenix key (see
../nix/nix-secrets/secrets.nix). - Ensure required secrets exist:
clawdinator-github-app.pem,clawdinator-discord-token-<n>,clawdinator-control-token,clawdinator-control-aws-*,clawdinator-anthropic-api-key. - Update
nix/hosts/<host>.nix(Discord allowlist, GitHub App installationId, identity name). - Discord must use
messages.queue.byChannel.discord = "interrupt";queuedelays replies to heartbeat and makes the bot appear dead. - Ensure
/var/lib/clawd/repos/clawdinatorscontains this repo (self-update requires it). - Verify systemd services:
clawdinator;clawdinator-github-app-tokenonly on hosts that explicitly enable GitHub App auth. - Commit and push changes; repo is the source of truth.
Bootstrap (local):
- Agenix identity is
~/.ssh/id_ed25519(primary SSH key). - Decrypt homelab admin creds:
RULES=../nix/nix-secrets/secrets.nix agenix -d homelab-admin.age -i ~/.ssh/id_ed25519
- OpenTofu env:
TF_VAR_aws_region=eu-central-1TF_VAR_ami_id=ami-...(empty string skips instance creation)TF_VAR_ssh_public_key="$(cat ~/.ssh/id_ed25519.pub)"(required when ami_id is set)TF_VAR_root_volume_size_gb=40(bump if Nix store runs out of space)
- Run
tofu init+tofu applyininfra/opentofu/aws. - After apply, update CI secrets from outputs:
tofu output -raw access_key_id→clawdinator-image-uploader-access-key-id.agetofu output -raw secret_access_key→clawdinator-image-uploader-secret-access-key.agetofu output -raw bucket_name→clawdinator-image-bucket-name.agetofu output -raw aws_region→clawdinator-image-bucket-region.age- Then
gh secret setforAWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_REGION,S3_BUCKET.
- Get the latest AMI ID:
aws ec2 describe-images --region eu-central-1 --owners self --filters "Name=tag:clawdinator,Values=true" --query "Images | sort_by(@,&CreationDate)[-1].[ImageId,Name,CreationDate]" --output text
End-to-end SDLC (local → AMI → host) (verified):
- Decrypt AWS creds (homelab admin) and export:
cd ~/code/nix/nix-secretsRULES=./secrets.nix agenix -d homelab-admin.age -i ~/.ssh/id_ed25519 > /tmp/homelab-admin.envset -a; source /tmp/homelab-admin.env; set +a- Cleanup:
trash /tmp/homelab-admin.env
- Build/import a new AMI explicitly. The old GitHub Actions build/deploy paths are disabled under
.github/workflows-disabled/. - Redeploy from the new AMI (instance replacement):
devenv shell -- bash -lc "cd infra/opentofu/aws && TF_VAR_ami_id=<AMI_ID> TF_VAR_ssh_public_key=\"$(cat ~/.ssh/id_ed25519.pub)\" TF_VAR_aws_region=eu-central-1 tofu apply -auto-approve"
- New IP:
tofu output -json instance_public_ips | jq -r '."clawdinator-1"'ssh -o StrictHostKeyChecking=accept-new root@<ip>
- Post-deploy sanity:
systemctl is-active clawdinatorsystemctl is-active clawdinator-github-app-token.timeronly if the target host explicitly enablesgithubAppGH_CONFIG_DIR=/var/lib/clawd/gh gh auth status -h github.comonly if the target host explicitly enables GitHub auth
Important:
- Repo/workspace on host is seeded from the AMI snapshot.
git pullis ephemeral; rebuild AMI for persistent changes. - Any manual host fix is triage-only; always rebuild the AMI and redeploy before calling it done.
- If SSH access is lost, use SSM (instance profile is attached via OpenTofu) to re-add
/root/.ssh/authorized_keys.
Key principle: mental notes don’t survive restarts — write it to a file.
Cattle vs pets: hosts are disposable. Prefer re-provisioning from OpenTofu + NixOS configs over in-place manual fixes. One way only: AWS AMI pipeline via S3 + VM Import. This is a greenfield repo. Do not reference alternate paths anywhere in code or docs.