404 lines
14 KiB
Markdown
404 lines
14 KiB
Markdown
# sojuboy
|
||
|
||
An IRC bouncer companion service for soju that:
|
||
|
||
- Watches your bouncer-connected channels continuously
|
||
- Notifies you on mentions via Pushover (default)
|
||
- Stores messages in SQLite for summaries and on-demand inspection
|
||
- Generates AI digests (OpenAI by default) on schedule or on demand
|
||
- Exposes a small HTTP API for health, tailing messages, metrics, and triggering digests
|
||
|
||
Note: this is not a bot and never replies in IRC. It passively attaches as a soju multi-client on your main account.
|
||
|
||
## Why
|
||
|
||
If you use soju as a bouncer, you may want per-client alerts and AI summaries without running a heavy IRC client all the time. This service connects to soju as a distinct client identity (e.g., `username/network@client`) and handles notifications and summaries for you, containerized and easy to run on a Synology or any Docker host.
|
||
|
||
## High-level architecture
|
||
|
||
- Language: Go (single static binary, low memory footprint)
|
||
- Long-lived IRC client: raw IRC using a lightweight parser (sorcix/irc) with an irssi-style handshake tailored for soju
|
||
- Message storage: SQLite via modernc.org/sqlite
|
||
- Scheduling: github.com/robfig/cron/v3
|
||
- Notifications: github.com/gregdel/pushover
|
||
- Summarization (LLM): github.com/sashabaranov/go-openai
|
||
- HTTP API: Go stdlib `net/http`
|
||
|
||
Runtime modules:
|
||
|
||
- `internal/soju`: soju connection, capability negotiation, irssi-style PASS/USER auth, joins, message ingestion, event playback, CHATHISTORY fallback
|
||
- `internal/store`: SQLite schema and queries
|
||
- `internal/notifier`: Pushover notifier (pluggable interface)
|
||
- `internal/summarizer`: OpenAI client with GPT-5 defaults, GPT-4o-mini fallback
|
||
- `internal/scheduler`: cron-based digest scheduling and daily retention job
|
||
- `internal/httpapi`: `/healthz`, `/ready`, `/tail`, `/trigger`, `/metrics`
|
||
- `internal/config`: env config loader and helpers
|
||
|
||
## Features
|
||
|
||
- Mention/keyword detection: punctuation-tolerant (letters, digits, `_` and `-` are word chars)
|
||
- Mention tuning: allow/deny channels, urgent keywords bypass quiet hours, rate limiting
|
||
- AI digest generation: concise natural summaries (no rigid sections); integrates pasted multi-line posts and referenced link context; image links sent to GPT‑5 as vision inputs
|
||
- Configurable schedules (cron), quiet hours, and summary parameters
|
||
- Local persistence with retention pruning (daily at 03:00)
|
||
- HTTP endpoints: health, tail, metrics, on-demand digests
|
||
|
||
## How it works
|
||
|
||
1) The service connects to soju and negotiates IRCv3 capabilities:
|
||
- Requests: `server-time`, `message-tags`, `batch`, `cap-notify`, `echo-message`, `draft/event-playback`; optional fallback `draft/chathistory` when needed
|
||
- Joins happen after numeric 001 (welcome)
|
||
|
||
2) Authentication:
|
||
- PASS then irssi-style `USER <username/network@client> <same> <host> :<realname>`
|
||
- Soju’s per-client identity preserves distinct history
|
||
|
||
3) Playback and backfill:
|
||
- If `draft/event-playback` is enabled, soju replays missed messages automatically
|
||
- Optional fallback: `CHATHISTORY LATEST <channel> timestamp=<RFC3339Nano> <limit>` using the last stored timestamp per channel (disabled by default)
|
||
|
||
4) Messages and mentions:
|
||
- Each `PRIVMSG` is stored with server-time when available
|
||
- Mentions trigger Pushover notifications subject to quiet hours, urgency, and rate limits
|
||
- Debug logs include: mention delivered or suppression reason (backfill, quiet hours, rate limit)
|
||
|
||
5) Summarization:
|
||
- `/trigger` or the scheduler loads a window and calls OpenAI
|
||
- GPT‑5 context: ~272k input tokens + up to 128k output tokens (400k total)
|
||
- Summaries are concise/natural and integrate multi-line posts, article text (readability-extracted), and image links (vision)
|
||
|
||
6) HTTP API:
|
||
- `/healthz` → `200 ok`
|
||
- `/ready` → `200` only when connected to soju
|
||
- `/tail?channel=#chan&limit=N` → plaintext tail (chronological)
|
||
- `/trigger?channel=#chan&window=6h` → returns digest and sends via notifier
|
||
- `/metrics` → Prometheus text metrics
|
||
- Protect `/tail` and `/trigger` with `HTTP_TOKEN` via Bearer, `token` query, `X-Auth-Token`, or basic auth (`token:<HTTP_TOKEN>`)
|
||
|
||
## Health and readiness
|
||
|
||
- `/healthz` always returns 200
|
||
- `/ready` returns 200 only when connected to soju
|
||
- Binary supports `--health` to perform a local readiness check and exit 0/1. Example Docker healthcheck:
|
||
|
||
```yaml
|
||
healthcheck:
|
||
test: ["CMD", "/sojuboy", "--health"]
|
||
interval: 30s
|
||
timeout: 3s
|
||
retries: 3
|
||
```
|
||
|
||
## Installation
|
||
|
||
### Prerequisites
|
||
|
||
- Docker (or Synology Container Manager)
|
||
- A soju bouncer you can connect to
|
||
- Pushover account and app token (for push)
|
||
- OpenAI API key (for AI summaries)
|
||
|
||
### Build and run (Docker Compose)
|
||
|
||
1) Create `.env` in repo root (see example below)
|
||
|
||
2) Start:
|
||
|
||
```bash
|
||
docker-compose up -d --build
|
||
```
|
||
|
||
3) Health check:
|
||
|
||
```bash
|
||
curl -s http://localhost:8080/healthz
|
||
```
|
||
|
||
4) Tail last messages (remember to URL-encode `#` as `%23`):
|
||
|
||
```bash
|
||
curl -s "http://localhost:8080/tail?channel=%23animaniacs&limit=50" \
|
||
-H "Authorization: Bearer $HTTP_TOKEN"
|
||
```
|
||
|
||
5) Trigger a digest for the last 6 hours:
|
||
|
||
```bash
|
||
curl -s "http://localhost:8080/trigger?channel=%23animaniacs&window=6h" \
|
||
-H "Authorization: Bearer $HTTP_TOKEN"
|
||
```
|
||
|
||
6) Metrics:
|
||
|
||
```bash
|
||
curl -s http://localhost:8080/metrics
|
||
```
|
||
|
||
## Quick start (Docker Compose)
|
||
|
||
```bash
|
||
docker-compose up -d --build
|
||
# wait for healthy
|
||
docker inspect --format='{{json .State.Health}}' sojuboy | jq
|
||
```
|
||
|
||
Compose includes a healthcheck calling the binary’s `--health` flag, which returns 0 only when `/ready` is 200.
|
||
|
||
## Configuration options
|
||
|
||
You can configure via a `.env` file or inline `environment:` in your compose YAML. Both approaches are shown below. Defaults for all variables are listed in the table after the examples.
|
||
|
||
### Option A: .env file (recommended for development)
|
||
|
||
Below shows maximum or large/reasonable values. Defaults are noted where they are also the maximum or when relevant.
|
||
|
||
```env
|
||
# soju / IRC
|
||
SOJU_HOST=bnc.example.org
|
||
SOJU_PORT=6697
|
||
SOJU_TLS=true
|
||
SOJU_NETWORK=your-network
|
||
|
||
# Client identity: include client suffix for per-client history in soju
|
||
IRC_NICK=yourNick
|
||
IRC_USERNAME=yourUser/your-network@sojuboy
|
||
IRC_REALNAME=Your Real Name
|
||
IRC_PASSWORD=yourSojuClientPassword
|
||
|
||
# Channels to auto-join (comma-separated)
|
||
CHANNELS=#animaniacs,#general
|
||
KEYWORDS=yourNick,YourCompany
|
||
|
||
# Auth method hint (raw is used; value is ignored but kept for compatibility)
|
||
SOJU_AUTH=raw
|
||
|
||
# Notifier (Pushover)
|
||
NOTIFIER=pushover
|
||
PUSHOVER_USER_KEY=your-pushover-user-key
|
||
PUSHOVER_API_TOKEN=your-pushover-app-token
|
||
|
||
# Summarizer (OpenAI)
|
||
LLM_PROVIDER=openai
|
||
OPENAI_API_KEY=sk-...
|
||
OPENAI_BASE_URL=https://api.openai.com/v1
|
||
OPENAI_MODEL=gpt-5
|
||
# Max completion (output) tokens for GPT‑5 is ~128k (model limit). Default 128000.
|
||
OPENAI_MAX_TOKENS=128000
|
||
# Summarizer tuning
|
||
SUMM_FOLLOW_LINKS=true # default true
|
||
SUMM_LINK_TIMEOUT=20s # default 20s
|
||
SUMM_LINK_MAX_BYTES=1048576 # default 1048576 (1 MiB/article)
|
||
SUMM_GROUP_WINDOW=120s # default 120s
|
||
SUMM_MAX_LINKS=20 # default 20
|
||
SUMM_MAX_GROUPS=20000 # default 0 (no cap); example large
|
||
SUMM_TIMEOUT=10m # request timeout; default 10m
|
||
|
||
# Digests
|
||
DIGEST_CRON=0 */6 * * * # every 6 hours
|
||
DIGEST_WINDOW=24h # default 24h
|
||
QUIET_HOURS= # e.g., 22:00-07:00
|
||
|
||
# Mentions/alerts
|
||
NOTIFY_BACKFILL=false # default false
|
||
MENTION_MIN_INTERVAL=30s # no hard max; rate-limit between alerts
|
||
MENTIONS_ONLY_CHANNELS= # optional allow-list (CSV)
|
||
MENTIONS_DENY_CHANNELS= # optional deny-list (CSV)
|
||
URGENT_KEYWORDS=urgent,priority # bypass quiet hours
|
||
|
||
# HTTP API
|
||
HTTP_LISTEN=:8080
|
||
HTTP_TOKEN=put-a-long-random-token-here
|
||
|
||
# Storage
|
||
STORE_PATH=/data/app.db
|
||
STORE_RETENTION_DAYS=365 # default 365
|
||
|
||
# Logging
|
||
LOG_LEVEL=info
|
||
```
|
||
|
||
Compose (with localhost bind suitable for Synology reverse proxy):
|
||
|
||
```yaml
|
||
services:
|
||
sojuboy:
|
||
image: code.cravey.net/your-user/sojuboy:v0.1.0-beta1
|
||
restart: unless-stopped
|
||
env_file: .env
|
||
ports:
|
||
- "127.0.0.1:8080:8080" # bind only to localhost; fronted by DSM Reverse Proxy
|
||
volumes:
|
||
- /volume1/docker/sojuboy/data:/data
|
||
healthcheck:
|
||
test: ["CMD", "/sojuboy", "--health"]
|
||
interval: 30s
|
||
timeout: 3s
|
||
retries: 3
|
||
```
|
||
|
||
### Option B: Inline environment in compose (no .env)
|
||
|
||
```yaml
|
||
services:
|
||
sojuboy:
|
||
image: code.cravey.net/your-user/sojuboy:v0.1.0-beta1
|
||
restart: unless-stopped
|
||
ports:
|
||
- "127.0.0.1:8080:8080" # bind only to localhost; fronted by DSM Reverse Proxy
|
||
volumes:
|
||
- /volume1/docker/sojuboy/data:/data
|
||
environment:
|
||
SOJU_HOST: "bnc.example.org" # default 127.0.0.1
|
||
SOJU_PORT: "6697" # default 6697
|
||
SOJU_TLS: "true" # default true
|
||
SOJU_NETWORK: "your-network" # default ""
|
||
IRC_NICK: "yourNick" # default sojuboy
|
||
IRC_USERNAME: "yourUser/your-network@sojuboy" # default IRC_NICK
|
||
IRC_REALNAME: "Your Real Name" # default sojuboy
|
||
IRC_PASSWORD: "yourSojuClientPassword" # default ""
|
||
CHANNELS: "#animaniacs,#general" # default "" (none)
|
||
KEYWORDS: "yourNick,YourCompany" # default IRC_NICK
|
||
SOJU_AUTH: "raw" # default sasl (hint only)
|
||
NOTIFIER: "pushover" # default pushover
|
||
PUSHOVER_USER_KEY: "..." # default ""
|
||
PUSHOVER_API_TOKEN: "..." # default ""
|
||
LLM_PROVIDER: "openai" # default openai
|
||
OPENAI_API_KEY: "sk-..." # default ""
|
||
OPENAI_BASE_URL: "https://api.openai.com/v1" # default ""
|
||
OPENAI_MODEL: "gpt-5" # default gpt-5
|
||
OPENAI_MAX_TOKENS: "128000" # default 128000
|
||
SUMM_FOLLOW_LINKS: "true" # default true
|
||
SUMM_LINK_TIMEOUT: "20s" # default 20s
|
||
SUMM_LINK_MAX_BYTES: "1048576" # default 1048576
|
||
SUMM_GROUP_WINDOW: "120s" # default 120s
|
||
SUMM_MAX_LINKS: "20" # default 20
|
||
SUMM_MAX_GROUPS: "20000" # default 0 (no cap)
|
||
SUMM_TIMEOUT: "10m" # default 10m
|
||
DIGEST_CRON: "0 */6 * * *" # default 0 */6 * * *
|
||
DIGEST_WINDOW: "24h" # default 24h
|
||
QUIET_HOURS: "" # default ""
|
||
NOTIFY_BACKFILL: "false" # default false
|
||
MENTION_MIN_INTERVAL: "30s" # default 30s
|
||
MENTIONS_ONLY_CHANNELS: "" # default ""
|
||
MENTIONS_DENY_CHANNELS: "" # default ""
|
||
URGENT_KEYWORDS: "urgent,priority" # default ""
|
||
HTTP_LISTEN: ":8080" # default :8080
|
||
HTTP_TOKEN: "<long-random-token>" # default ""
|
||
STORE_PATH: "/data/app.db" # default /data/app.db
|
||
STORE_RETENTION_DAYS: "365" # default 365
|
||
LOG_LEVEL: "info" # default info
|
||
healthcheck:
|
||
test: ["CMD", "/sojuboy", "--health"]
|
||
interval: 30s
|
||
timeout: 3s
|
||
retries: 3
|
||
```
|
||
|
||
### Defaults reference
|
||
|
||
| Variable | Default |
|
||
|---|---|
|
||
| SOJU_HOST | 127.0.0.1 |
|
||
| SOJU_PORT | 6697 |
|
||
| SOJU_TLS | true |
|
||
| IRC_NICK | sojuboy |
|
||
| IRC_USERNAME | IRC_NICK |
|
||
| IRC_REALNAME | sojuboy |
|
||
| IRC_PASSWORD | (empty) |
|
||
| SOJU_NETWORK | (empty) |
|
||
| CHANNELS | (empty) |
|
||
| KEYWORDS | IRC_NICK |
|
||
| SOJU_AUTH | sasl |
|
||
| NOTIFIER | pushover |
|
||
| PUSHOVER_USER_KEY | (empty) |
|
||
| PUSHOVER_API_TOKEN | (empty) |
|
||
| LLM_PROVIDER | openai |
|
||
| OPENAI_API_KEY | (empty) |
|
||
| OPENAI_BASE_URL | (empty) |
|
||
| OPENAI_MODEL | gpt-5 |
|
||
| OPENAI_MAX_TOKENS | 700 |
|
||
| SUMM_FOLLOW_LINKS | true |
|
||
| SUMM_LINK_TIMEOUT | 6s |
|
||
| SUMM_LINK_MAX_BYTES | 262144 |
|
||
| SUMM_GROUP_WINDOW | 90s |
|
||
| SUMM_MAX_LINKS | 5 |
|
||
| SUMM_MAX_GROUPS | 0 |
|
||
| SUMM_TIMEOUT | 5m |
|
||
| DIGEST_CRON | 0 */6 * * * |
|
||
| DIGEST_WINDOW | 6h |
|
||
| QUIET_HOURS | (empty) |
|
||
| NOTIFY_BACKFILL | false |
|
||
| MENTION_MIN_INTERVAL | 30s |
|
||
| MENTIONS_ONLY_CHANNELS | (empty) |
|
||
| MENTIONS_DENY_CHANNELS | (empty) |
|
||
| URGENT_KEYWORDS | (empty) |
|
||
| HTTP_LISTEN | :8080 |
|
||
| HTTP_TOKEN | (empty) |
|
||
| STORE_PATH | /data/app.db |
|
||
| STORE_RETENTION_DAYS | 7 |
|
||
| LOG_LEVEL | info |
|
||
|
||
## Pushover setup
|
||
|
||
1) Install Pushover iOS app and log in
|
||
2) Get your User Key (in the app or on the website)
|
||
3) Create an application at `pushover.net/apps/build` to get an API token
|
||
4) Put them in `.env` as `PUSHOVER_USER_KEY` and `PUSHOVER_API_TOKEN`
|
||
|
||
## OpenAI setup
|
||
|
||
- Set `OPENAI_API_KEY`
|
||
- Set `OPENAI_BASE_URL` to exactly `https://api.openai.com/v1`
|
||
- If `gpt-5` isn’t available on your account, use a supported model like `gpt-4o-mini`
|
||
- GPT‑5 limits: ~272k input + 128k output tokens (400k context)
|
||
|
||
## HTTP API
|
||
|
||
- `GET /healthz` → `200 ok`
|
||
- `GET /tail?channel=%23chan&limit=50`
|
||
- Returns plaintext messages (chronological)
|
||
- Auth: provide `HTTP_TOKEN` as a Bearer token (or query param `token=`)
|
||
- `GET /trigger?channel=%23chan&window=6h`
|
||
- Returns plaintext digest
|
||
- Also sends via notifier when configured
|
||
- Auth as above
|
||
- `GET /metrics`
|
||
- Prometheus metrics: `sojuboy_messages_ingested_total`, `sojuboy_notifications_sent_total`, `sojuboy_messages_pruned_total`, `sojuboy_connected`
|
||
|
||
## Troubleshooting
|
||
|
||
- Empty tail while there’s activity
|
||
- Ensure the service logs `join requested:` followed by `joined` for your channels
|
||
- Confirm `.env` `CHANNELS` contains your channels
|
||
- Check for `/metrics` and logs for recent message ingestion
|
||
|
||
- 401 Unauthorized from `/tail` or `/trigger`
|
||
- Provide `Authorization: Bearer $HTTP_TOKEN` or `?token=$HTTP_TOKEN`
|
||
|
||
- OpenAI 502/URL errors
|
||
- Ensure `OPENAI_BASE_URL=https://api.openai.com/v1`
|
||
- Try `OPENAI_MODEL=gpt-4o-mini` if `gpt-5` isn’t enabled for your account
|
||
|
||
## Roadmap
|
||
|
||
- Additional notifiers (ntfy, Telegram)
|
||
- Long-form HTML digest rendering
|
||
- Admin endpoints (e.g., `/join?channel=#chan`)
|
||
|
||
## Development notes
|
||
|
||
Project layout (selected):
|
||
|
||
- `cmd/sojuboy/main.go` – entrypoint, wiring config/services
|
||
- `internal/soju` – soju connector and ingestion
|
||
- `internal/store` – SQLite schema and queries
|
||
- `internal/notifier` – Pushover notifier
|
||
- `internal/summarizer` – OpenAI client and prompts
|
||
- `internal/httpapi` – health, tail, trigger, metrics endpoints
|
||
- `internal/scheduler` – cron jobs
|
||
|
||
Go toolchain: see `go.mod` (Go 1.23), Dockerfile builds static binary for a distroless image.
|
||
|
||
## License
|
||
|
||
MIT for code dependencies; this repository’s license will follow your preference (add a LICENSE if needed).
|