No description

Find a file

Thomas Cravey 575edf2c73 fix(http): avoid err redeclare in /trigger handler		2025-09-05 07:20:41 -05:00
cmd/sojuboy	feat(summarizer): add SummarizeForPush and use for Pushover; keep full WebUI/on-demand output; clamp only on push\ndocs: add AGENTS.md; revert CLAUDE.md release section	2025-09-05 06:58:38 -05:00
internal	fix(http): avoid err redeclare in /trigger handler	2025-09-05 07:20:41 -05:00
.dockerignore	defaults: raise max/defaults (OPENAI_MAX_TOKENS=128000, larger summarizer timeouts/limits, DIGEST_WINDOW=24h, RETENTION=365); docs: add inline env option, defaults table; compose: bind 127.0.0.1:8080	2025-08-16 12:29:58 -05:00
.gitignore	feat: initial Beta 1 release	2025-08-15 18:06:28 -05:00
AGENTS.md	feat(summarizer): add SummarizeForPush and use for Pushover; keep full WebUI/on-demand output; clamp only on push\ndocs: add AGENTS.md; revert CLAUDE.md release section	2025-09-05 06:58:38 -05:00
CHANGELOG.md	docs: update README for Web UI, SSE, link cards/summary; add CHANGELOG Beta 2 (v0.2.0)	2025-08-17 20:37:27 -05:00
CLAUDE.md	feat(summarizer): add SummarizeForPush and use for Pushover; keep full WebUI/on-demand output; clamp only on push\ndocs: add AGENTS.md; revert CLAUDE.md release section	2025-09-05 06:58:38 -05:00
docker-compose.yml	defaults: raise max/defaults (OPENAI_MAX_TOKENS=128000, larger summarizer timeouts/limits, DIGEST_WINDOW=24h, RETENTION=365); docs: add inline env option, defaults table; compose: bind 127.0.0.1:8080	2025-08-16 12:29:58 -05:00
Dockerfile	feat: initial Beta 1 release	2025-08-15 18:06:28 -05:00
go.mod	ui: single footer with version+hash; summarizer gets navbar and channel dropdown; cookie auth redirect on /summarizer	2025-08-16 20:23:12 -05:00
go.sum	docs: expand .env example to show max/large values; add SUMM_TIMEOUT and summarizer tunables\n\nfeat: summarizer improvements\n- readability extraction for articles\n- image links passed to model as vision inputs\n- configurable max groups/links/bytes and timeout\n- higher default ceilings; resilient fallback summary	2025-08-15 20:41:31 -05:00
README.md	docs: update README for Web UI, SSE, link cards/summary; add CHANGELOG Beta 2 (v0.2.0)	2025-08-17 20:37:27 -05:00
TODO.md	feat(summarizer): add SummarizeForPush and use for Pushover; keep full WebUI/on-demand output; clamp only on push\ndocs: add AGENTS.md; revert CLAUDE.md release section	2025-09-05 06:58:38 -05:00

README.md

sojuboy

An IRC bouncer companion service for soju that:

Watches your bouncer-connected channels continuously
Notifies you on mentions via Pushover (default)
Stores messages in SQLite for summaries and on-demand inspection
Generates AI digests (OpenAI by default) on schedule or on demand
Exposes a small HTTP API and a minimal Web UI (Pico.css) for status, tail, history, link cards, and on-demand summaries

Note: this is not a bot and never replies in IRC. It passively attaches as a soju multi-client on your main account.

Why

If you use soju as a bouncer, you may want per-client alerts and AI summaries without running a heavy IRC client all the time. This service connects to soju as a distinct client identity (e.g., username/network@client) and handles notifications and summaries for you, containerized and easy to run on a Synology or any Docker host.

High-level architecture

Language: Go (single static binary, low memory footprint)
Long-lived IRC client: raw IRC using a lightweight parser (sorcix/irc) with an irssi-style handshake tailored for soju
Message storage: SQLite via modernc.org/sqlite (WAL enabled)
Scheduling: github.com/robfig/cron/v3
Notifications: github.com/gregdel/pushover
Summarization (LLM): github.com/sashabaranov/go-openai
HTTP API + Web UI: Go stdlib net/http + html/template + embedded static assets

Runtime modules:

internal/soju: soju connection, capability negotiation, irssi-style PASS/USER auth, joins, message ingestion, event playback, CHATHISTORY fallback
internal/store: SQLite schema and queries
internal/notifier: Pushover notifier (pluggable interface)
internal/summarizer: OpenAI client with GPT-5 defaults, GPT-4o-mini fallback; separate link-summarization prompt
internal/scheduler: cron-based digest scheduling and daily retention job
internal/httpapi: /healthz, /ready, /tail, /trigger, /metrics, Web UI and JSON APIs
internal/config: env config loader and helpers

Features

Mention/keyword detection: punctuation-tolerant (letters, digits, _ and - are word chars)
Mention tuning: allow/deny channels, urgent keywords bypass quiet hours, rate limiting
AI digest generation: concise natural summaries (no rigid sections); integrates pasted multi-line posts and referenced link context; image links sent to GPT‑5 as vision inputs
Configurable schedules (cron), quiet hours, and summary parameters
Local persistence with retention pruning (daily at 03:00)
Web UI with:
- Realtime chat tail via SSE; auto-scroll to bottom; preload older history with infinite scroll-up
- Link cards with OG/Twitter metadata (X posts via oEmbed), YouTube oEmbed embeds, direct image previews
- Inline on-demand link summarization with caching (24h), and a single summarize toggle (🌚/🌝)
- Channel selector in the menubar, login interstitial using HTTP_TOKEN

How it works

The service connects to soju and negotiates IRCv3 capabilities:
- Requests: server-time, message-tags, batch, cap-notify, echo-message, draft/event-playback; optional fallback draft/chathistory when needed
- Joins happen after numeric 001 (welcome)
Authentication:
- PASS then irssi-style USER <username/network@client> <same> <host> :<realname>
- Soju’s per-client identity preserves distinct history
Playback and backfill:
- If draft/event-playback is enabled, soju replays missed messages automatically
- Optional fallback: CHATHISTORY LATEST <channel> timestamp=<RFC3339Nano> <limit> using the last stored timestamp per channel (disabled by default)
Messages and mentions:
- Each PRIVMSG is stored with server-time when available
- Mentions trigger Pushover notifications subject to quiet hours, urgency, and rate limits
Summarization:
- Digests: /trigger or the scheduler loads a window and calls OpenAI with a conversation-focused prompt
- Link summaries: dedicated prompt that ignores chat context; fetches page content with readability; includes oEmbed hints for YouTube and X; passes images to vision models
HTTP + JSON API:
- /healthz → 200 ok
- /ready → 200 only when connected to soju
- /tail?channel=#chan&limit=N → JSON tail for UI
- /history?channel=#chan&before=<RFC3339>&limit=N → JSON older messages (infinite scroll)
- /trigger?channel=#chan&window=6h → returns digest JSON and (optionally) pushes via notifier
- /linkcard?url=... → card JSON (title/desc/image or embed HTML)
- /linksummary?url=... → brief AI summary of a single URL (cached 24h)
- /metrics → Prometheus text metrics
- Protect UI + JSON with HTTP_TOKEN cookie; APIs also allow Bearer/query token

Health and readiness

/healthz always returns 200
/ready returns 200 only when connected to soju
Binary supports --health to perform a local readiness check and exit 0/1. Example Docker healthcheck:

healthcheck:
  test: ["CMD", "/sojuboy", "--health"]
  interval: 30s
  timeout: 3s
  retries: 3

Installation

Prerequisites

Docker (or Synology Container Manager)
A soju bouncer you can connect to
Pushover account and app token (for push)
OpenAI API key (for AI summaries)

Build and run (Docker Compose)

Create .env in repo root (see example below)
Start:

docker-compose up -d --build

Health check:

curl -s http://localhost:8080/healthz

Tail last messages (remember to URL-encode # as %23):

curl -s "http://localhost:8080/tail?channel=%23animaniacs&limit=50" \
  -H "Authorization: Bearer $HTTP_TOKEN"

Trigger a digest for the last 6 hours:

curl -s "http://localhost:8080/trigger?channel=%23animaniacs&window=6h" \
  -H "Authorization: Bearer $HTTP_TOKEN"

Metrics:

curl -s http://localhost:8080/metrics

Quick start (Docker Compose)

docker-compose up -d --build
# wait for healthy
docker inspect --format='{{json .State.Health}}' sojuboy | jq

Compose includes a healthcheck calling the binary’s --health flag, which returns 0 only when /ready is 200.

Configuration options

You can configure via a .env file or inline environment: in your compose YAML. Both approaches are shown below. Defaults for all variables are listed in the table after the examples.

Option A: .env file (recommended for development)

Below shows maximum or large/reasonable values. Defaults are noted where they are also the maximum or when relevant.

# soju / IRC
SOJU_HOST=bnc.example.org
SOJU_PORT=6697
SOJU_TLS=true
SOJU_NETWORK=your-network

# Client identity: include client suffix for per-client history in soju
IRC_NICK=yourNick
IRC_USERNAME=yourUser/your-network@sojuboy
IRC_REALNAME=Your Real Name
IRC_PASSWORD=yourSojuClientPassword

# Channels to auto-join (comma-separated)
CHANNELS=#animaniacs,#general
KEYWORDS=yourNick,YourCompany

# Auth method hint (raw is used; value is ignored but kept for compatibility)
SOJU_AUTH=raw

# Notifier (Pushover)
NOTIFIER=pushover
PUSHOVER_USER_KEY=your-pushover-user-key
PUSHOVER_API_TOKEN=your-pushover-app-token

# Summarizer (OpenAI)
LLM_PROVIDER=openai
OPENAI_API_KEY=sk-...
OPENAI_BASE_URL=https://api.openai.com/v1
OPENAI_MODEL=gpt-5
# Max completion (output) tokens for GPT‑5 is ~128k (model limit). Default 128000.
OPENAI_MAX_TOKENS=128000
# Summarizer tuning
SUMM_FOLLOW_LINKS=true            # default true
SUMM_LINK_TIMEOUT=20s             # default 20s
SUMM_LINK_MAX_BYTES=1048576       # default 1048576 (1 MiB/article)
SUMM_GROUP_WINDOW=120s            # default 120s
SUMM_MAX_LINKS=20                 # default 20
SUMM_MAX_GROUPS=20000             # default 0 (no cap); example large
SUMM_TIMEOUT=10m                  # request timeout; default 10m

# Digests
DIGEST_CRON=0 */6 * * *           # every 6 hours
DIGEST_WINDOW=24h                 # default 24h
QUIET_HOURS=                      # e.g., 22:00-07:00

# Mentions/alerts
NOTIFY_BACKFILL=false             # default false
MENTION_MIN_INTERVAL=30s          # no hard max; rate-limit between alerts
MENTIONS_ONLY_CHANNELS=           # optional allow-list (CSV)
MENTIONS_DENY_CHANNELS=           # optional deny-list (CSV)
URGENT_KEYWORDS=urgent,priority   # bypass quiet hours

# HTTP API
HTTP_LISTEN=:8080
HTTP_TOKEN=put-a-long-random-token-here

# Storage
STORE_PATH=/data/app.db
STORE_RETENTION_DAYS=365          # default 365

# Logging
LOG_LEVEL=info

Compose (with localhost bind suitable for Synology reverse proxy):

services:
  sojuboy:
    image: code.cravey.net/your-user/sojuboy:v0.2.0-beta2
    restart: unless-stopped
    env_file: .env
    ports:
      - "127.0.0.1:8080:8080"  # bind only to localhost; fronted by DSM Reverse Proxy
    volumes:
      - /volume1/docker/sojuboy/data:/data
    healthcheck:
      test: ["CMD", "/sojuboy", "--health"]
      interval: 30s
      timeout: 3s
      retries: 3

Option B: Inline environment in compose (no .env)

services:
  sojuboy:
    image: code.cravey.net/your-user/sojuboy:v0.2.0-beta2
    restart: unless-stopped
    ports:
      - "127.0.0.1:8080:8080"  # bind only to localhost; fronted by DSM Reverse Proxy
    volumes:
      - /volume1/docker/sojuboy/data:/data
    environment:
      SOJU_HOST: "bnc.example.org"           # default 127.0.0.1
      SOJU_PORT: "6697"                      # default 6697
      SOJU_TLS: "true"                       # default true
      SOJU_NETWORK: "your-network"           # default ""
      IRC_NICK: "yourNick"                   # default sojuboy
      IRC_USERNAME: "yourUser/your-network@sojuboy"  # default IRC_NICK
      IRC_REALNAME: "Your Real Name"         # default sojuboy
      IRC_PASSWORD: "yourSojuClientPassword" # default ""
      CHANNELS: "#animaniacs,#general"       # default "" (none)
      KEYWORDS: "yourNick,YourCompany"       # default IRC_NICK
      SOJU_AUTH: "raw"                        # default sasl (hint only)
      NOTIFIER: "pushover"                   # default pushover
      PUSHOVER_USER_KEY: "..."               # default ""
      PUSHOVER_API_TOKEN: "..."              # default ""
      LLM_PROVIDER: "openai"                 # default openai
      OPENAI_API_KEY: "sk-..."               # default ""
      OPENAI_BASE_URL: "https://api.openai.com/v1"  # default ""
      OPENAI_MODEL: "gpt-5"                  # default gpt-5
      OPENAI_MAX_TOKENS: "128000"            # default 128000
      SUMM_FOLLOW_LINKS: "true"              # default true
      SUMM_LINK_TIMEOUT: "20s"               # default 20s
      SUMM_LINK_MAX_BYTES: "1048576"         # default 1048576
      SUMM_GROUP_WINDOW: "120s"              # default 120s
      SUMM_MAX_LINKS: "20"                   # default 20
      SUMM_MAX_GROUPS: "20000"               # default 0 (no cap)
      SUMM_TIMEOUT: "10m"                    # default 10m
      DIGEST_CRON: "0 */6 * * *"             # default 0 */6 * * *
      DIGEST_WINDOW: "24h"                    # default 24h
      QUIET_HOURS: ""                         # default ""
      NOTIFY_BACKFILL: "false"               # default false
      MENTION_MIN_INTERVAL: "30s"            # default 30s
      MENTIONS_ONLY_CHANNELS: ""             # default ""
      MENTIONS_DENY_CHANNELS: ""             # default ""
      URGENT_KEYWORDS: "urgent,priority"     # default ""
      HTTP_LISTEN: ":8080"                   # default :8080
      HTTP_TOKEN: "<long-random-token>"      # default ""
      STORE_PATH: "/data/app.db"             # default /data/app.db
      STORE_RETENTION_DAYS: "365"            # default 365
      LOG_LEVEL: "info"                      # default info
    healthcheck:
      test: ["CMD", "/sojuboy", "--health"]
      interval: 30s
      timeout: 3s
      retries: 3

Defaults reference

Variable	Default
SOJU_HOST	127.0.0.1
SOJU_PORT	6697
SOJU_TLS	true
IRC_NICK	sojuboy
IRC_USERNAME	IRC_NICK
IRC_REALNAME	sojuboy
IRC_PASSWORD	(empty)
SOJU_NETWORK	(empty)
CHANNELS	(empty)
KEYWORDS	IRC_NICK
SOJU_AUTH	sasl
NOTIFIER	pushover
PUSHOVER_USER_KEY	(empty)
PUSHOVER_API_TOKEN	(empty)
LLM_PROVIDER	openai
OPENAI_API_KEY	(empty)
OPENAI_BASE_URL	(empty)
OPENAI_MODEL	gpt-5
OPENAI_MAX_TOKENS	128000
SUMM_FOLLOW_LINKS	true
SUMM_LINK_TIMEOUT	20s
SUMM_LINK_MAX_BYTES	1048576
SUMM_GROUP_WINDOW	120s
SUMM_MAX_LINKS	20
SUMM_MAX_GROUPS	0
SUMM_TIMEOUT	10m
DIGEST_CRON	0 /6 * *
DIGEST_WINDOW	24h
QUIET_HOURS	(empty)
NOTIFY_BACKFILL	false
MENTION_MIN_INTERVAL	30s
MENTIONS_ONLY_CHANNELS	(empty)
MENTIONS_DENY_CHANNELS	(empty)
URGENT_KEYWORDS	(empty)
HTTP_LISTEN	:8080
HTTP_TOKEN	(empty)
STORE_PATH	/data/app.db
STORE_RETENTION_DAYS	365
LOG_LEVEL	info

Pushover setup

Install Pushover iOS app and log in
Get your User Key (in the app or on the website)
Create an application at pushover.net/apps/build to get an API token
Put them in .env as PUSHOVER_USER_KEY and PUSHOVER_API_TOKEN

OpenAI setup

Set OPENAI_API_KEY
Set OPENAI_BASE_URL to exactly https://api.openai.com/v1
If gpt-5 isn’t available on your account, use a supported model like gpt-4o-mini
GPT‑5 limits: ~272k input + 128k output tokens (400k context)

HTTP API

GET /healthz → 200 ok
GET /tail?channel=%23chan&limit=50 (JSON)
GET /history?channel=%23chan&before=<RFC3339>&limit=50 (JSON)
GET /trigger?channel=%23chan&window=6h (JSON)
GET /linkcard?url=… (JSON)
GET /linksummary?url=… (JSON)
GET /metrics

Troubleshooting

Empty tail while there’s activity
- Ensure the service logs readiness and joins for your channels
- Confirm .env CHANNELS contains your channels
- Check /metrics and logs for recent message ingestion
401 Unauthorized from UI/API
- Log in at /login with HTTP_TOKEN, or pass it via Bearer/token=
OpenAI 502/URL errors
- Ensure OPENAI_BASE_URL=https://api.openai.com/v1
- Try OPENAI_MODEL=gpt-4o-mini if gpt-5 isn’t enabled for your account

Roadmap

Additional notifiers (ntfy, Telegram)
Long-form HTML digest rendering
Admin endpoints (e.g., /join?channel=#chan)

Development notes

Project layout (selected):

cmd/sojuboy/main.go – entrypoint, wiring config/services
internal/soju – soju connector and ingestion
internal/store – SQLite schema and queries
internal/notifier – Pushover notifier
internal/summarizer – OpenAI client and prompts
internal/httpapi – UI and endpoints
internal/scheduler – cron jobs

Go toolchain: see go.mod (Go 1.23), Dockerfile builds static binary for a distroless image.

License

MIT for code dependencies; this repository’s license will follow your preference (add a LICENSE if needed).

README.md Unescape Escape