docs: expand .env example to show max/large values; add SUMM_TIMEOUT and summarizer tunables\n\nfeat: summarizer improvements\n- readability extraction for articles\n- image links passed to model as vision inputs\n- configurable max groups/links/bytes and timeout\n- higher default ceilings; resilient fallback summary

2025-08-15 20:41:31 -05:00 · 2025-08-15 20:41:31 -05:00 · 9ecf4f4f4c
commit 9ecf4f4f4c
parent 2954e85e7a
7 changed files with 296 additions and 53 deletions
--- a/README.md
+++ b/README.md
@ -31,14 +31,14 @@ Runtime modules:
 - `internal/notifier`: Pushover notifier (pluggable interface)
 - `internal/summarizer`: OpenAI client with GPT-5 defaults, GPT-4o-mini fallback
 - `internal/scheduler`: cron-based digest scheduling and daily retention job
- `internal/httpapi`: `/healthz`, `/tail`, `/trigger`, `/metrics`
+- `internal/httpapi`: `/healthz`, `/ready`, `/tail`, `/trigger`, `/metrics`
 - `internal/config`: env config loader and helpers

 ## Features

 - Mention/keyword detection: punctuation-tolerant (letters, digits, `_` and `-` are word chars)
 - Mention tuning: allow/deny channels, urgent keywords bypass quiet hours, rate limiting
- AI digest generation: concise natural summaries (no rigid sections); integrates pasted multi-line posts and referenced link context
+- AI digest generation: concise natural summaries (no rigid sections); integrates pasted multi-line posts and referenced link context; image links sent to GPT‑5 as vision inputs
 - Configurable schedules (cron), quiet hours, and summary parameters
 - Local persistence with retention pruning (daily at 03:00)
 - HTTP endpoints: health, tail, metrics, on-demand digests
@ -63,9 +63,9 @@ Runtime modules:
   - Debug logs include: mention delivered or suppression reason (backfill, quiet hours, rate limit)

 5) Summarization:
-   - `/trigger` or the scheduler loads a window and calls OpenAI (with a 60s timeout)
-   - Defaults to `OPENAI_MODEL=gpt-5` with `MaxCompletionTokens`; temperature omitted for reasoning-like models
-   - Tunables let you follow link targets and group multi-line posts (see env below)
+   - `/trigger` or the scheduler loads a window and calls OpenAI
+   - GPT‑5 context: ~272k input tokens + up to 128k output tokens (400k total)
+   - Summaries are concise/natural and integrate multi-line posts, article text (readability-extracted), and image links (vision)

 6) HTTP API:
   - `/healthz` → `200 ok`
@ -83,7 +83,7 @@ Runtime modules:

 ```yaml
 healthcheck:
-  test: ["/sojuboy", "--health"]
+  test: ["CMD", "/sojuboy", "--health"]
  interval: 30s
  timeout: 3s
  retries: 3
@ -146,6 +146,8 @@ Compose includes a healthcheck calling the binary’s `--health` flag, which ret

 ## Configuration (.env example)

+Below shows maximum or large/reasonable values. Defaults are noted where they are also the maximum or when relevant.
+
 ```env
 # soju / IRC
 SOJU_HOST=bnc.example.org
@ -176,22 +178,25 @@ LLM_PROVIDER=openai
 OPENAI_API_KEY=sk-...
 OPENAI_BASE_URL=https://api.openai.com/v1
 OPENAI_MODEL=gpt-5
-OPENAI_MAX_TOKENS=700
+# Max completion (output) tokens for GPT‑5 is ~128k (model limit). Default 700.
+OPENAI_MAX_TOKENS=128000
 # Summarizer tuning
-SUMM_FOLLOW_LINKS=true           # fetch small snippets from referenced links
-SUMM_LINK_TIMEOUT=6s             # HTTP timeout per link
-SUMM_LINK_MAX_BYTES=262144       # max bytes fetched per link
-SUMM_GROUP_WINDOW=90s            # group multi-line posts within this window
-SUMM_MAX_LINKS=5                 # limit links fetched per summary
+SUMM_FOLLOW_LINKS=true            # default true
+SUMM_LINK_TIMEOUT=20s             # no hard max; example large
+SUMM_LINK_MAX_BYTES=1048576       # no hard max; example large (1 MiB/article)
+SUMM_GROUP_WINDOW=120s            # no hard max; example large grouping window
+SUMM_MAX_LINKS=20                 # no strict max; example large
+SUMM_MAX_GROUPS=20000             # 0=no cap; example large
+SUMM_TIMEOUT=10m                  # request timeout; default 5m

 # Digests
-DIGEST_CRON=0 */6 * * *
-DIGEST_WINDOW=6h
-QUIET_HOURS=
+DIGEST_CRON=0 */6 * * *           # every 6 hours
+DIGEST_WINDOW=24h                 # no hard max; example large window
+QUIET_HOURS=                      # e.g., 22:00-07:00

 # Mentions/alerts
-NOTIFY_BACKFILL=false             # if true, notify even for replayed (older) messages
-MENTION_MIN_INTERVAL=30s          # min interval between alerts per channel/keyword
+NOTIFY_BACKFILL=false             # default false
+MENTION_MIN_INTERVAL=30s          # no hard max; rate-limit between alerts
 MENTIONS_ONLY_CHANNELS=           # optional allow-list (CSV)
 MENTIONS_DENY_CHANNELS=           # optional deny-list (CSV)
 URGENT_KEYWORDS=urgent,priority   # bypass quiet hours
@ -202,7 +207,7 @@ HTTP_TOKEN=put-a-long-random-token-here

 # Storage
 STORE_PATH=/data/app.db
-STORE_RETENTION_DAYS=7
+STORE_RETENTION_DAYS=365          # example large retention

 # Logging
 LOG_LEVEL=info
@ -220,7 +225,7 @@ LOG_LEVEL=info
 - Set `OPENAI_API_KEY`
 - Set `OPENAI_BASE_URL` to exactly `https://api.openai.com/v1`
 - If `gpt-5` isn’t available on your account, use a supported model like `gpt-4o-mini`
- GPT-5 beta limitations: temperature fixed; use `MaxCompletionTokens`
+- GPT‑5 limits: ~272k input + 128k output tokens (400k context)

 ## HTTP API