Hyperion (database + Redis)
Hyperion is Kraite's stateful core — one dedicated AMD-EPYC box that runs only MySQL and Redis. Every other server in the fleet (athena for ingestion + web, eos / iris / nyx for trading workers, tyche for indicators) reads and writes against this single shared backend over the private kraite-net network. Hyperion runs no Laravel code, no Horizon, no daemons; it does two storage jobs and gets out of the way.
This is the server lens view. For application-level data shapes, see the Domains chapters.
What runs on Hyperion
| Service | Version | Notes |
|---|---|---|
| MySQL | 8.4.8 | The single shared kraite schema. Bound to 10.0.0.2 (private interface) only; public 3306 is firewalled. |
| Redis | 8.0.5 | The shared Horizon queue + cache backend. Bound to 10.0.0.2 + 127.0.0.1. Database 2 is the fleet-wide convention — all servers use REDIS_DB=2. |
No web server, no PHP-FPM, no Horizon supervisor, no scheduler. Every Laravel process on every other server connects here over the 10.0.0.0/16 private network.
athena ──┐
eos ──┤ private LAN (10.0.0.0/16)
iris ──┼──────────────────────────► Hyperion
nyx ──┤ ├─ MySQL :3306
tyche ──┘ ├─ Redis :6379
└─ Redis :6379
Why MySQL and Redis are co-located
Architectural decision
Splitting MySQL and Redis onto separate boxes would have doubled the spend on the dedicated-CPU SKU (CCX23 ≈ €34/mo) without buying meaningful isolation — both are storage workloads, both are private-network-only, and neither competes with the other for the same resource bucket (Redis is RAM-bound, MySQL is IO-bound). Co-locating them on one well-provisioned AMD-EPYC box with NVMe storage keeps the latency envelope tight (every dispatch-daemon tick reads Redis; every position-state change writes MySQL — both are hot paths). The blast radius of losing Hyperion is identical whether it's one box or two, so the operational benefit of splitting is zero.
MySQL tuning that deviates from defaults
The MySQL config has been moved away from stock defaults wherever the workload demanded it.
| Setting | Value | Why |
|---|---|---|
innodb_buffer_pool_size | 10G | The 2 G stock value was the bottleneck on hot-set caching of steps, orders, positions. Kraite's working set fits in 10 G — every query that hits the buffer pool avoids disk. |
innodb_io_capacity | 5000 | The stock 200 is a 2010-era spinning-rust assumption. NVMe storage sustains far more random write IOPS; the background flusher couldn't keep up during HH:08 storms (kline-fetch fan-outs + mark-price daemon UPDATE bursts) and writers stalled on Innodb_buffer_pool_wait_free. |
innodb_io_capacity_max | 20000 | Companion to the above — gives the flusher headroom during write bursts. |
innodb_flush_log_at_trx_commit | 2 | Trades up to one second of crash-window for substantial commit throughput. Acceptable because every position-changing flow is idempotent on exchange_order_id and re-runnable from exchange state. |
innodb_log_buffer_size | 32M | Reduces flushing pressure on the large transactions step-block writes generate (many child rows in one commit). |
skip-log-bin | enabled | Kraite has no replication; binlogs would be pure overhead. |
max_connections | 256 | Sized for the fleet-wide ~162 sustained worker connections plus admin / console ad-hoc queries. Stock 151 was too tight. |
Redis hardening
Redis on Hyperion is bound to 10.0.0.2 (the private interface) plus 127.0.0.1 (loopback for local maintenance). The public interface is firewalled at the UFW layer — 6379 from the public internet is dropped.
The dangerous commands FLUSHALL, FLUSHDB, KEYS, and DEBUG have been renamed to random strings so an authenticated client cannot wipe the queue surface by accident. The requirepass directive is set; every worker connects with the shared password from ~/Herd/.credentials/kraite/servers.json.
Backups
Backups are taken by the application layer (running on athena via spatie/laravel-backup) using non-blocking mysqldump options — no global write lock, no impact on writers during the backup window. Backups land on Backblaze B2 with a tiered GFS retention strategy (daily / weekly / monthly / yearly).
The migration-ownership rule (documented in the operator runbook): only ingestion.kraite.com runs migrations against Hyperion. Admin and the public site read this schema; they never alter it.
Failure semantics
Hyperion is the single point of failure. A Hyperion outage halts every app, every worker, every daemon — including the queue surface (losing Redis means losing every job not yet picked up by a worker). Recovery is operational:
- Restore MySQL from the most recent B2 snapshot onto a fresh CCX23 box.
- Restore Redis (cache only — queued jobs in Redis are not snapshotted; they re-emit naturally as the scheduler ticks on athena).
- Update
/etc/hostson every fleet box with the new private IP. - Restart application servers.
The decision to keep one shared MySQL instance instead of read replicas is a deliberate trade — simplicity over horizontal-scale reads while the workload still fits one well-tuned box.
Cross-lens links
- Architecture overview — the full topology Hyperion anchors
- Athena (ingestion + web) — runs the application that owns migrations against Hyperion
- Horizon queues — the queue surface that lives in Hyperion's Redis
- Open positions — the dominant write workload Hyperion serves