Latency Critical Systems
Use this skill when the user cares about realtime behavior, hot paths, streaming freshness, or execution speed. This includes HFT-like infrastructure, but the skill is engineering-focused. It does not authorize live trading or financial advice.
Split The Metrics
Do not collapse everything into "fast." Track:
- p50, p95, and p99 latency;
- throughput;
- freshness age;
- queue depth;
- cache hit rate;
- provider/API response time;
- browser render time;
- correctness under load;
- failure and retry behavior.
Map The Hot Path
Write the path from user/event to final visible state:
source event -> provider API -> ingest worker -> queue -> cache -> edge route
-> client stream -> browser render -> user-visible state
Then measure each segment separately.
Optimization Order
- Remove unnecessary round trips.
- Cache stable reads with freshness metadata.
- Batch small calls and writes.
- Move compute closer to the data or the user.
- Split hot and cold paths.
- Apply backpressure before queues grow unbounded.
- Use streaming only when it improves freshness or user experience.
- Add canaries for stale data, degraded providers, and bad cache state.
Verification
Use live readbacks when a deployed surface exists:
- HTTP timing and response headers;
- provider freshness timestamp;
- queue or job state;
- edge/cache state;
- browser verification for actual UI freshness;
- logs around retries and degraded mode.
For market-data or execution-adjacent paths, also verify orderbook age, VWAP assumptions, provider status, and kill-switch behavior before calling the path ready.
Guardrails
- Do not optimize latency by dropping required validation.
- Do not hide stale data behind fast cache hits.
- Do not claim millisecond behavior from client labels without measurement.
- Do not run live orders, destructive migrations, or customer-impacting deploys without an explicit approval gate.
- Keep secrets and private payloads out of logs and benchmark artifacts.