Metrics
Overview
apiexec exposes runtime metrics for observability. The interface is callback-based rather
than embedded: the engine maintains thread-safe atomic counters and invokes a user-supplied
callback after every successful fetch, passing a read-only MetricsSnapshot. This keeps the
library footprint small and lets applications route metrics to whichever backend they already
use - Prometheus, StatsD, Datadog, OpenTelemetry, or a simple log line.
Thread safety: All counters are std::atomic; snapshot() is lock-free.
Overhead: Incrementing a counter is a single atomic add. A snapshot is 11 atomic loads.
The prefetch benchmark shows identical throughput with metrics enabled vs. disabled.
Header: source/core/metrics.hpp - no external dependencies.
Available metrics
| Metric | Type | Description | Incremented when |
|---|---|---|---|
request_count | counter (int64) | Total HTTP requests made | Every call to transport->execute() |
retry_count | counter (int64) | Total retry attempts (429 + 5xx + network) | Every time handle_error decides to retry |
success_count | counter (int64) | Successful fetches (response parsed OK) | After parse_response() returns true |
records_total | counter (int64) | Batches delivered to the caller | Once per successful batch |
error_rate_limit | counter (int64) | Terminal 429 errors after retries exhausted | STREAM_ERROR_RATE_LIMIT returned to caller |
error_server | counter (int64) | Terminal 5xx errors after retries exhausted | STREAM_ERROR_SERVER returned to caller |
error_client | counter (int64) | Terminal 4xx errors (non-429, no retry) | STREAM_ERROR_CLIENT returned to caller |
error_network | counter (int64) | Terminal network errors after retries exhausted | STREAM_ERROR_NETWORK returned to caller |
error_parse | counter (int64) | Response parse failures | parse_response() returns false |
window_size_ms | gauge (double) | Current cursor time window size in ms | After every cursor advancement |
cumulative_cost | gauge (double) | Cumulative cost units reported by the adapter | After every adapter->response_cost() call |
What is NOT exposed
- Per-adapter labels - metrics are per-stream. For per-adapter aggregation, create one stream per adapter and aggregate externally.
- Latency histograms - the library does not measure request duration. Track this in your callback if needed.
- Prefetch queue depth - prefetch is depth 0 or 1 (double-buffer); no queue to measure.
MetricsSnapshot
A read-only snapshot is returned by engine.metrics_snapshot() and passed to the callback. Snapshots are cheap (one atomic load per field) and lock-free.
struct MetricsSnapshot {
int64_t request_count;
int64_t retry_count;
int64_t success_count;
int64_t error_rate_limit;
int64_t error_server;
int64_t error_client;
int64_t error_network;
int64_t error_parse;
int64_t records_total;
double window_size_ms;
double cumulative_cost;
};Usage patterns
1. Poll the snapshot on demand
Query metrics at any time - for example, to log periodically or expose on a /metrics endpoint.
#include "core/engine.hpp"
ExecutionEngine<JsonBatch> engine(/* ... */);
while (engine.has_next()) {
auto result = engine.next_batch();
// ... process result ...
}
auto snap = engine.metrics_snapshot();
std::cout << "Requests: " << snap.request_count
<< " Retries: " << snap.retry_count
<< " Rate-limit errors: " << snap.error_rate_limit << "\n";2. Callback on every successful fetch
Register a MetricsCallback - it fires once per successful next_batch() call.
engine.set_metrics_callback([](const apiexec::MetricsSnapshot& s) {
// Update a Prometheus counter
requests_total.Increment(s.request_count);
retries_total.Increment(s.retry_count);
// Or send to StatsD
statsd.gauge("apiexec.window_size_ms", s.window_size_ms);
});Keep callbacks fast
The callback runs on the thread calling next_batch() (or the prefetch thread when
prefetch is enabled). Do not perform synchronous I/O inside it - forward data to a
dedicated reporting thread if needed.
3. Prometheus text format
The Metrics class has a built-in Prometheus text exposition formatter:
auto prom_text = engine.metrics().to_prometheus("apiexec");
// Write prom_text to your HTTP /metrics response bodySample output:
apiexec_requests_total 1250 apiexec_retries_total 47 apiexec_successes_total 1200 apiexec_errors_rate_limit_total 5 apiexec_errors_server_total 2 apiexec_errors_client_total 0 apiexec_errors_network_total 1 apiexec_errors_parse_total 0 apiexec_records_total 1200 apiexec_window_size_ms 3600000.000000 apiexec_cumulative_cost_units 0.000000
The prefix is configurable (default "apiexec"). Use a different prefix per stream if you
run multiple streams in the same process.
4. Expose /metrics over HTTP
Pair the Prometheus formatter with any small HTTP server:
#include <httplib.h> // or any HTTP server library
httplib::Server svr;
svr.Get("/metrics", [&engine](const auto& req, auto& res) {
res.set_content(
engine.metrics().to_prometheus("apiexec"),
"text/plain; version=0.0.4"
);
});
svr.listen("0.0.0.0", 9090);Interpreting the metrics
Healthy stream
request_count = 100 success_count = 100 retry_count = 0 error_* = 0
Every request succeeds on the first try. The window may be growing toward max_window_ms.
Stream under moderate rate-limit pressure
request_count = 150 success_count = 100 retry_count = 50 error_rate_limit = 0 window_size_ms = 1800000 (shrunk from initial 3600000)
50 retries were needed to complete 100 successful fetches - a 50% retry rate. The window
shrunk once on the first 429 and has stabilized. Consider reducing window_grow_factor or
increasing min_window_ms to reduce oscillation.
Stream hitting budget cap
request_count = 10 success_count = 10 cumulative_cost = 1000.0 (equal to budget_tokens)
Budget exhausted. The next call to next_batch() returns BUDGET_EXHAUSTED.
Stream with persistent network issues
request_count = 60 success_count = 10 retry_count = 50 error_network = 5 (5 batches hit max_retries)
5 batches hit max_retries and were returned to the caller as terminal network failures.
Check your logging callback for the specific failure modes.
Metrics in the language bindings
C API limitation
The C API does not currently expose per-stream metrics directly. Full metric access
(stream_metrics_snapshot_v2) is planned for ABI v2. In the meantime, the C API exposes
stream_cost_info_v1 for budget and cost queries only.
| Binding | Metrics access |
|---|---|
| C++ (direct) | engine.metrics_snapshot() / engine.set_metrics_callback() - full access |
| C API | stream_cost_info_v1() - budget/cost only |
| Go | stream.CostInfo() - budget/cost only |
| Rust | stream.cost_info() - budget/cost only |
| Python | Not yet exposed beyond cost |
| Java | Not yet exposed beyond cost |
| JavaScript | Not yet exposed beyond cost |
If you need full metrics access from a binding, the recommended pattern is to host the engine
in a C++ process, expose metrics via a /metrics HTTP endpoint, and have the binding consumer
poll that endpoint.
