Metrics
The metrics subsystem lets you define a metric once — its measures and dimensions — and then emit measure values by name. Where those values go (a log file, the messaging transport, CloudWatch, or a Prometheus scrape endpoint) is chosen by configuration, not by your code. The same business logic emits the same metric regardless of platform.
The push targets serialize to CloudWatch EMF (Embedded Metric Format); the pull target serves a Prometheus/OpenMetrics exposition.
You obtain the service from your GGCommons instance. The accessor name differs per language:
MetricEmitter metrics = gg.getMetrics();# get_metrics() returns the MetricEmitter CLASS — every method is static, with# process-global state. There is no per-instance emitter in Python.metrics = gg.get_metrics()// metrics() returns an Arc<dyn MetricService>.let metrics = gg.metrics();// metrics() returns a MetricService.const metrics = gg.metrics();Define a metric
Section titled “Define a metric”Build a metric with MetricBuilder: give it a name, add one or more measures (each with a unit
and a storage resolution), and optionally add dimensions. A measure’s storage resolution is
either 1 (high resolution, 1-second granularity) or 60 (standard, 1-minute granularity).
metrics.defineMetric( MetricBuilder.create("performance") .withConfig(config) // fills thingName, componentName, namespace .addMeasure("replyLatency", "Milliseconds", 1) .addDimension("instance", "main") .build());# with_config() fills thing + component ONLY (not namespace) in Python — set the# namespace explicitly, or build() substitutes the placeholder "GGCommons/Metrics".metric = (MetricBuilder.create("performance") .with_config(config) .with_namespace("MyApp/Metrics") .add_measure("reply_latency", "Milliseconds", 1) .add_dimension("instance", "main") .build())metrics.define_metric(metric)metrics.define_metric( MetricBuilder::create("performance") .with_config(&config) // fills thing, component, namespace .add_measure("replyLatency", "Milliseconds", 1) .add_dimension("instance", "main") .build(),);const metric = MetricBuilder.create("performance") .withConfig(config) // fills thing, component, namespace .addMeasure("replyLatency", "Milliseconds", 1) .addDimension("instance", "main") .build();metrics.defineMetric(metric);Dimensions and the auto-injected three
Section titled “Dimensions and the auto-injected three”Dimensions are key/value labels attached to every datum of the metric. Three are injected automatically when you build:
| Dimension | Value | Source |
|---|---|---|
category |
the metric’s name | injected by the builder on build() |
coreName |
the IoT Thing name | from withThingName / config |
component |
the component name | from withComponentName / config |
CloudWatch caps a metric at 10 dimensions (MAX_DIMENSIONS = 10), counting the three injected
ones. The SDKs enforce the cap differently when you exceed it:
- Java, Python, TypeScript throw (Java at
addDimensionandbuild; Python and TypeScript atbuild). - Rust is infallible:
build()trims the excess custom dimensions (keeping the three injected) and logs atracing::warn!instead of erroring.
Emit measure values
Section titled “Emit measure values”After a metric is defined, emit values by passing the metric name and a map of
measureName -> value. There are two emission paths:
emitMetric(batched) — records the value into the target’s buffer; it is delivered on the target’s normal interval / batching policy.emitMetricNow(immediate) — bypasses the buffer and delivers right away.
flushMetrics force-drains the buffer on demand. The measure-value container type is
language-specific: Map<String, Float> (Java — note Float, not double), a plain dict
(Python), HashMap<String, f64> (Rust), and Record<string, number> (TypeScript).
Map<String, Float> values = new HashMap<>();values.put("replyLatency", 12.5f); // Map<String, Float>
metrics.emitMetric("performance", values); // buffered — drained on the emit intervalmetrics.emitMetricNow("performance", values); // immediate — bypasses the buffermetrics.flushMetrics(); // force-drain now// metrics.close(); // flush + release the target on shutdownmetrics.emit_metric("performance", {"reply_latency": 12.5}) # bufferedmetrics.emit_metric_now("performance", {"reply_latency": 12.5}) # immediate
# Python's MetricEmitter has NO public flush method — targets own their batching.# Use shutdown() to close the target (which flushes/releases it) on exit.metrics.shutdown()let mut values = HashMap::new();values.insert("replyLatency".to_string(), 12.5_f64); // HashMap<String, f64>
metrics.emit_metric("performance", values.clone()).await?; // bufferedmetrics.emit_metric_now("performance", values).await?; // immediatemetrics.flush_metrics().await?; // force-drain now// metrics.shutdown().await; // release the target on exitawait metrics.emitMetric("performance", { replyLatency: 12.5 }); // bufferedawait metrics.emitMetricNow("performance", { replyLatency: 12.5 }); // immediateawait metrics.flushMetrics(); // force-drain now// await metrics.shutdown(); // release on exitTargets and how one is chosen
Section titled “Targets and how one is chosen”A target decides where emitted metrics actually go. You select it with metricEmission.target
in your component config:
target |
What it does |
|---|---|
log |
Writes EMF JSON lines to the metric log file (the library default). |
messaging |
Publishes EMF over the active messaging transport. |
cloudwatch |
Sends EMF to CloudWatch through a durable store-and-forward buffer (see below). |
cloudwatchcomponent |
Hands metrics to the AWS-managed CloudWatch metrics Greengrass component. |
prometheus |
Exposes metrics on an HTTP scrape endpoint (the default on Kubernetes). |
The effective target is resolved by a three-tier precedence, identical across all four SDKs:
- The explicit
metricEmission.targetfrom config, if set. - Otherwise the platform-profile default —
prometheuson Kubernetes. - Otherwise the library default,
log.
An unknown target name logs a warning and falls back to log.
// component config — metricEmission section{ "metricEmission": { "target": "cloudwatch", "namespace": "MyApp/Metrics" }}Prometheus pull endpoint
Section titled “Prometheus pull endpoint”The prometheus target inverts the usual lifecycle. Instead of pushing, emit/emitNow only
update an in-process latest-value gauge registry; flush is a no-op (a scrape pulls the data);
shutdown/close stops the HTTP listener. A Prometheus server (or any scraper) pulls the current
values on its own schedule.
- The HTTP server binds on
0.0.0.0, default port 9090, default path/metrics. GET <path>returns200with the OpenMetrics text exposition; any other path returns404(and, on Java, a non-GETon the metrics path returns405).- A bind/start failure is logged and swallowed — the component keeps running and emits keep updating the registry; only the scrape endpoint is unavailable.
Gauge naming: the gauge name is sanitize(lowercase("{namespace}_{measureName}")) (restricted to
[a-z0-9_], prefixed with _ if it would start with a digit). Dimensions become labels, with each
label name sanitized to [a-zA-Z_][a-zA-Z0-9_]*. The namespace used is the configured
metricEmission.namespace (not the per-metric namespace), defaulting to ggcommons.
Durable CloudWatch buffer
Section titled “Durable CloudWatch buffer”The cloudwatch target is backed by a durable, disk-backed store-and-forward buffer by default
(buffer.type defaults to durable). Datums are written to local disk first and drained to
CloudWatch in batches, so metrics survive process restarts and cloud disconnects — the edge-first
default. Set buffer.type to memory to opt back into in-memory batching only.
{ "metricEmission": { "target": "cloudwatch", "namespace": "MyApp/Metrics", "buffer": { "type": "durable", "maxDiskBytes": 134217728, "path": "/var/lib/ggcommons/metrics/{ComponentName}/cw", "onFull": "dropOldest", "fsync": "perBatch" } }}The defaults shown above (128 MiB cap, the per-component cw path, dropOldest when full,
fsync per batch) apply when the buffer block is omitted.
Behavior to be aware of:
- Durable mode fails fast if the bundled
ggstreamlognative core is missing — that is a deployment error, since the core ships with the SDK. If the core is present but the buffer cannot open (bad path / IO error), it degrades to in-memory batching. - The drain batches up to 1000 datums (≈900 KB) per CloudWatch request, drops datums outside CloudWatch’s accept window (≈14 days past / ≈2 hours future), and groups by namespace.
close()on a durable buffer flushes to disk and stops the engine — it does not drain to the cloud, so any backlog persists for the next restart.
EMF timestamp units
Section titled “EMF timestamp units”For the push (EMF) targets, the _aws.Timestamp field is in epoch milliseconds. The
cloudwatchcomponent message carries its own timestamp in epoch seconds (a deliberately
different unit), and its JSON location differs by language — Java places it at
request.metricData.timestamp, while Python places it at request.timestamp (a sibling of
metricData).