Metrics¶
Satella’s metrics¶
Metrics and instruments are a system to output real-time statistics.
Metrics are defined and meant to be used in a similar way to Python logging. It has a name (hierarchical, dot-separated), that does not have to correspond to particular modules or classes. It can be in one of 4 states:
- DISABLED
- RUNTIME
- DEBUG
- INHERIT
These are contained in a enum:
-
class
satella.instrumentation.metrics.
MetricLevel
¶ An enumeration.
By default, it runs in RUNTIME mode. This means that statistics are collected only from metrics of this instrument that are set to at least RUNTIME. If a user wants to dig deeper, it can switch the instrument to DEBUG, which will cause more data to be registered. If a metric is in state INHERIT, it will inherit the metric level from it’s parent, traversing the tree if required. The tree node separator is a dot in the metric’s name, eg. satella.metrics.my_metric.
INHERIT is the default state for all other metrics than root, for root the default is RUNTIME. Root metric cannot be set to INHERIT, as it would not make sense.
Also, if parent is RUNTIME and child is DEBUG, the metrics reported by the child won’t be included in parent metric data.
You can switch the metric anytime by assigning a correct value to
it’s level
property, or by specifying it’s metric level during a call to getMetric()
.
Note that a decision to accept/reject a handle()
-provided value happens
when handle()
is called, based on current level. If you change
the level, it may take some time for the metric to return correct
values.
The call to getMetric()
is specified as follows
-
satella.instrumentation.metrics.
getMetric
(metric_name: str = '', metric_type: str = 'base', metric_level: Optional[satella.instrumentation.metrics.metric_types.base.MetricLevel] = None, **kwargs)¶ Obtain a metric of given name.
Parameters: - metric_name – a metric name. Subsequent nesting levels have to be separated with a dot
- metric_type – metric type
- metric_level – a metric level to set this metric to.
Raises: - MetricAlreadyExists – a metric having this name already exists, but with a different type
- ValueError – metric name contains a forbidden character
You obtain metrics using getMetric()
as follows:
metric = getMetric(__name__+'.StringMetric', 'string', MetricLevel.RUNTIME, **kwargs)
Please note that metric name must match the following regex:
[a-zA-Z_:][a-zA-Z0-9_:]*
internal is for those cases where the application is the consumer of
the metrics, and you don’t want them exposed to outside.
Take care to examine this field of MetricData
if you
write custom exporters!
Where the second argument is a metric type. Following metric types are available:
base - for just a container metric
int - for int values
float - for float values
empty - disregard all provided values, outputs nothing
counter - starts from zero, increments or decrements the counter value. Also optionally can register the amount of calls
-
class
satella.instrumentation.metrics.metric_types.
CounterMetric
(name, root_metric: Metric = None, metric_level: Optional[satella.instrumentation.metrics.metric_types.base.MetricLevel] = None, internal: bool = False, sum_children: bool = True, count_calls: bool = False, *args, **kwargs)¶ A counter that can be adjusted by a given value.
Parameters: - sum_children – whether to sum up all calls to children
- count_calls – count the amount of calls to handle()
-
class
cps - will count given amount of calls to handle() during last time period, as specified by user
-
class
satella.instrumentation.metrics.metric_types.
ClicksPerTimeUnitMetric
(*args, time_unit_vectors: Optional[List[float]] = None, aggregate_children: bool = True, internal: bool = False, **kwargs)¶ This tracks the amount of calls to handle() during the last time periods, as specified by time_unit_vectors (in seconds). You may specify multiple time periods as consequent entries in the list.
By default (if you do not specify otherwise) this will track calls made during the last second.
This was once deprecated but out of platforms which suck at calculating derivatives of their series (AWS, I’m looking at you!) this was decided to be undeprecated.
-
class
Note
Normally you should use a counter and calculate a rate() from it, but since some platforms suck at rate a decision was made to keep this.
linkfail - for tracking whether given link is online or offline
-
class
satella.instrumentation.metrics.metric_types.
LinkfailMetric
(name: str, root_metric: Metric = None, metric_level: Optional[satella.instrumentation.metrics.metric_types.base.MetricLevel] = None, labels: Optional[dict] = None, internal: bool = False, consecutive_failures_to_offline: int = 100, consecutive_successes_to_online: int = 10, callback_on_online: Callable[[int, dict], None] = <function LinkfailMetric.<lambda>>, callback_on_offline: Callable[[int, dict], None] = <function LinkfailMetric.<lambda>>, *args, **kwargs)¶ Metric that measures whether given link is operable.
Parameters: - consecutive_failures_to_offline – consecutive failures needed for link to become offline
- consecutive_successes_to_online – consecutive successes needed for link to become online after a failure
- callback_on_online – callback that accepts an address of a link that becomes online and labels
- callback_on_offline – callback that accepts an address of a link that becomes offline and labels
-
class
summary - a metric that counts a rolling window of values, and provides for a way to calculate percentiles. Corresponds to Prometheus’ summary metrics.
-
class
satella.instrumentation.metrics.metric_types.
SummaryMetric
(name, root_metric: Metric = None, metric_level: Optional[satella.instrumentation.metrics.metric_types.base.MetricLevel] = None, internal: bool = False, last_calls: int = 100, quantiles: Sequence[float] = (0.5, 0.95), aggregate_children: bool = True, count_calls: bool = True, *args, **kwargs)¶ A metric that can register some values, sequentially, and then calculate quantiles from it. It calculates configurable quantiles over a sliding window of amount of measurements.
Parameters: - last_calls – last calls to handle() to take into account
- quantiles – a sequence of quantiles to return in to_metric_data
- aggregate_children – whether to sum up children values (if present)
- count_calls – whether to count total amount of calls and total time
-
class
histogram - a metric that puts given values into predefined buckets. Corresponds to Prometheus’ histogram metric
-
class
satella.instrumentation.metrics.metric_types.
HistogramMetric
(name: str, root_metric: Metric = None, metric_level: Optional[satella.instrumentation.metrics.metric_types.base.MetricLevel] = None, internal: bool = False, buckets: Sequence[float] = (0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1.0, 2.5, 5.0, 7.5, 10.0), aggregate_children: bool = True, *args, **kwargs)¶ A histogram, by Prometheus’ interpretation.
Parameters: - buckets – buckets to add. First bucket will be from zero to first value, second from first value to second, last bucket will be from last value to infinity. So there are len(buckets)+1 buckets. Buckets are expected to be passed in sorted!
- aggregate_children – whether to accept child calls to be later presented as total
-
class
callable - a metric whose value is a result of a given callable
-
class
satella.instrumentation.metrics.metric_types.
CallableMetric
(name, root_metric: Metric = None, metric_level: Optional[satella.instrumentation.metrics.metric_types.base.MetricLevel] = None, labels: Optional[dict] = None, internal: bool = False, value_getter: Optional[Callable[[], float]] = None, *args, **kwargs)¶ A metric whose value at any given point in time is the result of it’s callable.
Parameters: value_getter – a callable() that returns a float - the current value of this metric. It should be easy and cheap to compute, as this callable will be called each time a snapshot of metric state is requested
-
class
uptime - a metric to report uptime
-
class
satella.instrumentation.metrics.metric_types.
UptimeMetric
(*args, time_getter: Callable[[], float] = <built-in function monotonic>, **kwargs)¶ A metric that gives the difference between current value of time_getter and it’s value at the initialization of this metric
Parameters: time_getter – a callable/0 that returns a float, the notion of the time passing. By default it’s a safe time.monotonic
-
class
Note that metric.measure()
will include time spent
processing the generator’s content by the client, so you might
want to avoid measuring generators. However,
if this is the behaviour that you want, you get it.
Note that if you request a different type of existing metric via getMetric, an MetricAlreadyExists exception will be raised:
-
class
satella.exceptions.
MetricAlreadyExists
(msg, name, requested_type, existing_type)¶ Metric with given name already exists, but with a different type
Third parameter is optional. If set, all child metrics created during this metric’s instantiation will receive such metric level. If the metric already exists, it’s level will be set to provided metric level, if passed.
All child metrics (going from the root metric to 0) will be initialized
with the value that you just passed. In order to keep them in order,
an additional parameter passed to getMetric()
, metric_level
, if
specified, will set given level upon returning the even existing
metric.
This will be set on all children created by this call. If you have any children from previous calls, they will remain unaffected.
If you specify any kwargs, they will be delivered to the last metric’s in chain constructor.
Since metrics in Satella are primarily though out to end up on a Prometheus, it is very important to understand Prometheus’ data model.
Root metric’s to_metric_data
will output a flat set, called
MetricDataCollection:
-
class
satella.instrumentation.metrics.
MetricDataCollection
(*values)¶ A bunch of metric datas
-
postfix_with
(postfix: str) → satella.instrumentation.metrics.data.MetricDataCollection¶ Postfix every child with given postfix and return self
-
prefix_with
(prefix: str) → satella.instrumentation.metrics.data.MetricDataCollection¶ Prefix every child with given prefix and return self
-
remove_internals
()¶ Remove entries marked as internal
-
set_timestamp
(timestamp: float) → satella.instrumentation.metrics.data.MetricDataCollection¶ Assign every child this timestamp and return self
-
set_value
(value) → satella.instrumentation.metrics.data.MetricDataCollection¶ Set all children to a particular value and return self. Most useful
-
strict_eq
(other: satella.instrumentation.metrics.data.MetricDataCollection) → bool¶ Do values in other MetricDataCollection match also?
-
to_json
() → Union[list, dict, str, int, float, None]¶ Return a JSON-able representation of this object
-
which consists of MetricData:
-
class
satella.instrumentation.metrics.
MetricData
(name: str, value: float, labels: dict = None, timestamp: Optional[float] = None, internal: bool = False)¶ -
to_json
(prefix: str = '') → Union[list, dict, str, int, float, None]¶ Return a JSON-able representation of this object
-
On most metrics you can specify additional labels. They will serve to create an independent “sub-metric” of sorts, eg.
metric = getMetric('root', 'int')
metric.runtime(2, label='value')
metric.runtime(3, label='key')
assert metric.to_metric_data() == MetricDataCollection(MetricData('root', 2, {'label': value}),
MetricData('root', 3, {'label': 'key}))
This functionality is provided by the below class:
-
class
satella.instrumentation.metrics.metric_types.
EmbeddedSubmetrics
(name, root_metric: Optional[satella.instrumentation.metrics.metric_types.base.Metric] = None, metric_level: str = None, labels: Optional[dict] = None, internal: bool = False, *args, **kwargs)¶ A metric that can optionally accept some labels in it’s handle, and this will be counted as a separate metric. For example:
>>> metric = getMetric('root.test.IntValue', 'int', enable_timestamp=False) >>> metric.handle(2, label='key') >>> metric.handle(3, label='value')
If you try to inherit from it, refer to
simple.IntegerMetric
to see how to do it. All please pass all the arguments received from child class into this constructor, as this constructor actually stores them! Refer tocps.ClicksPerTimeUnitMetric
on how to do that.-
clone
(labels: dict) → satella.instrumentation.metrics.metric_types.base.LeafMetric¶ Return a fresh instance of this metric, with it’s parent being set to this metric and having a particular set of labels, and being of level INHERIT.
-
get_specific_metric_data
(labels: dict) → satella.instrumentation.metrics.data.MetricDataCollection¶ Return a MetricDataCollection for a child with given labels
-
Rolling your own metrics¶
In order to roll your own metrics, you must first subclass Metric. You can subclass one of the following classes, to the best of your liking. Please also refer to existing metric implementations on how to best subclass them.
-
class
satella.instrumentation.metrics.metric_types.
Metric
(name, root_metric: Optional[satella.instrumentation.metrics.metric_types.base.Metric] = None, metric_level: Union[satella.instrumentation.metrics.metric_types.base.MetricLevel, int, None] = None, internal: bool = False, *args, **kwargs)¶ Container for child metrics. A base metric class, as well as the default metric.
Switch levels by setting metric.level to a proper value
Parameters: - enable_timestamp – append timestamp of last update to the metric
- internal – if True, this metric won’t be visible in exporters
-
get_timestamp
() → Optional[float]¶ Return this timestamp, or None if no timestamp support is enabled
-
reset
() → None¶ Delete all child metrics that this metric contains.
Also, if called on root metric, sets the runlevel to RUNTIME
-
class
satella.instrumentation.metrics.metric_types.
LeafMetric
(name, root_metric: Optional[satella.instrumentation.metrics.metric_types.base.Metric] = None, metric_level: str = None, labels: Optional[dict] = None, internal: bool = False, *args, **kwargs)¶ A metric capable of generating only leaf entries.
You cannot hook up any children to a leaf metric.
-
class
satella.instrumentation.metrics.metric_types.base.
EmbeddedSubmetrics
(name, root_metric: Optional[satella.instrumentation.metrics.metric_types.base.Metric] = None, metric_level: str = None, labels: Optional[dict] = None, internal: bool = False, *args, **kwargs)¶ A metric that can optionally accept some labels in it’s handle, and this will be counted as a separate metric. For example:
>>> metric = getMetric('root.test.IntValue', 'int', enable_timestamp=False) >>> metric.handle(2, label='key') >>> metric.handle(3, label='value')
If you try to inherit from it, refer to
simple.IntegerMetric
to see how to do it. All please pass all the arguments received from child class into this constructor, as this constructor actually stores them! Refer tocps.ClicksPerTimeUnitMetric
on how to do that.-
clone
(labels: dict) → satella.instrumentation.metrics.metric_types.base.LeafMetric¶ Return a fresh instance of this metric, with it’s parent being set to this metric and having a particular set of labels, and being of level INHERIT.
-
get_specific_metric_data
(labels: dict) → satella.instrumentation.metrics.data.MetricDataCollection¶ Return a MetricDataCollection for a child with given labels
-
Remember to define a class attribute of CLASS_NAME, which is a string defining how to call your metric. After everything is done, register it by using the following decorator on your metric class
-
satella.instrumentation.metrics.metric_types.
register_metric
(cls)¶ Decorator to register your custom metrics
To zip together two or more metrics, you can use the following class:
-
class
satella.instrumentation.metrics.
AggregateMetric
(*metrics)¶ A virtual metric grabbing a few other metrics and having a single .handle() call represent a bunch of calls to other metrics. Ie, the following:
>>> m1 = getMetric('summary', 'summary') >>> m2 = getMetric('histogram', 'histogram') >>> m1.runtime() >>> m2.runtime()
Is the same as:
>>> am = AggregateMetric(getMetric('summary', 'summary'), getMetric('histogram', 'histogram')) >>> am.runtime()
Note that this class supports only reporting. It doesn’t read data, or read/write metric levels.
To automatically apply labels you can use this class:
-
class
satella.instrumentation.metrics.
LabeledMetric
(metric_to_wrap, **labels)¶ A wrapper to another metric that will always call it’s .runtime and .handle with some predefined labels
Use like:
>>> a = getMetric('a', 'counter') >>> b = LabeledMetric(a, key=5)
Then this:
>>> a.runtime(1, key=5)
Will be equivalent to this:
>>> b.runtime(1)
Exporting data¶
In order to export data to Prometheus, you can use the following function:
-
satella.instrumentation.metrics.exporters.
metric_data_collection_to_prometheus
(mdc: satella.instrumentation.metrics.data.MetricDataCollection) → str¶ Render the data in the form understandable by Prometheus.
Values marked as internal will be skipped.
Parameters: - mdc – Metric data collection to render
- tree – MetricDataCollection returned by the root metric (or any metric for that instance).
Returns: a string output to present to Prometheus
For example in such a way:
def export_to_prometheus():
metric = getMetric()
return metric_data_collection_to_prometheus(metric.to_metric_data())
Dots in metric names will be replaced with underscores.
Or, if you need a HTTP server that will export metrics for Prometheus, use this class that is a daemonic thread you can use to easily expose metrics to Prometheus:
-
class
satella.instrumentation.metrics.exporters.
PrometheusHTTPExporterThread
(interface: str, port: int, extra_labels: Optional[dict] = None, enable_metric: bool = False)¶ A daemon thread that listens on given interface as a HTTP server, ready to serve as a connection point for Prometheus to scrape metrics off this service.
This additionally (if user requests so) may export a metric called prometheus.exports_per_time which is a cps with time_unit_vectors=[1, 20, 60] counting the amount of exports in given time period.
Parameters: - interface – a interface to bind to
- port – a port to bind to
- extra_labels – extra labels to add to each metric data point, such as the name of the service or the hostname
- enable_metric – whether to enable the metric
-
get_metric_data
() → satella.instrumentation.metrics.data.MetricDataCollection¶ Obtain metric data.
Overload to provide custom source of metric data.
-
run
() → None¶ Calls self.loop() indefinitely, until terminating condition is met
-
terminate
(force: bool = False) → satella.instrumentation.metrics.exporters.prometheus.PrometheusHTTPExporterThread¶ Order this thread to terminate and return self.
You will need to .join() on this thread to ensure that it has quit.
Parameters: force – whether to terminate this thread by injecting an exception into it
Useful data structures¶
Sometimes you want to have some data structures with metrics about themselves. Here go they:
-
class
satella.instrumentation.metrics.structures.
MetrifiedThreadPoolExecutor
(max_workers=None, thread_name_prefix='', initializer=None, initargs=(), time_spent_waiting=None, time_spent_executing=None, waiting_tasks: Optional[satella.instrumentation.metrics.metric_types.callable.CallableMetric] = None, metric_level: satella.instrumentation.metrics.metric_types.base.MetricLevel = <MetricLevel.RUNTIME: 2>)¶ A thread pool executor that provides execution statistics as metrics.
This class will also backport some of Python 3.8’s characteristics of the thread pool executor to earlier Pythons, thread name prefix, initializer, initargs and BrokenThreadPool behaviour.
Parameters: - time_spent_waiting – a metric (can be aggregate) to which times spent waiting in the queue will be deposited
- time_spent_executing – a metric (can be aggregate) to which times spent executing will be deposited
- waiting_tasks – a fresh CallableMetric that will be patched to yield the number of currently waiting tasks
- metric_level – a level with which to log to these two metrics
-
get_queue_length
() → int¶ Return the amount of tasks currently in the queue
-
class
satella.instrumentation.metrics.structures.
MetrifiedCacheDict
(stale_interval, expiration_interval, value_getter, value_getter_executor=None, cache_failures_interval=None, time_getter=<built-in function monotonic>, default_value_factory=None, cache_hits: Optional[satella.instrumentation.metrics.metric_types.counter.CounterMetric] = None, cache_miss: Optional[satella.instrumentation.metrics.metric_types.counter.CounterMetric] = None, refreshes: Optional[satella.instrumentation.metrics.metric_types.counter.CounterMetric] = None, how_long_refresh_takes: Optional[satella.instrumentation.metrics.metric_types.measurable_mixin.MeasurableMixin] = None)¶ A CacheDict with metrics!
Parameters: - cache_hits – a counter metric that will be updated with +1 each time there’s a cache hit
- cache_miss – a counter metric that will be updated with +1 each time there’s a cache miss
- refreshes – a metric that will be updated with +1 each time there’s a cache refresh
- how_long_refresh_takes – a metric that will be ticked with time value_getter took
-
class
satella.instrumentation.metrics.structures.
MetrifiedLRUCacheDict
(stale_interval: float, expiration_interval: float, value_getter, value_getter_executor=None, cache_failures_interval=None, time_getter=<built-in function monotonic>, default_value_factory=None, max_size: int = 100, cache_hits: Optional[satella.instrumentation.metrics.metric_types.base.Metric] = None, cache_miss: Optional[satella.instrumentation.metrics.metric_types.base.Metric] = None, refreshes: Optional[satella.instrumentation.metrics.metric_types.base.Metric] = None, how_long_refresh_takes: Optional[satella.instrumentation.metrics.metric_types.measurable_mixin.MeasurableMixin] = None, evictions: Optional[satella.instrumentation.metrics.metric_types.base.Metric] = None, **kwargs)¶ A LRUCacheDict with metrics!
Parameters: - cache_hits – a counter metric that will be updated with +1 each time there’s a cache hit
- cache_miss – a counter metric that will be updated with +1 each time there’s a cache miss
- refreshes – a metric that will be updated with +1 each time there’s a cache refresh
- how_long_refresh_takes – a metric that will be ticked with time value_getter took
-
class
satella.instrumentation.metrics.structures.
MetrifiedExclusiveWritebackCache
(*args, cache_hits: Optional[satella.instrumentation.metrics.metric_types.counter.CounterMetric] = None, cache_miss: Optional[satella.instrumentation.metrics.metric_types.counter.CounterMetric] = None, entries_waiting: Optional[satella.instrumentation.metrics.metric_types.callable.CallableMetric] = None, **kwargs)¶