Observability
Prometheus Metrics
For monitoring/alerting, Fennel exposes all relevant metrics behind a Prometheus endpoint. You can point Grafana, New Relic, or any other metric system that speaks the Prometheus protocol towards this endpoint. Once your metric system is connected to Fennel's Prometheus endpoint, you can seamlessly use your existing monitoring/alerting stack.
The following metrics are exposed via the Prometheus endpoint:
Write path metrics
fennel_source_dataset_metricsReports the rate & the health of data ingestion from sources into datasets
Type: Gauge
Labels:
source_name- name of the sourcedataset_name- name of the datasetmetric- metric to report. Possible value arebacklog&event_timestamp_seconds.backlogreports the number of events ingested by the source but yet to be applied to the dataset.event_timestamp_secondsreports the event timestamp of the last message processed (this is sampled and hence approximate).
fennel_source_dataset_counterReports the count of schema errors encountered by the data source
Type: Counter
Labels:
source_name- name of the sourcedataset_name- name of the datasetformat- format of the data being processedmetric- metric to report. Possible value aremsgs_processed&schema_error.msgs_processedreports the number of messages processed by the source dataset so far.schema_errorreports the number of messages which failed due to schema mismatch related errors.
fennel_pipeline_metricsReports the volume and the health of data flowing through the pipeline into the dataset
Type: Gauge
Labels:
pipeline_name- the name of the pipelinedataset_name- the name of the datasetmetric- metric to report. Possible values arebacklog,event_timestamp_seconds,error.backlogreports the number of events in the input datasets that haven't been processed by the pipeline.event_timestamp_secondsreports the event timestamp of the last message processed (this is sampled and hence approximate).erroris the number of messages for which the pipeline encountered an error.
fennel_expectation_status_counterReports the number of rows in the dataset that passed/failed the given data expectation
Type: Counter
Labels:
dataset- the name of the dataset whose data is monitoredexpectation_name– the name of the specific expectationstatus- whether expectation passed or not. Valid values aresuccessorfailure
fennel_source_watermark_discarded_rowsReports the number of rows discarded at the source dataset due to watermarking i.e. the rows are older than the
disorderset on the source.Type: Counter
Labels:
source_name- the name of the sourcedataset_name– the name of the dataset
fennel_source_watermark_timestampReports the current watermark timestamp of the source dataset. Any event older than this timestamp is discarded and reported in
fennel_source_watermark_discarded_rowsmetric.Type: Gauge
Labels:
source_name- the name of the sourcedataset_name– the name of the dataset
Read path metrics
fennel_request_counterReports the number of API calls to Fennel broken by the endpoint.
Type: Counter
Labels:
target- the endpoint invoked. Valid values aresync,extract_features,log,extract_historical_features.code- the status code corresponding to the request. Valid values are valid HTTP codes.
fennel_extract_features_counterReports the number of times a given feature has been extracted.
Type: Counter
Labels:
featureset- the name of the featureset in which the feature to be extracted is definedfeature- the name of the feature to be extractedworkflow- the name of the workflow set in the feature extraction call. Note that it defaults todefaultwhen not set.target- eitherextract_featuresorextract_historical_features
fennel_extract_features_error_counterReports the number of errors in feature extraction calls by workflow
Type: Counter
Labels:
workflow- the name of the workflow set in the feature extraction call. Note that it defaults todefaultwhen not set.target- eitherextract_featuresorextract_historical_features.code- the status code corresponding to the request. Valid values are valid HTTP codes.
fennel_request_latency_secondsReports the latencies of the API calls made to Fennel.
Type: Histogram
Labels:
target- valid values areextract_features,extract_historical_features,sync, andlog.
extract_features_latency_secondsReports the latencies of feature extraction calls in particular.
Type: Histogram
Labels:
workflow- the name of the workflow set in the feature extraction call. Note that it defaults todefaultwhen not set.target- eitherextract_featuresorextract_historical_features.