Generator for retention graphs from ElasticSearch data
Go to file
Jakub Sokołowski 8d09740b02
add --fleet flag to narrow down query to eth.prod by default
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2020-07-15 20:08:23 +02:00
examples add README 2020-07-08 15:34:54 +02:00
.gitignore ignore *.png files 2020-07-08 15:24:13 +02:00
README.md add --fleet flag to narrow down query to eth.prod by default 2020-07-15 20:08:23 +02:00
graph.py parametrize figsize 2020-07-08 15:21:32 +02:00
main.py add --fleet flag to narrow down query to eth.prod by default 2020-07-15 20:08:23 +02:00
query.py add --fleet flag to narrow down query to eth.prod by default 2020-07-15 20:08:23 +02:00
requirements.txt add missing dependency on elasticsearch 2020-07-08 16:00:17 +02:00

README.md

Description

This Python script generates graphs visualizing peer retention in Status app.

Details

The script queries an ElasticSearch endpoint for logstash-* indices and aggregates counts of instances of log messages with set peer_id field.

This data is then analyzed using Pandas and graphed using Seaborn in the form of

This was built from a combination of a CSV generating script and a cohort analysis done by @jakubgs and @bgits.

Usage

The script provides a number of options:

Usage: main.py [options]

This generates a CSV with buckets of peer_ids for every day.

Options:
  -h, --help            show this help message and exit
  -H ES_HOST, --host=ES_HOST
                        ElasticSearch host.
  -P ES_PORT, --port=ES_PORT
                        ElasticSearch port.
  -i INDEX_PATTERN, --index-pattern=INDEX_PATTERN
                        Patter for matching indices.
  -f FIELD, --field=FIELD
                        Name of the field to count.
  -F FLEET, --fleet=FLEET
                        Name of the fleet to query.
  -m MAX_SIZE, --max-size=MAX_SIZE
                        Max number of counts to find.
  -d IMAGE_DPI, --image-dpi=IMAGE_DPI
                        DPI of generated PNG images.
  -o OUTPUT_DIR, --output-dir=OUTPUT_DIR
                        Dir into which images are generated.

Example: ./unique_count.py -i "logstash-2019.11.*" -f "peer_id"

Example