Prometheus - инструмент сбора метрик приложений.
Grafana - отображение собранных статистических данных о работе приложений.
На картинке ниже отображены результат работы двух приложений, расположенных на разных хостах.
Запущенный процесс Prometheus:
1289 ? Ssl 2:41 /usr/bin/prometheus-node-exporter --collector.diskstats.ignored-devices=^(ram|loop|fd|(h|s|v|xv)d[a-z]|nvmed+nd+p)d+$ --collector.filesystem.ignored-mount-points=^/(sys|proc|dev|run)($|/) --collector.netdev.ignored-devices=^lo$ --collector.textfile.directory=/var/lib/prometheus/node-exporter 21144 ? Ssl 3:02 /usr/local/bin/prometheus --config.file=/etc/prometheus/prometheus.yml --storage.tsdb.path=/var/lib/prometheus --web.console.templates=/etc/prometheus/consoles --web.console.libraries=/etc/prometheus/console_libraries --web.listen-address=0.0.0.0:9090 --storage.tsdb.retention.time=2d --storage.tsdb.retention.size=512MB
Параметры --storage.tsdb.retention.time=2d и --storage.tsdb.retention.size=512MB ограничивают количество сохраненных данных(метрик). Если не ограничить, то их накапливается ОГРОМНОЕ(ПРОСТО НЕРЕАЛЬНОЕ) количество. Заданы в сервисе старта prometheus (/etc/init.d/prometheus см. в конце заметки полный текст):
HELPER_ARGS="--name=$NAME --output=$LOGFILE --pidfile=$PIDFILE --user=$USER --storage.tsdb.retention=2d"
Просмотр Prometheus.
Grafana - инструмент мониторинга приложений. Получение метрик из spring-boot приложения делается с помощью io.spring.dependency-management.
Пример проекта shop_kotlin
Установка Grafana.
Загрузить Grafana с https://grafana.com/grafana/download
или так:
sudo apt-get install -y adduser libfontconfig1 wget https://dl.grafana.com/enterprise/release/grafana-enterprise_10.1.0_amd64.deb sudo dpkg -i grafana-enterprise_10.1.0_amd64.deb
TODO:
- Обстрелять yandex-tankом проект shop_kotlin
Ссылки:
Set up and observe a Spring Boot application with Grafana Cloud, Prometheus, and OpenTelemetry
Установка prometheus
Осваиваем мониторинг с Prometheus. Часть 3. Настройка Prometheus server (!!!)
Файл /etc/init.d/prometheus:
#!/bin/sh # kFreeBSD do not accept scripts as interpreters, using #!/bin/sh and sourcing. if [ true != "$INIT_D_SCRIPT_SOURCED" ] ; then set "$0" "$@"; INIT_D_SCRIPT_SOURCED=true . /lib/init/init-d-script fi ### BEGIN INIT INFO # Provides: prometheus # Required-Start: $remote_fs $syslog # Required-Stop: $remote_fs $syslog # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Short-Description: Monitoring system and time series database # Description: Prometheus is a systems and services monitoring system. It # collects metrics from configured targets at given # intervals, evaluates rule expressions, displays the # results, and can trigger alerts if some condition is # observed to be true. ### END INIT INFO # Author: Martín Ferrari <Адрес электронной почты защищен от спам-ботов. Для просмотра адреса в браузере должен быть включен Javascript. > DESC="monitoring system and time series database" DAEMON=/usr/bin/prometheus NAME=prometheus USER=prometheus PIDFILE=/var/run/prometheus/prometheus.pid LOGFILE=/var/log/prometheus/prometheus.log CFGFILE=/etc/prometheus/prometheus.yml HELPER=/usr/bin/daemon HELPER_ARGS="--name=$NAME --output=$LOGFILE --pidfile=$PIDFILE --user=$USER --storage.tsdb.retention=2d" ARGS="" [ -r /etc/default/$NAME ] && . /etc/default/$NAME config_check() { retcode=0 errors="$(/usr/bin/promtool check config $CFGFILE 2>&1)" || retcode=$? if [ $retcode -ne 0 ]; then log_failure_msg echo "Configuration test failed. Output of config test was:" >&2 echo "$errors" >&2 return $retcode fi } do_start_prepare() { mkdir -p `dirname $PIDFILE` || true chown -R $USER: /var/lib/prometheus/metrics2 chown -R $USER: `dirname $LOGFILE` chown -R $USER: `dirname $PIDFILE` } do_start_cmd() { # Return # 0 if daemon has been started # 1 if daemon was already running # 2 if daemon could not be started $HELPER $HELPER_ARGS --running && return 1 config_check || return 2 $HELPER $HELPER_ARGS -- $DAEMON $ARGS || return 2 return 0 } do_stop_cmd() { # Return # 0 if daemon has been stopped # 1 if daemon was already stopped # 2 if daemon could not be stopped # other if a failure occurred $HELPER $HELPER_ARGS --running || return 1 $HELPER $HELPER_ARGS --stop || return 2 # wait for the process to really terminate for n in 1 2 3 4 5; do sleep $n $HELPER $HELPER_ARGS --running || break done $HELPER $HELPER_ARGS --running || return 0 return 2 } do_reload() { log_daemon_msg "Reloading $DESC configuration files" "$NAME" $HELPER $HELPER_ARGS --running || return 1 config_check || return 2 helper_pid=$(cat $PIDFILE) if [ -z "$helper_pid" ]; then log_failure_msg "Unable to find PID" return 1 fi start-stop-daemon --stop --signal 1 --quiet \ --ppid "$helper_pid" --exec "$DAEMON" log_end_msg $? }
Для Prometheus нужно настроить задания для сбора метрик. Пример заданий в /etc/prometheus/prometheus.yml:
global: scrape_interval: 15s evaluation_interval: 15s external_labels: monitor: 'example' rule_files: - "rules.yml" scrape_configs: - job_name: 'prometheus' scrape_interval: 5s scrape_timeout: 5s static_configs: - targets: ['localhost:9090'] - job_name: "camel_kafka_consumer_extdto(1.57)" scrape_interval: 5s metrics_path: "/camel_kafka_consumer_extdto/api/actuator/prometheus/" static_configs: - targets: ["192.168.1.57:8992"] - job_name: "shop_kotlin(1.20)" scrape_interval: 5s metrics_path: "/shop_kotlin/api/actuator/prometheus" static_configs: - targets: ["192.168.1.20:8988"] - job_name: "camel_kafka_consumer_extdto(1.20)" scrape_interval: 5s metrics_path: "/camel_kafka_consumer_extdto/api/actuator/prometheus" static_configs: - targets: ["192.168.1.20:8992"] - job_name: "shop_kafka_consumer(1.20)" scrape_interval: 5s metrics_path: "/camel_kafka_consumer_extdto/api/actuator/prometheus" static_configs: - targets: ["192.168.1.20:8992"] - job_name: "shop_kafka_consumer(1.57:8698)" scrape_interval: 5s metrics_path: "/shop_kafka_consumer/api/actuator/prometheus" static_configs: - targets: ["192.168.1.57:8698"] - job_name: "shop_kotlin(1.57:8988)" scrape_interval: 5s metrics_path: "/shop_kotlin/api/actuator/prometheus" static_configs: - targets: ["192.168.1.57:8988"] - job_name: "vacancy_backend(1.57:8988)" scrape_interval: 5s metrics_path: "/vacancy/api/actuator/prometheus" static_configs: - targets: ["192.168.1.57:8988"] - job_name: "vacancy_backend(1.20:8988)" scrape_interval: 5s metrics_path: "/vacancy/api/actuator/prometheus" static_configs: - targets: ["192.168.1.20:8988"]
Путь к файлу заданий указан в параметре "CFGFILE" файла /etc/init.d/prometheus (см.выше "CFGFILE=/etc/prometheus/prometheus.yml")