Kapacitor 是一个开源框架,用来处理、监控和警告时间序列数据,它使用 TICKscript 脚本来定义任务。Kapacitor是InfluxData开源的数据处理引擎。它可以处理来自InfluxDB的流数据和批处理数据,可以周期性将InfluxDB中的数据汇总、处理后再输出到InfluxDB当中,或者告警(支持Email、HTTP、TCP、 HipChat, OpsGenie, Alerta, Sensu, PagerDuty, Slack等多种方式)。
一.安装
修改kapacitor.conf文件中的参数data_dir和[logging]的路径,重启报错,错误如下:
[XXXXXXXX kapacitor]# systemctl status kapacitor
● kapacitor.service - Time series data processing engine.
Loaded: loaded (/usr/lib/systemd/system/kapacitor.service; disabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Fri 2019-01-29 15:17:42 CST; 312ms ago
Docs: https://github.com/influxdb/kapacitor
Process: 28842 ExecStart=/usr/bin/kapacitord -config /etc/kapacitor/kapacitor.conf $KAPACITOR_OPTS (code=exited, status=1/FAILURE)
Main PID: 28842 (code=exited, status=1/FAILURE)
Jan 29 15:17:42 systemd[1]: Unit kapacitor.service entered failed state.
Jan 29 15:17:42 systemd[1]: kapacitor.service failed.
Jan 29 15:17:42 systemd[1]: kapacitor.service holdoff time over, scheduling restart.
Jan 29 15:17:42 systemd[1]: Stopped Time series data processing engine..
Jan 29 15:17:42 systemd[1]: start request repeated too quickly for kapacitor.service
Jan 29 15:17:42 systemd[1]: Failed to start Time series data processing engine..
Jan 29 15:17:42 systemd[1]: Unit kapacitor.service entered failed state.
Jan 29 15:17:42 systemd[1]: kapacitor.service failed.
查看service 文件/usr/lib/systemd/system/kapacitor.service,发现启动账号为kapacitor。
解决方案:对替换的文件授权即可
chown -R kapacitor:kapacitor data
chown -R kapacitor:kapacitor logs
2.调试报错
调试命令如下:
kapacitor record stream -task cpu_alert -duration 60s
报错信息:
failed to create recording file: open /var/lib/kapacitor/replay/119w1985-0101-120c-83b0-c9XXXXXXXXX.srpl: permission denied
查看报错文件的权限
解决方案:
chown -R kapacitor:kapacitor replay
3.log过多过大,调试log打印级别
上线运行一周发现kapacitor.log有4G,过多过大。
4.服务的开启/关闭/查看
启动服务
systemctl start kapacitor.service
关闭服务
systemctl stop kapacitor.service
查看服务状态
systemctl status kapacitor.service
5.部分命令
查看已部署的task
kapacitor list tasks
如果需要调整代码,在TICKscript文件中直接编辑。调整后,重新生成task
例如某id为cpu_alert的tick
kapacitor define cpu_alert -tick cpu_alert.tick
6. bach类型的task 注意没有-duration 参数
例如:
kapacitor record batch -task XXXXX -duration 60s
报错信息如下:
flag provided but not defined: -duration
解释如下:
Usage: kapacitor record batch [options]
Record the result of a InfluxDB query from a task.
Prints the recording ID on exit.
See "kapacitor help replay" for how to replay a recording.
Examples:
$ kapacitor record batch -task cpu_idle -start 2015-09-01T00:00:00Z -stop 2015-09-02T00:00:00Z
This records the result of the query defined in task "cpu_idle" and runs the query
until the queries reaches the stop time, starting at time "start" and incrementing
by the schedule defined in the task.
$ kapacitor record batch -task cpu_idle -past 10h
This records the result of the query defined in task "cpu_idle" and runs the query
until the queries reaches the present time.
The starting time for the queries is "now - 10h" and increments by the schedule defined in the task.
Options:
-no-wait
Do not wait for the recording to finish.
-past string
Set start time via "now - past".
-recording-id string
The ID to give to this recording. If not set an random ID is chosen.
-start string
The start time for the set of queries.
-stop string
The stop time for the set of queries (default now).
-task string
The ID of a task. Uses the queries contained in the task.
参考资料
https://docs.influxdata.com/kapacitor/v1.5/introduction/getting-started/#test-the-task