Observability is a headache for most small and medium-sized companies, which mainly manifests in the following aspects :
The protagonist of this article is actually not unified. At the current stage, different open source components are still used to implement different functions. However, N9e can view them on the same main panel, but the connection between the data Still hasn't happened.
Then why do we still need to study N9e?
Because it is developing in this direction.
As mentioned above, Grafana is already doing this. Based on the Grafana Loki Tempo Prometheus combination, monitoring, indicators, and links can be linked. What is the difference between N9e and Grafana?
In Mr. Qin’s words: Grafana is better at managing monitoring panels, and N9e is better at managing alarm rules.
N9e can send different alarm rules to different business groups and groups to avoid generating a large number of alarm messages in one group, which will lead to the story of the crying wolf over time.
Having said so much, what does N9e look like?
The following is a system I have deployed.
As you can see, on this panel, we can implement:
In this way, you don’t need to switch back and forth between several applications, which is fast.
If you don’t understand the architecture, it will be in vain if you don’t understand the architecture.
Now let’s take a look at what the architecture of N9e looks like. Only by clarifying how N9e works from the architectural logic will be of great benefit to both deployment and maintenance.
N9e mainly has a central convergence deployment solution and an edge sinking hybrid deployment solution, which will be explained below.
First picture:
This solution is to establish an N9e cluster , the monitoring data of other regions are sent to this cluster, which requires a good network connection between the central cluster and other regions.
For the central cluster, it mainly includes the following components:
For other Regions, you only need to deploy Categraf, which will push local monitoring data to the central cluster.
This architecture is characterized by simplicity and relatively low maintenance costs. The premise is that the network links between computer rooms must be relatively good. If the network is not good, the following solution must be used.
This architecture is a supplement to the central deployment solution, mainly for the network Bad situation:
In the edge computer room, when deploying the timing library, alarm engine, and forwarding gateway, please note that the alarm engine needs to rely on the database because alarm rules need to be synchronized, and the forwarding gateway also needs to rely on the database because it requires To register objects in the database, you need to open the relevant network.
!! # PS: For this solution, the network itself is not good, and the network needs to be opened. Maybe It will still be affected by network problems.
Why should we choose stand-alone deployment here?
Actually, I want to deploy each component next to each other, which will be helpful for understanding the entire N9e operating mode.
!! Tips: I am using Ubuntu 22.04.1 system
It will start automatically after the installation is completed. Then set a user password for the database.##!! Tips : For the sake of speed, I installed Mariadb
# 更新镜像源 $ sudo apt-get update # 更新软件 $ sudo apt-get upgrade # 安装Mariabd $ sudo apt-get install mariadb-server-10.6Copy after login
# 连接数据库 $ sudo mysql # 设置权限和密码 > GRANT ALL PRIVILEGES ON *.* TO 'root'@'localhost' IDENTIFIED BY '1234'; >flush privileges;
# 更新镜像源 $ sudo apt-get update # 更新软件 $ sudo apt-get upgrade # 安装Redis $ sudo apt install redis-server
# 下载二进制包 $ wget https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.90.0/victoria-metrics-linux-amd64-v1.90.0.tar.gz # 解压 $ tar xf victoria-metrics-linux-amd64-v1.90.0.tar.gz # 启动 $ nohup ./victoria-metrics-prod &>victoria.log &
# 下载最新版本的二进制包 $ wget https://github.com/ccfos/nightingale/releases/download/v6.0.0-ga.3/n9e-v6.0.0-ga.3-linux-amd64.tar.gz # 解压 $ mkdir n9e $ tar xf n9e-v6.0.0-ga.3-linux-amd64.tar.gz -C n9e/ # 检验目录如下 $ ll total 35332 drwxrwxr-x7 jokerbai jokerbai 40964月 12 14:05 ./ drwxr-xr-x4 jokerbai jokerbai 40964月 12 14:05 ../ drwxrwxr-x3 jokerbai jokerbai 40964月 12 14:05 cli/ drwxrwxr-x 10 jokerbai jokerbai 40964月 12 14:05 docker/ drwxrwxr-x4 jokerbai jokerbai 40964月 12 14:09 etc/ drwxrwxr-x 20 jokerbai jokerbai 40964月 12 14:05 integrations/ -rwxr-xr-x1 jokerbai jokerbai 252805124月6 19:05 n9e* -rwxr-xr-x1 jokerbai jokerbai 108380164月6 19:05 n9e-cli* -rw-r--r--1 jokerbai jokerbai297844月6 19:04 n9e.sql drwxrwxr-x6 jokerbai jokerbai 40964月 12 14:05 pub/
# 导入数据库 $ mysql -uroot -p <n9e.sql
[[Pushgw.Writers]] # Url = "http://127.0.0.1:8480/insert/0/prometheus/api/v1/write" Url = "http://127.0.0.1:8428/api/v1/write"
# 启动服务 $ nohup ./n9e &>n9e.log & # 检测17000端口是否启动 $ ss -ntl | grep 17000 LISTEN 04096 *:17000*:*
Enter http://127.0.0.1:17000 in the browser, then enter the username root and password root.2020 to log in to the system.
# 下载 $ wget https://download.flashcat.cloud/categraf-v0.2.38-linux-amd64.tar.gz # 解压 $ tar xf categraf-v0.2.38-linux-amd64.tar.gz # 进入目录 $ cd categraf-v0.2.38-linux-amd64/
[[writers]] url = "http://127.0.0.1:17000/prometheus/v1/write" [heartbeat] enable = true
$ nohup ./categraf &>categraf.log &
Then you can see the basic information on the main interface.
Now if you go to view the time series data indicators, you cannot query them. Because no data source has been added.
Add a data source in System Configuration->Data Source, as follows:
##Then you can see the corresponding indicator data. You can also view the monitoring data of the host through the built-in dashboard, as follows: SummaryAt present, Nightingale has been updated to the V6 version. This version has many new functional attempts, such as access to ELK, access to Jaeger, etc. This series will continue to be updated in the future.
The above is the detailed content of [Nightingale Monitoring] First time meeting Nightingale, still strong!. For more information, please follow other related articles on the PHP Chinese website!