Supervisión de las métricas del servidor linux en Home Assistant a través de mqtt

Era necesario instalar otro servidor en casa, y me propuse monitorear su rendimiento en una casa inteligente doméstica, que es utilizada por Home Assistant. La búsqueda rápida y cuidadosa en Google no me dio soluciones universales, así que construí mi propia bicicleta.





Introducción: supervisaremos la carga y la temperatura del procesador, la RAM y la carga de intercambio, el espacio libre en el disco, la duración del tiempo de actividad, la carga total del sistema, la temperatura y el estado de los discos inteligentes por separado, y el estado de la incursión (en un servidor con ubuntu server 20, se planteó un software simple raid1) ... Unidades WD Green, placa base GA-525 con atom525 incorporado.



El broker mosquitto ya se ha configurado en el servidor doméstico inteligente, por lo que se eligió mqtt como método de transferencia de datos.



En las primeras secciones de este trabajo, se dan los principios de los métodos de recopilación de datos aplicados y, al final, los scripts de transferencia de datos y la configuración de HA.



Todos los comandos de los ejemplos se ejecutan como root.





Tabla de contenido

Recopilación de sensores del sistema

Recopilación de

datos de carga del sistema Recopilación de datos de estado del disco duro

Recopilación de

datos de estado RAID Envío de datos recopilados

Configuración de Home Assistant





Lecturas del sensor del sistema

Para obtener los sensores incorporados, usaremos la utilidad de sensores





Si no está instalado, póngalo: apt-get install lm-sensors







Primero, necesita encontrar todos los sensores disponibles. Ejecutamos el comando sensors-detect





y respondemos todas las preguntas y . Después de eso, puedes ver lo que sucedió:sensors







Cabe señalar que, personalmente, mis sensores comenzaron a mostrar todos los sensores encontrados solo después de un reinicio. Quizás algún tipo de error, no lo sé.





. sensors json, . sensors -A -u -j



json. , .







, . . json - jp. - ubuntu :



apt-get install jq







xpath . , -.





. , , , temp3, :



sensors -A -u -j | jq '.["coretemp-isa-0000"]["Core 0"].temp2_input'

sensors -A -u -j | jq '.["it8720-isa-0290"].fan1.fan1_input'

sensors -A -u -j | jq '.["it8720-isa-0290"].temp3.temp3_input'








, , , , .





. - free. , -m, .





, . - , .



free -m | grep "Mem" | awk '{print $2}'







grep , awk - , . , . .





, df. , , , . - , . : df









df | grep "/dev/md127p1" | awk '{print $5}' | sed 's/%$//'

df | grep "/dev/md126p1" | awk '{print $5}' | sed 's/%$//'








/proc/loadavg. , - , . , , / 1, 5 15 . . , ( ) , '? 15 :



cat /proc/loadavg | awk '{print $3}'







uptime:



uptime | awk '{print $3}' | sed 's/,$//'







mpstat. , , . , , . , , , . mpstat , apt install sysstat. ,



mpstat | grep all | awk '{print $13}'







, .



, , . bash . bc



cpuidle=$(mpstat | grep all | awk '{print $13}')

cpuload=$(echo "100-$cpuidle" | bc -l)

echo " : $cpuload"








hddtemp. , :



apt-get install hddtemp







: , -n :





SMART smartmontools



apt-get install smartmontools







, -a, .



smartctl -a /dev/sda







, . , . . :





  • Raw_Read_Error_Rate — . , . , . . , ;





  • Reallocated_Sector_Ct — . ;





  • Seek_Error_Rate — . ;





  • Spin_Retry_Count — . ;





  • Reallocated_Event_Count — ;





  • Offline_Uncorrectable — . .





, - json. -j, :



smartctl -a -j /dev/sda







json, . . , . json xpath .





xpath, jq, ( ):





smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[0].raw.value' #Raw_Read_Error_Rate

smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[3].raw.value' #Reallocated_Sector_Ct

smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[4].raw.value' #Seek_Error_Rate

smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[6].raw.value' #Spin_Retry_Count

smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[12].raw.value' #Reallocated_Event_Count

smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[14].raw.value' #Offline_Uncorrectable








, " - " - -H, . -j, json.





json:



smartctl -a /dev/sda -j | jq '.smart_status.passed' #smart_status







, ()

, , , cron . .



smartctl -t short /dev/sda







, 2



smartctl -t long /dev/sda







, 1 .



, , smartd, , . , . smartd .





RAID

raid mdadm. , /var. , mdadm , raid .



, - sys. [1] [2]



- . .





, cat /proc/mdstat





- :









echo 'check' >/sys/block/md126/md/sync_action

echo 'check' >/sys/block/md127/md/sync_action












cat /sys/block/md126/md/mismatch_cnt

cat /sys/block/md127/md/mismatch_cnt








0, .





, .





mosquitto, :



apt-get install mosquitto-clients







- , . - ( ), ( raid ), ( smart):



touch system.sh && touch drives.sh && touch smart.sh

chmod u+x system.sh && chmod u+x drives.sh && chmod u+x smart.sh








:





system.sh
#!/bin/bash
#      
ip=xx.xx.xx.xx
usr="xx"
pass="xx"



tempdrive1=$(hddtemp "/dev/sda" -n)
echo "  1: $tempdrive1"
tempdrive2=$(hddtemp "/dev/sdb" -n)
echo "  2: $tempdrive2"


tempcpu=$(sensors -A -u -j | jq '.["coretemp-isa-0000"]["Core 0"].temp2_input')
echo " : $tempcpu"
fan=$(sensors -A -u -j | jq '.["it8720-isa-0290"].fan1.fan1_input')
echo "  : $fan"
temp3=$(sensors -A -u -j | jq '.["it8720-isa-0290"].temp3.temp3_input')
echo " : $temp3"

totalram=$(free -m | grep "Mem" | awk '{print $2}')
echo " : $totalram"
usedram=$(free -m | grep "Mem" | awk '{print $3}')
echo "  : $usedram"
usedrampercent=$(($usedram * 100 / $totalram))
echo "    : $usedrampercent"

totalswap=$(free -m | grep "Swap" | awk '{print $2}')
echo " : $totalswap"
usedswap=$(free -m | grep "Swap" | awk '{print $3}')
echo "  : $usedswap"
usedswappercent=$(($usedswap * 100 / $totalswap))
echo "    : $usedswappercent"

averageload=$(cat /proc/loadavg | awk '{print $3}')
echo "  : $averageload"

uptimedata=$(uptime | awk '{print $3}' | sed 's/,$//')
echo ": $uptimedata"

cpuidle=$(mpstat | grep all | awk '{print $13}')
cpuload=$(echo "100-$cpuidle" | bc -l) # ,    bash      
echo "  : $cpuload"


echo " "
echo " "

mosquitto_pub -h $ip -t "srv/tempdrive1" -m $tempdrive1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/tempdrive2" -m $tempdrive2 -u $usr -P $pass

mosquitto_pub -h $ip -t "srv/tempcpu" -m $tempcpu -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/fan" -m $fan -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/temp3" -m $temp3 -u $usr -P $pass

mosquitto_pub -h $ip -t "srv/usedrampercent" -m $usedrampercent -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/usedswappercent" -m $usedswappercent -u $usr -P $pass

mosquitto_pub -h $ip -t "srv/averageload" -m $averageload -u $usr -P $pass

mosquitto_pub -h $ip -t "srv/uptimedata" -m $uptimedata -u $usr -P $pass

mosquitto_pub -h $ip -t "srv/cpuload" -m $cpuload -u $usr -P $pass

      
      



drives.sh
#!/bin/bash
#      
ip=xx.xx.xx.xx
usr="xx"
pass="xx"



raid_system_status=$(cat /sys/block/md126/md/mismatch_cnt)
echo " RAID  : $raid_system_status"
raid_var_status=$(cat /sys/block/md127/md/mismatch_cnt)
echo " RAID  : $raid_var_status"

freesystemdisk=$(df | grep "/dev/md127p1" | awk '{print $5}' | sed 's/%$//')
echo "    : $freesystemdisk"
freedatadisk=$(df | grep "/dev/md126p1" | awk '{print $5}' | sed 's/%$//')
echo "    : $freedatadisk"

echo " "
echo " "

mosquitto_pub -h $ip -t "srv/raid_system_status" -m $raid_system_status -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/raid_var_status" -m $raid_var_status -u $usr -P $pass

mosquitto_pub -h $ip -t "srv/freesystemdisk" -m $freesystemdisk -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/freedatadisk" -m $freedatadisk -u $usr -P $pass

      
      



smart.sh
#!/bin/bash
#      
ip=xx.xx.xx.xx
usr="xx"
pass="xx"



Raw_Read_Error_Rate1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[0].raw.value')
echo "SMART Raw_Read_Error_Rate  1: $Raw_Read_Error_Rate1"
Reallocated_Sector_Ct1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[3].raw.value')
echo "SMART Reallocated_Sector_Ct  1: $Reallocated_Sector_Ct1"
Seek_Error_Rate1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[4].raw.value')
echo "SMART Seek_Error_Rate  1: $Seek_Error_Rate1"
Spin_Retry_Count1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[6].raw.value')
echo "SMART Spin_Retry_Count  1: $Spin_Retry_Count1"
Reallocated_Event_Count1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[12].raw.value')
echo "SMART Reallocated_Event_Count  1: $Reallocated_Event_Count1"
Offline_Uncorrectable1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[14].raw.value')
echo "SMART Offline_Uncorrectable  1: $Offline_Uncorrectable1"

smart_status1=$(smartctl -a /dev/sda -j | jq '.smart_status.passed')
echo "  1: $smart_status1"

Raw_Read_Error_Rate2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[0].raw.value')
echo "SMART Raw_Read_Error_Rate  2: $Raw_Read_Error_Rate2"
Reallocated_Sector_Ct2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[3].raw.value')
echo "SMART Reallocated_Sector_Ct  2: $Reallocated_Sector_Ct2"
Seek_Error_Rate2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[4].raw.value')
echo "SMART Seek_Error_Rate  2: $Seek_Error_Rate2"
Spin_Retry_Count2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[6].raw.value')
echo "SMART Spin_Retry_Count  2: $Spin_Retry_Count2"
Reallocated_Event_Count2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[12].raw.value')
echo "SMART Reallocated_Event_Count  2: $Reallocated_Event_Count2"
Offline_Uncorrectable2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[14].raw.value')
echo "SMART Offline_Uncorrectable  2: $Offline_Uncorrectable2"

smart_status2=$(smartctl -a /dev/sdb -j | jq '.smart_status.passed')
echo "  2: $smart_status2"

echo " "
echo " "

mosquitto_pub -h $ip -t "srv/Raw_Read_Error_Rate1" -m $Raw_Read_Error_Rate1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Reallocated_Sector_Ct1" -m $Reallocated_Sector_Ct1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Seek_Error_Rate1" -m $Seek_Error_Rate1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Spin_Retry_Count1" -m $Spin_Retry_Count1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Reallocated_Event_Count1" -m $Reallocated_Event_Count1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Offline_Uncorrectable1" -m $Offline_Uncorrectable1 -u $usr -P $pass

mosquitto_pub -h $ip -t "srv/Raw_Read_Error_Rate2" -m $Raw_Read_Error_Rate2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Reallocated_Sector_Ct2" -m $Reallocated_Sector_Ct2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Seek_Error_Rate2" -m $Seek_Error_Rate2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Spin_Retry_Count2" -m $Spin_Retry_Count2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Reallocated_Event_Count2" -m $Reallocated_Event_Count2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Offline_Uncorrectable2" -m $Offline_Uncorrectable2 -u $usr -P $pass

mosquitto_pub -h $ip -t "srv/smart_status1" -m $smart_status1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/smart_status2" -m $smart_status2 -u $usr -P $pass

      
      



, Mosquitto broker Home Assistant





, , , .





Home Assistant

, . Home Assistant .





sensor:
  - platform: mqtt
    state_topic: "srv/tempdrive1"
    name: " nextcloud   1"
    unit_of_measurement: °C
  - platform: mqtt
    state_topic: "srv/tempdrive2"
    name: " nextcloud   2"
    unit_of_measurement: °C
  - platform: mqtt
    state_topic: "srv/tempcpu"
    name: " nextcloud  "
    unit_of_measurement: °C
  - platform: mqtt
    state_topic: "srv/fan"
    name: " nextcloud  "
    unit_of_measurement: ppm
  - platform: mqtt
    state_topic: "srv/temp3"
    name: " nextcloud  "
    unit_of_measurement: °C
  - platform: mqtt
    state_topic: "srv/usedrampercent"
    name: " nextcloud  RAM"
    unit_of_measurement: "%"
  - platform: mqtt
    state_topic: "srv/usedswappercent"
    name: " nextcloud  SWAP"
    unit_of_measurement: "%"
  - platform: mqtt
    state_topic: "srv/freesystemdisk"
    name: " nextcloud     "
    unit_of_measurement: "%"
  - platform: mqtt
    state_topic: "srv/freedatadisk"
    name: " nextcloud     "
    unit_of_measurement: "%"
  - platform: mqtt
    state_topic: "srv/averageload"
    name: " nextcloud   "
  - platform: mqtt
    state_topic: "srv/uptimedata"
    name: " nextcloud "
  - platform: mqtt
    state_topic: "srv/cpuload"
    name: " nextcloud   "
    unit_of_measurement: "%"
  - platform: mqtt
    state_topic: "srv/Raw_Read_Error_Rate1"
    name: " nextcloud  1 SMART Raw_Read_Error_Rate"
  - platform: mqtt
    state_topic: "srv/Reallocated_Sector_Ct1"
    name: " nextcloud  1 SMART Reallocated_Sector_Ct"
  - platform: mqtt
    state_topic: "srv/Seek_Error_Rate1"
    name: " nextcloud  1 SMART Seek_Error_Rate"
  - platform: mqtt
    state_topic: "srv/Spin_Retry_Count1"
    name: " nextcloud  1 SMART Spin_Retry_Count"
  - platform: mqtt
    state_topic: "srv/Reallocated_Event_Count1"
    name: " nextcloud  1 SMART Reallocated_Event_Count"
  - platform: mqtt
    state_topic: "srv/Offline_Uncorrectable1"
    name: " nextcloud  1 SMART Offline_Uncorrectable"
  - platform: mqtt
    state_topic: "srv/smart_status1"
    name: " nextcloud  1 SMART "
  - platform: mqtt
    state_topic: "srv/Raw_Read_Error_Rate2"
    name: " nextcloud  2 SMART Raw_Read_Error_Rate"
  - platform: mqtt
    state_topic: "srv/Reallocated_Sector_Ct2"
    name: " nextcloud  2 SMART Reallocated_Sector_Ct"
  - platform: mqtt
    state_topic: "srv/Seek_Error_Rate2"
    name: " nextcloud  2 SMART Seek_Error_Rate"
  - platform: mqtt
    state_topic: "srv/Spin_Retry_Count2"
    name: " nextcloud  2 SMART Spin_Retry_Count"
  - platform: mqtt
    state_topic: "srv/Reallocated_Event_Count2"
    name: " nextcloud  2 SMART Reallocated_Event_Count"
  - platform: mqtt
    state_topic: "srv/Offline_Uncorrectable2"
    name: " nextcloud  2 SMART Offline_Uncorrectable"
  - platform: mqtt
    state_topic: "srv/smart_status2"
    name: " nextcloud  2 SMART "
  - platform: mqtt
    state_topic: "srv/raid_system_status"
    name: " nextcloud RAID   "
  - platform: mqtt
    state_topic: "srv/raid_var_status"
    name: " nextcloud RAID   "
      
      



, , , ! . , , . :





, . , , smart .





- , . , . → → mqtt.



- linux , , , .





- . , . , .





La captura de pantalla muestra que el servidor discutido está planeado para nextcloud. Sus indicadores internos también se pueden agregar perfectamente a HA, para ello hay una maravillosa api. Y HA tiene integración incorporada.








All Articles