Zabbix a prueba de fallos: migraci贸n de replicaci贸n as铆ncrona a grupal

Introducci贸n

Zabbix admite varias bases de datos, pero solo MySQL y PostgreSQL se consideraron las m谩s adecuadas para mi instalaci贸n. PostgreSQL con su repomgr y pgbouncer o alg煤n stolon por un lado y MySQL Group Replication por el otro. Debido al uso de MySQL en la configuraci贸n actual y el deseo de un equipo est谩ndar, la elecci贸n recay贸 en la segunda opci贸n.





Entonces, 驴qu茅 es exactamente MySQL Group Replication? Como sugiere el nombre, se trata de un grupo de servidores que almacena el mismo conjunto de datos. El n煤mero m谩ximo de nodos en un grupo est谩 limitado a 9. Puede funcionar en modo primario 煤nico o primario m煤ltiple. Pero lo m谩s interesante es que todo funciona autom谩ticamente, ya sea la elecci贸n de un nuevo servidor maestro, la detecci贸n de un nodo roto, el cerebro dividido o la recuperaci贸n de la base de datos. Esta funcionalidad se suministra como complementos group_replication y mysql_clone, la comunicaci贸n se produce a trav茅s del protocolo Group Communication System, que se basa en el algoritmo Paxos. Este tipo de replicaci贸n es compatible desde las versiones 5.7.17 y 8.0.1.





Mi instalaci贸n actual funciona en Zabbix 5.0 LTS y MySQL 5.7, la migraci贸n se realizar谩 elevando la versi贸n de MySQL a 8.0, por lo que es m谩s interesante).





Supervisi贸n de la replicaci贸n

S铆 S铆. Es como TDD, solo que en la administraci贸n, primero debe preparar el monitoreo para que el nuevo cl煤ster ingrese inmediatamente a los radares de nuestro sistema de monitoreo y ni un solo problema escape a su ojo atento. Dado que a煤n no tiene replicaci贸n de grupo (GR), la salida de los comandos a continuaci贸n estar谩 en blanco, por lo que estoy dando un ejemplo de salida de un cl煤ster en ejecuci贸n.





La principal fuente de informaci贸n sobre el estado de los nodos es el comando:





SELECCIONAR * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID	                           | MEMBER_HOST  | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 500049c2-99b7-11e9-8d36-e4434b5f9d0c | example1.com |      3306   | ONLINE       | SECONDARY   | 8.0.13         |
| group_replication_applier | 50024be2-9889-11eb-83da-e4434ba03de0 | example2.com |      3306   | ONLINE       | PRIMARY     | 8.0.13         |
| group_replication_applier | 500b2035-986e-11eb-a9f8-564d00018ad1 | example3.com |      3306   | ONLINE       | SECONDARY   | 8.0.13         |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
      
      



MEMBER_STATE . https://dev.mysql.com/doc/refman/8.0/en/group-replication-server-states.html. , , .





:





SELECT * FROM performance_schema.replication_group_member_stats\G
*************************** 1. row ***************************
                              CHANNEL_NAME: group_replication_applier
                                   VIEW_ID: 16178860996821458:41
                                 MEMBER_ID: 500049c2-99b7-11e9-8d36-e4434b5f9d0c
               COUNT_TRANSACTIONS_IN_QUEUE: 0
                COUNT_TRANSACTIONS_CHECKED: 75715997
                  COUNT_CONFLICTS_DETECTED: 0
        COUNT_TRANSACTIONS_ROWS_VALIDATING: 1957048
        TRANSACTIONS_COMMITTED_ALL_MEMBERS: 500049c2-99b7-11e9-8d36-e4434b5f9d0c:1-1821470279,
500293cf-594c-11ea-aafd-e4434ba03de0:1-622868371,
5000d25c-059e-11e8-822b-564d00018ad1:1-140221041,
c9aae4fb-97a6-11eb-89d1-e4434b5f9d0c:1-125382195
            LAST_CONFLICT_FREE_TRANSACTION: c9aae4fb-97a6-11eb-89d1-e4434b5f9d0c:125471159
COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 0
         COUNT_TRANSACTIONS_REMOTE_APPLIED: 5664
         COUNT_TRANSACTIONS_LOCAL_PROPOSED: 75710337
         COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0
*************************** 2. row ***************************
                              CHANNEL_NAME: group_replication_applier
                                   VIEW_ID: 16178860996821458:41
                                 MEMBER_ID: 50024be2-9889-11eb-83da-e4434ba03de0
               COUNT_TRANSACTIONS_IN_QUEUE: 0
                COUNT_TRANSACTIONS_CHECKED: 75720452
                  COUNT_CONFLICTS_DETECTED: 0
        COUNT_TRANSACTIONS_ROWS_VALIDATING: 1955202
        TRANSACTIONS_COMMITTED_ALL_MEMBERS: 500049c2-99b7-11e9-8d36-e4434b5f9d0c:1-1821470279,
500293cf-594c-11ea-aafd-e4434ba03de0:1-622868371,
5000d25c-059e-11e8-822b-564d00018ad1:1-140221041,
c9aae4fb-97a6-11eb-89d1-e4434b5f9d0c:1-125377993
            LAST_CONFLICT_FREE_TRANSACTION: c9aae4fb-97a6-11eb-89d1-e4434b5f9d0c:125470919
COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 0
         COUNT_TRANSACTIONS_REMOTE_APPLIED: 75711354
         COUNT_TRANSACTIONS_LOCAL_PROPOSED: 9105
         COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0
*************************** 3. row ***************************
                              CHANNEL_NAME: group_replication_applier
                                   VIEW_ID: 16178860996821458:41
                                 MEMBER_ID: 500b2035-986e-11eb-a9f8-564d00018ad1
               COUNT_TRANSACTIONS_IN_QUEUE: 38727
                COUNT_TRANSACTIONS_CHECKED: 49955241
                  COUNT_CONFLICTS_DETECTED: 0
        COUNT_TRANSACTIONS_ROWS_VALIDATING: 1250063
        TRANSACTIONS_COMMITTED_ALL_MEMBERS: 500049c2-99b7-11e9-8d36-e4434b5f9d0c:1-1821470279,
500293cf-594c-11ea-aafd-e4434ba03de0:1-622868371,
5000d25c-059e-11e8-822b-564d00018ad1:1-140221041,
c9aae4fb-97a6-11eb-89d1-e4434b5f9d0c:1-125382195
            LAST_CONFLICT_FREE_TRANSACTION: c9aae4fb-97a6-11eb-89d1-e4434b5f9d0c:125430975
COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 47096
         COUNT_TRANSACTIONS_REMOTE_APPLIED: 49908155
         COUNT_TRANSACTIONS_LOCAL_PROPOSED: 0
         COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0
3 rows in set (0.00 sec)
      
      



COUNT_TRANSACTIONS_IN_QUEUE, Seconds_Behind_Master . , .





, () . , , . , - , . , , , , .





, - . , . , .





:





  1. ( TCP 33061 ). ;





  2. MySQL 8.0 (FreeBSD, Poudriere - );





  3. , Zabbix ( );





  4. , Secondary ( , - ). ;





  5. MySQL 5.7 ;





  6. ( , );





  7. ;





  8. MySQL 8.0 (mysql_upgrade , 8 );





  9. , ( , , . . ). , ;





  10. , , ( , );





  11. ( RESET SLAVE ALL;);





  12. ;





  13. Zabbix Zabbix ;





  14. ( 4 8, 8 , . . );





  15. ;





  16. Ansible Playbook' ;





  17. ;





  18. HADNS;





  19. ;





:





  1. MySQL ;





  2. ;





  3. , MySQL ;





  4. ;





, .





9, 12 14 .





9:

. .





.





SELECT tables.table_schema , tables.table_name , tables.engine
 FROM information_schema.tables
 LEFT JOIN (
 SELECT table_schema , table_name
 FROM information_schema.statistics
 GROUP BY table_schema, table_name, index_name HAVING
 SUM( case when non_unique = 0 and nullable != 'YES' then 1 else 0 end ) = count(*) ) puks
ON tables.table_schema = puks.table_schema and tables.table_name = puks.table_name
WHERE puks.table_name is null
 AND tables.table_type = 'BASE TABLE' AND Engine="InnoDB";
      
      



Zabbix, . Zabbix, dbversion . .





ALTER TABLE history ADD COLUMN `id` BIGINT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT;
ALTER TABLE history_uint ADD COLUMN `id` BIGINT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT;
ALTER TABLE history_text ADD COLUMN `id` BIGINT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT;
ALTER TABLE history_str ADD COLUMN `id` BIGINT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT;
ALTER TABLE history_log ADD COLUMN `id` BIGINT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT;
ALTER TABLE dbversion ADD PRIMARY KEY (mandatory);
      
      



. , - Zabbix.





12:

, , .





server-id=[     ]
gtid_mode=ON
enforce_gtid_consistency=ON
log_bin=binlog
log_slave_updates=ON
binlog_format=ROW
master_info_repository=TABLE
relay_log_info_repository=TABLE
transaction_write_set_extraction=XXHASH64

disabled_storage_engines="MyISAM,BLACKHOLE,FEDERATED,ARCHIVE,MEMORY"
plugin_load_add='group_replication.so;mysql_clone.so'
ssl-ca=/usr/local/etc/ssl/mysql/ca.crt
ssl-cert=/usr/local/etc/ssl/mysql/server.crt
ssl-key=/usr/local/etc/ssl/mysql/server.key
group_replication_ssl_mode=VERIFY_IDENTITY
group_replication_group_name="[   ,     SELECT UUID();]"
group_replication_start_on_boot=off #       
group_replication_local_address="[   ].com:33061"
group_replication_group_seeds="example1.com:33061,example2.com:33061,example3.com:33061"
group_replication_ip_allowlist="2.2.2.2/32,3.3.3.3/32,4.4.4.4/32"
group_replication_member_weight=50

group_replication_recovery_use_ssl=ON
group_replication_recovery_ssl_verify_server_cert=ON
group_replication_recovery_ssl_ca=/usr/local/etc/ssl/mysql/ca.crt
group_replication_recovery_ssl_cert=/usr/local/etc/ssl/mysql/server.crt
group_replication_recovery_ssl_key=/usr/local/etc/ssl/mysql/server.key
      
      



my.cnf, , , . , . group_replication_start_on_boot, , .





SHOW VARIABLES LIKE 'binlog_format'; SET GLOBAL binlog_format = RAW; , .





group_replication_ssl_mode group_replication_recovery_ssl_verify_server_cert , Subject Alternative Name (SAN) , group_replication_group_seeds.





group_replication_member_weight . , , , .





:





SET SQL_LOG_BIN=0;
CREATE USER 'replication'@'%' IDENTIFIED BY '[ ]' REQUIRE SSL;
GRANT replication slave ON *.* TO 'replication'@'%';
GRANT BACKUP_ADMIN ON *.* TO 'replication'@'%';
FLUSH PRIVILEGES;
SET SQL_LOG_BIN=1;
      
      



:





INSTALL PLUGIN group_replication SONAME 'group_replication.so';
INSTALL PLUGIN clone SONAME 'mysql_clone.so';
SHOW PLUGINS;
      
      



, :





CHANGE REPLICATION SOURCE TO SOURCE_USER='replication', SOURCE_PASSWORD='[ ]' \\
  FOR CHANNEL 'group_replication_recovery';
      
      



. group_replication_bootstrap_group , , :





SET GLOBAL group_replication_bootstrap_group=ON; #     
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF; #     
      
      



, :





mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+---------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE  |
+---------------------------+--------------------------------------+-------------+-------------+---------------+
| group_replication_applier | ce9be252-2b71-11e6-b8f4-00212844f856 |example1.com |       3306  | ONLINE        |
+---------------------------+--------------------------------------+-------------+-------------+---------------+
      
      



MySQL .





14:

Zabbix , . MySQL , , .





12- , (server-id, group_replication_local_address). , group_replication_bootstrap_group .





Distributed Recovery mysql_clone . , , , .





, , .





, my.cnf group_replication_start_on_boot off on MySQL .





SELECT * FROM performance_schema.replication_group_members; - .





SELECT * FROM performance_schema.replication_group_member_stats\G - .





SELECT group_replication_set_as_primary('[uuid ]'); - .





Zabbix

Zabbix , , . . , Primary , , Zabbix , , . HADNS, Zabbix IP DNS .





Quiz谩s no todo est茅 hecho con la elegancia que nos gustar铆a. Es posible que desee utilizar mysql-shell, mysqlrouter y convertir Group Replication en InnoDB Cluster, o agregar HAProxy, especialmente cuando est谩 implementando Zabbix desde cero. Espero que esta historia sirva como un buen punto de partida y sea 煤til. 隆Gracias por la atenci贸n!





literatura adicional

https://dev.mysql.com/doc/refman/8.0/en/group-replication.html





https://blog.zabbix.com/scaling-zabbix-with-mysql-innodb-cluster/8472/





https://en.wikipedia.org/wiki/Paxos_(computer_science)








All Articles