Home News Feeds Planet MySQL


Content View Hits : 12066
Planet MySQL
Planet MySQL - https://planet.mysql.com

  • Setting up a High Availability for PMM
    We have been working with PMM for quite a long time, we do most of the performance analysis with PMM for most of our clients. It also provides the flexibility that we have built our own custom dashboard. PMM has many advantages Easy to deploy (docker based) Flexible Customizable Query Analytics One-stop solution for MySQL,Mongo,ProxySQL & PostgresSQL Orchestrator Rich and Deep metrics stats Highly recommended for any production deployments its equivalent to Enterprise-grade monitoring and graphing tool. Recently we have been working for our client on MySQL Consulting to scale peak sale of the year. Wherein we have deployed PMM to view the performance insights of DB. We started on-boarding a considerable number of servers under PMM, Everyone liked the way PMM was able to project the performance insights with a beautiful dashboard over a period of time. Need for HA and Backup of PMM server When PMM started covering border group of server, we were raised on the question of HA for PMM server, of course it is good to have a High Availability for all possible solutions. What if the server crashes due to Hardware failures? Does PMM support redundancy? Is PMM server a single point of failure (SPOF) ? What would be the backup strategy? We had search for many options to achieve HA, with containerized (docker) PMM server. We explored lsyncd – Rsync based daemon to copy incremental changes at the block level, failed when we tried to sync the entire /var/lib/docker and start back docker services DRBD – will work, but it has complexity with the setup and maintenance The Accidental HA solution: As the name suggests, “it was Accidental,”. lets see how it’s done.We have below servers, those were running on debian9(Stretch), with the same version of docker. Source : live.mydbops.com (production) Destination: livebackup.mydbops.com (Replica) The requirement here is to sync the metrics data between the source(live.mydbops.com) and destination (livebackup.mydbops.com), On the source server we had Prometheus data set size around 178G, so the initial sync took some time. Stop PMM-server on source # docker stop pmm-server Copy the existing data within the docker volume. #docker cp pmm-data:/opt/prometheus/data opt/prometheus/ # docker cp pmm-data:/opt/consul-data opt/ # docker cp pmm-data:/var/lib/mysql var/lib/ # docker cp pmm-data:/var/lib/grafana var/lib/ Once the initial copy is done, start back the source PMM server # docker start pmm-server Next, to transfer the data copy to the destination server, I have used SCP here. #scp -r /backup/* livebackup.mydbops.com:/backup Now on the destination server, make sure to have the same version of the PMM server as the source, I have here used version “1.17-1” Since the monitoring server used for testing does not have internet access enabled I had to do an offline load of PMM image as below # docker load < pmm-server_1.17.tar 071d8bd76517: Loading layer [==================================================>] 210.2MB/210.2MB 6e66f289f547: Loading layer [==================================================>] 899.3MB/899.3MB Loaded image: percona/pmm-server:latest Creating docker volume as same as the source server # docker create -v /opt/prometheus/data -v /opt/consul-data -v /var/lib/mysql -v /var/lib/grafana --name pmm-data percona/pmm-server:latest /bin/true 985630d4aa64375cc4834ceb0f66654bada98cda151636bcd63e9f9b0d3fe3a2 Once the data volume is created now proceed to copy back the data into docker data volume as below # docker cp /backup/opt/prometheus/data pmm-data:/opt/prometheus/ # docker cp /backup/opt/consul-data pmm-data:/opt/ # docker cp /backp/var/lib/mysql pmm-data:/var/lib/ # docker cp /backup/var/lib/grafana pmm-data:/var/lib/ Once the data-copy is done, PMM data volume is ready to be used after the change of ownership as below. docker run --rm --volumes-from pmm-data -it percona/pmm-server:latest chown -R pmm:pmm /opt/prometheus/data /opt/consul-data docker run --rm --volumes-from pmm-data -it percona/pmm-server:latest chown -R grafana:grafana /var/lib/grafana docker run --rm --volumes-from pmm-data -it percona/pmm-server:latest chown -R mysql:mysql /var/lib/mysql Now let’s proceed with the final step to run the pmm-server #docker run -d -p 8080:80 --volumes-from pmm-data --name pmm-server -e SERVER_USER=fhuser -e SERVER_PASSWORD=FHPassword -e METRICS_RETENTION=1500h -e ORCHESTRATOR_ENABLED=true -e ORCHESTRATOR_USER='pmm_user' -e ORCHESTRATOR_PASSWORD='PMM_pass' -e METRICS_RESOLUTION=5s --restart always percona/pmm-server:latest Note this command should be exactly the same command executed on the master.After a few moment the PMM service was started in the destination box, we could see the live metrics data polling in (Graph below). We wanted to check the data sync between the source and destination, so we decided to make it run for a couple of more days After this test run period, We have verified the graphs between source and destination, which seems exactly the same, I am just sharing a sample graph for a DB server( between source and destination. Graph from Destination Graph from source : How does it work? In the first instance when we saw the data is being polled in the backup we were astonished !! and more curious to know how it’s actually working ?. We had a long discussion with our expert team on this This is just a conceptual understanding to explain this scenario.Let’s go back to the official architecture of the PMM server by Percona, As below. . As the above architecture depicts PMM has two parts one is the QAN(Query Analytics) and (MM) Metrics Monitor, where we have achieved the redundancy. Metrics monitor works by a “PULL” mechanism, ie., the Prometheus collects the metrics data from registered exporters. Since we have duplicated and created the destination server with the same registered exporter from servers. Hence destination, it makes one more pull from DB servers and stores data on to its own data-volume. What we have achieved is in the below architecture To analyze this we decided to capture packets on the DB machine for those servers and we could TCP packet analysis via TCP Dump: root@mydbops-db-8046755:/home/kabilesh.pr# tcpdump -c 20 -tttt -i eth0 src livebackup.mydbops.com tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes 2019-10-08 13:49:30.593668 IP livebackup.mydbops.com.59502 > mydbops-db-8046755.42002: Flags [P.], seq 1314972321:1314972594, ack 1936498908, win 1444, options [nop,nop,TS val 130344282 ecr 540951618], length 273 2019-10-08 13:49:30.614009 IP livebackup.mydbops.com.59502 > mydbops-db-8046755.42002: Flags [.], ack 4126, win 1422, options [nop,nop,TS val 130344287 ecr 540952633], length 0 2019-10-08 13:49:30.614022 IP livebackup.mydbops.com.59502 > mydbops-db-8046755.42002: Flags [.], ack 9052, win 1390, options [nop,nop,TS val 130344287 ecr 540952633], length 0 2019-10-08 13:49:30.712817 IP livebackup.mydbops.com.59516 > mydbops-db-8046755.42002: Flags [P.], seq 1684281946:1684282219, ack 3948074006, win 1444, options [nop,nop,TS val 130344312 ecr 540951623], length 273 2019-10-08 13:49:31.803979 IP livebackup.mydbops.com.42998 > mydbops-db-8046755.42000: Flags [P.], seq 703692847:703693117, ack 1617572969, win 1444, options [nop,nop,TS val 130344585 ecr 540952412], length 270 2019-10-08 13:49:31.816501 IP livebackup.mydbops.com.42998 > mydbops-db-8046755.42000: Flags [.], ack 15606, win 1436, options [nop,nop,TS val 130344588 ecr 540952933], length 0 20 packets captured 23 packets received by filter 0 packets dropped by kernel root@mydbops-db-8046755:/home/kabilesh.pr# tcpdump -c 20 -tttt -i eth0 src live.mydbops.com tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes 2019-10-08 13:49:42.615378 IP live.mydbops.com.40336 > mydbops-db-8046755.42000: Flags [P.], seq 2445371654:2445371924, ack 1808532851, win 1444, options [nop,nop,TS val 905019213 ecr 540954908], length 270 2019-10-08 13:49:42.627810 IP live.mydbops.com.40336 > mydbops-db-8046755.42000: Flags [.], ack 4126, win 1428, options [nop,nop,TS val 905019217 ecr 540955636], length 0 2019-10-08 13:49:42.627833 IP live.mydbops.com.40336 > mydbops-db-8046755.42000: Flags [.], ack 14262, win 1386, options [nop,nop,TS val 905019217 ecr 540955636], length 0 2019-10-08 13:49:43.490791 IP live.mydbops.com.55843 > mydbops-db-8046755.42002: Flags [P.], seq 611310810:611311083, ack 2742084436, win 1444, options [nop,nop,TS val 905019432 ecr 540954843], length 273 2019-10-08 13:49:43.512377 IP live.mydbops.com.55843 > mydbops-db-8046755.42002: Flags [.], ack 4126, win 1432, options [nop,nop,TS val 905019438 ecr 540955857], length 0 2019-10-08 13:49:43.512388 IP live.mydbops.com.55843 > mydbops-db-8046755.42002: Flags [.], ack 9064, win 1410, options [nop,nop,TS val 905019438 ecr 540955857], length 0 20 packets captured 20 packets received by filter 0 packets dropped by kernel Key Takeaways: This works well for replicating exactly the same set of servers which is already registered with the source. If a new server is added, I believe the samessteps have to be followed back. This is not officially supported, though it works and full-fills the requirement. QAN HA is not done here, but it can be done since the PMM server uses MySQL internally to store the QAN data, hence it can be replicated by exposing the MySQL port. There can be minor differences or deviation in the metrics data since the time at which it polls the data might be different between the servers. As said earlier it is good to have a High Availability for PMM as it plays a critical role.

  • Automatic member fencing with OFFLINE_MODE in Group Replication
    Group Replication enables you to create fault-tolerant systems with redundancy by replicating the system state to a set of servers. Even if some of the servers subsequently fail, as long it is not all or a majority, the system is still available.… Tweet Google Plus Share

  • Dirty reads in High Availability solution
     Understand dirty reads when using ProxySQL  Recently I had been asked to dig a bit about WHY some user where getting dirty reads when using PXC and ProxySQL.  While the immediate answer was easy, I had taken that opportunity to dig a bit more and buildup a comparison between different HA solutions.   For the ones that cannot wait, the immediate answer is …drum roll, PXC is based on Galera replication, and as I am saying from VERY long time (2011), Galera replication is virtually synchronous. Given that if you are not careful you MAY hit some dirty reads, especially if configured incorrectly.   There is nothing really bad here, we just need to know how to handle it right.  In any case the important thing is to understand some basic concepts.  Two ways of seeing the world (the theory) Once more let us talk about data-centric approach and data-distributed. We can have one data state:  Where all the data nodes see a single state of the data. This is it, you will consistently see the same data at a given T moment in time, where T is the moment of commit on the writer.   Or we have data distributed: Where each node has an independent data state. This means that data can be visible on the writer, but not yet visible on another node at the moment of commit, and that there is no guarantee that data will be passed over in a given time.   The two extremes can be summarized as follow: Tightly coupled database clusters Data Centric approach (single state of the data, distributed commit) Data is consistent in time cross nodes Replication requires high performing link Geographic distribution is forbidden Loosely coupled database clusters Single node approach (local commit) Data state differs by node Single node state does not affect the cluster Replication link doesn’t need to be high performance Geographic distribution is allowed    Two ways of seeing the world (the reality) Given life is not perfect and we do not have only extremes, the most commonly used MySQL solution find their place covering different points in the two-dimensional Cartesian coordinate system: This graph has the level of high availability on the X axis and the level of Loose – Tight relation on the Y axis.  As said I am only considering the most used solutions: MySQL – NDB cluster Solutions based on Galera  MySQL Group replication / InnoDB Cluster Basic Asynchronous MySQL replication  InnoDB Cluster and Galera are present in two different positions, while the others take a unique position in the graph. At the two extreme position we have Standard replication, which is the one less tight and less HA, and NDB Cluster who is the tightest solution and higher HA.    Translating this into our initial problem, it means that when using NDB we NEVER have Dirty Reads, while when we use standard replication we know this will happen. Another aspect we must take in consideration when reviewing our solutions, is that nothing come easy. So, the more we want to move to the Right-Top corner the more we need to be ready to give. This can be anything, like performance, functionalities, easy to manage, etc.  When I spoke about the above the first time, I got a few comments, the most common was related on why I decided to position them in that way and HOW I did test it.   Well initially I had a very complex approach, but thanks to the issue with the Dirty Reads and the initial work done by my colleague Marcelo Altman, I can provide a simple empiric way that you can replicate just use the code and instructions from HERE.   Down into the rabbit hole  The platform To perform the following tests, I have used: A ProxySQL server An NDB cluster of 3 MySQL nodes 6 data nodes (3 Node Groups) A cluster of 3 PXC 5.7 single writer An InnoDB cluster 3 nodes single writer  A 3 nodes MySQL replica set 1 Application node running a simple Perl script All nodes where connected with dedicated backbone network, different from front end receiving data from the script.  The tests I have run the same simple test script with the same set of rules in ProxySQL. For Galera and InnoDB cluster I had used the native support in ProxySQL, also because I was trying to emulate the issues I was asked to investigate.  For Standard replication and NDB I had used the mysql_replication_hostgroup settings, with the difference that the later one had 3 Writers, while basic replication has 1 only. Finally, the script was a single threaded operation, creating a table in the Test schema, filling it with some data, then read the Ids in ascending order, modify the record with update, and try to read immediately after.  When doing that with ProxySQL, the write will go to the writer Host Group (in our case 1 node also for NDB, also if this is suboptimal), while reads are distributed cross the READ Host Group. If for any reason an UPDATE operation is NOT committed on one of the nodes being part of the Reader HG, we will have a dirty read. Simple no?! The results     Let us review the graph. Number of dirty reads significantly reduce moving from left to the right of the graph, dropping from 70% of the total with basic replication to the 0.06% with Galera (sync_wait =0). The average lag is the average time taken from the update commit to when the script returns the read with the correct data.  It is interesting to note a few factors: The average cost time in GR between EVENTUAL and AFTER is negligible Galera average cost between sync_wait=0 and sync_wait=3 is 4 times longer  NDB is getting an average cost that is in line with the other BUT its max Lag is very low, so the fluctuation because the synchronization is minimal (respect to the others) GR and Galera can have 0 dirty reads but they need to be configured correctly.   Describing a bit more the scenario, MySQL NDB cluster is the best, period! Less performant in single thread than PXC but this is expected, given NDB is designed to have a HIGH number of simultaneous transactions with very limited impact. Aside that it has 0 dirty pages no appreciable lag between writer commit – reader.  On the other side of the spectrum we have MySQL replication with the highest number of dirty reads, still performance was not bad but data is totally inconsistent.  Galera (PXC implementation) is the faster solution when single threaded and has only 0.06% of dirty reads with WSREP_SYNC_WAIT=0, and 0 dirty pages when SYNC_WAIT=3. About galera we are seen and paying something that is like that by design. A very good presentation (https://www.slideshare.net/lefred.descamps/galera-replication-demystified-how-does-it-work) from Fred Descamps explain how the whole thing works. This slide is a good example: By design the apply and commit finalize in Galera may have (and has) a delay between nodes. When changing the parameter wsrep_sync_wait as explained in the documentation the node initiates a causality check, blocking incoming queries while it catches up with the cluster.  Once all data on the node receiving the READ request is commit_finalized, the node perform the read.  MySQL InnoDB Cluster is worth a bit of discussion. From MySQL 8.0.14 Oracle introduced the parameter group_replication_consistency please read (https://dev.mysql.com/doc/refman/8.0/en/group-replication-consistency-guarantees.html), in short MySQL Group replication can now handle in different way the behavior in respect of Write transactions and read consistency. Relevant to us are two settings: EVENTUAL Both RO and RW transactions do not wait for preceding transactions to be applied before executing. This was the behavior of Group Replication before the group_replication_consistency variable was added. A RW transaction does not wait for other members to apply a transaction. This means that a transaction could be externalized on one member before the others. AFTER A RW transaction waits until its changes have been applied to all of the other members. This value has no effect on RO transactions. This mode ensures that when a transaction is committed on the local member, any subsequent transaction reads the written value or a more recent value on any group member. Use this mode with a group that is used for predominantly RO operations to ensure that applied RW transactions are applied everywhere once they commit. This could be used by your application to ensure that subsequent reads fetch the latest data which includes the latest writes.   As shown above using AFTER is a win and will guarantee us to prevent dirty reads with a small cost. ProxySQL ProxySQL has native support for Galera and Group replication, including the identification of the transactions/writeset behind. Given that we can think ProxySQL SHOULD prevent dirty reads, and it actually does when the entity is such to be caught.  But dirty reads can happen in such so small-time window that ProxySQL cannot catch them.  As indicated above we are talking of microseconds or 1-2 milliseconds. To catch such small entity ProxySQL monitor should pollute the MySQL servers with requests, and still possibly miss them given network latency.  Given the above, the dirty read factor, should be handled internally as MySQL Group Replication and Galera are doing, providing the flexibility to choose what to do.  There are always exceptions, and in our case the exception is in the case of basic MySQL replication. In that case, you can install and use the ProxySQL binlog reader, that could help to keep the READS under control, but will NOT be able to prevent them when happening a very small time and number. Conclusion Nothing comes for free, dirty reads is one of “those” things that can be prevented but we must be ready to give something back.  It doesn’t matter what, but we cannot get all at the same time.  Given that is important to identify case by case WHICH solution fits better, sometimes it can be NDB, others Galera or Group replication.  There is NOT a silver bullet and there is not a single way to proceed.  Also, when using Galera or GR the more demanding setting to prevent dirty reads, can be set at the SESSION level, reducing the global cost. Summarizing NDB is the best, but is complex and fits only some specific usage like high number of threads; simple schema definition; in memory dataset Galera is great and it helps in joining performance and efficiency. It is a fast solution but can be flexible enough to prevent dirty reads with some cost.Use WSREP_SYNC_WAIT to tune that see (https://galeracluster.com/library/documentation/mysql-wsrep-options.html#wsrep-sync-wait) MySQL Group Replication come actually attached, we can avoid dirty reads, it cost a bit use SET group_replication_consistency= 'AFTER' for that. Standard replication can use ProxySQL Binlog Reader, it will help but will not prevent the dirty reads.  To be clear: With Galera use WSREP_SYNC_WAIT=3 for reads consistency  With GR use group_replication_consistency= 'AFTER' I suggest to use SESSION not GLOBAL and play a bit with the settings to understand well what is going on.   I hope this article had given you a better understanding of what solutions we have out there, such that you will be able to perform an informed decision when in need.    Reference https://www.proxysql.com/blog/proxysql-gtid-causal-reads https://github.com/Tusamarco/proxy_sql_tools/tree/master/proxy_debug_tools https://en.wikipedia.org/wiki/Isolation_(database_systems)#Dirty_reads https://galeracluster.com/library/documentation/mysql-wsrep-options.html#wsrep-sync-wait https://dev.mysql.com/doc/refman/8.0/en/group-replication-configuring-consistency-guarantees.html https://www.slideshare.net/lefred.descamps/galera-replication-demystified-how-does-it-work

    MySQL 8.0.18 was just released, and it contains a brand new feature to analyze and understand how queries are executed: EXPLAIN ANALYZE. What is it? EXPLAIN ANALYZE is a profiling tool for your queries that will show you where MySQL spends time on your query and why.… Facebook Twitter LinkedIn

  • MySQL load data infile made faster .
    Loading any large file into MySQL server using the LOAD DATA INFILE is a time consuming process , because it is single threaded and it is a single transaction too. But with modern hardwares system resource is not a bottle neck. At Mydbops we focus on improving the efficiency of process as we value performance more. MySQL introduced the parallel load data operations in its latest minor release MySQL 8.0.17 . I had the curiosity to test this feature and wanted to know, how it can improve the data loading comparing to the existing single threaded method  . Through this blog I am going to  compare the both methods . Remember you can use the parallel data loading utility only via MySQL Shell . Internal Work Flow : This section describes the working flow of the parallel data loading in MySQL 8.0.17 . The importTable utility will analyse the input data file The importTable utility will divide the large file into chunks The importTable utility will update the chunks to destination MySQL server with parallel threads Lab Environment : Below is the server configuration , which I was used to test the data loading operation . 4 core CPU 8 GB RAM SSD Disk Centos 7 I have a CSV file to be loaded , the size of the file is around 8 GB . sakthisrii:mysql-files sakthivel$ ls -lrth -rw-r--r-- 1 root _mysql 8.06G Sep 12 18:16 load_data.csv Let’s start the test with the existing method ( single threaded LOAD DATA INFILE ) . Loading via load data infile: MySQL localhost:33060+ ssl osm SQL > load data infile '/usr/local/mysql/mysql-files/load_data.csv' into table single.single_load fields terminated by ',' optionally enclosed by '"' lines terminated by '\n' (user_id,visible,name,size,latitude,longitude,timestamp,public,description,inserted); Query OK, 14244516 rows affected, 0 warnings (39 min 35.5036 sec) Records: 14244516 Deleted: 0 Skipped: 0 Warnings: 0 Well it took 39 minutes and 36 seconds .  No, I am going to load same data on a different table , with new Muti threaded in MySQL shell utility . With multi threaded in MySQL Shell. MySQL localhost osm JS > util.importTable("/usr/local/mysql/mysql-files/load_data.csv", {schema: "parallel", table: "parallel_load", columns: ["user_id","visible","name","size","latitude","longitude","timestamp","public","description","inserted"], dialect: "csv-unix", skipRows: 0, showProgress: true, fieldsOptionallyEnclosed: true, fieldsTerminatedBy: ",", linesTerminatedBy: "\n",fieldsEnclosedBy: '"',threads: 8, bytesPerChunk: "1G", maxRate: "2G"}) Importing from file '/usr/local/mysql/mysql-files/load_data.csv' to table `parallel`.`parallel_load` in MySQL Server at /tmp%2Fmysql.sock using 8 threads [Worker01] parallel.parallel_load: Records: 1780564 Deleted: 0 Skipped: 0 Warnings: 0 [Worker02] parallel.parallel_load: Records: 1780564 Deleted: 0 Skipped: 0 Warnings: 0 [Worker03] parallel.parallel_load: Records: 1780564 Deleted: 0 Skipped: 0 Warnings: 0 [Worker04] parallel.parallel_load: Records: 1780564 Deleted: 0 Skipped: 0 Warnings: 0 [Worker05] parallel.parallel_load: Records: 1780564 Deleted: 0 Skipped: 0 Warnings: 0 [Worker06] parallel.parallel_load: Records: 1780564 Deleted: 0 Skipped: 0 Warnings: 0 [Worker07] parallel.parallel_load: Records: 1780564 Deleted: 0 Skipped: 0 Warnings: 0 [Worker08] parallel.parallel_load: Records: 1780564 Deleted: 0 Skipped: 0 Warnings: 0 100% (8.06 GB / 8.06 GB), 535.17 KB/s File '/usr/local/mysql/mysql-files/load_data.csv' (8.06 GB) was imported in 6 min 30.0411 sec at 18.81 MB/s Total rows affected in parallel.parallel_load: Records: 14244516 Deleted: 0 Skipped: 0 Warnings: 0 Yes, It took only 6 minutes and 30 seconds . This is 6x faster than the single threaded method and improved the speed tremendously. Options needs to be taken care : Below are the important options which involved in the performance of effective data loading. Every thresholds should be provided with the optimal values based on the available system resources ( CPU / RAM / Disk IOPS ) else it can degrade the performance. Threads BytesPerChunk MaxRate Threads :  You can define the number of parallel threads to process the data from the input file to the target server. The default value is 8 threads BytesPerChunk : This defines the size of the chunk for each LOAD DATA call . All the threads will process the separate chunk during the operation . We can define the threshold based on the available core and file size . MaxRate: The maximum limit on data throughput in bytes per second per thread. Use this option if you need to avoid saturating the network or the I/O or CPU for the client host or target server. Hope this blog helps to identify the difference between the existing load data operation and latest MySQL Shell utility parallel data loading operation . At Mydbops, We are keep testing the new things on MySQL and related tools, will be back soon with an exciting blog soon. Featured image by Joao Marcelo Marques on Unsplash