Home  >  Article  >  Database  >  Remember an analysis of MySQL semaphore crash

Remember an analysis of MySQL semaphore crash

步履不停
步履不停Original
2019-07-02 16:08:042307browse

Remember an analysis of MySQL semaphore crash

BA should be familiar with InnoDB: Semaphore wait has lasted > 600 seconds. We intentionally crashed the server because it appears to be hung. MySQL background thread srv_error_monitor_thread When it is discovered that there is a latch lock that has been blocked for more than 600 seconds, if the lock is not released after being detected for 10 consecutive times, panic will be triggered to prevent the service from continuing to hang.

What happened

Version number: MySQL 5.5.40

The log continues to output threads waiting for data Dictionary lock, location is dict0dict.c line 305, waiting time exceeds 900s.

The thread holding the lock is 139998697924352, and its hexadecimal value is 7f53fca8a700.

--Thread 139998393616128 has waited at dict0dict.c line 305 for 934.00 seconds the semaphore:X-lock on RW-latch at 0x105a1b8 created in file dict0dict.c line 748a writer (thread id 139998697924352) has reserved it in mode exclusivenumber of readers 0, waiters flag 1, lock_word: 0Last time read locked in file dict0dict.c line 302Last time write locked in file /pb2/build/sb_0-13157587-1410170252.03/rpm/BUILD/mysql-5.5 .40/mysql-5.5.40/storage/innobase/dict/dict0dict.c line 305

Lock the transaction information of thread 139998697924352 and query the data dictionary table operation.

--- Transaction 0, Not Started Updating Table Statistics: MySQL Thread ID 14075, OS Thread Handle 0x7F53FCA8A700, Query ID 110414021 21.14.5.139 JZJK USRSELECT ROUND (SUM (Data_Length Index_length Data_free)/1024/1024 /1024) AS MY_DB_TOTAL_SIZE FROM information_schema.TABLES

Check the lock-holding thread 139998697924352 to see if there are other locks waiting.

Found thread 139998697924352, self-lock in btr0sea.c line 1134, this lock structure is related to AHI.

--Thread 139998697924352 has waited at btr0sea.c line 1134 for 934.00 seconds the semaphore:X-lock (wait_ex) on RW-latch at 0x1eb06448 created in file btr0sea.c line 178a writer (thread id 139998697924352) has reserved it in mode wait exclusivenumber of readers 1, waiters flag 1, lock_word: ffffffffffffffffLast time read locked in file btr0sea.c line 1057Last time write locked in file /pb2/build/sb_0-13157587-1410170252.03/rpm/BUILD /mysql-5.5.40/mysql-5.5.40/storage/innobase/btr/btr0sea.c line 1134

Next look at which functions the two lock structures are in:

dict0dict.c line 305 in the dict_table_stats_lock function

btr0sea.c line 1134 in the btr_search_drop_page_hash_index function

Under what circumstances will these functions be called?

Enable innodb_table_monitor and call dict_table_stats_lock to apply the X lock when outputting logs. This case is not enabled.

When innodb_stats_on_metadata is enabled, querying the data dictionary table will trigger the update operation of statistical information, and the X lock on dict_table_stats_lock will be called. This matches the transaction information of the thread holding the lock.

Adaptive hash index (AHI) is a hash table structure used by InnoDB to speed up index page search. When the number of page visits meets certain conditions, the address of this page will be stored in a hash table to reduce the cost of B-tree queries.

MySQL version 5.5 AHI uses the global lock btr_search_latch to maintain the consistency of hash table modifications.

InnoDB buffer pool status shows that free buffer basically remains 0 idle. When the InnoDB buffer pool evicts a data page, it calls the btr_search_drop_page_hash_index function to clear the data page from the AHI.

--------------------------------BUFFER POOL AND MEMORY-------- ---------------------Total memory allocated 17582522368; in additional pool allocated 0

Dictionary memory allocated 4289681

##Buffer pool size 1048576

Free buffers 0

Database pages 1040831

Old database pages 384193

Modified db pages 0

Summary

AHI's global lock btr_search_latch is often a hot spot of competition and affects performance. It has been improved after version 5.7 and is split into multiple instances like InnoDB buffer. In this case, when the Innodb_stats_on_metadata parameter is turned on, statistical information is updated when querying metadata information, and the data dictionary is locked, which blocks a large number of business operations. In addition, due to insufficient buffer pool space, the table evicts old pages and triggers AHI's btr_search_latch lock competition, which ultimately leads to The semaphore times out and crashes.

>> Easter Egg <<

It is quite a task to analyze the Semaphore crash in logs of several megabytes and find the relationship between locks, threads and transactions. Maddening. With the help of the three magic weapons of sed, awk and grep, although the efficiency has been improved, it is still not efficient enough.

To be lazy, I wrote a small program to help the DBA quickly sort out these relationships.

Its usage is like this:

hongbin@MBP ~> mysqldba doctor -f /Users/hongbin/workbench/mysqld_safe.log

Target version, look for the corresponding version when checking the code:

MySQL Server Version: '5.7.16-log'

The number of semaphore crashes that appear in the log and The number of mysql starts. If the number of starts is greater than the number of crashes, it may be caused by normal startup or other crashes:

##************ MySQL service start count ******* ***

MySQL Semaphore crash -> 3 times ["2018-08-13 23:12:18" "2018-08-14 12:13:43" "2018 -08-16 13:42:36"]

MySQL Service start -> 3 times ["2018-08-13 23:12:59" "2018-08-14 12:15:20" "2018-08-16 13:46:37"]

Which RW-latches are the threads mainly waiting for? The content includes: lock position, number of occurrences, thread id (occurrence times), focus on the ones that appear more frequently:

********** Which thread waited lock **********row0purge.cc:861 - > 58 140477266503424:(57) 140617703745280:(1)gi.cc:14791 -> 1 140477035656960:(1)trx0undo.ic:171 -> 1 1406176827655 68:(1)ha_innodb.cc:14791 -> 620 140617389913856:(58) 140202719565568:(58) 140202716903168:(57) 140477029533440:(56) 140617407219456:(55) 140477035656960:(5 2) 140477035124480:(29) 140477108467456:(29) 140477025539840:(26) 140477031130880:(25) 140477027669760:(22) 140617634944768:(21) 140617634146048:(21) 140477019948800:(21) 140477026604800:(20) 140477022078720:(1 8) 140477018883840:(16) 140477038585600:(15) 140477028734720:(10) 140477022877440:(9) 140477034325760:(1) 140477031663360:(1)srv0srv.cc:1968 -> 208 140477276993280:(185) 140617714235136:(23)ha_innodb.cc:5510 -> ; 601 140617398167296:(57) 140617409615616:(55) 140617392043776: (53) 140477110597376:(52) 140617395771136:(50) 140617636275968:(45) 140617632548608:(40) 140617634146048:(33) 14061763967564 8:(32) 140617397102336:(28) 140617639409408:(23) 140617635743488:(21) 140617637811968: (18) 140617638610688:(16) 140617399232256:(12) 140617638344448:(10) 140617638078208:(10) 140477033793280:(10) 14047702926720 0:(10) 140617397368576:(9) 140617635211008:(6) 140617393641216:(5) 140617637545728: (3) 140617402693376:(2) 140477037254400:(1)dict0dict.cc:1239 -> 136 140477122623232:(50) 140617392842496:(35) 140202726487808:(2 6) 140477123688192:(12) 140477038851840:(5) 140477030065920:( 4) 140617634412288:(4)row0trunc.cc:1835 -> 1 140477109798656:(1)

Which write threads hold the X lock for the above lock, focus on the ones that appear more often:

************ Which writer threads block at **********

##140616681907968 -> 1 trx0undo.ic:171:(1)140477173069568 -> 243 srv0srv.cc:1968:(185) row0purge.cc:861:(57) row0trunc.cc:1835:(1)140617682765568 -> 29 srv0sr v.cc: 1968:(23) ha_innodb.cc:5510:(5) row0purge.cc:861:(1)

Write the transaction information corresponding to the thread. There may also be log records that do not output transaction information:

********** These writer threads trx state **********MySQL thread id 83874, OS thread handle 140477173069568, query id 13139674 10.0.1.146 aml deleting from reference tables

Statistics on S lock held by writing threads:

****These writer threads at last time reads locked ****

140477173069568 -> 243 row0purge.cc:861:(243)140617682765568 -> 24 row0purge.cc:861:(24)140616681907968 -> 1 trx0undo.ic: 190:(1 )

Statistics on X lock held by write threads:

****These writer threads at last time write locked ****

140477173069568 -> 243 dict0stats.cc:2366:(243)140617682765568 -> 24 dict0stats.cc:2366:(24)140616681907968 -> 1 buf0flu.cc:1 198:(1)

Through post-event log analysis, it is possible that the transaction information of the thread is not output to the log, and it is impossible to know the specific operations performed by the transaction. In response to this situation, the mini program has added the ability to collect transaction information during the transaction.

The usage is like this:

hongbin@MBP ~> mysqldba -uxxx -pxxx doctor -w

It will monitor the errors of the target mysql In the log, once the keyword "a writer (thread id 140616681907968) has reserved it in mode" appears, query the transaction information in ps and save it.

The above is just a usage of the mini program. As a mini program that serves DBA, there are other functions waiting for you to discover. Feel free to share your thoughts with me.

For more technical articles related to SQL, please visit the

SQL Tutorial

column to learn!

The above is the detailed content of Remember an analysis of MySQL semaphore crash. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn