InnoDB Errors - Table Cache Mutex Contention Detected, Wait Threshold Exceeded
I've been encountering an issue, since the beginning of July, that causes my MariaDB instance to grind to a halt at almost the same time of day. Daily. Around 16:30 ET.
At first, I thought it was a Nextcloud issue. I didn't see much (if any) performance impact on other applications/services that relied on the MariaDB instance, initially.
However, further investigation has led me to believe that something has gone wrong with my MariaDB instance. This afternoon, I had to forcefully stop/restart MariaDB -- which is something I've never had to do before. MariaDB usually stops and starts pretty easily.
This would be for MariaDB 10.6.4, running on Windows Server 2019.
I grabbed the lines from the error log for today:
2024-07-23 0:29:00 27 [Warning] Aborted connection 27 to db: 'meshcentral' user: 'meshcentral-dbMan' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 0:29:00 14 [Warning] Aborted connection 14 to db: 'meshcentral' user: 'meshcentral-dbMan' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 0:29:00 15 [Warning] Aborted connection 15 to db: 'meshcentral' user: 'meshcentral-dbMan' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 0:29:00 16 [Warning] Aborted connection 16 to db: 'meshcentral' user: 'meshcentral-dbMan' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 0:29:00 22 [Warning] Aborted connection 22 to db: 'meshcentral' user: 'meshcentral-dbMan' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 0:29:00 21 [Warning] Aborted connection 21 to db: 'meshcentral' user: 'meshcentral-dbMan' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 0:29:00 20 [Warning] Aborted connection 20 to db: 'meshcentral' user: 'meshcentral-dbMan' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 0:29:00 101 [Warning] Aborted connection 101 to db: 'oe2-db' user: 'oe2-dbMan' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 0:29:00 19 [Warning] Aborted connection 19 to db: 'meshcentral' user: 'meshcentral-dbMan' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 0:29:00 18 [Warning] Aborted connection 18 to db: 'meshcentral' user: 'meshcentral-dbMan' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 0:29:00 17 [Warning] Aborted connection 17 to db: 'meshcentral' user: 'meshcentral-dbMan' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 0:34:30 3 [Warning] Aborted connection 3 to db: 'hmaildb' user: 'hMailServerDBMan' host: 'localhost' (Got timeout reading communication packets) 2024-07-23 0:43:52 0 [Note] Windows service "MariaDB": received SERVICE_CONTROL_SHUTDOWN 2024-07-23 0:43:53 0 [Note] C:\Program Files\MariaDB 10.6\bin\mysqld.exe (initiated by: unknown): Normal shutdown 2024-07-23 0:43:54 0 [Note] InnoDB: FTS optimize thread exiting. 2024-07-23 0:43:54 0 [Note] InnoDB: Starting shutdown... 2024-07-23 0:43:54 0 [Note] InnoDB: Dumping buffer pool(s) to E:\MariaDB\ib_buffer_pool 2024-07-23 0:43:54 0 [Note] InnoDB: Restricted to 32404 pages due to innodb_buf_pool_dump_pct=25 2024-07-23 0:43:54 0 [Note] InnoDB: Buffer pool(s) dump completed at 240723 0:43:54 2024-07-23 0:44:05 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1" 2024-07-23 0:44:05 0 [Note] InnoDB: Shutdown completed; log sequence number 97586551533; transaction id 95165637 2024-07-23 0:44:05 0 [Note] C:\Program Files\MariaDB 10.6\bin\mysqld.exe: Shutdown complete 2024-07-23 8:42:24 0 [Note] InnoDB: Compressed tables use zlib 1.2.11 2024-07-23 8:42:24 0 [Note] InnoDB: Number of pools: 1 2024-07-23 8:42:24 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions 2024-07-23 8:42:24 0 [Note] InnoDB: Initializing buffer pool, total size = 2147483648, chunk size = 134217728 2024-07-23 8:42:24 0 [Note] InnoDB: Completed initialization of buffer pool 2024-07-23 8:42:26 0 [Note] InnoDB: 128 rollback segments are active. 2024-07-23 8:42:27 0 [Note] InnoDB: Creating shared tablespace for temporary tables 2024-07-23 8:42:27 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... 2024-07-23 8:42:27 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB. 2024-07-23 8:42:27 0 [Note] InnoDB: 10.6.4 started; log sequence number 97586551533; transaction id 95165636 2024-07-23 8:42:27 0 [Note] InnoDB: Loading buffer pool(s) from E:\MariaDB\ib_buffer_pool 2024-07-23 8:42:27 0 [Note] Plugin 'FEEDBACK' is disabled. 2024-07-23 8:42:29 0 [Note] Server socket created on IP: '::'. 2024-07-23 8:42:29 0 [Note] Server socket created on IP: '0.0.0.0'. 2024-07-23 8:42:34 0 [Note] C:\Program Files\MariaDB 10.6\bin\mysqld.exe: ready for connections. Version: '10.6.4-MariaDB' socket: '' port: 3306 mariadb.org binary distribution 2024-07-23 8:43:01 0 [Note] InnoDB: Buffer pool(s) load completed at 240723 8:43:01 2024-07-23 9:27:01 8 [Warning] IP address '10.12.8.3' could not be resolved: No such host is known. 2024-07-23 9:27:01 9 [Warning] IP address '10.12.8.3' could not be resolved: No such host is known. 2024-07-23 9:27:01 9 [Warning] Aborted connection 9 to db: 'unconnected' user: 'unauthenticated' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 9:27:01 8 [Warning] Aborted connection 8 to db: 'unconnected' user: 'unauthenticated' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 9:27:01 9 [Warning] Aborted connection 9 to db: 'unconnected' user: 'unauthenticated' host: '10.12.8.3' (This connection closed normally without authentication) 2024-07-23 9:27:01 8 [Warning] Aborted connection 8 to db: 'unconnected' user: 'unauthenticated' host: '10.12.8.3' (This connection closed normally without authentication) 2024-07-23 9:27:03 10 [Warning] IP address '10.12.8.3' could not be resolved: No such host is known. 2024-07-23 9:27:03 10 [Warning] Aborted connection 10 to db: 'unconnected' user: 'unauthenticated' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 9:27:03 10 [Warning] Aborted connection 10 to db: 'unconnected' user: 'unauthenticated' host: '10.12.8.3' (This connection closed normally without authentication) 2024-07-23 9:27:03 11 [Warning] IP address '10.12.8.3' could not be resolved: No such host is known. 2024-07-23 9:27:03 11 [Warning] Aborted connection 11 to db: 'unconnected' user: 'unauthenticated' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 9:27:03 11 [Warning] Aborted connection 11 to db: 'unconnected' user: 'unauthenticated' host: '10.12.8.3' (This connection closed normally without authentication) 2024-07-23 9:27:04 12 [Warning] IP address '10.12.8.3' could not be resolved: No such host is known. 2024-07-23 9:27:04 12 [Warning] Aborted connection 12 to db: 'unconnected' user: 'unauthenticated' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 9:27:04 12 [Warning] Aborted connection 12 to db: 'unconnected' user: 'unauthenticated' host: '10.12.8.3' (This connection closed normally without authentication) 2024-07-23 9:27:06 13 [Warning] IP address '10.12.8.3' could not be resolved: No such host is known. 2024-07-23 9:27:06 13 [Warning] Aborted connection 13 to db: 'unconnected' user: 'unauthenticated' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 9:27:06 13 [Warning] Aborted connection 13 to db: 'unconnected' user: 'unauthenticated' host: '10.12.8.3' (This connection closed normally without authentication) 2024-07-23 9:27:16 24 [Warning] Aborted connection 24 to db: 'unconnected' user: 'unauthenticated' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 9:27:16 24 [Warning] Aborted connection 24 to db: 'unconnected' user: 'unauthenticated' host: '10.12.8.3' (This connection closed normally without authentication) 2024-07-23 9:27:38 18 [Warning] Aborted connection 18 to db: 'meshcentral' user: 'meshcentral-dbMan' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 9:27:38 21 [Warning] Aborted connection 21 to db: 'meshcentral' user: 'meshcentral-dbMan' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 9:27:38 20 [Warning] Aborted connection 20 to db: 'meshcentral' user: 'meshcentral-dbMan' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 9:27:38 14 [Warning] Aborted connection 14 to db: 'meshcentral' user: 'meshcentral-dbMan' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 9:29:24 31 [Warning] Aborted connection 31 to db: 'unconnected' user: 'unauthenticated' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 9:29:24 31 [Warning] Aborted connection 31 to db: 'unconnected' user: 'unauthenticated' host: '10.12.8.3' (This connection closed normally without authentication) 2024-07-23 9:32:39 32 [Note] Detected table cache mutex contention at instance 1: 25% waits. Additional table cache instance activated. Number of instances after activation: 2. 2024-07-23 9:32:51 52 [Warning] Aborted connection 52 to db: 'oe2-db' user: 'oe2-dbMan' host: '10.12.8.3' (Got an error reading communication packets) 2024-07-23 9:34:44 76 [Note] Detected table cache mutex contention at instance 1: 31% waits. Additional table cache instance activated. Number of instances after activation: 3. 2024-07-23 9:34:44 142 [Note] Detected table cache mutex contention at instance 2: 29% waits. Additional table cache instance activated. Number of instances after activation: 4. 2024-07-23 9:35:27 157 [Note] Detected table cache mutex contention at instance 2: 23% waits. Additional table cache instance activated. Number of instances after activation: 5. 2024-07-23 16:27:52 22 [Warning] InnoDB: A long wait (503 seconds) was observed for dict_sys.mutex 2024-07-23 16:29:31 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.mutex. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ 240723 16:29:35 [ERROR] mysqld got exception 0x80000003 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see https://mariadb.com/kb/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 10.6.4-MariaDB key_buffer_size=134217728 read_buffer_size=131072 max_used_connections=154 max_threads=65537 thread_count=55 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 139517 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... server.dll!my_parameter_handler()[my_init.c:278] ucrtbase.dll!raise() ucrtbase.dll!abort() server.dll!ib::fatal::~fatal()[ut0ut.cc:519] server.dll!srv_monitor_task()[srv0srv.cc:1270] server.dll!tpool::thread_pool_win::native_timer::timer_callback()[tpool_win.cc:97] ntdll.dll!RtlAddRefActivationContext() ntdll.dll!RtlAcquireSRWLockExclusive() KERNEL32.DLL!BaseThreadInitThunk() ntdll.dll!RtlUserThreadStart() The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains information that should help you find out what is causing the crash. Writing a core file at E:\MariaDB\ 2024-07-23 16:30:01 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.mutex. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ 2024-07-23 16:30:01 7709 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.mutex. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/
When checking Task Manager, RAM usage sits consistently at 60%. CPU usage doesn't appear to sit anywhere near 100% during occurrences of this issue. I think that I've configured my instance incorrectly. But, I'm not quite sure what to change yet.
Answer Answered by Daniel Black in this comment.
"Detected table cache mutex contention" isn't an error or warning, its a note that MariaDB is sizing up its table cache because they are frequently happing. Nothing to worry about on this one.
THe ERROR FATAL ... obviously is a problem. There have been almost 3 years of bug fixes since 10.6.4 and some would fix this kind of problem. Please update to a newer version in the 10.6 series.