Home>Article>Operation and Maintenance> How to check hardware errors in Linux

How to check hardware errors in Linux

WBOY
WBOY Original
2022-05-17 10:02:34 3259browse

In Linux, you can use mcelog to check hardware errors; mcelog is a tool used to check hardware errors. Errors can be obtained based on the hot restart or hard restart caused by the error. The error information of the hot restart will be captured. If the hard restart error cannot be caught, you can use the "yum install mcelog" command to install it.

How to check hardware errors in Linux

#The operating environment of this tutorial: linux7.3 system, Dell G3 computer.

How to check hardware errors in Linux

1. mcelog is a tool used on Linux systems to check hardware errors, especially memory and CPU errors.

Uncorrected errors are critical exceptions that often result in kernel errors on the system if the CPU cannot recover. This causes the application to reset and interrupt.

For uncorrected errors, mcelog's ability to catch the error depends on whether the error resulted in a warm restart or a hard restart.

If it is a hot restart, the information will be captured by mcelog and can be seen after recovery. A hard reboot can result in data loss, and mcelog may not capture the event.

2. Installation

[root@RedHat_test ~]# yum install mcelog.x86_64

3. How to start mcelog

  • ## cron: oldest The method, there are certain, scheduled tasks, some will be lost

  • daemon: This method is used on el7, the daemon process

  • trigger: A more advanced way, when triggered, see man mcelog

4, mcelog related files

    ##/dev/ mcelog device file
  • /var/log/mcelog messages log file
  • /etc/mcelog/mcelog.conf configuration file
  • /var/run/mcelog.pid
  • The default fault log is only recorded in /var/log/mcelog and is not recorded in the system log.

If it needs to be reflected in the system log, you need to modify the /etc/mcelog/mcelog.conf file, remove the preceding #, and save it.

5. Run mcelog in the background

[root@RedHat_test ~]# mcelog --daemon

6. Check whether the system is abnormal

1. How to run mcelog manually

[root@RedHat_test ~]# mcelog --daemon

2. Check the mcelog log

[root@RedHat_test ~]# tail /var/log/mcelog # 什么也没有输出,表明正常

3. Check whether the mcelog daemon detects error information

[root@RedHat_test ~]# mcelog --client # 什么也没有输出,表明正常

4. Parse the mcelog output when the system exception occurs

  [root@RedHat_test ~]# mcelog --ascii < file.log # or或者 [root@RedHat_test ~]# mcelog --ascii --file file.log

Recommended learning:

Linux video tutorial

The above is the detailed content of How to check hardware errors in Linux. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn