Home > Article > Operation and Maintenance > Frequently asked Linux interview questions: Find large files and securely clear them
本篇文章给大家分享一个Linux 线上面试高频问题:如何查找大文件并安全地清除?,给大家分析分析,大家也可以对照着自己分析一下,希望对大家有所帮助!
服务线上环境,会出现一些磁盘使用率过高而告警的情况,可能是某个日志文件过大,没有及时清理回收,如何找到大目录和大文件?
如何安全的清理大文件?
如何使占用的磁盘空间快速释放掉?
(这里以当前目录 ./ 为例,统计 top5)
【du -k --max-depth=1 ./ |sort -nr|head -n5】
[root@test-001 /]# du -k --max-depth=1 ./ |sort -nr|head -n5 137450839518./ 6785876./data 2182577./usr 1830341./home 446856./var //du -k # 显示目录或文件大小时,以 kB 为单位; //du --max-depth=1 [目录] # 只显示指定目录下第一层目录(不含单个文件)的大小; //sort -nr # 以行为单位,根据数字大小从大到小排序; //head -n5 # 显示内容的开头 5 行,这里显示就是 top5 内容;
(这里以当前目录 ./ 为例,统计 top5)
(1)命令详情和说明
【du -sk * ./ | sort -nr | head -n5 | awk -F'\t' '{if(1024 * 1024 * 1024 * 1024 > $1 && $1 >= 1024 * 1024 * 1024) {printf "%.2fT\t\t %s\n", $1/(1024 * 1024 * 1024), $2} else if(1024 * 1024 * 1024 > $1 && $1 >= 1024 * 1024) {printf "%.2fG\t\t %s\n", $1/(1024 * 1024), $2} else if (1024 * 1024 > $1 && $1 >= 1024) {printf "%.2fM\t\t %s\n", $1/1024, $2} else {printf "%sk\t\t %s\n", $1, $2}}' 】
[root@test-001 /]# du -sk * ./ | sort -nr | head -n5 | awk -F'\t' '{if(1024 * 1024 * 1024 * 1024 > $1 && $1 >= 1024 * 1024 * 1024) {printf "%.2fT\t\t %s\n", $1/(1024 * 1024 * 1024), $2} else if(1024 * 1024 * 1024 > $1 && $1 >= 1024 * 1024) {printf "%.2fG\t\t %s\n", $1/(1024 * 1024), $2} else if (1024 * 1024 > $1 && $1 >= 1024) {printf "%.2fM\t\t %s\n", $1/1024, $2} else {printf "%sk\t\t %s\n", $1, $2}}' 7.13G data 2.17G usr 1.75G home 447.04M var 408.50M run //du -sk * # 显示当前目录下每个文件夹和文件的大小以KB为单位(最常用),s表示汇总,k是以KB为统计单位; //./ #当前目录下 //sort -nr # 以行为单位,根据数字大小从大到小排序; //awk -F'\t'# 以水平制表符进行分割,后面的程序就是进行换算单位,格式化输出成易懂的形式;
(2)du、head、sort、awk 详细说明参考已有文章附录
(3)Linux 中 printf 命令使用参考
// Linux 中 printf 命令使用参考 // https://www.linuxprobe.com/linux-printf-example.html '{ if(1024 * 1024 * 1024 * 1024 > $1 && $1 >= 1024 * 1024 * 1024) { printf "%.2fT\t\t %s\n", $1/(1024 * 1024 * 1024), $2 } else if(1024 * 1024 * 1024 > $1 && $1 >= 1024 * 1024) { printf "%.2fG\t\t %s\n", $1/(1024 * 1024), $2 } else if (1024 * 1024 > $1 && $1 >= 1024) { printf "%.2fM\t\t %s\n", $1/1024, $2 } else { printf "%sk\t\t %s\n", $1, $2 } }'
(1)rm 命令有哪些坑?
rm -rf / # 这个命令绝逼不能操作,删除根目录下的文件,就是系统中的所有文件都要被删除。如果是线上服务机器操作了,那就悲剧了!误操作了怎么办?赶快ctrl+c、ctrl+z 能保住多少是多少吧。
rm -rf / home/apps/logs/ # 这也是个天坑命令!目的是删除日志文。结果书写时“多了一个空格”的 bug,看懂了么?这就变成了 [rm -rf /] !
埋藏隐患的日志清理 shell 脚本!脚本关键内容如下。
cd ${log_path} rm -rf *
目的是:进入到日志目录,然后把日志都删除。隐患:当目录不存在时,悲剧就发生了!
(2)如何安全使用 rm 命令?
在生产环境把 [rm -rf] 命令替换为 [mv],再写个脚本程序定期清理,模拟回收站的功能。
把日志清理 shell 脚本,改用逻辑与 && 进行连接。
cd ${log_path} rm -rf *
改用逻辑与 && 进行连接,合并成一句,前半句逻辑失败,后半句命令不执行:
```shell
cd ${log_path} && rm -rf *
完整的日志清理 shell 脚本如下:
```shell #!/bin/bash base_home="/home/apps" log_path=${base_home}/logs cd ${log_path} && rm -rf *
(1)问题情景
1 磁盘使用率监控报警,进入机器可以 (df -h) 命令看到磁盘使用率确实超过了报警阀值。
2 使用命令查看大目录,并进入到目录下 【du -sk * ./ | sort -nr | head -n5 | awk -F'\t' '{if(1024 * 1024 * 1024 * 1024 > $1 && $1 >= 1024 * 1024 * 1024) {printf "%.2fT\t\t %s\n", $1/(1024 * 1024 * 1024), $2} else if(1024 * 1024 * 1024 > $1 && $1 >= 1024 * 1024) {printf "%.2fG\t\t %s\n", $1/(1024 * 1024), $2} else if (1024 * 1024 > $1 && $1 >= 1024) {printf "%.2fM\t\t %s\n", $1/1024, $2} else {printf "%sk\t\t %s\n", $1, $2}}'
】
3 依然没找到大文件,该怎么办呢?
(2)排查思路
1 思考:是不是有文件已经被删除了,但进程还在占用该文件,进程未结束,空间未释放?
2 使用「lsof |grep -i deleted
」命令查看,能查看到已删除,空间没有释放的文件,包含文件大小,进程和服务名等信息。
lsof(List Open Files) 用于查看进程打开的文件,打开文件的进程,进程打开的端口(TCP、UDP),找回/恢复删除的文件。是十分方便的系统监视工具,因为 lsof 命令需要访问核心内存和各种文件,所以需要root 用户权限执行。
(3) Release the occupied disk space
Restart the service pointed to by the process, and the occupied disk space will be released. Be careful during online production operations. Do not kill the process directly. Evaluate whether there is a restart command for the process service itself, and evaluate whether the service can be restarted.
(4) Remarks Appendix
1 When a file is being used by a process and the user deletes the file, the file will only be removed from the directory removed from the structure but not removed from disk.
2 When the process using this file ends, the file will be truly deleted from the disk and the occupied space will be released. When Linux opens a file, the kernel will create a file for each process in the /proc/
『/proc/{nnnn}/fd/
folder ({nnnn}
(for pid)" Create a folder named after its pid to save the relevant information of the process, and its subfolder fd saves the fd (fd: file descriptor) of all the files opened by the process.
3 Ctrl C
and Ctrl Z
are both interrupt commands. Ctrl C
is to forcefully interrupt the execution of the program, and the process has been terminated; Ctrl Z
is to suspend the task (meaning to pause), it is still in the process, it just maintains the suspended state.
6 Commonly used safe cleaning commands for large files in production environments
Safe cleaning of production environments What are the requirements for large files? It is necessary not only to not affect the normal operation of the service, but also to quickly release the occupied disk space (it is not our purpose to make the files disappear, our purpose is to quickly release the occupied disk space).
Do not use "rm -rf xxx.log
"; commonly use "echo "" > xxx.log
".
It is assumed that xxx.log is a large file. For example, this xxx.log has dozens of GB. "echo "" > xxx.log
" is used A ""
content overwrites the original file content, causing disk space to be released instantly!
summarizes the commonly used combination commands for finding large directories and large files (involving du, head , sort, awk and other commands);
And how to use the rm command safely;
There is also a disk usage alarm, but it cannot be checked How to troubleshoot specific large files;
Finally, we also mentioned the commonly used echo command to overwrite the original file to instantly release the disk space occupied.
Related recommendations: "Linux Video Tutorial"
The above is the detailed content of Frequently asked Linux interview questions: Find large files and securely clear them. For more information, please follow other related articles on the PHP Chinese website!