DEVICESCAN failed: aborted matching pattern /dev/discs/disc*

  • 2020 年 3 月 31 日
  • 筆記

DEVICESCAN failed: glob(3) aborted matching pattern /dev/discs/disc*

問題描述

在實際環境中,發現一個報錯如下

Mar 28 10:09:36 localhost smartd[9865]: DEVICESCAN failed: glob(3) aborted matching pattern /dev/discs/disc*  

該問題出現概率不大,但是由於基數大了,還是會不時就會出現,讓你想忽略都沒法忽略。今天之前一直以為該問題是hardware導致,如硬盤問題等。但是今天做了一版for PXE的ramdisk OS,竟然可以在原本正常機器上百分百複製該現象,故懷疑該問題另有緣由。

smartd(8)

smartd is a daemon that monitors the Self-Monitoring, Analysis and Reporting Technology (SMART) system  built into many ATA-3 and later ATA, IDE and SCSI-3 hard drives. The purpose of SMART is to monitor the  reliability of the hard drive and predict drive failures, and to carry out different types of drive  self-tests. This version of smartd is compatible with ATA/ATAPI-7 and earlier standards.    smartd will attempt to enable SMART monitoring on ATA devices (equivalent to smartctl -s on) and polls  these and SCSI devices every 30 minutes (configurable), logging SMART errors and changes of SMART  Attributes via the SYSLOG interface. The default location for these SYSLOG notifications and warnings  is /var/log/messages.  

分析

message 信息如下,可知smartd服務開啟時,未掃描到硬盤,之後才初始化硬盤。

Mar 28 10:09:36 localhost smartd[9865]: Configuration file /etc/smartmontools/smartd.conf was parsed, found DEVICESCAN, scanning devices  Mar 28 10:09:36 localhost smartd[9865]: DEVICESCAN failed: glob(3) aborted matching pattern /dev/discs/disc*  Mar 28 10:09:36 localhost smartd[9865]: In the system's table of devices NO devices found to scan  Mar 28 10:09:36 localhost smartd[9865]: Monitoring 0 ATA/SATA, 0 SCSI/SAS and 0 NVMe devices  ...  Mar 28 10:09:38 localhost kernel: scsi 0:0:0:0: Attached scsi generic sg0 type 0  Mar 28 10:09:38 localhost kernel: AMD64 EDAC driver v3.4.0  Mar 28 10:09:38 localhost kernel: Request for unknown module key 'Mellanox Technologies signing key: 61feb074fc7292f958419386ffdd9d5ca999e403' err -11  Mar 28 10:09:38 localhost kernel: ata1.00: Enabling discard_zeroes_data  Mar 28 10:09:38 localhost kernel: sd 0:0:0:0: [sda] 937703088 512-byte logical blocks: (480 GB/447 GiB)  Mar 28 10:09:38 localhost kernel: sd 0:0:0:0: [sda] 4096-byte physical blocks  Mar 28 10:09:38 localhost kernel: sd 0:0:0:0: [sda] Write Protect is off  Mar 28 10:09:38 localhost kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA  Mar 28 10:09:38 localhost kernel: ata1.00: Enabling discard_zeroes_data  Mar 28 10:09:38 localhost kernel: ata1.00: Enabling discard_zeroes_data  Mar 28 10:09:38 localhost kernel: sd 0:0:0:0: [sda] Attached SCSI removable disk  

開機後查看smartd服務狀態和硬盤狀態如下,系統登入後該服務依舊未能監控硬盤

[root@localhost ~]# systemctl status smartd.service  ● smartd.service - Self Monitoring and Reporting Technology (SMART) Daemon     Loaded: loaded (/usr/lib/systemd/system/smartd.service; enabled; vendor preset: enabled)     Active: active (running) since Sat 2020-03-28 10:24:35 CST; 45min ago       Docs: man:smartd(8)             man:smartd.conf(5)   Main PID: 8613 (smartd)     CGroup: /system.slice/smartd.service             └─8613 /usr/sbin/smartd -n -q never    Mar 28 10:24:35 localhost.localdomain systemd[1]: Started Self Monitoring and Reporting Technology (SMART) Daemon.  Mar 28 10:24:36 localhost.localdomain smartd[8613]: smartd 6.5 2016-05-07 r4318 [x86_64-linux-3.10.0-957.el7.x86_64] (local build)  Mar 28 10:24:36 localhost.localdomain smartd[8613]: Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org  Mar 28 10:24:36 localhost.localdomain smartd[8613]: Opened configuration file /etc/smartmontools/smartd.conf  Mar 28 10:24:36 localhost.localdomain smartd[8613]: Configuration file /etc/smartmontools/smartd.conf was parsed, found DEVICESCAN, scanning devices  Mar 28 10:24:36 localhost.localdomain smartd[8613]: DEVICESCAN failed: glob(3) aborted matching pattern /dev/discs/disc*  Mar 28 10:24:36 localhost.localdomain smartd[8613]: In the system's table of devices NO devices found to scan  Mar 28 10:24:36 localhost.localdomain smartd[8613]: Monitoring 0 ATA/SATA, 0 SCSI/SAS and 0 NVMe devices  [root@localhost ~]# lsblk  NAME MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT  sda    8:0    1 447.1G  0 disk  

嘗試修復

為修復此問題,此處嘗試延後smartd,使smartd在初始化硬盤之後啟動。 原始 smartd.service 配置文件如下

[root@server ~]# systemctl cat smartd.service  [Unit]  Description=Self Monitoring and Reporting Technology (SMART) Daemon  Documentation=man:smartd(8) man:smartd.conf(5)  After=syslog.target    [Service]  EnvironmentFile=-/etc/sysconfig/smartmontools  ExecStart=/usr/sbin/smartd -n $smartd_opts  ExecReload=/bin/kill -HUP $MAINPID  StandardOutput=syslog    [Install]  WantedBy=multi-user.target  [root@localhost FX2010700017L]#  

為保證 smartd.service 在初始化硬盤後啟動,此處在配置文件的After=中增加multi-user.target,讓smartd在初始化終端之後啟動(由於此系統為runlevel3啟動,若為runlevel5則需改為graphical.target)。然後重新打包ramdisk OS

[root@localhost ~]# cat /usr/lib/systemd/system/smartd.service  [Unit]  Description=Self Monitoring and Reporting Technology (SMART) Daemon  Documentation=man:smartd(8) man:smartd.conf(5)  After=syslog.target multi-user.target    [Service]  EnvironmentFile=-/etc/sysconfig/smartmontools  ExecStart=/usr/sbin/smartd -n $smartd_opts  ExecReload=/bin/kill -HUP $MAINPID  StandardOutput=syslog    [Install]  WantedBy=multi-user.target  [root@localhost FX2010700017L]#  

從新的ramdisk OS啟動後查看messages log和smartd服務,如下,smartd在成功終端初始化完成後啟動,並在服務啟動時掃描到硬盤。至此,問題解決。

[root@localhost ~]# cat /var/log/messages  ...  Mar 28 15:31:43 localhost kernel: scsi 0:0:0:0: Attached scsi generic sg0 type 0  Mar 28 15:31:43 localhost kernel: ata1.00: Enabling discard_zeroes_data  Mar 28 15:31:43 localhost kernel: sd 0:0:0:0: [sda] 937703088 512-byte logical blocks: (480 GB/447 GiB)  Mar 28 15:31:43 localhost kernel: sd 0:0:0:0: [sda] 4096-byte physical blocks  Mar 28 15:31:43 localhost kernel: sd 0:0:0:0: [sda] Write Protect is off  Mar 28 15:31:43 localhost kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA  Mar 28 15:31:43 localhost kernel: ata1.00: Enabling discard_zeroes_data  Mar 28 15:31:43 localhost kernel: ata1.00: Enabling discard_zeroes_data  Mar 28 15:31:43 localhost kernel: sd 0:0:0:0: [sda] Attached SCSI removable disk  ...  ar 28 15:31:47 localhost systemd: Reached target Multi-User System.  Mar 28 15:31:47 localhost systemd: Started Self Monitoring and Reporting Technology (SMART) Daemon.  Mar 28 15:31:47 localhost systemd: Started Stop Read-Ahead Data Collection 10s After Completed Startup.  Mar 28 15:31:47 localhost systemd: Starting Update UTMP about System Runlevel Changes...  Mar 28 15:31:47 localhost systemd: Started Update UTMP about System Runlevel Changes.  Mar 28 15:31:47 localhost systemd: Startup finished in 52.156s (kernel) + 8.846s (userspace) = 5min 1.858s.  Mar 28 15:31:47 localhost smartd[42226]: smartd 6.5 2016-05-07 r4318 [x86_64-linux-3.10.0-957.el7.x86_64] (local build)  Mar 28 15:31:47 localhost smartd[42226]: Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org  Mar 28 15:31:47 localhost smartd[42226]: Opened configuration file /etc/smartmontools/smartd.conf  Mar 28 15:31:47 localhost smartd[42226]: Configuration file /etc/smartmontools/smartd.conf was parsed, found DEVICESCAN, scanning devices  Mar 28 15:31:47 localhost smartd[42226]: Device: /dev/sda, type changed from 'scsi' to 'sat'  Mar 28 15:31:47 localhost smartd[42226]: Device: /dev/sda [SAT], opened  Mar 28 15:31:47 localhost smartd[42226]: Device: /dev/sda [SAT], 480 GB  Mar 28 15:31:47 localhost systemd: Created slice User Slice of root.  Mar 28 15:31:47 localhost systemd-logind: New session 1 of user root.  Mar 28 15:31:47 localhost systemd: Started Session 1 of user root.  Mar 28 15:31:47 localhost smartd[42226]: Device: /dev/sda [SAT], not found in smartd database.  Mar 28 15:31:47 localhost smartd[42226]: Device: /dev/sda [SAT], can't monitor Offline_Uncorrectable count - no Attribute 198  Mar 28 15:31:47 localhost smartd[42226]: Device: /dev/sda [SAT], is SMART capable. Adding to "monitor" list.  Mar 28 15:31:47 localhost smartd[42226]: Monitoring 1 ATA/SATA, 0 SCSI/SAS and 0 NVMe devices  ...  [root@localhost ~]# systemctl status smartd.service  ● smartd.service - Self Monitoring and Reporting Technology (SMART) Daemon     Loaded: loaded (/usr/lib/systemd/system/smartd.service; enabled; vendor preset: enabled)     Active: active (running) since Sat 2020-03-28 15:31:47 CST; 15min ago       Docs: man:smartd(8)             man:smartd.conf(5)   Main PID: 42226 (smartd)     CGroup: /system.slice/smartd.service             └─42226 /usr/sbin/smartd -n -q never    Mar 28 15:31:47 localhost.localdomain smartd[42226]: Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org  Mar 28 15:31:47 localhost.localdomain smartd[42226]: Opened configuration file /etc/smartmontools/smartd.conf  Mar 28 15:31:47 localhost.localdomain smartd[42226]: Configuration file /etc/smartmontools/smartd.conf was parsed, found DEVICESCAN, scanning devices  Mar 28 15:31:47 localhost.localdomain smartd[42226]: Device: /dev/sda, type changed from 'scsi' to 'sat'  Mar 28 15:31:47 localhost.localdomain smartd[42226]: Device: /dev/sda [SAT], opened  Mar 28 15:31:47 localhost.localdomain smartd[42226]: Device: /dev/sda [SAT], INTEL SSDSCKKB480G8, S/N:PHYH951400VK480K, WWN:5-5cd2e4-1520d72fb, FW:XC311120, 480 GB  Mar 28 15:31:47 localhost.localdomain smartd[42226]: Device: /dev/sda [SAT], not found in smartd database.  Mar 28 15:31:47 localhost.localdomain smartd[42226]: Device: /dev/sda [SAT], can't monitor Offline_Uncorrectable count - no Attribute 198  Mar 28 15:31:47 localhost.localdomain smartd[42226]: Device: /dev/sda [SAT], is SMART capable. Adding to "monitor" list.  Mar 28 15:31:47 localhost.localdomain smartd[42226]: Monitoring 1 ATA/SATA, 0 SCSI/SAS and 0 NVMe devices