问题描述
我正在使用本机ZFS运行Ubuntu Server 13.04 64位。我有一个zpool由4个硬盘驱动器组成,其中一个硬盘驱动器昨天死亡,现在不再被操作系统或BIOS识别。
不幸的是我只在下次重启后才看到问题,所以现在缺少驱动器标签,我无法使用官方说明here和here替换磁盘。
zpool status hermes -x
版画
root@zeus:~# zpool status hermes -x
pool: hermes
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-4J
scan: scrub repaired 0 in 2h4m with 0 errors on Sun Jun 9 00:28:24 2013
config:
NAME STATE READ WRITE CKSUM
hermes DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
ata-ST3300620A_5QF0MJFP ONLINE 0 0 0
ata-ST3300831A_5NF0552X UNAVAIL 0 0 0
ata-ST3200822A_5LJ1CHMS ONLINE 0 0 0
ata-ST3200822A_3LJ0189C ONLINE 0 0 0
errors: No known data errors
我已经更换了一个新的驱动器(它有标签/dev/disk/by-id/ata-ST3500320AS_9QM03ATQ
)
任何一个命令
zpool replace hermes /dev/disk/by-id/ata-ST3300831A_5NF0552X /dev/disk/by-id/ata-ST3500320AS_9QM03ATQ
zpool offline hermes /dev/disk/by-id/ata-ST3300831A_5NF0552X
zpool detatch hermes /dev/disk/by-id/ata-ST3300831A_5NF0552X
失败了
root@zeus:~# zpool offline hermes /dev/disk/by-id/ata-ST3300831A_5NF0552X
cannot offline /dev/disk/by-id/ata-ST3300831A_5NF0552X: no such device in pool
因为死亡的驱动器的标签不再存在于系统中。我也尝试了上面的命令,省略了驱动器标签的路径无济于事。
如何更换”ghost”磁盘?
最佳解决办法
在今晚无休止地挖掘之后,我终于找到了解决方案。简而言之,您可以使用zpool
命令使用磁盘的GUID(即使在断开驱动器后仍然存在)。
答案很长:我使用zdb
命令得到了磁盘的GUID,这给了我以下输出
root@zeus:/dev# zdb
hermes:
version: 28
name: 'hermes'
state: 0
txg: 162804
pool_guid: 14829240649900366534
hostname: 'zeus'
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 14829240649900366534
children[0]:
type: 'raidz'
id: 0
guid: 5355850150368902284
nparity: 1
metaslab_array: 31
metaslab_shift: 32
ashift: 9
asize: 791588896768
is_log: 0
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 11426107064765252810
path: '/dev/disk/by-id/ata-ST3300620A_5QF0MJFP-part2'
phys_path: '/dev/gptid/73b31683-537f-11e2-bad7-50465d4eb8b0'
whole_disk: 1
create_txg: 4
children[1]:
type: 'disk'
id: 1
guid: 15935140517898495532
path: '/dev/disk/by-id/ata-ST3300831A_5NF0552X-part2'
phys_path: '/dev/gptid/746c949a-537f-11e2-bad7-50465d4eb8b0'
whole_disk: 1
create_txg: 4
children[2]:
type: 'disk'
id: 2
guid: 7183706725091321492
path: '/dev/disk/by-id/ata-ST3200822A_5LJ1CHMS-part2'
phys_path: '/dev/gptid/7541115a-537f-11e2-bad7-50465d4eb8b0'
whole_disk: 1
create_txg: 4
children[3]:
type: 'disk'
id: 3
guid: 17196042497722925662
path: '/dev/disk/by-id/ata-ST3200822A_3LJ0189C-part2'
phys_path: '/dev/gptid/760a94ee-537f-11e2-bad7-50465d4eb8b0'
whole_disk: 1
create_txg: 4
features_for_read:
我正在寻找的GUID是15935140517898495532
,它使我能够做到
root@zeus:/dev# zpool offline hermes 15935140517898495532
root@zeus:/dev# zpool status
pool: hermes
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub repaired 0 in 2h4m with 0 errors on Sun Jun 9 00:28:24 2013
config:
NAME STATE READ WRITE CKSUM
hermes DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
ata-ST3300620A_5QF0MJFP ONLINE 0 0 0
ata-ST3300831A_5NF0552X OFFLINE 0 0 0
ata-ST3200822A_5LJ1CHMS ONLINE 0 0 0
ata-ST3200822A_3LJ0189C ONLINE 0 0 0
errors: No known data errors
然后
root@zeus:/dev# zpool replace hermes 15935140517898495532 /dev/disk/by-id/ata-ST3500320AS_9QM03ATQ
root@zeus:/dev# zpool status
pool: hermes
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Sun Jun 9 01:44:36 2013
408M scanned out of 419G at 20,4M/s, 5h50m to go
101M resilvered, 0,10% done
config:
NAME STATE READ WRITE CKSUM
hermes DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
ata-ST3300620A_5QF0MJFP ONLINE 0 0 0
replacing-1 OFFLINE 0 0 0
ata-ST3300831A_5NF0552X OFFLINE 0 0 0
ata-ST3500320AS_9QM03ATQ ONLINE 0 0 0 (resilvering)
ata-ST3200822A_5LJ1CHMS ONLINE 0 0 0
ata-ST3200822A_3LJ0189C ONLINE 0 0 0
errors: No known data errors
重新启动完成后,一切都运行良好。包含此信息可能会很好,您可以使用通过zdb
获得的磁盘的GUID和zpool
命令,以及zpool的联机帮助页。
编辑
正如下面的durval指出的那样,zdb
命令可能无法输出任何内容。然后你可以尝试使用
zdb -l /dev/<name-of-device>
明确列出有关设备的信息(即使它已从系统中丢失)。
次佳解决办法
问题是磁盘由ID引用而不是由设备引用。
这是一个应该工作的解决方法:
ln -s /dev/null /dev/ata-ST3300831A_5NF0552X
zpool export hermes
zpool import hermes
zpool status
# note the new device name that should appear here
zpool offline hermes xxxx
zpool replace hermes xxxx /dev/disk/by-id/ata-ST3500320AS_9QM03ATQ
编辑:我迟到了30秒……
第三种解决办法
@Marcus:感谢您对自己的问题发表了这个优秀的答案,这对我帮助很大。
前几天我发现了一个可能让你感兴趣的转折(以及将来a-googling的任何其他人):我有一个缓存设备从池中删除(并标记为”UNAVAIL”)由于同样的错误(ZFS-8000) -4J,“标签丢失或无效”),尝试脱机/删除/替换它失败的完全相同的“池中没有这样的设备”消息。
但是,当我尝试应用您的解决方案时,普通的”zdb”(没有参数)没有列出设备,更不用说它的GUID了。
经过一番挖掘,我发现“zdb -l /dev /DEVICENAME”列出了GUID(直接从设备中获取,而不是从池记录中获取),并使用该GUID使我能够进行替换(实际上我做了一个”zpool offline”接着是”zpool remove”,然后是”zpool add”,效果很好。