问题描述
Previously I asked about mounting GlusterFS at boot in an Ubuntu 12.04 server,答案是这在12.04中有问题,并在14.04中起作用。很好奇,我尝试了在笔记本电脑上运行的虚拟机,并在14.04中成功了。由于这对我来说至关重要,因此我决定将正在运行的服务器升级到14.04,只是发现GlusterFS也不自动安装本地主机卷。
这是一个Linode服务器,fstab如下所示:
# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc defaults 0 0
/dev/xvda / ext4 noatime,errors=remount-ro 0 1
/dev/xvdb none swap sw 0 0
/dev/xvdc /var/lib/glusterfs/brick01 ext4 defaults 1 2
koraga.int.example.com:/public_uploads /var/www/shared/public/uploads glusterfs defaults,_netdev 0 0
引导过程像这样(在网络安装部分周围,这是唯一的失败):
* Stopping Mount network filesystems [ OK ]
* Starting set sysctls from /etc/sysctl.conf [ OK ]
* Stopping set sysctls from /etc/sysctl.conf [ OK ]
* Starting configure virtual network devices [ OK ]
* Starting Bridge socket events into upstart [ OK ]
* Starting Waiting for state [fail]
* Stopping Waiting for state [ OK ]
* Starting Block the mounting event for glusterfs filesystems until the [fail]k interfaces are running
* Starting Waiting for state [fail]
* Starting Block the mounting event for glusterfs filesystems until the [fail]k interfaces are running
* Stopping Waiting for state [ OK ]
* Starting Signal sysvinit that remote filesystems are mounted [ OK ]
* Starting GNU Screen Cleanup [ OK ]
我相信日志文件/var/log/glusterfs/var-www-shared-public-uploads.log
包含了解决该问题的主要线索,因为它是唯一一个在安装无法正常工作的服务器与本地虚拟服务器(实际上是安装服务器)之间真正不同的文件:
[2014-07-10 05:51:49.762162] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.1 (/usr/sbin/glusterfs --volfile-server=koraga.int.example.com --volfile-id=/public_uploads /var/www/shared/public/uploads)
[2014-07-10 05:51:49.774248] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-07-10 05:51:49.774278] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread
[2014-07-10 05:51:49.775573] E [socket.c:2161:socket_connect_finish] 0-glusterfs: connection to 192.168.134.227:24007 failed (Connection refused)
[2014-07-10 05:51:49.775634] E [glusterfsd-mgmt.c:1601:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: koraga.int.example.com (No data available)
[2014-07-10 05:51:49.775649] I [glusterfsd-mgmt.c:1607:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2014-07-10 05:51:49.776284] W [glusterfsd.c:1095:cleanup_and_exit] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x23) [0x7f6718bf3f83] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x90) [0x7f6718bf7da0] (-->/usr/sbin/glusterfs(+0xcf13) [0x7f67192bbf13]))) 0-: received signum (1), shutting down
[2014-07-10 05:51:49.776314] I [fuse-bridge.c:5475:fini] 0-fuse: Unmounting '/var/www/shared/public/uploads'.
该卷的状态为:
Volume Name: public_uploads
Type: Distribute
Volume ID: 52aa6d85-f4ea-4c39-a2b3-d20d34ab5916
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: koraga.int.example.com:/var/lib/glusterfs/brick01/public_uploads
Options Reconfigured:
auth.allow: 127.0.0.1,192.168.134.227
client.ssl: off
server.ssl: off
nfs.disable: on
如果在启动后运行mount -a
,则表明卷已正确安装:
koraga.int.example.com:/public_uploads on /var/www/shared/public/uploads type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
几个相关的日志文件显示如下:
/var/log/upstart/mounting-glusterfs-_var_www_shared_public_uploads.log
:
start: Job failed to start
/var/log/upstart/wait-for-state-mounting-glusterfs-_var_www_shared_public_uploadsstatic-network-up.log
:
status: Unknown job: static-network-up
start: Unknown job: static-network-up
但是在我的测试服务器上,它显示的内容完全相同,因此,我认为这无关紧要。
有什么主意吗?
更新:我尝试将WAIT_FOR从static-network-up更改为网络,但仍然无法正常工作,但是启动时所有[fail]消息都消失了。在以下情况下,这些是日志文件的包含:
/var/log/glusterfs/var-www-shared-public-uploads.log
包含:
wait-for-state stop/waiting
/var/log/upstart/wait-for-state-mounting-glusterfs-_var_www_shared_public_uploadsstatic-network-up.log
包含:
start: Job is already running: networking
/var/log/glusterfs/var-www-shared-public-uploads.log
包含:
[2014-07-11 17:19:38.000207] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.1 (/usr/sbin/glusterfs --volfile-server=koraga.int.example.com --volfile-id=/public_uploads /var/www/shared/public/uploads)
[2014-07-11 17:19:38.029421] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-07-11 17:19:38.029450] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread
[2014-07-11 17:19:38.030288] E [socket.c:2161:socket_connect_finish] 0-glusterfs: connection to 192.168.134.227:24007 failed (Connection refused)
[2014-07-11 17:19:38.030331] E [glusterfsd-mgmt.c:1601:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: koraga.int.example.com (No data available)
[2014-07-11 17:19:38.030345] I [glusterfsd-mgmt.c:1607:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2014-07-11 17:19:38.030984] W [glusterfsd.c:1095:cleanup_and_exit] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x23) [0x7fd9495b7f83] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x90) [0x7fd9495bbda0] (-->/usr/sbin/glusterfs(+0xcf13) [0x7fd949c7ff13]))) 0-: received signum (1), shutting down
[2014-07-11 17:19:38.031013] I [fuse-bridge.c:5475:fini] 0-fuse: Unmounting '/var/www/shared/public/uploads'.
更新2:我还在upstart文件中尝试过此操作:
start on (started glusterfs-server and mounting TYPE=glusterfs)
但是计算机无法启动(不知道为什么)。
最佳回答
我设法通过在该线程与以下线程中的答案组合来完成这项工作:GlusterFS is failing to mount on boot
按照@Dan Pisarski编辑/etc/init/mounting-glusterfs.conf
读取:
exec start wait-for-state WAIT_FOR=networking WAITER=mounting-glusterfs-$MOUNTPOINT
按照@ dialt0ne将/etc/fstab
更改为:
[serverip]:[vol] [mountpoint] glusterfs defaults,nobootwait,_netdev,backupvolfile-server=[backupserverip],direct-io-mode=disable 0 0
在Ubuntu 14.04.2 LTS上适用于我(tm)
次佳回答
我在Ubuntu 12.04上的AWS上遇到了相同的问题。您可以为我做一些事情:
-
在fstab中添加更多fetch-attempts
这将允许您在网络不可用时重试卷文件服务器。
-
在fstab中添加备份volfile服务器
如果主数据库由于某种原因关闭,这将允许您从另一个gluster服务器成员挂载文件系统。
-
在您的fstab中添加
nobootwait
这允许实例在未安装此文件系统的情况下继续引导。
我当前的fstab中的示例条目为:
10.20.30.40:/fs1 /example glusterfs defaults,nobootwait,_netdev,backupvolfile-server=10.20.30.41,fetch-attempts=10 0 2
我尚未在14.04上对此进行测试,但对于我的12.04实例而言,它可以正常工作。