分享

使用openstack遇到的问题

nettman 发表于 2014-3-2 23:42:25 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 2 36777

1.openstack-nova-compute安装后没有日志输出可能的原因是什么?

2.安装完nova-compute后启动服务:没有初始化数据会报告一个无法查询数据库的错误,该如何解决?

更多问题可查看下面


1.安装完openstack-nova-compute后没有日志输出:

缺少python依赖包,安装依赖包

python-repoze.lru-0.3-1.1.x86_64.rpm


2.安装完nova-compute后启动服务:

此时如果没有初始化数据会报告一个无法查询数据库的错误。

解决方法:

配置nova.conf的nova数据库,并使用nova-manage db sync初始化数据库。


3.配置libvirt和libvirt_type,启动nova-compute,出现问题:

2012-04-13 23:56:24 AUDIT nova.service [-] Starting compute node (version 2012.1-LOCALBRANCH:LOCALREVISION)

2012-04-13 23:56:24 CRITICAL nova [-] [Errno 2] No such file or directory: ‘/usr/lib64/python2.6/site-packages/instances’

解决方法,创建目录:

mkdir -p /usr/lib64/python2.6/site-packages/instances


4.nova-compute启动时出现:

2012-04-15 18:25:18 TRACE nova     raise exception.ClassNotFound(class_name=class_str, exception=exc)

2012-04-15 18:25:18 TRACE nova ClassNotFound: Class API could not be found: No module named glance.common

2012-04-15 18:25:18 TRACE nova

解决方法:(安装缺少的python包)

Installing: python-dateutil-1.5-1.1 [done]

Installing: python-pycrypto-2.5-1.1 [done]

Installing: python-passlib-1.5.3-1.1 [done]

Installing: python-xattr-0.6.2-1.1 [done]

Installing: python-PasteScript-1.7.5-1.1 [done]

Installing: python-python-memcached-1.47-1.1 [done]

Installing: libmysqlclient_r15-5.0.94-0.2.4.1 [done]

Installing: python-ldap-2.3.5-1.21 [done]

Installing: python-mysql-1.2.2-2.12 [done]

Installing: python-keystone-2012.1-1.1 [done]


5.安装完成后启动nova-compute,启动,nova-compute日志

2012-04-15 19:51:06 TRACE nova   File “/usr/lib64/python2.6/site-packages/sqlalchemy/engine/default.py”, line 330, in do_execute

2012-04-15 19:51:06 TRACE nova     cursor.execute(statement, parameters)

2012-04-15 19:51:06 TRACE nova   File “/usr/lib64/python2.6/site-packages/MySQLdb/cursors.py”, line 166, in execute

2012-04-15 19:51:06 TRACE nova     self.errorhandler(self, exc, value)

2012-04-15 19:51:06 TRACE nova   File “/usr/lib64/python2.6/site-packages/MySQLdb/connections.py”, line 35, in defaulterrorhandler

2012-04-15 19:51:06 TRACE nova     raise errorclass, errorvalue

目前来看nova数据库需要连接后数据库还没初始化。

解决方法,初始化nova数据库:

susesp2:/var/log/nova # nova-manage db sync

2012-04-15 20:02:45 DEBUG nova.utils [-] backend <module ‘nova.db.sqlalchemy.migration’ from ‘/usr/lib64/python2.6/site-packages/nova/db/sqlalchemy/migration.pyc’> from (pid=10941) __get_backend /usr/lib64/python2.6/site-packages/nova/utils.py:658

2012-04-15 20:03:23 WARNING nova.utils [-] /usr/lib64/python2.6/site-packages/nova/db/sqlalchemy/migrate_repo/versions/075_convert_bw_usage_to_store_network_id.py:49: SADeprecationWarning: useexisting is deprecated.  Use extend_existing.

useexisting=True)

2012-04-15 20:03:31 WARNING nova.utils [-] /usr/lib64/python2.6/site-packages/nova/db/sqlalchemy/migrate_repo/versions/081_drop_instance_id_bw_cache.py:40: SADeprecationWarning: useexisting is deprecated.  Use extend_existing.

useexisting=True)


6.libvirt连接错误:

2012-04-15 20:24:08 TRACE nova   File “/usr/lib64/python2.6/site-packages/libvirt.py”, line 2836, in getVersion

2012-04-15 20:24:08 TRACE nova     if ret == -1: raise libvirtError (‘virConnectGetVersion() failed’, conn=self)

2012-04-15 20:24:08 TRACE nova libvirtError: internal error Cannot find suitable emulator for x86_64

解决方法:

Essex默认配置nova.conf的libvirt_type=”xen”默认配置文件中需要有引号,无法读取,解决方式,libvirt_type=xen这样即可。


7.在用户生成证书时报如下错误:

在susesp2:~/key # nova-manage project zipfile –project=mycloud –user=kevin –file=nova.zip

Stderr: “Using configuration from ./openssl.cnf\nerror loading the config file ‘./openssl.cnf’\n15649:error:02001002:system library:fopen:No such file or directory:bss_file.c:126:fopen(‘./openssl.cnf’,'rb’)\n15649:error:2006D080:BIO routines:BIO_new_file:no such file:bss_file.c:129:\n15649:error:0E078072:configuration file routines:DEF_LOAD:no such file:conf_def.c:197:\n”

The above error may show that the certificate db has not been created.

Please create a database by running a nova-cert server on this host.

解决方法:

susesp2:~/key # zypper install openstack-nova-cert

susesp2:~/key # /etc/init.d/openstack-nova-cert start

susesp2:~/key # chkconfig openstack-nova-cert on


8.在使用nova查看虚拟实例时出现400错误:

susesp2:~/key # nova image-list

ERROR: n/a (HTTP 400)

解决方法:

susesp2:~ # zypper search nova-api

Loading repository data…

Reading installed packages…

S | Name               | Summary                        | Type

–+——————–+——————————–+——–

| openstack-nova-api | OpenStack Compute API services | package

susesp2:~ # zypper install openstack-nova-api

其它问题引起的http 400错误,novarc环境变量写错,这点很重要:

suse11sp2:~/user # cat novarc

NOVARC=$(readlink -f “${BASH_SOURCE:-${0}}” 2>/dev/null) ||

NOVARC=$(python -c ‘import os,sys; print os.path.abspath(os.path.realpath(sys.argv[1]))’ “${BASH_SOURCE:-${0}}”)

NOVA_KEY_DIR=${NOVARC%/*}

export EC2_ACCESS_KEY=”kevin:mycloud”

export EC2_SECRET_KEY=”f20bb381-9cbf-40a7-a84f-499b815efa19″

export EC2_URL=”http://192.168.1.76:8773/services/Cloud

export S3_URL=”http://192.168.1.76:3333

export EC2_USER_ID=42 # nova does not use user id, but bundling requires it

export EC2_PRIVATE_KEY=${NOVA_KEY_DIR}/pk.pem

export EC2_CERT=${NOVA_KEY_DIR}/cert.pem

export NOVA_CERT=${NOVA_KEY_DIR}/cacert.pem

export EUCALYPTUS_CERT=${NOVA_CERT} # euca-bundle-image seems to require this set

alias ec2-bundle-image=”ec2-bundle-image –cert ${EC2_CERT} –privatekey ${EC2_PRIVATE_KEY} –user 42 –ec2cert ${NOVA_CERT}”

alias ec2-upload-bundle=”ec2-upload-bundle -a ${EC2_ACCESS_KEY} -s ${EC2_SECRET_KEY} –url ${S3_URL} –ec2cert ${NOVA_CERT}”

export NOVA_API_KEY=”kevin”

export NOVA_USERNAME=”kevin”

export NOVA_PROJECT_ID=”mycloud”

export NOVA_URL=”http://192.168.1.76:8774/v1.1/

export NOVA_VERSION=”1.1″


9.在openstack-nova-compute启动时报错:

2012-04-14 00:33:54 TRACE nova     return libvirt.openAuth(uri, auth, 0)

2012-04-14 00:33:54 TRACE nova   File “/usr/lib64/python2.6/site-packages/libvirt.py”, line 102, in openAuth

2012-04-14 00:33:54 TRACE nova     if ret is None:raise libvirtError(‘virConnectOpenAuth() failed’)

2012-04-14 00:33:54 TRACE nova libvirtError: Failed to connect socket to ‘/var/run/libvirt/libvirt-sock’: No such file or directory

2012-04-14 00:33:54 TRACE nova

问题libvirt服务没启动,需要启动libvirt服务。

SUSE SP2上在物理机启动过程中,openstack-nova-compute先于libvirtd启动, 每次重启物理机需要在手动重启openstack-nova-compute。(不知道他们怎么理解的,估计这个是个BUG,嘿嘿)

另外造成上述错误也有可能缺少相关的软件包,安装并重启服务:

suse:/var/log/nova # zypper install avahi

Loading repository data…

Reading installed packages…

Resolving package dependencies…

The following NEW packages are going to be installed:

avahi avahi-lang libavahi-core5 libdaemon0 nss-mdns nss-mdns-32bit


10.启动nova-network时报地址池被占用:

The ‘listeners’ argument to Pool (and create_engine()) is deprecated. Use event.listen().\n Pool.__init__(self, creator, **kw)\n\n2012-04-16 12:50:30 WARNING nova.utils [req-c4afc2fa-361a-4586-93aa-e203bff0937b None None] /usr/lib64/python2.6/site-packages/sqlalchemy/pool.py:145: SADeprecationWarning: Pool.add_listener is deprecated. Use event.listen()\n self.add_listener(l)\n\n\ndnsmasq: failed to create listening socket for 172.16.0.1: Address already in use\n”

解决方法:(这个问题为dnsmasq服务启动,如果再启动会占用原来的进程,多启动了一次)

/etc/init.d/dnsmasq stop

chkconfig dnsmasq off



排查问题使用–debug或者–verbose参数跟踪:

susesp2:~ # nova –debug list

connect: (127.0.0.1, 8774)

send: ‘GET /v1.1 HTTP/1.1\r\nHost: 127.0.0.1:8774\r\nx-auth-project-id: mycloud\r\naccept-encoding: gzip, deflate\r\nx-auth-user: kevin\r\nuser-agent: python-novaclient\r\nx-auth-key: kevin\r\naccept: application/json\r\n\r\n’

reply: ‘HTTP/1.1 204 No Content\r\n’

header: Content-Length: 0

header: X-Auth-Token: kevin:mycloud

header: X-Server-Management-Url: http://127.0.0.1:8774/v1.1/mycloud

header: Content-Type: text/plain; charset=UTF-8

header: Date: Mon, 16 Apr 2012 03:19:47 GMT

send: ‘GET /v1.1/mycloud/servers/detail HTTP/1.1\r\nHost: 127.0.0.1:8774\r\nx-auth-project-id: mycloud\r\nx-auth-token: kevin:mycloud\r\naccept-encoding: gzip, deflate\r\naccept: application/json\r\nuser-agent: python-novaclient\r\n\r\n’

reply: ‘HTTP/1.1 200 OK\r\n’

header: X-Compute-Request-Id: req-753d19f9-7267-410f-8591-f0fccb413cf9

header: Content-Type: application/json

header: Content-Length: 15

header: Date: Mon, 16 Apr 2012 03:19:47 GMT

+—-+——+——–+———-+

| ID | Name | Status | Networks |

+—-+——+——–+———-+

+—-+——+——–+———-+



11.在测试的时候,多次对于网络操作,会引起如下错误:
2012-05-11 17:51:04 TRACE nova.rpc.amqp [u'Traceback (most recent call last):\n', u'  File "/usr/lib64/python2.6/site-packages/nova/rpc/amqp.py", line 252, in _process_data\n    rval = node_func(context=ctxt, **node_args)\n', u'  File "/usr/lib64/python2.6/site-packages/nova/network/manager.py", line 258, in wrapped\n    return func(self, context, *args, **kwargs)\n', u'  File "/usr/lib64/python2.6/site-packages/nova/network/manager.py", line 321, in allocate_for_instance\n    **kwargs)\n', u'  File "/usr/lib64/python2.6/site-packages/nova/network/manager.py", line 258, in wrapped\n    return func(self, context, *args, **kwargs)\n', u'  File "/usr/lib64/python2.6/site-packages/nova/network/manager.py", line 907, in allocate_for_instance\n    requested_networks=requested_networks)\n', u'  File "/usr/lib64/python2.6/site-packages/nova/network/manager.py", line 196, in _allocate_fixed_ips\n    utils.to_primitive(network)}})\n', u'  File "/usr/lib64/python2.6/site-packages/nova/rpc/__init__.py", line 68, in call\n    return _get_impl().call(context, topic, msg, timeout)\n', u'  File "/usr/lib64/python2.6/site-packages/nova/rpc/impl_kombu.py", line 674, in call\n    return rpc_amqp.call(context, topic, msg, timeout, Connection.pool)\n', u'  File "/usr/lib64/python2.6/site-packages/nova/rpc/amqp.py", line 338, in call\n    rv = list(rv)\n', u'  File "/usr/lib64/python2.6/site-packages/nova/rpc/amqp.py", line 306, in __iter__\n    raise result\n', u'RemoteError: Remote error: NetworkNotFound Network 4 could not be found.\n[u\'Traceback (most recent call last):\\n\', u\'  File "/usr/lib64/python2.6/site-packages/nova/rpc/amqp.py", line 252, in _process_data\\n    rval = node_func(context=ctxt, **node_args)\\n\', u\'  File "/usr/lib64/python2.6/site-packages/nova/network/manager.py", line 785, in set_network_host\\n    self.host)\\n\', u\'  File "/usr/lib64/python2.6/site-packages/nova/db/api.py", line 818, in network_set_host\\n    return IMPL.network_set_host(context, network_id, host_id)\\n\', u\'  File "/usr/lib64/python2.6/site-packages/nova/db/sqlalchemy/api.py", line 102, in wrapper\\n    return f(*args, **kwargs)\\n\', u\'  File "/usr/lib64/python2.6/site-packages/nova/db/sqlalchemy/api.py", line 2110, in network_set_host\\n    raise exception.NetworkNotFound(network_id=network_id)\\n\', u\'NetworkNotFound: Network 4 could not be found.\\n\'].\n’].

解决方法:
drop database nova;
create database nova;
重新初始化数据库:
nova-manage db sync



尚未解决的问题:(路过的大侠指点一下),在key注入时失败:

2012-04-19 16:46:44 DEBUG nova.utils [req-a9beb228-97cb-4854-b6a6-e16009f10f94 kevin mycloud] Running cmd (subprocess): sudo kpartx -d /dev/loop3 from (pid=24116) execute /usr/lib64/python2.6/site-packages/nova/utils.py:219

2012-04-19 16:46:44 DEBUG nova.utils [req-a9beb228-97cb-4854-b6a6-e16009f10f94 kevin mycloud] Running cmd (subprocess): sudo losetup –detach /dev/loop3 from (pid=24116) execute /usr/lib64/python2.6/site-packages/nova/utils.py:219

2012-04-19 16:46:45 WARNING nova.virt.libvirt.connection [req-a9beb228-97cb-4854-b6a6-e16009f10f94 kevin mycloud] [instance: 07525848-d3d6-4545-866a-cd3da516182d] Ignoring error injecting data into image a0c337e6-0716-46db-9e33-078321398bf3 (Unexpected error while running command.

Command: sudo mkdir -p /tmp/tmpGTSXMs/root/.ssh

Exit code: 1

Stdout: ”

Stderr: “mkdir: cannot create directory `/tmp/tmpGTSXMs/root’: Read-only file system\n”)

2012-04-19 16:46:52 DEBUG nova.virt.libvirt.connection [req-a9beb228-97cb-4854-b6a6-e16009f10f94 kevin mycloud]

对于在SUSE SP2上安装openstack之前建议先测试python版本是否满足openstack Essex需要:

test script:

## vim: set syn=on ts=4 sw=4 sts=0 noet foldmethod=indent:

## purpose: check if python interpreter contains blocking issues for OpenStack

## copyright: B1 Systems GmbH <info@b1-systems.de>, 2012.

## license: GPLv3+, http://www.gnu.org/licenses/gpl-3.0.html

## author: Christian Berendt <berendt@b1-systems.de>, 2012.

print “test if keyword arguments with unicode keys are working without problems”

print “expected result: {u’key’: ‘value’}”

print

def testing(**kwargs):

print kwargs

try:

testing(**{u’key’: ‘value’})

except TypeError:

print “ERROR: catched TypeError, keyword arguments with unicode keys are NOT working”

print

print “test if ident of thread is working without problems”

print “expected result: a long integer (for example: 140630265239296)”

print

import threading

t = threading.current_thread()

ident = t.ident

if ident == None:

print “ERROR: ident of current thread is None, ident of threads is NOT working”

else:

print ident

如果输出为这段代码,基本python解析器满足要求,不会有大问题,不然在对应xml解析等问题上还是有很多问题:

print “test if ident of thread is working without problems”

print “expected result: a long integer (for example: 140630265239296)”



说明针对XEN镜像注入网络配置文件和key的问题,目前来看由于XEN的镜像不支持注入,需要修改openstack代码,目前还在研究代码,杯具,还不会改,求有经验者指导!!!


遇到问题追加及补充



补充问题

1.rabbitmq 停掉以后,compute会退出
当rabbitmq 停掉以后,过两分钟左右,compute会自动退出,日志中出现:
2012-03-25 21:41:26 INFO nova.rpc.common [-] Reconnecting to AMQP server on 192.168.28.5:5672
2012-03-25 21:41:27 ERROR nova.rpc.common [-] AMQP server on 192.168.28.5:5672 is unreachable: [Errno 113] EHOSTUNREACH. Trying again in 7 seconds.
(nova.rpc.common): TRACE: Traceback (most recent call last):
(nova.rpc.common): TRACE: File “/usr/lib/python2.6/site-packages/nova/rpc/impl_kombu.py”, line 446, in reconnect
(nova.rpc.common): TRACE: self._connect()
(nova.rpc.common): TRACE: File “/usr/lib/python2.6/site-packages/nova/rpc/impl_kombu.py”, line 423, in _connect
(nova.rpc.common): TRACE: self.connection.connect()
(nova.rpc.common): TRACE: File “/usr/lib/python2.6/site-packages/kombu/connection.py”, line 118, in connect
(nova.rpc.common): TRACE: return self.connection
(nova.rpc.common): TRACE: File “/usr/lib/python2.6/site-packages/kombu/connection.py”, line 438, in connection
(nova.rpc.common): TRACE: self._connection = self._establish_connection()
(nova.rpc.common): TRACE: File “/usr/lib/python2.6/site-packages/kombu/connection.py”, line 404, in _establish_connection
(nova.rpc.common): TRACE: conn = self.transport.establish_connection()
(nova.rpc.common): TRACE: File “/usr/lib/python2.6/site-packages/kombu/transport/pyamqplib.py”, line 242, in establish_connection
(nova.rpc.common): TRACE: connect_timeout=conninfo.connect_timeout)
(nova.rpc.common): TRACE: File “/usr/lib/python2.6/site-packages/kombu/transport/pyamqplib.py”, line 51, in __init__
(nova.rpc.common): TRACE: super(Connection, self).__init__(*args, **kwargs)
(nova.rpc.common): TRACE: File “/usr/lib/python2.6/site-packages/amqplib/client_0_8/connection.py”, line 125, in __init__
(nova.rpc.common): TRACE: self.transport = create_transport(host, connect_timeout, ssl)
(nova.rpc.common): TRACE: File “/usr/lib/python2.6/site-packages/amqplib/client_0_8/transport.py”, line 220, in create_transport
(nova.rpc.common): TRACE: return TCPTransport(host, connect_timeout)
(nova.rpc.common): TRACE: File “/usr/lib/python2.6/site-packages/amqplib/client_0_8/transport.py”, line 58, in __init__
(nova.rpc.common): TRACE: self.sock.connect((host, port))
(nova.rpc.common): TRACE: File “/usr/lib/python2.6/site-packages/eventlet/greenio.py”, line 179, in connect
(nova.rpc.common): TRACE: socket_checkerr(fd)
(nova.rpc.common): TRACE: File “/usr/lib/python2.6/site-packages/eventlet/greenio.py”, line 43, in socket_checkerr
(nova.rpc.common): TRACE: raise socket.error(err, errno.errorcode[err])
(nova.rpc.common): TRACE: error: [Errno 113] EHOSTUNREACH
(nova.rpc.common): TRACE:

这个问题,是由于openstack中,对rabbitmq 如果失去连接,会进行尝试,缺省是尝试12次,每次间隔10秒,到时间还不能连接,就抛出错误,退出。解决办法,在 nova.conf 加入下面的参数:
#防止 rabbitmq重启导致 compute 死掉
rabbit_max_retries=0

具体原因,可以参见代码:impl_kombu.py
self.max_retries = FLAGS.rabbit_max_retries
def reconnect(self):
“”"Handles reconnecting and re-establishing queues.
Will retry up to self.max_retries number of times.
self.max_retries = 0 means to retry forever.
Sleep between tries, starting at self.interval_start
seconds, backing off self.interval_stepping number of seconds
each attempt.
“”"
if self.max_retries and attempt == self.max_retries:
LOG.exception(_(‘Unable to connect to AMQP server on ‘
‘%(hostname)s:%(port)d after %(max_retries)d ‘
‘tries: %(err_str)s’) % log_info)
# NOTE(comstud): Copied from original code. There’s
# really no better recourse because if this was a queue we
# need to consume on, we have no way to consume anymore.
sys.exit(1)


加微信w3aboutyun,可拉入技术爱好者群

已有(2)人评论

跳转到指定楼层
zw2002 发表于 2015-4-24 22:10:29
前辈,学习了。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条