OpenStack故障诊断
问题导读:1、OpenStack如何检查服务?
2、OpenStack报错解决的常见手法?
static/image/hrline/4.gif
最近,由于某些需要,这几天在做练习,期间也遇到了些错误,所以,把一些问题整理下,方便大家,也方便自己。
一、Swift问题
执行,如下这一步时,提示下面这个问题,很明显,报错为HTTP 500,意思是该服务没有运行或者说运行不正常
#swift list
AAuthorization Failure. Authorization Failed: An unexpected error prevented the server from fulfilling your request. (HTTP 500)
查看swift的一个服务
#systemctl status openstack-swift-proxy.service
openstack-swift-proxy.service - OpenStack Object Storage (swift) - Proxy Server
Loaded: loaded (/usr/lib/systemd/system/openstack-swift-proxy.service; enabled)
Active: active (running) since Wed 2015-04-08 14:49:13 CST; 7s ago
Main PID: 27408 (swift-proxy-ser)
CGroup: /system.slice/openstack-swift-proxy.service
├─27408 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
├─27413 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
├─27414 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
├─27415 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
├─27416 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
├─27417 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
├─27418 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
├─27419 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
└─27420 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
Apr 08 14:49:13 server1-a.example.com proxy-server: Configuring auth_uri to point to the public identity endpoint is require...dpoint
Apr 08 14:49:13 server1-a.example.com proxy-server: Using /tmp/keystone-signing-swift as cache directory for signing certificate
Apr 08 14:49:13 server1-a.example.com proxy-server: Adding required filter dlo to pipeline at position 0
Apr 08 14:49:13 server1-a.example.com proxy-server: Adding required filter gatekeeper to pipeline at position 0
Apr 08 14:49:13 server1-a.example.com proxy-server: Adding required filter catch_errors to pipeline at position 0
Apr 08 14:49:13 server1-a.example.com proxy-server: Pipeline was modified. New pipeline is "catch_errors gatekeeper dlo heal...roxy".
Apr 08 14:49:13 server1-a.example.com proxy-server: Starting keystone auth_token middleware
Apr 08 14:49:13 server1-a.example.com proxy-server: Configuring admin URI using auth fragments. This is deprecated, use 'ide...stead.
Apr 08 14:49:13 server1-a.example.com proxy-server: Configuring auth_uri to point to the public identity endpoint is require...dpoint
Apr 08 14:49:13 server1-a.example.com proxy-server: Using /tmp/keystone-signing-swift as cache directory for signing certificate
Hint: Some lines were ellipsized, use -l to show in full.
OK,这里的服务状态是没有问题的。我们直奔主题,看看keystone的log
查看keystone日志,这里可以看出是swift服务的endpoint配置错误了“(tenant-id)”。
#tail -f /var/log/keystone/keystone.log
2015-04-08 14:47:03.810 27239 TRACE root error: Address already in use
2015-04-08 14:47:03.810 27239 TRACE root
2015-04-08 14:47:03.811 27239 CRITICAL keystone [-] error: Address already in use
2015-04-08 14:48:19.559 27143 ERROR keystone.catalog.core [-] Malformed endpoint http://server1-a.example.com:8080/v1/AUTH_%(tenant-id)s - unknown key u'tenant-id'
2015-04-08 14:48:19.559 27143 WARNING keystone.common.wsgi [-] An unexpected error prevented the server from fulfilling your request.
2015-04-08 14:48:48.248 27143 ERROR keystone.catalog.core [-] Malformed endpoint http://server1-a.example.com:8080/v1/AUTH_%(tenant-id)s - unknown key u'tenant-id'
2015-04-08 14:48:48.249 27143 WARNING keystone.common.wsgi [-] An unexpected error prevented the server from fulfilling your request.
2015-04-08 14:49:32.258 27447 WARNING keystone.openstack.common.versionutils [-] Deprecated: keystone.middleware.core.XmlBodyMiddleware is deprecated as of Icehouse in favor of support for "application/json" only and may be removed in K.
2015-04-08 14:49:48.012 27458 WARNING keystone.openstack.common.versionutils [-] Deprecated: keystone.middleware.core.XmlBodyMiddleware is deprecated as of Icehouse in favor of support for "application/json" only and may be removed in K.
^C
进入mysql数据库修改数据为正确的
#mysql -uroot -predhat
Welcome to the MariaDB monitor.Commands end with ; or \g.
Your MariaDB connection id is 15
Server version: 5.5.37-MariaDB-wsrep MariaDB Server, wsrep_25.10.r3980
Copyright (c) 2000, 2013, Oracle, Monty Program Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| keystone |
| mysql |
| performance_schema |
| test |
+--------------------+
5 rows in set (0.00 sec)
MariaDB [(none)]> use keystone;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
MariaDB > show tables;
+-----------------------+
| Tables_in_keystone |
+-----------------------+
| assignment |
| credential |
| domain |
| endpoint |
| group |
| migrate_version |
| policy |
| project |
| region |
| role |
| service |
| token |
| trust |
| trust_role |
| user |
| user_group_membership |
+-----------------------+
16 rows in set (0.00 sec)
MariaDB > select * from endpoint;
+----------------------------------+----------------------------------+-----------+-----------+----------------------------------+---------------------------------------------------------+-------+---------+
| id | legacy_endpoint_id | interface | region | service_id | url | extra | enabled |
+----------------------------------+----------------------------------+-----------+-----------+----------------------------------+---------------------------------------------------------+-------+---------+
| 5207f7d59e3a4b458c4f3af0f1655453 | ca2f88cbd2e94c1389751ef3c7deaf81 | admin | regionOne | d035698982f34af89469a01ab6beb81c | http://server1-a.example.com:8080/v1/AUTH_%(tenant-id)s | {} | 1 |
| 6abcaec50de449e8af9d052e89393545 | 311dfd01ef7a4cb794f7b6b0ff54812f | internal| regionOne | 607cb091a6154bd7bad83835e6e32496 | http://server1-a.example.com:5000/v2.0 | {} | 1 |
| a59e60335f404583979098c0162e43e5 | 311dfd01ef7a4cb794f7b6b0ff54812f | admin | regionOne | 607cb091a6154bd7bad83835e6e32496 | http://server1-a.example.com:35357/v2.0 | {} | 1 |
| cfa7bacf373d49418e287d90ee0e595f | ca2f88cbd2e94c1389751ef3c7deaf81 | public | regionOne | d035698982f34af89469a01ab6beb81c | http://server1-a.example.com:8080/v1/AUTH_%(tenant_id)s | {} | 1 |
| e58a543fd5b4417487fd5469747f099f | 311dfd01ef7a4cb794f7b6b0ff54812f | public | regionOne | 607cb091a6154bd7bad83835e6e32496 | http://server1-a.example.com:5000/v2.0 | {} | 1 |
| ff9001eafac64c1399a1758dcfb50ea5 | ca2f88cbd2e94c1389751ef3c7deaf81 | internal| regionOne | d035698982f34af89469a01ab6beb81c | http://server1-a.example.com:8080/v1/AUTH_%(tenant_id)s | {} | 1 |
+----------------------------------+----------------------------------+-----------+-----------+----------------------------------+---------------------------------------------------------+-------+---------+
6 rows in set (0.00 sec)
MariaDB > update endpoint set url="http://server1-a.example.com:8080/v1/AUTH_%(tenant_id)s" where id='5207f7d59e3a4b458c4f3af0f1655453';
Query OK, 1 row affected (0.33 sec)
Rows matched: 1Changed: 1Warnings: 0
MariaDB > select * from endpoint;
+----------------------------------+----------------------------------+-----------+-----------+----------------------------------+---------------------------------------------------------+-------+---------+
| id | legacy_endpoint_id | interface | region | service_id | url | extra | enabled |
+----------------------------------+----------------------------------+-----------+-----------+----------------------------------+---------------------------------------------------------+-------+---------+
| 5207f7d59e3a4b458c4f3af0f1655453 | ca2f88cbd2e94c1389751ef3c7deaf81 | admin | regionOne | d035698982f34af89469a01ab6beb81c | http://server1-a.example.com:8080/v1/AUTH_%(tenant_id)s | {} | 1 |
| 6abcaec50de449e8af9d052e89393545 | 311dfd01ef7a4cb794f7b6b0ff54812f | internal| regionOne | 607cb091a6154bd7bad83835e6e32496 | http://server1-a.example.com:5000/v2.0 | {} | 1 |
| a59e60335f404583979098c0162e43e5 | 311dfd01ef7a4cb794f7b6b0ff54812f | admin | regionOne | 607cb091a6154bd7bad83835e6e32496 | http://server1-a.example.com:35357/v2.0 | {} | 1 |
| cfa7bacf373d49418e287d90ee0e595f | ca2f88cbd2e94c1389751ef3c7deaf81 | public | regionOne | d035698982f34af89469a01ab6beb81c | http://server1-a.example.com:8080/v1/AUTH_%(tenant_id)s | {} | 1 |
| e58a543fd5b4417487fd5469747f099f | 311dfd01ef7a4cb794f7b6b0ff54812f | public | regionOne | 607cb091a6154bd7bad83835e6e32496 | http://server1-a.example.com:5000/v2.0 | {} | 1 |
| ff9001eafac64c1399a1758dcfb50ea5 | ca2f88cbd2e94c1389751ef3c7deaf81 | internal| regionOne | d035698982f34af89469a01ab6beb81c | http://server1-a.example.com:8080/v1/AUTH_%(tenant_id)s | {} | 1 |
+----------------------------------+----------------------------------+-----------+-----------+----------------------------------+---------------------------------------------------------+-------+---------+
6 rows in set (0.00 sec)
MariaDB > quit
Bye
修改之后,我们重启服务
#systemctl restart openstack-keystone
#systemctl restart mariadb.service
#systemctl restart openstack-keystone
#systemctl restart memcached
#systemctl restart openstack-swift-proxy
#systemctl restart openstack-swift-container.service
#systemctl restart openstack-swift-account.service
OK,问题解决
#swift list
二、neutron 问题
创建路由时,提示下面这个问题
#neutron router-create router1
Authentication required
查看日志
#tail -f /var/log/keystone/keystone.log
2015-04-09 09:31:48.966 1444 WARNING keystone.common.wsgi [-] Could not find role, admin.
2015-04-09 09:31:48.997 1444 WARNING keystone.common.wsgi [-] Could not find project, services.
2015-04-09 11:47:57.934 1444 WARNING keystone.common.wsgi [-] Authorization failed. The request you have made requires authentication. from 127.0.0.1
2015-04-09 11:47:58.081 1444 WARNING keystone.common.wsgi [-] Authorization failed. The request you have made requires authentication. from 127.0.0.1
2015-04-09 11:48:41.130 1444 WARNING keystone.common.wsgi [-] Authorization failed. The request you have made requires authentication. from 127.0.0.1
2015-04-09 11:48:41.273 1444 WARNING keystone.common.wsgi [-] Authorization failed. The request you have made requires authentication. from 127.0.0.1
2015-04-09 12:06:56.024 1444 WARNING keystone.common.wsgi [-] You are not authorized to perform the requested action, admin_required.
2015-04-09 12:07:09.367 1444 WARNING keystone.common.wsgi [-] You are not authorized to perform the requested action, admin_required.
2015-04-09 12:07:54.750 1444 WARNING keystone.common.wsgi [-] Authorization failed. The request you have made requires authentication. from 127.0.0.1
2015-04-09 12:07:54.886 1444 WARNING keystone.common.wsgi [-] Authorization failed. The request you have made requires authentication. from 127.0.0.1
查看keystone日志,没问题;status keystone服务也是运行了的。
–debug时,发现了一些问题,由于一些原因,debug信息这里没有贴出来,得知是配置错误了。
#neutron --debug router-create router1
添加,admin_user 解决(配置时,遗漏了这一步)
#vim /etc/neutron/neutron.conf
admin_user = neutron
最后,重启服务
#systemctl restart neutron-server.service
三、OpenStack服务排查
openstack服务运行查看
# openstack-status
# openstack-service neutron
诊断问题,除了我们使用–debug、看日志问题外。我们还可以使用–config-file指定服务的配置文件,来检查
这里以neutron为例
# ps aux | grep neutron-service
# neutron-l3-agent --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/l3_agent.ini (这后面跟一些该服务的配置文件,可以通过如下systemd查看与哪些配置文件相关)
#ls /usr/lib/systemd/system | grep neutron*
neutron-dhcp-agent.service
neutron-l3-agent.service
neutron-lbaas-agent.service
neutron-metadata-agent.service
neutron-netns-cleanup.service
neutron-openvswitch-agent.service
neutron-ovs-cleanup.service
neutron-server.service
#vim /usr/lib/systemd/system/neutron-l3-agent.service
Description=OpenStack Neutron Layer 3 Agent
After=syslog.target network.target
Type=simple
User=neutron
ExecStart=/usr/bin/neutron-l3-agent --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/l3_agent.ini --config-file /etc/neutron/fwaas_driver.ini --log-file /var/log/neutron/l3-agent.log
PrivateTmp=false
KillMode=process
WantedBy=multi-user.target
四、Horizon界面登陆不上
问题描述:
Horizon界面登陆不上,IP地址可以ping通;检查Nova-api服务、httpd服务均为正常运行,80端口也正常开放;相应horizon日志也无错误和警告信息。
由于,部分原因,这里便未贴出登陆web页面报错图片。出现上述情况后,也着实纠结了下,但随后不久,联想到CA证书问题。(使用的是火狐浏览器)
于是,便尝试在其他电脑上登陆,OK,能登陆。果然是CA证书问题,因为之前那个浏览器其他人登陆过Dashboard,于是将该浏览器上的CA证书删除之、重启,问题解决。
五、小结
记住,如果错误凡是报40X,一般都是keystone有问题,去看日志吧;需要检查该服务的endpoint、user,tenant等服务;如果是50X错误,则是服务运行有问题。
OpenStack错误概率
1、20%的可能性是自己敲命令错了,这需要自己仔细核对是否有误。
2、60%的可能是自己配置文件时配置错了,这需要检查是否有误。
3、4%的可能性是自己遇到bug了,你可以求助于bug社区。
4、其他........
对OpenStack的错误排除,如果你有什么更好的建议或赐教,可以在这里交流,也可以在这里的博客上畅谈。
好贴,支持下
如果是服务重启命令执行却重启不了服务呢
页:
[1]