分享

OpenStack故障诊断

徐超 发表于 2015-4-9 23:05:07 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 2 197989
问题导读:
1、OpenStack如何检查服务?
2、OpenStack报错解决的常见手法?




最近,由于某些需要,这几天在做练习,期间也遇到了些错误,所以,把一些问题整理下,方便大家,也方便自己。


一、Swift问题
执行,如下这一步时,提示下面这个问题,很明显,报错为HTTP 500,意思是该服务没有运行或者说运行不正常
  1. #swift list
  2. AAuthorization Failure. Authorization Failed: An unexpected error prevented the server from fulfilling your request. (HTTP 500)
复制代码

查看swift的一个服务
  1. #systemctl status openstack-swift-proxy.service
  2. openstack-swift-proxy.service - OpenStack Object Storage (swift) - Proxy Server
  3.    Loaded: loaded (/usr/lib/systemd/system/openstack-swift-proxy.service; enabled)
  4.    Active: active (running) since Wed 2015-04-08 14:49:13 CST; 7s ago
  5. Main PID: 27408 (swift-proxy-ser)
  6.    CGroup: /system.slice/openstack-swift-proxy.service
  7.            ├─27408 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
  8.            ├─27413 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
  9.            ├─27414 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
  10.            ├─27415 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
  11.            ├─27416 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
  12.            ├─27417 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
  13.            ├─27418 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
  14.            ├─27419 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
  15.            └─27420 /usr/bin/python /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
  16. Apr 08 14:49:13 server1-a.example.com proxy-server[27419]: Configuring auth_uri to point to the public identity endpoint is require...dpoint
  17. Apr 08 14:49:13 server1-a.example.com proxy-server[27419]: Using /tmp/keystone-signing-swift as cache directory for signing certificate
  18. Apr 08 14:49:13 server1-a.example.com proxy-server[27420]: Adding required filter dlo to pipeline at position 0
  19. Apr 08 14:49:13 server1-a.example.com proxy-server[27420]: Adding required filter gatekeeper to pipeline at position 0
  20. Apr 08 14:49:13 server1-a.example.com proxy-server[27420]: Adding required filter catch_errors to pipeline at position 0
  21. Apr 08 14:49:13 server1-a.example.com proxy-server[27420]: Pipeline was modified. New pipeline is "catch_errors gatekeeper dlo heal...roxy".
  22. Apr 08 14:49:13 server1-a.example.com proxy-server[27420]: Starting keystone auth_token middleware
  23. Apr 08 14:49:13 server1-a.example.com proxy-server[27420]: Configuring admin URI using auth fragments. This is deprecated, use 'ide...stead.
  24. Apr 08 14:49:13 server1-a.example.com proxy-server[27420]: Configuring auth_uri to point to the public identity endpoint is require...dpoint
  25. Apr 08 14:49:13 server1-a.example.com proxy-server[27420]: Using /tmp/keystone-signing-swift as cache directory for signing certificate
  26. Hint: Some lines were ellipsized, use -l to show in full.
复制代码

OK,这里的服务状态是没有问题的。我们直奔主题,看看keystone的log
查看keystone日志,这里可以看出是swift服务的endpoint配置错误了“(tenant-id)”。
  1. #tail -f /var/log/keystone/keystone.log
  2. 2015-04-08 14:47:03.810 27239 TRACE root error: [Errno 98] Address already in use
  3. 2015-04-08 14:47:03.810 27239 TRACE root
  4. 2015-04-08 14:47:03.811 27239 CRITICAL keystone [-] error: [Errno 98] Address already in use
  5. 2015-04-08 14:48:19.559 27143 ERROR keystone.catalog.core [-] Malformed endpoint http://server1-a.example.com:8080/v1/AUTH_%(tenant-id)s - unknown key u'tenant-id'
  6. 2015-04-08 14:48:19.559 27143 WARNING keystone.common.wsgi [-] An unexpected error prevented the server from fulfilling your request.
  7. 2015-04-08 14:48:48.248 27143 ERROR keystone.catalog.core [-] Malformed endpoint http://server1-a.example.com:8080/v1/AUTH_%(tenant-id)s - unknown key u'tenant-id'
  8. 2015-04-08 14:48:48.249 27143 WARNING keystone.common.wsgi [-] An unexpected error prevented the server from fulfilling your request.
  9. 2015-04-08 14:49:32.258 27447 WARNING keystone.openstack.common.versionutils [-] Deprecated: keystone.middleware.core.XmlBodyMiddleware is deprecated as of Icehouse in favor of support for "application/json" only and may be removed in K.
  10. 2015-04-08 14:49:48.012 27458 WARNING keystone.openstack.common.versionutils [-] Deprecated: keystone.middleware.core.XmlBodyMiddleware is deprecated as of Icehouse in favor of support for "application/json" only and may be removed in K.
  11. ^C
复制代码

进入mysql数据库修改数据为正确的
  1. #mysql -uroot -predhat
  2. Welcome to the MariaDB monitor.  Commands end with ; or \g.
  3. Your MariaDB connection id is 15
  4. Server version: 5.5.37-MariaDB-wsrep MariaDB Server, wsrep_25.10.r3980
  5. Copyright (c) 2000, 2013, Oracle, Monty Program Ab and others.
  6. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
  7. MariaDB [(none)]> show databases;
  8. +--------------------+
  9. | Database           |
  10. +--------------------+
  11. | information_schema |
  12. | keystone           |
  13. | mysql              |
  14. | performance_schema |
  15. | test               |
  16. +--------------------+
  17. 5 rows in set (0.00 sec)
  18. MariaDB [(none)]> use keystone;
  19. Reading table information for completion of table and column names
  20. You can turn off this feature to get a quicker startup with -A
  21. Database changed
复制代码
  1. MariaDB [keystone]> show tables;
  2. +-----------------------+
  3. | Tables_in_keystone    |
  4. +-----------------------+
  5. | assignment            |
  6. | credential            |
  7. | domain                |
  8. | endpoint              |
  9. | group                 |
  10. | migrate_version       |
  11. | policy                |
  12. | project               |
  13. | region                |
  14. | role                  |
  15. | service               |
  16. | token                 |
  17. | trust                 |
  18. | trust_role            |
  19. | user                  |
  20. | user_group_membership |
  21. +-----------------------+
  22. 16 rows in set (0.00 sec)
  23. MariaDB [keystone]> select * from endpoint;
  24. +----------------------------------+----------------------------------+-----------+-----------+----------------------------------+---------------------------------------------------------+-------+---------+
  25. | id                               | legacy_endpoint_id               | interface | region    | service_id                       | url                                                     | extra | enabled |
  26. +----------------------------------+----------------------------------+-----------+-----------+----------------------------------+---------------------------------------------------------+-------+---------+
  27. | 5207f7d59e3a4b458c4f3af0f1655453 | ca2f88cbd2e94c1389751ef3c7deaf81 | admin     | regionOne | d035698982f34af89469a01ab6beb81c | http://server1-a.example.com:8080/v1/AUTH_%(tenant-id)s | {}    |       1 |
  28. | 6abcaec50de449e8af9d052e89393545 | 311dfd01ef7a4cb794f7b6b0ff54812f | internal  | regionOne | 607cb091a6154bd7bad83835e6e32496 | http://server1-a.example.com:5000/v2.0                  | {}    |       1 |
  29. | a59e60335f404583979098c0162e43e5 | 311dfd01ef7a4cb794f7b6b0ff54812f | admin     | regionOne | 607cb091a6154bd7bad83835e6e32496 | http://server1-a.example.com:35357/v2.0                 | {}    |       1 |
  30. | cfa7bacf373d49418e287d90ee0e595f | ca2f88cbd2e94c1389751ef3c7deaf81 | public    | regionOne | d035698982f34af89469a01ab6beb81c | http://server1-a.example.com:8080/v1/AUTH_%(tenant_id)s | {}    |       1 |
  31. | e58a543fd5b4417487fd5469747f099f | 311dfd01ef7a4cb794f7b6b0ff54812f | public    | regionOne | 607cb091a6154bd7bad83835e6e32496 | http://server1-a.example.com:5000/v2.0                  | {}    |       1 |
  32. | ff9001eafac64c1399a1758dcfb50ea5 | ca2f88cbd2e94c1389751ef3c7deaf81 | internal  | regionOne | d035698982f34af89469a01ab6beb81c | http://server1-a.example.com:8080/v1/AUTH_%(tenant_id)s | {}    |       1 |
  33. +----------------------------------+----------------------------------+-----------+-----------+----------------------------------+---------------------------------------------------------+-------+---------+
  34. 6 rows in set (0.00 sec)
复制代码

  1. MariaDB [keystone]> update endpoint set url="http://server1-a.example.com:8080/v1/AUTH_%(tenant_id)s" where id='5207f7d59e3a4b458c4f3af0f1655453';
  2. Query OK, 1 row affected (0.33 sec)
  3. Rows matched: 1  Changed: 1  Warnings: 0
复制代码

  1. MariaDB [keystone]> select * from endpoint;
  2. +----------------------------------+----------------------------------+-----------+-----------+----------------------------------+---------------------------------------------------------+-------+---------+
  3. | id                               | legacy_endpoint_id               | interface | region    | service_id                       | url                                                     | extra | enabled |
  4. +----------------------------------+----------------------------------+-----------+-----------+----------------------------------+---------------------------------------------------------+-------+---------+
  5. | 5207f7d59e3a4b458c4f3af0f1655453 | ca2f88cbd2e94c1389751ef3c7deaf81 | admin     | regionOne | d035698982f34af89469a01ab6beb81c | http://server1-a.example.com:8080/v1/AUTH_%(tenant_id)s | {}    |       1 |
  6. | 6abcaec50de449e8af9d052e89393545 | 311dfd01ef7a4cb794f7b6b0ff54812f | internal  | regionOne | 607cb091a6154bd7bad83835e6e32496 | http://server1-a.example.com:5000/v2.0                  | {}    |       1 |
  7. | a59e60335f404583979098c0162e43e5 | 311dfd01ef7a4cb794f7b6b0ff54812f | admin     | regionOne | 607cb091a6154bd7bad83835e6e32496 | http://server1-a.example.com:35357/v2.0                 | {}    |       1 |
  8. | cfa7bacf373d49418e287d90ee0e595f | ca2f88cbd2e94c1389751ef3c7deaf81 | public    | regionOne | d035698982f34af89469a01ab6beb81c | http://server1-a.example.com:8080/v1/AUTH_%(tenant_id)s | {}    |       1 |
  9. | e58a543fd5b4417487fd5469747f099f | 311dfd01ef7a4cb794f7b6b0ff54812f | public    | regionOne | 607cb091a6154bd7bad83835e6e32496 | http://server1-a.example.com:5000/v2.0                  | {}    |       1 |
  10. | ff9001eafac64c1399a1758dcfb50ea5 | ca2f88cbd2e94c1389751ef3c7deaf81 | internal  | regionOne | d035698982f34af89469a01ab6beb81c | http://server1-a.example.com:8080/v1/AUTH_%(tenant_id)s | {}    |       1 |
  11. +----------------------------------+----------------------------------+-----------+-----------+----------------------------------+---------------------------------------------------------+-------+---------+
  12. 6 rows in set (0.00 sec)
  13. MariaDB [keystone]> quit
  14. Bye
复制代码

修改之后,我们重启服务
  1. #systemctl restart openstack-keystone
  2. #systemctl restart mariadb.service
  3. #systemctl restart openstack-keystone
  4. #systemctl restart memcached
  5. #systemctl restart openstack-swift-proxy
  6. #systemctl restart openstack-swift-container.service
  7. #systemctl restart openstack-swift-account.service
复制代码

OK,问题解决
  1. #swift list
复制代码


二、neutron 问题
创建路由时,提示下面这个问题
  1. #neutron router-create router1
  2. Authentication required
复制代码

查看日志
  1. #tail -f /var/log/keystone/keystone.log
  2. 2015-04-09 09:31:48.966 1444 WARNING keystone.common.wsgi [-] Could not find role, admin.
  3. 2015-04-09 09:31:48.997 1444 WARNING keystone.common.wsgi [-] Could not find project, services.
  4. 2015-04-09 11:47:57.934 1444 WARNING keystone.common.wsgi [-] Authorization failed. The request you have made requires authentication. from 127.0.0.1
  5. 2015-04-09 11:47:58.081 1444 WARNING keystone.common.wsgi [-] Authorization failed. The request you have made requires authentication. from 127.0.0.1
  6. 2015-04-09 11:48:41.130 1444 WARNING keystone.common.wsgi [-] Authorization failed. The request you have made requires authentication. from 127.0.0.1
  7. 2015-04-09 11:48:41.273 1444 WARNING keystone.common.wsgi [-] Authorization failed. The request you have made requires authentication. from 127.0.0.1
  8. 2015-04-09 12:06:56.024 1444 WARNING keystone.common.wsgi [-] You are not authorized to perform the requested action, admin_required.
  9. 2015-04-09 12:07:09.367 1444 WARNING keystone.common.wsgi [-] You are not authorized to perform the requested action, admin_required.
  10. 2015-04-09 12:07:54.750 1444 WARNING keystone.common.wsgi [-] Authorization failed. The request you have made requires authentication. from 127.0.0.1
  11. 2015-04-09 12:07:54.886 1444 WARNING keystone.common.wsgi [-] Authorization failed. The request you have made requires authentication. from 127.0.0.1
复制代码

查看keystone日志,没问题;status keystone服务也是运行了的。
–debug时,发现了一些问题,由于一些原因,debug信息这里没有贴出来,得知是配置错误了。
  1. #neutron --debug router-create router1
复制代码

添加,admin_user 解决(配置时,遗漏了这一步)
  1. #vim /etc/neutron/neutron.conf
  2. [keystone_authtoken]
  3. admin_user = neutron
复制代码

最后,重启服务
  1. #systemctl restart neutron-server.service
复制代码

三、OpenStack服务排查

openstack服务运行查看

  1. # openstack-status
  2. # openstack-service neutron
复制代码

诊断问题,除了我们使用–debug、看日志问题外。我们还可以使用–config-file指定服务的配置文件,来检查
这里以neutron为例
  1. # ps aux | grep neutron-service
复制代码
  1. # neutron-l3-agent --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/l3_agent.ini   (这后面跟一些该服务的配置文件,可以通过如下systemd查看与哪些配置文件相关)
复制代码
  1. #ls /usr/lib/systemd/system | grep neutron*
  2. neutron-dhcp-agent.service
  3. neutron-l3-agent.service
  4. neutron-lbaas-agent.service
  5. neutron-metadata-agent.service
  6. neutron-netns-cleanup.service
  7. neutron-openvswitch-agent.service
  8. neutron-ovs-cleanup.service
  9. neutron-server.service
复制代码
  1. #vim /usr/lib/systemd/system/neutron-l3-agent.service
  2. [Unit]
  3. Description=OpenStack Neutron Layer 3 Agent
  4. After=syslog.target network.target
  5. [Service]
  6. Type=simple
  7. User=neutron
  8. ExecStart=/usr/bin/neutron-l3-agent --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/l3_agent.ini --config-file /etc/neutron/fwaas_driver.ini --log-file /var/log/neutron/l3-agent.log
  9. PrivateTmp=false
  10. KillMode=process
  11. [Install]
  12. WantedBy=multi-user.target
复制代码

四、Horizon界面登陆不上

问题描述:

Horizon界面登陆不上,IP地址可以ping通;检查Nova-api服务、httpd服务均为正常运行,80端口也正常开放;相应horizon日志也无错误和警告信息。

由于,部分原因,这里便未贴出登陆web页面报错图片。出现上述情况后,也着实纠结了下,但随后不久,联想到CA证书问题。(使用的是火狐浏览器)

于是,便尝试在其他电脑上登陆,OK,能登陆。果然是CA证书问题,因为之前那个浏览器其他人登陆过Dashboard,于是将该浏览器上的CA证书删除之、重启,问题解决。

五、小结
记住,如果错误凡是报40X,一般都是keystone有问题,去看日志吧;需要检查该服务的endpoint、user,tenant等服务;如果是50X错误,则是服务运行有问题。


OpenStack错误概率
1、20%的可能性是自己敲命令错了,这需要自己仔细核对是否有误。
2、60%的可能是自己配置文件时配置错了,这需要检查是否有误。
3、4%的可能性是自己遇到bug了,你可以求助于bug社区。
4、其他........


对OpenStack的错误排除,如果你有什么更好的建议或赐教,可以在这里交流,也可以在这里的博客上畅谈




已有(2)人评论

跳转到指定楼层
j112929 发表于 2015-4-10 12:19:09
好贴,支持下
回复

使用道具 举报

lilili 发表于 2015-11-7 20:37:06
如果是服务重启命令执行却重启不了服务呢
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条