desehawk 发表于 2016-6-23 18:42 ----在log中发现了一些问题,似乎是因为没有qdhcp的namespace--- 2016-06-23 15:21:39.505 2876 INFO oslo_messaging._drivers.impl_rabbit [-] Connecting to AMQP server on controller0:5672 2016-06-23 15:21:39.535 2876 INFO neutron.agent.dhcp.agent [req-572ca88c-99d5-4ae9-8c06-86b62ee54745 ] Synchronizing state 2016-06-23 15:21:39.540 2876 INFO oslo_messaging._drivers.impl_rabbit [-] Connected to AMQP server on controller0:5672 2016-06-23 15:21:39.586 2876 INFO neutron.agent.dhcp.agent [-] DHCP agent started 2016-06-23 15:21:40.584 2876 ERROR neutron.agent.linux.utils [req-572ca88c-99d5-4ae9-8c06-86b62ee54745 ] Command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'delete', 'qdhcp-60cf7464-09c8-4c4a-ba8d-cd3004970bd6'] Exit code: 1 Stdin: Stdout: Stderr: Cannot remove namespace file "/var/run/netns/qdhcp-60cf7464-09c8-4c4a-ba8d-cd3004970bd6": No such file or directory 2016-06-23 15:21:40.584 2876 WARNING neutron.agent.linux.dhcp [req-572ca88c-99d5-4ae9-8c06-86b62ee54745 ] Failed trying to delete namespace: qdhcp-60cf7464-09c8-4c4a-ba8d-cd3004970bd6 2016-06-23 15:21:40.585 2876 INFO neutron.agent.dhcp.agent [req-572ca88c-99d5-4ae9-8c06-86b62ee54745 ] Synchronizing state complete 2016-06-23 15:21:45.586 2876 INFO neutron.agent.dhcp.agent [-] Synchronizing state 2016-06-23 15:21:45.693 2876 INFO neutron.agent.dhcp.agent [-] Synchronizing state complete 2016-06-23 16:50:22.062 2833 INFO neutron.common.config [-] Logging enabled! 2016-06-23 16:50:22.063 2833 INFO neutron.common.config [-] /usr/bin/neutron-dhcp-agent version 2015.1.2 2016-06-23 16:50:22.100 2833 WARNING oslo_config.cfg [req-606d77c7-5a86-4edc-b03e-2440cb627da1 ] Option "lock_path" from group "DEFAULT" is deprecated. Use option "lock_path" from group "oslo_concurrency". 2016-06-23 16:50:22.106 2833 INFO oslo_messaging._drivers.impl_rabbit [req-d6ceb7b0-dd30-45aa-bcb6-32e3d6f04be8 ] Connecting to AMQP server on controller0:5672 2016-06-23 16:50:22.114 2833 INFO neutron.agent.dhcp.agent [-] Synchronizing state 2016-06-23 16:50:22.135 2833 INFO oslo_messaging._drivers.impl_rabbit [req-d6ceb7b0-dd30-45aa-bcb6-32e3d6f04be8 ] Connected to AMQP server on controller0:5672 2016-06-23 16:50:22.155 2833 INFO oslo_messaging._drivers.impl_rabbit [req-d6ceb7b0-dd30-45aa-bcb6-32e3d6f04be8 ] Connecting to AMQP server on controller0:5672 2016-06-23 16:50:22.168 2833 INFO oslo_messaging._drivers.impl_rabbit [-] Connecting to AMQP server on controller0:5672 2016-06-23 16:50:22.208 2833 INFO oslo_messaging._drivers.impl_rabbit [req-d6ceb7b0-dd30-45aa-bcb6-32e3d6f04be8 ] Connected to AMQP server on controller0:5672 2016-06-23 16:50:22.210 2833 INFO oslo_messaging._drivers.impl_rabbit [-] Connected to AMQP server on controller0:5672 2016-06-23 16:50:22.370 2833 INFO neutron.agent.dhcp.agent [-] Synchronizing state complete 2016-06-23 16:50:22.402 2833 INFO oslo_messaging._drivers.impl_rabbit [-] Connecting to AMQP server on controller0:5672 2016-06-23 16:50:22.421 2833 INFO neutron.agent.dhcp.agent [req-d6ceb7b0-dd30-45aa-bcb6-32e3d6f04be8 ] Synchronizing state 2016-06-23 16:50:22.433 2833 INFO oslo_messaging._drivers.impl_rabbit [-] Connected to AMQP server on controller0:5672 2016-06-23 16:50:22.463 2833 INFO neutron.agent.dhcp.agent [-] DHCP agent started 2016-06-23 16:50:22.521 2833 INFO neutron.agent.dhcp.agent [req-d6ceb7b0-dd30-45aa-bcb6-32e3d6f04be8 ] Synchronizing state complete 2016-06-23 16:51:49.764 2833 INFO neutron.openstack.common.service [req-606d77c7-5a86-4edc-b03e-2440cb627da1 ] Caught SIGTERM, exiting 2016-06-23 16:51:49.819 2833 ERROR oslo_messaging._drivers.impl_rabbit [-] Failed to consume message from queue: 2016-06-23 16:51:50.416 3288 INFO neutron.common.config [-] Logging enabled! 2016-06-23 16:51:50.416 3288 INFO neutron.common.config [-] /usr/bin/neutron-dhcp-agent version 2015.1.2 2016-06-23 16:51:50.430 3288 WARNING oslo_config.cfg [req-98f04cfd-f5db-4ff7-957d-9b436ac4a7ad ] Option "lock_path" from group "DEFAULT" is deprecated. Use option "lock_path" from group "oslo_concurrency". |
是有点奇怪。 查看下其他日志,需要找到error。对于这种假死的状态,重启应该就好了 |
openask 发表于 2016-6-23 16:32 ----google了一下---- that when the agents first boot up, they are out of sync. And that's normal behaviour. Then they do synchronize, but no message is written back in the logs, 当重启agent的时候如果提示下面的是正确的举动: Agent tunnel out of sync with plugin! Agent out of sync with plugin! --------google到一个可能的解决方案点------- 也就是说agents会固定一段时间(75s)去向neutron-server报告,如果neutron-server没有收到agents们的报告就会显示为XXX。从这个点出发,建议查看schedule task Agents report their own status to neutron-server periodically. The default inter time is 75 seconds. If neutron server can't recieve the report in 75 secods,the alive of the agent will be xxx. And it will be changed to :-) after recieving new status report. Translates into : If this happens all of the time and is causing issues with scheduling then you should look into load on the servers where the agents are running, look into the logs of the agents, see if there are any issues with scheduled tasks |
本帖最后由 openask 于 2016-6-23 18:10 编辑 再补充一下: 在compute node中做了关闭一个服务 再开启一个服务 查看log 发现log中有这样的提示 : Agent out of sync with plugin! Agent tunnel out of sync with plugin! ----------------------关闭服务后重新开启服务---------------- systemctl stop neutron-openvswitch-agent.service [root@compute1 ~]# systemctl status neutron-openvswitch-agent.service ● neutron-openvswitch-agent.service - OpenStack Neutron Open vSwitch Agent Loaded: loaded (/usr/lib/systemd/system/neutron-openvswitch-agent.service; enabled; vendor preset: disabled) Active: inactive (dead) since Wed 2016-06-22 22:00:18 EDT; 1min 34s ago [root@compute1 ~]# systemctl start neutron-openvswitch-agent.service [root@compute1 ~]# systemctl status neutron-openvswitch-agent.service ● neutron-openvswitch-agent.service - OpenStack Neutron Open vSwitch Agent Loaded: loaded (/usr/lib/systemd/system/neutron-openvswitch-agent.service; enabled; vendor preset: disabled) Active: active (running) since Wed 2016-06-22 22:02:05 EDT; 38s ago ---------------log------------------- [root@compute1 neutron]# tail -f openvswitch-agent.log 2016-06-22 22:02:05.933 17125 INFO neutron.common.config [-] Logging enabled! 2016-06-22 22:02:05.934 17125 INFO neutron.common.config [-] /usr/bin/neutron-openvswitch-agent version 2015.1.2 2016-06-22 22:02:05.943 17125 WARNING oslo_config.cfg [-] Option "lock_path" from group "DEFAULT" is deprecated. Use option "lock_path" from group "oslo_concurrency". 2016-06-22 22:02:06.952 17125 INFO oslo_messaging._drivers.impl_rabbit [-] Connecting to AMQP server on controller0:5672 2016-06-22 22:02:07.015 17125 INFO oslo_messaging._drivers.impl_rabbit [-] Connected to AMQP server on controller0:5672 2016-06-22 22:02:07.036 17125 INFO oslo_messaging._drivers.impl_rabbit [-] Connecting to AMQP server on controller0:5672 2016-06-22 22:02:07.075 17125 INFO oslo_messaging._drivers.impl_rabbit [-] Connected to AMQP server on controller0:5672 2016-06-22 22:02:07.705 17125 INFO oslo_messaging._drivers.impl_rabbit [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Connecting to AMQP server on controller0:5672 2016-06-22 22:02:07.726 17125 INFO oslo_messaging._drivers.impl_rabbit [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Connected to AMQP server on controller0:5672 2016-06-22 22:02:07.745 17125 INFO oslo_messaging._drivers.impl_rabbit [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Connecting to AMQP server on controller0:5672 2016-06-22 22:02:07.763 17125 INFO oslo_messaging._drivers.impl_rabbit [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Connected to AMQP server on controller0:5672 2016-06-22 22:02:07.778 17125 INFO oslo_messaging._drivers.impl_rabbit [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Connecting to AMQP server on controller0:5672 2016-06-22 22:02:07.795 17125 INFO oslo_messaging._drivers.impl_rabbit [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Connected to AMQP server on controller0:5672 2016-06-22 22:02:07.814 17125 INFO oslo_messaging._drivers.impl_rabbit [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Connecting to AMQP server on controller0:5672 2016-06-22 22:02:07.835 17125 INFO oslo_messaging._drivers.impl_rabbit [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Connected to AMQP server on controller0:5672 2016-06-22 22:02:07.852 17125 INFO oslo_messaging._drivers.impl_rabbit [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Connecting to AMQP server on controller0:5672 2016-06-22 22:02:07.872 17125 INFO oslo_messaging._drivers.impl_rabbit [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Connected to AMQP server on controller0:5672 2016-06-22 22:02:07.890 17125 INFO oslo_messaging._drivers.impl_rabbit [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Connecting to AMQP server on controller0:5672 2016-06-22 22:02:07.907 17125 INFO oslo_messaging._drivers.impl_rabbit [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Connected to AMQP server on controller0:5672 2016-06-22 22:02:07.925 17125 INFO oslo_messaging._drivers.impl_rabbit [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Connecting to AMQP server on controller0:5672 2016-06-22 22:02:07.943 17125 INFO oslo_messaging._drivers.impl_rabbit [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Connected to AMQP server on controller0:5672 2016-06-22 22:02:07.964 17125 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Agent initialized successfully, now running... 2016-06-22 22:02:07.976 17125 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Agent out of sync with plugin! 2016-06-22 22:02:08.106 17125 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [req-a019d63c-ddf2-4df3-93a4-841fdda04cd7 ] Agent tunnel out of sync with plugin! ------------neutron-dhcp-agent.log------- 在此log中看到了没有发送report 报了一个 Failed reporting state错误 2016-06-21 22:55:35.950 13361 INFO neutron.agent.dhcp.agent [-] Synchronizing state complete 2016-06-21 22:55:35.952 13361 INFO oslo_messaging._drivers.impl_rabbit [-] Connecting to AMQP server on controller0:5672 2016-06-21 22:55:35.970 13361 INFO neutron.agent.dhcp.agent [req-12dfeee8-c542-46ad-b3f5-2557c2fcbd2f ] Synchronizing state 2016-06-21 22:55:35.995 13361 INFO oslo_messaging._drivers.impl_rabbit [-] Connected to AMQP server on controller0:5672 2016-06-21 22:55:36.057 13361 INFO neutron.agent.dhcp.agent [-] DHCP agent started 2016-06-21 22:55:36.074 13361 INFO neutron.agent.dhcp.agent [req-12dfeee8-c542-46ad-b3f5-2557c2fcbd2f ] Synchronizing state complete 2016-06-22 19:39:35.733 2848 INFO neutron.common.config [-] Logging enabled! 2016-06-22 19:39:35.756 2848 INFO neutron.common.config [-] /usr/bin/neutron-dhcp-agent version 2015.1.2 2016-06-22 19:39:35.869 2848 WARNING oslo_config.cfg [req-8833acf1-c0ca-4783-bbfc-12ccfe4717d6 ] Option "lock_path" from group "DEFAULT" is deprecated. Use option "lock_path" from group "oslo_concurrency". 2016-06-22 19:39:35.884 2848 INFO oslo_messaging._drivers.impl_rabbit [req-fba865de-cb4e-44ee-9ccd-25ca25b23be9 ] Connecting to AMQP server on controller0:5672 2016-06-22 19:39:35.908 2848 INFO neutron.agent.dhcp.agent [-] Synchronizing state 2016-06-22 19:39:35.945 2848 INFO oslo_messaging._drivers.impl_rabbit [req-fba865de-cb4e-44ee-9ccd-25ca25b23be9 ] Connected to AMQP server on controller0:5672 2016-06-22 19:39:35.957 2848 INFO oslo_messaging._drivers.impl_rabbit [req-fba865de-cb4e-44ee-9ccd-25ca25b23be9 ] Connecting to AMQP server on controller0:5672 2016-06-22 19:39:35.973 2848 INFO oslo_messaging._drivers.impl_rabbit [-] Connecting to AMQP server on controller0:5672 2016-06-22 19:39:36.258 2848 INFO oslo_messaging._drivers.impl_rabbit [-] Connected to AMQP server on controller0:5672 2016-06-22 19:39:36.269 2848 INFO oslo_messaging._drivers.impl_rabbit [req-fba865de-cb4e-44ee-9ccd-25ca25b23be9 ] Connected to AMQP server on controller0:5672 2016-06-22 19:40:36.281 2848 ERROR neutron.agent.dhcp.agent [-] Unable to sync network state. 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent Traceback (most recent call last): 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/dhcp/agent.py", line 157, in sync_state 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent active_networks = self.plugin_rpc.get_active_networks_info() 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/dhcp/agent.py", line 417, in get_active_networks_info 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent host=self.host) 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 156, in call 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent retry=self.retry) 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 90, in _send 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent timeout=timeout, retry=retry) 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 350, in send 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent retry=retry) 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 339, in _send 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent result = self._waiter.wait(msg_id, timeout) 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 243, in wait 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent message = self.waiters.get(msg_id, timeout=timeout) 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 149, in get 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent 'to message ID %s' % msg_id) 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent MessagingTimeout: Timed out waiting for a reply to message ID 699644629f1740e8b6013baba374bbc2 2016-06-22 19:40:36.281 2848 TRACE neutron.agent.dhcp.agent 2016-06-22 19:40:36.310 2848 INFO oslo_messaging._drivers.impl_rabbit [-] Connecting to AMQP server on controller0:5672 2016-06-22 19:40:36.320 2848 ERROR neutron.agent.dhcp.agent [req-fba865de-cb4e-44ee-9ccd-25ca25b23be9 ] Failed reporting state! 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent Traceback (most recent call last): 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/dhcp/agent.py", line 575, in _report_state 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent self.state_rpc.report_state(ctx, self.agent_state, self.use_call) 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/rpc.py", line 80, in report_state 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent return method(context, 'report_state', **kwargs) 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 156, in call 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent retry=self.retry) 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 90, in _send 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent timeout=timeout, retry=retry) 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 350, in send 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent retry=retry) 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 339, in _send 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent result = self._waiter.wait(msg_id, timeout) 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 243, in wait 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent message = self.waiters.get(msg_id, timeout=timeout) 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 149, in get 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent 'to message ID %s' % msg_id) 2016-06-22 19:40:36.320 2848 TRACE neutron.agent.dhcp.agent MessagingTimeout: Timed out waiting for a reply to message ID 27290842d8d241e1b71757fb33c57f65 |
补充一下: 这些服务的实际状态为active: ----1------ ● neutron-l3-agent.service - OpenStack Neutron Layer 3 Agent Loaded: loaded (/usr/lib/systemd/system/neutron-l3-agent.service; enabled; vendor preset: disabled) Active: active (running) since Wed 2016-06-22 19:39:34 EDT; 1h 31min ago Main PID: 2847 (neutron-l3-agen) CGroup: /system.slice/neutron-l3-agent.service -----2----- ● neutron-openvswitch-agent.service - OpenStack Neutron Open vSwitch Agent Loaded: loaded (/usr/lib/systemd/system/neutron-openvswitch-agent.service; enabled; vendor preset: disabled) Active: active (running) since Wed 2016-06-22 19:39:34 EDT; 1h 30min ago Main PID: 2846 (neutron-openvsw) CGroup: /system.slice/neutron-openvswitch-agent.service ----3------ ● neutron-dhcp-agent.service - OpenStack Neutron DHCP Agent Loaded: loaded (/usr/lib/systemd/system/neutron-dhcp-agent.service; enabled; vendor preset: disabled) Active: active (running) since Wed 2016-06-22 19:39:34 EDT; 2h 12min ago Main PID: 2848 (neutron-dhcp-ag) CGroup: /system.slice/neutron-dhcp-agent.service ----4---- ● neutron-metadata-agent.service - OpenStack Neutron Metadata Agent Loaded: loaded (/usr/lib/systemd/system/neutron-metadata-agent.service; enabled; vendor preset: disabled) Active: active (running) since Wed 2016-06-22 21:18:53 EDT; 34min ago Main PID: 13505 (neutron-metadat) CGroup: /system.slice/neutron-metadata-agent.service ---5----- [root@compute1 ~]# systemctl status neutron-openvswitch-agent.service ● neutron-openvswitch-agent.service - OpenStack Neutron Open vSwitch Agent Loaded: loaded (/usr/lib/systemd/system/neutron-openvswitch-agent.service; enabled; vendor preset: disabled) Active: active (running) since Wed 2016-06-22 19:39:13 EDT; 2h 16min ago Main PID: 1435 (neutron-openvsw) CGroup: /system.slice/neutron-openvswitch-agent.service |
active为false。具体楼主看下/var/log/neutron的日志 服务应该是不正常的 |