本帖最后由 howtodown 于 2014-11-10 08:37 编辑
问题导读:
1、虚拟机热迁移有哪些可行方法?
2、基于libver迁移,需要做哪些配置 ?
3、迁移失败了,我们该怎么办 ?
虚拟化hypervisor:KVM
libvirt版本:0.8.8
虚拟机使用本地存储
操作步骤:
1、准备libvirt开启tcp监控
修改/etc/libvirt/libvirtd.conf
去掉注释
listen_tls = 0
listen_tcp = 1
去掉注释并修改值
auth_tcp = “none”
2、scp镜像文件和console.log以及其他文件到目标主机
scp文件这个视你的情况而定,如果漏了的话迁移的时候会报错,到时候再根据提示scp就行。这一步最好保持源主机和目标主机的路径是一致的。
3、迁移
virsh migrate vm_name --live qemu+ssh://intent_ip/system --copy-storage-inc 复制代码
这过程中会让你输入目标主机的root密码(可以提前配置免密码登录,对于普通用户的免密码登录,可以参见我的这篇blog ),按照提示做就好,然后登陆目标主机。察看迁移的进度:
tail -f /var/log/libvirt/qemu/zhruxgpy.log 清理源节点
把源节点的虚拟机销毁,磁盘文件删除,与虚拟机相关的防火墙规则删除等等。
其实,在OpenStack中的虚拟机live-migrate,基本就是组合了上述过程,具体的命令后续有时间我会补上。
OpenStack虚拟机在线迁移失败问题及解决办法
备注
更新历史:
2013.07.17 该问题在最新的主干分支中不存在了,因为nova-scheduler中的代码有部分重构过,不知是无意还是有意,修复了这个bug
version:OpenStack Grizzly 2013.1.2
hypervisor:KVM
shared storage: no
1、问题描述
我的环境上有一台虚拟机,信息如下:
环境上两个节点:
虚拟机正常运行,对虚拟机执行live-migration操作:
nova live-migration --block-migrate 6cd558d9-e924-4598-8e63-e86a20929bd9 复制代码
返回的异常信息如下:
{
"badRequest": {
"message": "Live migration of instance 6cd558d9-e924-4598-8e63-e86a20929bd9 to host controller failed",
"code": 400
}
} 复制代码
2、问题分析
先查看日志中的异常堆栈:
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions Traceback (most recent call last):
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions File "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py", line 430, in _process_data
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions rval = self.proxy.dispatch(ctxt, version, method, **args)
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions File "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/dispatcher.py", line 133, in dispatch
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions return getattr(proxyobj, method)(ctxt, **kwargs)
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions File "/usr/lib/python2.7/dist-packages/nova/scheduler/manager.py", line 117, in live_migration
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions context, ex, {})
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions self.gen.next()
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions File "/usr/lib/python2.7/dist-packages/nova/scheduler/manager.py", line 96, in live_migration
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions block_migration, disk_over_commit)
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions File "/usr/lib/python2.7/dist-packages/nova/scheduler/driver.py", line 196, in schedule_live_migration
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions ignore_hosts)
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions File "/usr/lib/python2.7/dist-packages/nova/scheduler/driver.py", line 272, in _live_migration_dest_check
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions filter_properties)[0]
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions File "/usr/lib/python2.7/dist-packages/nova/scheduler/filter_scheduler.py", line 146, in select_hosts
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions request_spec, filter_properties, instance_uuids)]
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions File "/usr/lib/python2.7/dist-packages/nova/scheduler/filter_scheduler.py", line 336, in _schedule
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions filter_properties)
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions File "/usr/lib/python2.7/dist-packages/nova/scheduler/host_manager.py", line 342, in get_filtered_hosts
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions hosts, filter_properties)
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions File "/usr/lib/python2.7/dist-packages/nova/filters.py", line 53, in get_filtered_objects
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions return list(objs)
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions File "/usr/lib/python2.7/dist-packages/nova/filters.py", line 39, in filter_all
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions if self._filter_one(obj, filter_properties):
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions File "/usr/lib/python2.7/dist-packages/nova/scheduler/filters/__init__.py", line 30, in _filter_one
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions return self.host_passes(obj, filter_properties)
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions File "/usr/lib/python2.7/dist-packages/nova/scheduler/filters/image_props_filter.py", line 78, in host_passes
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions image_props = spec.get('image', {}).get('properties', {})
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions AttributeError: 'NoneType' object has no attribute 'get'
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44.979 32348 TRACE nova.api.openstack.compute.contrib.admin_actions
2013-07-10 15:07:44 INFO [nova.api.openstack.wsgi 673] [32348] HTTP exception thrown: Live migration of instance 6cd558d9-e924-4598-8e63-e86a20929bd9 to another host failed 复制代码
原来问题出在调度上,日志说的很明显了,是在image_props_filter中出现异常,spec.get('image', {})返回了None,导致python异常。那么spec.get('image', {})为什么返回None呢?从代码追溯一下spec中的image属性从何而来:
if not instance_ref['image_ref']:
image = None
else:
image = self.image_service.show(context,
instance_ref['image_ref'])
request_spec = {'instance_properties': instance_ref,
'instance_type': instance_type,
'instance_uuids': [instance_ref['uuid']],
'image': image} 复制代码
再回头看一下虚拟机信息,发现这个虚拟机是一个后端卷启动的虚拟机(boot from volume),至此,问题根因分析清楚。
3、问题解决
有两种解决方法:
1)修改在线迁移虚拟机的命令参数,强制指定目的主机,跳过schedule的阶段,改成如下(注意,如果是后端卷启动,就不能加--block-migrate参数:
nova live-migration 6cd558d9-e924-4598-8e63-e86a20929bd9 compute 复制代码
2)修改Nova的配置项scheduler_default_filters(默认配置是['RetryFilter', 'AvailabilityZoneFilter', 'RamFilter', 'ComputeFilter', 'ComputeCapabilitiesFilter', 'ImagePropertiesFilter']),将其中的ImagePropertiesFilter删除,重启nova-scheduler进程后再次执行迁移,成功。