Check1 :OpenVswitch 版本是否 > =2.7, DB Schema >= 7.14.0,每一台都要检查
[root@cmp1 ~]# ovs-vsctl --version
因为CT (connection track)在V2.6.0以后才有。
Check2 :内核版本是否 >= 2.7,每一台都要检查,lsmod看是否内核成功执行,modinfo看是否有安装过。
modprobe openvswtich <==可以要求内核用新的kernel module,但如果都不能成功的话,就只能靠重启。
rmmod openvswitch -> insmod openvswitch 也可以试试
[root@cmp1 ~]# lsmod |grep openvswitch
openvswitch 255478 4 vport_geneve
nf_nat_ipv6 14131 1 openvswitch
nf_nat_ipv4 14115 1 openvswitch
nf_defrag_ipv6 34768 2 openvswitch,nf_conntrack_ipv6
nf_defrag_ipv4 12729 2 openvswitch,nf_conntrack_ipv4
nf_nat 26146 3 openvswitch,nf_nat_ipv4,nf_nat_ipv6
nf_conntrack 105745 7 openvswitch,nf_nat,nf_nat_ipv4,nf_nat_ipv6,nf_conntrack_netlink,nf_conntrack_ipv4,nf_conntrack_ipv6
gre 13796 1 openvswitch
libcrc32c 12644 2 xfs,openvswitch
[root@cmp1 ~]# modinfo openvswitch
filename: /lib/modules/3.10.0-327.el7.x86_64/extra/openvswitch.ko
alias: net-pf-16-proto-16-family-ovs_packet
alias: net-pf-16-proto-16-family-ovs_flow
alias: net-pf-16-proto-16-family-ovs_vport
alias: net-pf-16-proto-16-family-ovs_datapath
version: 2.7.100
license: GPL
description: Open vSwitch switching datapath
rhelversion: 7.2
srcversion: 1A67978438AC8A6C7F305A7
depends: nf_conntrack,nf_nat,nf_defrag_ipv6,libcrc32c,nf_nat_ipv6,nf_nat_ipv4,gre,nf_defrag_ipv4
vermagic: 3.10.0-327.el7.x86_64 SMP mod_unload modversions
parm: udp_port:Destination UDP port (ushort)
Check 3: OVN-controller是否运作正常,每一台有需要跑VM的都要检查
[root@cmp1 ~]# ps aux|grep ovn-controller
root 2510 0.0 0.0 46864 832 ? S<s Apr22 0:00 ovn-controller: monitoring pid 2511 (healthy)
root 2511 0.1 0.0 48444 3700 ? S< Apr22 5:51 ovn-controller unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --no-chdir --log-file=/var/log/openvswitch/ovn-controller.log --pidfile=/var/run/openvswitch/ovn-controller.pid --detach --monitor
[root@cmp1 ~]# tail -f /var/log/openvswitch/ovn-controller.log
看看是否有错误的资讯存在。
Q1: 如果发生与SB connection有问题,请确认是否可以连通,另外确认是否有设定SB listen port
#ovn-nbctl get-connection
#ovn-sbctl get-connection
确认回报的listen port和address是设定正确。
设定方式
#ovn-nbctl set-connection ptcp:6641:{{ ovn_db_ip }}
#ovn-sbctl set-connection ptcp:6642:{{ ovn_db_ip }}
#ovs-appctl -t ovsdb-server ovsdb-server/add-remote ptcp:6640:{{ ovn_db_ip }}
如果前面指令失敗可以用底下指令直接配置。某些OVS版本會移除上面指令
#ovs-appctl -t /var/run/openvswitch/ovnsb_db.ctl ovsdb-server/add-remote ptcp:6642:{{ ovn_db_ip }}
4: 检查ovn-northd
# tail -f /var/log/openvswitch/ovn-northd.log
ovn-northd会启动NB/SB两个ovs database,但有时在安装过程没有先停止openvswitch,会造成NB/SB database的port被原有openvswitch占用(因为启动monitor,得先停止monitor才有办法停止openvswitch process)。
Check5 :查看vswitchd是否有log错误,每一台都要检查
[root@cmp1 ~]# tail -f /var/log/openvswitch/ovs-vswitchd.lo
这里会出现错误通常是设定flow问题,最大可能性是kernel module 与user space openvswitch版本号不相容。
Check6 :检查tunnel是否有建立,每一台都要检查
[root@cmp1 ~]# ovs-vsctl show
b8d3be1f-9d8a-44b3-a415-2e0434d76b94
Bridge br-int
fail_mode: secure
Port "ovn-f4001b-0"
Interface "ovn-f4001b-0"
type: geneve
options: {csum="true", key=flow, remote_ip="192.168.200.91"}
Port br-int
Interface br-int
type: internal
Port "tapccdd9d77-21"
Interface "tapccdd9d77-21"
Port "ovn-b250f7-0"
Interface "ovn-b250f7-0"
type: geneve
options: {csum="true", key=flow, remote_ip="192.168.200.93"}
ovs_version: "2.7.100"
Check7 :检查OVN-SB是否有学到与5相对应的chassis,每一台都要检查。
基本上需要Geneve就可以,Vxlan目的是要做HW VTEP用途,如果没有HW VTEP相关设备的话,就不需要。
[root@ctrl ~]# ovn-sbctl show
Chassis "f4001bf1-3cb4-4828-8cc1-9a8f3d99711a"
hostname: ctrl
Encap vxlan
ip: "192.168.200.91"
options: {csum="true"}
Encap geneve
ip: "192.168.200.91"
options: {csum="true"}
Chassis "7db0d092-2d3b-4a16-b553-348ca34f233b"
hostname: "cmp1"
Encap geneve
ip: "192.168.200.92"
options: {csum="true"}
Encap vxlan
ip: "192.168.200.92"
options: {csum="true"}
Chassis "b250f793-28e6-4d14-878e-5eba43381728"
hostname: "cmp2"
Encap geneve
ip: "192.168.200.93"
options: {csum="true"}
Encap vxlan
ip: "192.168.200.93"
options: {csum="true"
设定方式
systemctl start openvswitch
ovs-vsctl set Open_Vswitch . external-ids:hostname={hostname}
底下缺一不可,你可以从ovs-vswitch log中发现,少掉的他会一直警告
ovs-vsctl set open . external-ids:ovn-remote=tcp:{ctrl_ip}:6642
ovs-vsctl set open . external-ids:ovn-encap-type=geneve,vxlan
ovs-vsctl set open . external-ids:ovn-encap-ip={node_ip}
Check 8 :检查/var/log/neutron/server.log是否有错误
Q1: 关于sql问题,ovsdb问题,都跟database有相关联,请确认Openvswitch dabase schema 版本没有错误。还有neutron database有更新。
#neutron-db-manage --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini upgrade heads
Q2: 虽然最好的安装环境是没有任何网络/安全群组设定过,但如果有已经存在的网络/安全群组存在,那麽请用下面指令同步。
#neutron-ovn-db-sync-util --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini
Q3: neutron authenticate nova fail
A3: 因为neutron向OVN建立成功port后,会向neutron callback,此时需要认证。
在/etc/neutron/neutron.conf里增加,下面指针对liberty 版本,后面已改用nova_admin_xxx字眼。
[nova]
# Authentication URL (string value)
auth_url=http://{ctrl_ip}:35357
# Authentication type to load (string value)
auth_type=password
# Username (string value)
username=nova
# User's password (string value)
password=a0da31d154f74c18 <-- search nova_admin_password
# Project name to scope to (string value)
project_name=services
# Tenant Name (string value)
tenant_name=services
可用curl先确认authentication的填写内容是否有错误
curl -i -X POST http://192.168.200.71:35357/v2.0/tokens -H "Content-Type: application/json" -H "User-Agent: python-keystoneclient" -d '{"auth": {"tenantName": "admin", "passwordCredentials": {"username": "admin", "password": "password"}}}'
Q4: 如果listen port有被占用,可能是root wrapper没有跟这一起被删除造成的,用下面指令删除所有neutron类的process。
#ps -ef |awk '/neutron/{print $2}'|xargs kill {}
Q5:如果 连结OVN database有问题,请确认ml2设定是否正确
[ovn]
ovn_nb_connection = tcp:{ctrl_ip}:6641
ovn_sb_connection = tcp:{ctrl_ip}:6642
ovn_l3_mode = False
ovn_l3_scheduler = chance
ovn_native_dhcp = True
Q7: Sechma not match:
Exception: {u'error': u'unknown database', u'details': u'get_schema request specifies unknown database OVN_Southbound', u'syntax': u'["OVN_Southbound"]'}
A7: Mybe the ovn-sb or ovn-nb can't be conected throuth TCP. So set the connection
ovn-nbctl set-connection ptcp:6641:{ctrl_ip}
ovn-sbctl set-connection ptcp:6642:{ctrl_ip}
NOTE: the OVN-SB and OVN-NB database will be automatically created when you install the packages. But the connection will use the unix socket as default. You need to configure the TCP as needed.
Check 9: 检查neutron-l3-agent没有错误
tail -f /var/log/neutron/l3-agent.log
如果此安装采用l3 agent方式,l3 agent会透过使用ip netns来产生router,请勿随便移除他,否则会造成l3 agent失败,而误使用OVN logical router.
# ip netns
qrouter-241093c1-da31-4c1e-a6ff-e5df65b5da76
Q1: ovn-nbctl show 看到 lr-xxxx
A1:这表示l3 agent有错误产生,neutron不知道此agent还活着,请删除此router,并确认l3 agent没有问题。
Q2: 当我把router port加入external network,br-int与br-ex间并没有产生相对应的patch port
A2.1: 请确认 /etc/neutron/l3_agent.ini里的 interface_driver 设定
interface_driver =openvswitch
A2.2: 请确认 external-ids:ovn-bridge-mappings,很容易打错的字
# ovs-vsctl get open . external-ids:ovn-bridge-mappings
"public:br-ex"
#ovs-vsctl set open . external-ids:ovn-bridge-mappings=public:br-ex
移除范例:
#ovs-vsctl remove Open_vSwitch 2280f056-46bf-4a0a-9130-130d2fad8a13 external_ids ovn-bridge-mappings
A3. 如果一直都建立不起patch port的话,请看看是否有qg port存在。把它删除再试试。
Bridge br-ex
Port "qg-e9111b6f-82"
Interface "qg-e9111b6f-82"
type: internal