Check1 :OpenVswitch 版本是否 > =2.7, DB Schema >= 7.14.0,每一台都要检查

[root@cmp1 ~]# ovs-vsctl --version

因为CT (connection track)在V2.6.0以后才有。

Check2 :内核版本是否 >= 2.7,每一台都要检查,lsmod看是否内核成功执行,modinfo看是否有安装过。

modprobe openvswtich <==可以要求内核用新的kernel module,但如果都不能成功的话,就只能靠重启。

rmmod openvswitch -> insmod openvswitch 也可以试试

[root@cmp1 ~]# lsmod |grep openvswitch
openvswitch           255478  4 vport_geneve
nf_nat_ipv6            14131  1 openvswitch
nf_nat_ipv4            14115  1 openvswitch
nf_defrag_ipv6         34768  2 openvswitch,nf_conntrack_ipv6
nf_defrag_ipv4         12729  2 openvswitch,nf_conntrack_ipv4
nf_nat                 26146  3 openvswitch,nf_nat_ipv4,nf_nat_ipv6
nf_conntrack          105745  7 openvswitch,nf_nat,nf_nat_ipv4,nf_nat_ipv6,nf_conntrack_netlink,nf_conntrack_ipv4,nf_conntrack_ipv6
gre                    13796  1 openvswitch
libcrc32c              12644  2 xfs,openvswitch

[root@cmp1 ~]# modinfo openvswitch
filename:       /lib/modules/3.10.0-327.el7.x86_64/extra/openvswitch.ko
alias:          net-pf-16-proto-16-family-ovs_packet
alias:          net-pf-16-proto-16-family-ovs_flow
alias:          net-pf-16-proto-16-family-ovs_vport
alias:          net-pf-16-proto-16-family-ovs_datapath
version:        2.7.100
license:        GPL
description:    Open vSwitch switching datapath
rhelversion:    7.2
srcversion:     1A67978438AC8A6C7F305A7
depends:        nf_conntrack,nf_nat,nf_defrag_ipv6,libcrc32c,nf_nat_ipv6,nf_nat_ipv4,gre,nf_defrag_ipv4
vermagic:       3.10.0-327.el7.x86_64 SMP mod_unload modversions
parm:           udp_port:Destination UDP port (ushort)

Check 3: OVN-controller是否运作正常,每一台有需要跑VM的都要检查

[root@cmp1 ~]# ps aux|grep ovn-controller
root      2510  0.0  0.0  46864   832 ?        S<s  Apr22   0:00 ovn-controller: monitoring pid 2511 (healthy)
root      2511  0.1  0.0  48444  3700 ?        S<   Apr22   5:51 ovn-controller unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --no-chdir --log-file=/var/log/openvswitch/ovn-controller.log --pidfile=/var/run/openvswitch/ovn-controller.pid --detach --monitor

[root@cmp1 ~]# tail -f /var/log/openvswitch/ovn-controller.log
看看是否有错误的资讯存在。

Q1: 如果发生与SB connection有问题,请确认是否可以连通,另外确认是否有设定SB listen port

#ovn-nbctl get-connection
#ovn-sbctl get-connection

确认回报的listen port和address是设定正确。

设定方式

#ovn-nbctl set-connection ptcp:6641:{{ ovn_db_ip }}
#ovn-sbctl set-connection ptcp:6642:{{ ovn_db_ip }}
#ovs-appctl -t ovsdb-server ovsdb-server/add-remote ptcp:6640:{{ ovn_db_ip }}

如果前面指令失敗可以用底下指令直接配置。某些OVS版本會移除上面指令

#ovs-appctl -t /var/run/openvswitch/ovnsb_db.ctl ovsdb-server/add-remote ptcp:6642:{{ ovn_db_ip }}

4: 检查ovn-northd

# tail -f /var/log/openvswitch/ovn-northd.log

ovn-northd会启动NB/SB两个ovs database,但有时在安装过程没有先停止openvswitch,会造成NB/SB database的port被原有openvswitch占用(因为启动monitor,得先停止monitor才有办法停止openvswitch process)。

Check5 :查看vswitchd是否有log错误,每一台都要检查

[root@cmp1 ~]# tail -f /var/log/openvswitch/ovs-vswitchd.lo

这里会出现错误通常是设定flow问题,最大可能性是kernel module 与user space openvswitch版本号不相容。

Check6 :检查tunnel是否有建立,每一台都要检查

[root@cmp1 ~]# ovs-vsctl show
b8d3be1f-9d8a-44b3-a415-2e0434d76b94
    Bridge br-int
        fail_mode: secure
        Port "ovn-f4001b-0"
            Interface "ovn-f4001b-0"
                type: geneve
                options: {csum="true", key=flow, remote_ip="192.168.200.91"}
        Port br-int
            Interface br-int
                type: internal
        Port "tapccdd9d77-21"
            Interface "tapccdd9d77-21"
        Port "ovn-b250f7-0"
            Interface "ovn-b250f7-0"
                type: geneve
                options: {csum="true", key=flow, remote_ip="192.168.200.93"}
    ovs_version: "2.7.100"

Check7 :检查OVN-SB是否有学到与5相对应的chassis,每一台都要检查。

基本上需要Geneve就可以,Vxlan目的是要做HW VTEP用途,如果没有HW VTEP相关设备的话,就不需要。

[root@ctrl ~]# ovn-sbctl show
Chassis "f4001bf1-3cb4-4828-8cc1-9a8f3d99711a"
    hostname: ctrl
    Encap vxlan
        ip: "192.168.200.91"
        options: {csum="true"}
    Encap geneve
        ip: "192.168.200.91"
        options: {csum="true"}
Chassis "7db0d092-2d3b-4a16-b553-348ca34f233b"
    hostname: "cmp1"
    Encap geneve
        ip: "192.168.200.92"
        options: {csum="true"}
    Encap vxlan
        ip: "192.168.200.92"
        options: {csum="true"}
Chassis "b250f793-28e6-4d14-878e-5eba43381728"
    hostname: "cmp2"
    Encap geneve
        ip: "192.168.200.93"
        options: {csum="true"}
    Encap vxlan
        ip: "192.168.200.93"
        options: {csum="true"

设定方式

systemctl start openvswitch
ovs-vsctl set Open_Vswitch . external-ids:hostname={hostname}

底下缺一不可,你可以从ovs-vswitch log中发现,少掉的他会一直警告

ovs-vsctl set open . external-ids:ovn-remote=tcp:{ctrl_ip}:6642
ovs-vsctl set open . external-ids:ovn-encap-type=geneve,vxlan
ovs-vsctl set open . external-ids:ovn-encap-ip={node_ip}

Check 8 :检查/var/log/neutron/server.log是否有错误

Q1: 关于sql问题,ovsdb问题,都跟database有相关联,请确认Openvswitch dabase schema 版本没有错误。还有neutron database有更新。

#neutron-db-manage --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini upgrade heads

Q2: 虽然最好的安装环境是没有任何网络/安全群组设定过,但如果有已经存在的网络/安全群组存在,那麽请用下面指令同步。

#neutron-ovn-db-sync-util --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini

Q3: neutron authenticate nova fail

A3: 因为neutron向OVN建立成功port后,会向neutron callback,此时需要认证。

在/etc/neutron/neutron.conf里增加,下面指针对liberty 版本,后面已改用nova_admin_xxx字眼。

[nova]
# Authentication URL (string value)
auth_url=http://{ctrl_ip}:35357

# Authentication type to load (string value)
auth_type=password

# Username (string value)
username=nova

# User's password (string value)
password=a0da31d154f74c18   <-- search nova_admin_password

# Project name to scope to (string value)
project_name=services

# Tenant Name (string value)
tenant_name=services

可用curl先确认authentication的填写内容是否有错误

curl -i -X POST http://192.168.200.71:35357/v2.0/tokens -H "Content-Type: application/json" -H "User-Agent: python-keystoneclient" -d '{"auth": {"tenantName": "admin", "passwordCredentials": {"username": "admin", "password": "password"}}}'

Q4: 如果listen port有被占用,可能是root wrapper没有跟这一起被删除造成的,用下面指令删除所有neutron类的process。

 #ps -ef |awk '/neutron/{print $2}'|xargs kill {}

Q5:如果 连结OVN database有问题,请确认ml2设定是否正确

[ovn]
ovn_nb_connection = tcp:{ctrl_ip}:6641
ovn_sb_connection = tcp:{ctrl_ip}:6642
ovn_l3_mode = False
ovn_l3_scheduler = chance
ovn_native_dhcp = True

Q7: Sechma not match:

Exception: {u'error': u'unknown database', u'details': u'get_schema request specifies unknown database OVN_Southbound', u'syntax': u'["OVN_Southbound"]'}

A7: Mybe the ovn-sb or ovn-nb can't be conected throuth TCP. So set the connection

ovn-nbctl set-connection ptcp:6641:{ctrl_ip}
ovn-sbctl set-connection ptcp:6642:{ctrl_ip}

NOTE: the OVN-SB and OVN-NB database will be automatically created when you install the packages. But the connection will use the unix socket as default. You need to configure the TCP as needed.

Check 9: 检查neutron-l3-agent没有错误

tail -f /var/log/neutron/l3-agent.log

如果此安装采用l3 agent方式,l3 agent会透过使用ip netns来产生router,请勿随便移除他,否则会造成l3 agent失败,而误使用OVN logical router.

# ip netns
qrouter-241093c1-da31-4c1e-a6ff-e5df65b5da76

Q1: ovn-nbctl show 看到 lr-xxxx

A1:这表示l3 agent有错误产生,neutron不知道此agent还活着,请删除此router,并确认l3 agent没有问题。

Q2: 当我把router port加入external network,br-int与br-ex间并没有产生相对应的patch port

A2.1: 请确认 /etc/neutron/l3_agent.ini里的 interface_driver 设定

interface_driver =openvswitch

A2.2: 请确认 external-ids:ovn-bridge-mappings,很容易打错的字

# ovs-vsctl get open . external-ids:ovn-bridge-mappings
"public:br-ex"

#ovs-vsctl set open . external-ids:ovn-bridge-mappings=public:br-ex
移除范例:
#ovs-vsctl remove Open_vSwitch 2280f056-46bf-4a0a-9130-130d2fad8a13 external_ids ovn-bridge-mappings

A3. 如果一直都建立不起patch port的话,请看看是否有qg port存在。把它删除再试试。

Bridge br-ex
    Port "qg-e9111b6f-82"
    Interface "qg-e9111b6f-82"
    type: internal

results matching ""

    No results matching ""