效能測試之OVN vs (ML2+OVS)

2020-06-16 17:12:51

我們已經對OVN做了許多次的效能測試，但是缺少一個OVN和（ML2+OVS）的效能對比測試。我和許多人一起對比了這2種後端。本文是第一部分：控制平面的效能對比。後面會另外發文公布資料平面的效能對比結果。

控制平面的不同之處

ML2+OVS 控制平面是基於Openstack 的。首先有大量由Python編寫的agents 。 Neutron server與這些agents互動式使用基於AMQP的RPC機制（本文的案例用到了最廣泛使用的RabbitMQ）。

OVN 的控制平面使用了分散式資料庫驅動的方式. 設定和狀態由這2個資料庫管理: OVN northbound 和 southbound databases。這2個資料庫都基於OVSDB。與通過RPC接收更新的方式不同, OVN中的元件監控資料庫中相關表項的變化並將最新的表項應用於本地。這些元件的詳細資訊可以閱讀the first release of OVN 和 ovn-architecture document 。

OVN 沒有使用任何的Neutron agents。相反，所有功能都由ovn-controller 和 OVS 流實現。比如security groups, DHCP, L3 routing和 NAT功能等。

測試環境使用的硬體和軟體

本次測試使用了實驗室的13台機器，並分配下面的這幾種角色:

1 個OpenStack TripleO Undercloud for provisioning
3 個Controllers (OpenStack and OVN control plane services)
9 個Compute Nodes (Hypervisors)

硬體規格如下:

2x E5-2620 v2 (12 total cores, 24 total threads)
64GB RAM
4 x 1TB SATA
1 x Intel X520 Dual Port 10G

軟體:

CentOS 7.2
OpenStack, OVS, and OVN from their master branches (early December, 2016)
Neutron configuration notes
- (OVN) 6 API workers, 1 RPC worker (since rpc is not used and neutron requires at least 1) for neutron-server on each controller (x3)
- (ML2+OVS) 6 API workers, 6 RPC workers for neutron-server on each controller (x3)
- (ML2+OVS) DVR was enabled

設定和測試

效能測試工具為 OpenStack Rally 。我們使用 Browbeat 進行快速的安裝、設定、執行測試、儲存分析和結果對比。

Browbeat 中的Rally引數為:

rerun: 3
...
rally:
enabled: true
sleep_before: 5
sleep_after: 5
venv: /home/stack/rally-venv/bin/activate
plugins:
- netcreate-boot: rally/rally-plugins/netcreate-boot
- subnet-router-create: rally/rally-plugins/subnet-router-create
- neutron-securitygroup-port: rally/rally-plugins/neutron-securitygroup-port
benchmarks:
- name: neutron
enabled: true
concurrency:
- 8
- 16
- 32
times: 500
scenarios:
- name: create-list-network
enabled: true
file: rally/neutron/neutron-create-list-network-cc.yml
- name: create-list-port
enabled: true
file: rally/neutron/neutron-create-list-port-cc.yml
- name: create-list-router
enabled: true
file: rally/neutron/neutron-create-list-router-cc.yml
- name: create-list-security-group
enabled: true
file: rally/neutron/neutron-create-list-security-group-cc.yml
- name: create-list-subnet
enabled: true
file: rally/neutron/neutron-create-list-subnet-cc.yml
- name: plugins
enabled: true
concurrency:
- 8
- 16
- 32
times: 500
scenarios:
- name: netcreate-boot
enabled: true
image_name: cirros
flavor_name: m1.xtiny
file: rally/rally-plugins/netcreate-boot/netcreate_boot.yml
- name: subnet-router-create
enabled: true
num_networks: 10
file: rally/rally-plugins/subnet-router-create/subnet-router-create.yml
- name: neutron-securitygroup-port
enabled: true
file: rally/rally-plugins/neutron-securitygroup-port/neutron-securitygroup-port.yml

上述設定定義了幾種執行場景。在3個不同的並行級別下，分別執行500次。最後，開頭的"rerun: 3"意味著我們要將整個設定再執行3遍。是不是被繞暈啦, 那我們拿個例子看一下：

"netcreate-boot" 場景是建立一個網路，並在這個網路上啟動一個虛擬機器。這個場景會按下面這樣多次執行:

Run 1
- Create 500 VMs, each on their own network, 8 at a time, and then clean up
- Create 500 VMs, each on their own network, 16 at a time, and then clean up
- Create 500 VMs, each on their own network, 32 at a time, and then clean up
Run 2
- Create 500 VMs, each on their own network, 8 at a time, and then clean up
- Create 500 VMs, each on their own network, 16 at a time, and then clean up
- Create 500 VMs, each on their own network, 32 at a time, and then clean up
Run 3
- Create 500 VMs, each on their own network, 8 at a time, and then clean up
- Create 500 VMs, each on their own network, 16 at a time, and then clean up
- Create 500 VMs, each on their own network, 32 at a time, and then clean up

總共會建立4500 個虛擬機器。

測試結果

Browbeat 能夠儲存 rally 生成的測試結果。這些測試結果可以用elastic 進行查詢，用Kibana顯示在網頁上。

接下來的幾個表格分別展示 average times, 95th percentile, Maximum, 和 minimum times 下的 API執行效能

譯者注：百分比響應時間是比平均響應時間更好的效能指標。

因為在數理統計學上，早就有數學家用理論證明平均值是一個非常不可信的資料。

比如稍微幾個遠離置信區間的數值就可以嚴重影響到平均值

譯者注：average是指併行度為8、16、32的場景下完成測試所需時間的平均值

API	ML2+OVS Average	OVN Average	% improvement
nova.boot_server	80.672	23.45	70.93%
neutron.list_ports	6.296	6.478	-2.89%
neutron.list_subnets	5.129	3.826	25.40%
neutron.add_interface_router	4.156	3.509	15.57%
neutron.list_routers	4.292	3.089	28.03%
neutron.list_networks	2.596	2.628	-1.23%
neutron.list_security_groups	2.518	2.518	0.00%
neutron.remove_interface_router	3.679	2.353	36.04%
neutron.create_port	2.096	2.136	-1.91%
neutron.create_subnet	1.775	1.543	13.07%
neutron.delete_port	1.592	1.517	4.71%
neutron.create_security_group	1.287	1.372	-6.60%
neutron.create_network	1.352	1.285	4.96%
neutron.create_router	1.181	0.845	28.45%
neutron.delete_security_group	0.763	0.793	-3.93%

譯者注：95%指4500個測試結果按從小到大排列，選擇第4275個的資料（4500*0.95=4275）

API	ML2+OVS 95%	OVN 95%	% improvement
nova.boot_server	163.2	35.336	78.35%
neutron.list_ports	11.038	11.401	-3.29%
neutron.list_subnets	10.064	6.886	31.58%
neutron.add_interface_router	7.908	6.367	19.49%
neutron.list_routers	8.374	5.321	36.46%
neutron.list_networks	5.343	5.171	3.22%
neutron.list_security_groups	5.648	5.556	1.63%
neutron.remove_interface_router	6.917	4.078	41.04%
neutron.create_port	5.521	4.968	10.02%
neutron.create_subnet	4.041	3.091	23.51%
neutron.delete_port	2.865	2.598	9.32%
neutron.create_security_group	3.245	3.547	-9.31%
neutron.create_network	3.089	2.917	5.57%
neutron.create_router	2.893	1.92	33.63%
neutron.delete_security_group	1.776	1.72	3.15%

譯者注：Maximum是指併行度為32的場景下完成測試所需時間

API	ML2+OVS Maximum	OVN Maximum	% improvement
nova.boot_server	221.877	47.827	78.44%
neutron.list_ports	29.233	32.279	-10.42%
neutron.list_subnets	35.996	17.54	51.27%
neutron.add_interface_router	29.591	22.951	22.44%
neutron.list_routers	19.332	13.975	27.71%
neutron.list_networks	12.516	13.765	-9.98%
neutron.list_security_groups	14.577	13.092	10.19%
neutron.remove_interface_router	35.546	9.391	73.58%
neutron.create_port	53.663	40.059	25.35%
neutron.create_subnet	46.058	26.472	42.52%
neutron.delete_port	5.121	5.149	-0.55%
neutron.create_security_group	14.243	13.206	7.28%
neutron.create_network	32.804	32.566	0.73%
neutron.create_router	14.594	6.452	55.79%
neutron.delete_security_group	4.249	3.746	11.84%

譯者注：Minimum是指併行度為8的場景下完成測試所需時間

API	ML2+OVS Minimum	OVN Minimum	% improvement
nova.boot_server	18.665	3.761	79.85%
neutron.list_ports	0.195	0.22	-12.82%
neutron.list_subnets	0.252	0.187	25.79%
neutron.add_interface_router	1.698	1.556	8.36%
neutron.list_routers	0.185	0.147	20.54%
neutron.list_networks	0.21	0.174	17.14%
neutron.list_security_groups	0.132	0.184	-39.39%
neutron.remove_interface_router	1.557	1.057	32.11%
neutron.create_port	0.58	0.614	-5.86%
neutron.create_subnet	0.42	0.416	0.95%
neutron.delete_port	0.464	0.46	0.86%
neutron.create_security_group	0.081	0.094	-16.05%
neutron.create_network	0.113	0.179	-58.41%
neutron.create_router	0.077	0.053	31.17%
neutron.delete_security_group	0.092	0.104	-13.04%

分析

從上述表格中可以看到OVN對效能提升最猛的就是"nova.boot_server"。這是測試不只是衡量從Neutron載入設定所花時間，同時也衡量了提供網路功能所花的時間。（譯者注：這個"server"其實就是虛擬機器）

當Nova 啟動一個虛擬機器時,得先等待Neutron發出的埠可用事件。收到這個事件後，虛擬機器才會被啟動，啟動完成時變為ACTIVE狀態。ML2+OVS 和OVN 都使用這個機制。我們的測試場景測量了虛擬機器變為ACTIVE狀態所花時間。

在未來的測試中，我們將把這個Nova和Neutron間的同步機制關閉，再來比較ML2+OVS 和OVN的測試結果。這將確認等待Neutron報告埠可用的過程中花費了額外的時間。

我要說明一點，你不應該關閉這個同步機制。關閉這個機制的唯一原因為：不是所有的Neutron後端都支援該同步機制（ML2+OVS and OVN都支援這個同步機制）。實施同步機制後，就能避免出現競爭狀態。同時也保證在啟動虛擬機器之前，網路就是可用的。這個問題就是花費多長時間能讓Neutron提供可用的網路。未來將分析Neutron (ML2+OVS)在提供網路功能的過程中，到底在哪裡花費了大部分的時間。

本文永久更新連結地址：http://www.linuxidc.com/Linux/2017-06/144766.htm

效能測試之OVN vs (ML2+OVS)

控制平面的不同之處

測試環境使用的硬體和軟體

設定和測試

測試結果

分析

熱門文章