Replacing ovs-vsctl calls with native OVSDB in neutron
Summary
I currently have a patch that needs review that adds a new drop-in replacement for ovs_lib that uses openvswitch’s python bindings to make OVSDB calls instead of running ovs-vsctl. Here is the spec that was approved for Juno which I will need to update for Kilo.
Both the current ovs_lib and ovs-vsctl seem to scale quadratically with the number of ports on a system where ovs_lib2 scales linearly.
Please take a look at the review and make suggestions. There’s still some stuff to do, but it should be in a testable state.
Benchmarking
Test setup
Test setup is just a devstack VM with the dummy network kernel module loaded and set to create 1000 dummy devices. Create /etc/modprobe.d/dummy.conf with:
and then:
Baseline - bash and ovs-vsctl
First, let’s get rid of the necessity for using sudo by just quickly doing:
Now we can test adding 100 ports w/o sudo overhead:
So that isn’t too bad, actually. What happens if we use sudo?
So we’re about 5x slower just having to use sudo from the CLI. What about rootwrap?
Using sudo rootwrap is around 20x slower than the baseline of using no privilege escalation tool at all.
Now, what about adding 1000 ports? Does it scale linearly? Do we get around 13 seconds for adding 1000 ports with no sudo?
No, we do not. ovs-vsctl does a dump of most of the database each time it runs. The more ports in the DB, the slower each successive call will be.
Testing ovs_lib1 against ovs_lib2
Here is a simple script to benchmark ovs_lib1 against ovs_lib2.
which results in:
So calling ovs-vsctl directly with sudo seems to be about 2x as fast as using ovs_lib1 to do the same thing and using ovs_lib2 is roughly the same speed as calling ovs-vsctl directly without sudo.
What about with 1000 ports instead? Does ovs_lib2 scale better than ovs-vsctl? Let’s bump the range(100) to range(1000) and remove the greenpool stuff (1000 spawns is just going to cause ovs-vsctl timeouts and open file descriptor errors) and see:
Yes! ovs_lib2 does scale linearly. Even though OVS’s python IDL library does cache the database, it does it upon connection and ovs_lib2 maintains that connection and reuses it.
Caveats
If we go this route, we’ll have to talk about the recommended way for handling privileges. This could be done via connecting to openvswitch via TCP/SSL and controlling access via firewall rules and/or having deployment tools/packaging modifying the owner/permissions of the ovsdb unix socket.
Conclusion
Even without the overhead of sudo or rootwrap, there is room for dramatically improving performance of OVSDB operations by moving away from calling ovs-vsctl. Please take a look at the review even though I have it marked as Work In Progress. I crave your feedback!