I've spent the last few days working on a support case where a customer was comparing the speed of Connector/Net with an application that uses our native C api. His test was using a server side prepared statement to insert 3 sets of values in a loop that executed 600,000 times. The table is a simple 3 column table and the SQL looks like this:
INSERT INTO test VALUES (@p1, @p2, @p3), (@p4, @p5, @p6), (@p7, @p8, @p9)
He was seeing 34,529 records / second using libmysql.dll but only about 24,000 records / second using Connector/Net. He understandably wanted to know why there was such a difference and if we could do something about it. Absolutely!
We solved this problem through a combination of optimizations made to our code and some changes made to the testing application. I won't say much about the optimizations. We made some obvious changes like moving a lot more of the prepared statement work to the prepare phase and out of the execute phase. We also discovered that BitConverter.ToBytes() is a lot slower than our hand-coded methods for reading and writing integer values. We replaced usages of ArrayList with generic equivalents (it's pretty amazing how much faster generics are actually).
One of the bigger changes we made was to use a second hash for parameter name lookups. When you are talking about loops with lots of iterations even the smallest inefficiencies can kill you. Since .NET supports name lookup for parameters we use a hashtable to map parameter names to indices (we made this change when one of our customer was complaining that parameter lookup was slow on one of his commands that had > 30,000 parameters. Yeah.). Name lookup should be case-insensitive so we use a case-insensitive comparer with our hashtable. So the optimization we made this time is to use two hash tables. The first is case sensitive and is checked first. If that fails then we check the second and return failure only if both fail.
That's what we did in our code. We also made some changes to the testing app to make sure the comparison is truly apples to apples. Let's first look at the loop code in each case.
Don't get hung up on the output of these being slightly different. The point here is that the C code is using simple pointer arithmetic during each loop where the C# code is doing array lookup. The array lookup code is quite a bit slower because the get accessor for the parameter collection does some bounds checking so that a nicer exception will be thrown in the event of an out-of-bounds index.
Another change that was made to the testing app was to disable the command timeout in the C# code. This is a small point but when the command timeout is greater than zero then a timer is constructed and started during each execution. In this case that is 600,000 timers! Again, since the C code doesn't provide any type of command timeout it's not a true apples-to-apples comparison.
So, in a nutshell, benchmarking against libmysql is sort of like racing a jet engine. Very fast but not a great deal of "safety features". Still, after many hours with dotTrace the results speak for themselves.
C app = 35,294 records per second.
C# app = 35,321 records per second.