Anyone who wants to do "serious benchmark testing" must be in this special club or else they will take months and months to perform even the most basic tests. When they ask for help, they will be told, as I was by g-dog, that I need to show results first.
This is their "scientific" method.
You understand nothing! Absolutely nothing! One can do a reliable test by hand. One game takes at most one hour.
Let's also be very clear on the distinction between asking for help (assistance with the technical details of the methodology) and asking for "help" (volunteers to whom one can delegate for assistance executing your specific implementation of the methodology).
One demands some level of credibility be built first (especially given the demands on others' time), whereas the other does not.


Steve, do you have to do the analysis and move comparison by hand or is the process automated in some fashion?
Do you just set the engine to Full Analysis with the parameters noted or do you have to walk through using Infinite Analysis and compare?
As has been stated, I use Batch Analyzer, which was developed in 2008 for use on another site.
After several days of trial & error, I managed to get it running in Vista/W7.
The Delphi program has a 4 million game database built in & starts analysis of multiple pgn once out of book & under whatever depth/time per ply conditions you choose.
Here is the interface: