We should have a chess.com tournament with 100 randomly-selected female players and 100 randomly-selected male players, then look at the final rankings.
That should eliminate the male-skewed population sample bias while producing a vaguely statistically significant result.
It's quite an unusual to be in a position to be able to organise something like this (no educational psychologists have access to thousands of players at the drop of a hat to perform such an experiment - I can sense them squirming in jealous paroxysms already).
We should also keep in mind that computer engines are about to completely surpass human skill, so in the future there will need to be a seperate robot championship. Then when the powers of technology and biology are combined, a cyborg championship.