Is there a program that..

Sort:
ChessinBlackandWhite

Is there a program that I can plug games into and will show the aproximate stregnth that each side played at?

Rsava

Hey Michael, take a look at LucasChess:

http://lucaschess.host22.com/

I just started playing with it but it appears to do that - I think.

ChessinBlackandWhite

Does it allow you to enter like pgn? I am interested in looking at past results OTB

Irontiger

This does not answer the question, but if one side crushed the others, it will be hard, no matter how well-thought the program, to assess the stronger side's rating. I mean, Carlsen and I would play moves of approximate equal value (if not the same moves) against a 1200 beginner, because there is nothing subtle to see in the totally lost/won positions.

IpswichMatt

Lucas chess is a worthwhile program and worth downloading, but I don't believe it will do what the OP is asking for.

I think I've seen this question before on these forums, and the consensus was that there is no such program.

Which is surprising, because I'd have thought someone would have had a go at this. I guess the problem is that the program would work by comparing each move chosen against an engine's evaluation, and it's much harder to get a close correlation with the engine in complicated positions than in simpler ones.

Shivsky

This is a fascinating question : Just how would any tool make an objective evaluation of the player's strengths across the whole game?  

Most "Playing vs Computer" tools like ChessMaster or Fritz will typically recommend pairing you UP (vs. a higher rated player) as you keep winning ... but to make an assessment in one game what Player A or B might be rated sounds rather error-prone.

Update:

Perhaps a set of "negative" metrics like a Driving License Test could be a start : You start the game at the engine's "rating" (2500+) and as it evaluates each move ... based on the mistakes/blunders/inaccuracies you played (qualitatively and quantitatively), it docks points (low as 10 and high as 500 etc.) and keeps doing this while normalizing how often you deviate from the move the engine recommends (some nifty stats calculation). In the end, you end up with a +/- 100-200 point assessment of what it thinks is your playing strength.  Not so accurate at the "higher" end but it should be able to say "Player A *has* to be atleast ____ rating"

TBentley
IpswichMatt wrote:

Lucas chess is a worthwhile program and worth downloading, but I don't believe it will do what the OP is asking for.

I think I've seen this question before on these forums, and the consensus was that there is no such program.

Which is surprising, because I'd have thought someone would have had a go at this. I guess the problem is that the program would work by comparing each move chosen against an engine's evaluation, and it's much harder to get a close correlation with the engine in complicated positions than in simpler ones.

Your last paragraph is accurate. Here are a couple of articles that employ that technique (for a person's career or a tournament rather than a single game):

http://en.chessbase.com/home/TabId/211/PostId/4003455

http://chessbase.com/Home/TabId/211/PostId/4009400/the-quality-of-play-at-the-candidates-080413.aspx

And here's one that discusses the technique:

http://en.chessbase.com/home/TabId/211/PostId/4007621

IpswichMatt

@Shivsky, yes I would have thought such a tool would be available, but I'm not aware of any. As I said, I believe that the problem is that it is much more difficult to play the best moves when the position is complicated.

There was a study done to attempt to determine the strongest chess player of all time by comparing their moves with Houdini's, so they must have developed a tool for that. Capablanca was shown to have the closest correlation. The authors pointed out though that Capa's games tended to contain simpler positions, so it wasn't really a fair comparison.

IpswichMatt

@TBentley, thanks for that - I wasn't aware they'd done this analysis with the Candidates matches

Irontiger
Shivsky wrote:

Perhaps a set of "negative" metrics like a Driving License Test could be a start : You start the game at the engine's "rating" (2500+) and as it evaluates each move ... based on the mistakes/blunders/inaccuracies you played (qualitatively and quantitatively), it docks points (low as 10 and high as 500 etc.) and keeps doing this while normalizing how often you deviate from the move the engine recommends (some nifty stats calculation). In the end, you end up with a +/- 100-200 point assessment of what it thinks is your playing strength.  Not so accurate at the "higher" end but it should be able to say "Player A *has* to be atleast ____ rating"

If I understand your idea, it is the following :

1-each move has some value, scaling from "perfect" to "huge blunder", that the computer can assess ;

2-ratings are converted to probabilities to play a "perfect" move, a "huge blunder", etc., depending on how many such moves are available (for example if all moves except one hang the queen, even a weak player could find it, whereas if all moves but one lose in 40 moves it needs good calculation) ;

3- by matching some player's performance regarding how much blunders etc. he makes in a game, the computer converts it to rating. (with some error margin depending on the length of the game, etc)

 

I agree perfectly on 2- and 3-, but I don't think a computer is adequately able to discriminate between blunders, mistakes, etc. A move that shows great planning skills but loses a piece due to a 20-move tactic would be deemed worse than a passive waiting move by the computer. Humans and computers do not think the same way.

To sum up my objection : I think it would be easy to make a program that allows computers to evaluate other computers, but not to evaluate humans.

 

Add to this my previous problem that a move that takes 10 more moves to mate is deemed a blunder by the computer, even if it is safer humanely speaking.

ChessinBlackandWhite

Hmm i was thinking using databases of all levels of players one could create a program that could give a statistical analysis of what rating range most likely would play each move in the game and then average the ratings given for each move. some moves would have more significance that others but it does not seem outside the capabilities of people today and computers. this is a disappointment :P

ChessinBlackandWhite

I like the negative metric idea. i don't need an answer like player a played at 1651. but a range like 1600-1700 or 1650-1700 would be helpful i think

Shivsky

@IronTiger:

Good points ... the algorithm could be tweaked to stop being "negative" about the human not playing the most efficient mate or cleaning up the board in the most efficient manner; if he is winning in a less than optimum (but safe by human standards) then he is still winning so no "points docked" there.  

A classic example is a human (who is winning) playing the h2-h3/4 luft to avoid silliness on the backrank even though an engine would NEVER play that unless it was a factor in a forcing line :)

Though 100% agreement that a move that shows great planning and understanding but missing a 5-7 ply deep tactical shot (that Masters do miss) may cruelly penalize a strong player and make him "appear" like a weak one to the computer.    

So the premise is indeed murky ...

TBentley
Irontiger wrote:

Add to this my previous problem that a move that takes 10 more moves to mate is deemed a blunder by the computer, even if it is safer humanely speaking.

In the analysis in the Chessbase articles, they ignored moves where the player's move and the computer move were both more than +2 or less than -2.

theshrewdking

This is an interesting question but only theoretically. In reality measuring human's playing strength with a computer doesn't make much sense because humans make different kind of moves than machines. But computers can be used with good results to determine other computers' strength.

Shivsky

@Sydfhd: You are missing what this thread is really about.  ELO calculation based on win/loss performance and prior ratings is not what we're talking about here.

Shivsky

Putting it simply : He wants a tool to take in a PGN with game/s between  two people with no prior rating embedded inside the PGN and evaluate how strong those players are based on statistically analyzing the strength/weakness of all moves played.

ChessinBlackandWhite

yes, but not compared to an engine, but compared to what other humans have played. In each position, with say 100 recorded games, what moves did each rating range most often choose to get an idea of where people are playing in a given game

TBentley

If you're talking about looking at existing games, it applies only to the opening, and some opening databases can give you performance ratings. For example, in chesstempo's database for 2200+ vs 2200+ games, the performance rating for e4 is 2453 and the performance rating for d4 is 2473. This is probably explained by the fact that the average rating for the white player playing e4 is 2416, and 2434 for d4. Which isn't what your opening post asked for.

Shivsky
Sydfhd wrote:

shivisky, the softwares i listed do that exactly.

No, it doesn't ... and there's nothing similar, let alone call it "exactly" the same as this thread topic.

  There is no indication of engine analysis parameters or anything of the sort in the homepage for the software or the usage notes.

(http://remi.coulom.free.fr/Bayesian-Elo/#history)

Your tools only work if the PGNs contain ELO ratings of the players prior to the games.  We are talking about engine analysis making an assessment of player's strength or rating.

Please be clear on what you are posting.