The Solsky Evaluation

Sort:
solskytz

Why should only computers have their "numeric" evaluation? I want one too. 

It's only reasonable that I would - seeing that with computers, evaluations go from zero (equality) to infinity or a ridiculously high number (equals checkmate). 

Humans generally go through the steps of equality - slight advantage - clear advantage - winning - but they have no established way (short of leaning on computer numbers, and that only rather recently) to speak about the various nuances and degrees of winning positions - see lower down to get what I'm talking about...

I will be using numbers for the evals - but there is NO PARALLEL between my numbers and computer numbers. 

Computers calculate an advantage in pawns. I'm just offering a scale of ARBITRARY numbers, which don't directly represent ANYTHING. 

For a computer evaluation of 2.00, you can ask - "two of what"? and the answer will be "two pawns or the equivalent thereof". 

For a Solsky 2.00 evaluation, you can ask "two of what" all you want... 

- - - - - - - - - - - - - - - - - 

Here is the Solsky Evaluation metric:

0.00 - equal positions.

0.50 - At least equality for white (this is when you still don't want to speak even of a slight advantage. They used that a lot in the Israeli chess club where I started to learn about the game so many years back)

1.00 - white is a bit better / slight advantage

1.50 - slight to significant advantage/ between slight and clear

2.00 - white is clearly better

2.50 - white has a large advantage (but you're not sure that it's actually already winning objectively)

NOTE: these evaluations, being human, are largely individual. A position which is a clear advantage from my point of view, could be, for example, clearly and easily winning to an IM. These evaluations emphasize the importance of really having your own point of view, and knowing what you are capable of and where you come short. 

3.00 - white is winning, but will have to go through hell to actually pull it off (considerable technical difficulties in realization - may easily slip back into "clear advantage" territory or worse)

3.50 - white is winning, but not before he'll have to jump through some hoops (some technical difficulties in converting - losing the winning advantage is still a not-too-unlikely scenario)

4.00 - white is winning, if he plays reasonably well (here we don't expect him to let go of his grip - but stuff happens, you know... at the very least he could still make the win harder for himself and drop a half-step or a whole-step back...)

4.50 - Only a miracle would save black (we can see how that can happen, but frankly, it would really take a miracle for him to pull it off. Of course, we've already seen miracles happen in our times...)

5.00 - black is utterly hopeless and can resign with a clear conscience (of course, he's still playing on, and we begin to think nasty things about his conscience...)

5.50 - black's position is likened to the carcass of an antelope, lying on the ground with an open stomach, into which three tigers (the white pieces) are prying, getting meat out of (white pieces competing between themselves on which one causes more damage, eats more pawns, menaces the king more, etc.)

5.50 is a position where if your friends see you play that position, they wonder why you are still playing it, and reconsider their friendship with you. So if you have this position where your friends are present, YOU would feel this INTERNAL URGE to resign - as opposed to eval 5.00, where resignation is just the "respectable" option, but still without any particular emotional drive to actually go and do it. 

6.00 - we give this eval number to a position where we know that we have checkmate in a known number of moves (we know that we mate in seven, no matter what black does, and yes, this includes silly queen interpositions and self-sacrificing time-wasting checks on g1 where we grab the checking piece immediately with no positional relief for the loser)

6.50 - the evaluation for mate on the board - where the capable opponent can still keep staring at the board, not believe that he's REALLY checkmated (he was winning until a move ago, after all, wasn't he?!), think that the game is STILL going on and busily trying to FIND A WAY OUT of the checkmate if only he looks hard enough. 

AyoDub

Ok, but what practical value does the scale have?

solskytz

It's good for you when you're playing chess and where your position is winning - you can examine different paths and then decide just "how much winning" it is. 

It is also good when you're losing and trying to find the way to make your opponent's life as tough for him as possible. 

When you're winning, or even just better, you can also use it to try to figure out which resources your opponent has, to make life more of a hell for you...

- - - - - - - - 

In addition, the scale explains how come you were "totally" winning (but maybe only +3 or +3.5 Solsky scale) and then lost your winning advantage...

It's good to differentiate between different types of winning positions, in short :-)

dzikus

There are also positions which are objectively equal but practically one side has advantage because the opponent has to make a number of only moves (very hard to find) to keep balance.

In this kind of positions human's evaluation differs much with engine's point of view: computers say 0.0 (they find that sequence of saving moves) but, say, you are playing against an opponent who is 500 elo points below unikely for him to find those hard moves.

I had this kind of games: feeling I should win but still waiting anxiously if my opponent plays the saving moves and defends.

solskytz

<Dzikus> This is totally seconded!!

Such positions as you describe are in no means equal by human standards - and their evaluation depends on skill level - as it will be much different for 2700s than it would be for 2000s or lower. In addition, personal preferences and familiarities will need to be taken into account. So a personal evaluation system becomes a necessity. 

IpswichMatt

You've got two 5.5's in your list, Solskytz - better fix that before the rest of the trolls arrive...

Or is the second 5.50 para just an elaboration on the first?

solskytz

Yes, Matt, it is indeed - and thanks for your concern :-)

Elubas

Interesting. I kind of pretend I'm a computer when thinking about these things in my head -- not because I want to think like a computer, but because I have an idea of what computer evals go with what sentiments about the position. Usually when it says about 1.3 it means you are very likely winning, but you may have to go through a convincing technical process to "prove" it. Then from there it's just various degrees of how clear/easy the win is. Sometimes of course the 1.3 just means the horizon effect is making the computer misevaluate things, but you get the idea I think. After all, a piece advantage can eventually turn into a larger material advantage, etc, and eventually checkmate -- you just need the technique.

But even a modest .3 advantage feels pretty pleasant -- the opponent's position is very solid, hard to break down, yet is unlikely to attack you and you are free to, gradually, put some pressure on him. And by .3 I mean the kinds of positions Houdini tends to evaluate as .3.

Elubas

"There are also positions which are objectively equal but practically one side has advantage because the opponent has to make a number of only moves (very hard to find) to keep balance."

I actually don't experience many situations like this -- generally what makes an objective evaluation strong are the positional/tactical trumps in your position, which in turn make it easier to play. In fact the main time the situation describes happens in my experience is during a tactical combination that has to be executed right in order to get a playable position.

ozzie_c_cobblepot

Holy cow. TL;DR. Actually I did but didn't get past 2.50. While I can appreciate that 2.50 doesn't compare with 2.50 in computer parlance, is there any reason that numbers are used here? They seem much more like layers to me. Numbers imply comparison -- 2.00 is twice as good as 1.00, even if you cannot compare 2.00 in OP-speak to 2.00 in Rybka-speak. If you change it all to layers, then the entire post becomes an OP-exploration of the nuances of the chess vocabulary.

Granted, I already knew that the vocabulary is pretty nuanced -- just try to explain to someone that white is better but not winning, or what on earth a "won game" is - and why it's so difficult to win.

I find it difficult already to explain to six-year-old Cobblepot what it means when I say "and white is winning" - because he very often will point out to me that just losing your queen doesn't mean you're losing, and that he won a game once after losing his queen, which is true. So then I explain that at my level going up a queen (especially in that position he had, which was basically an endgame position) is the same as winning - that I would be able to beat Carlsen from that position. And somehow it still didn't sink in - he was just saying "but it's not _winning_, because you haven't _won_ yet, and I still _could_ win, right?"

solskytz

<NM Ozzie> Of course, my OP numbers are totally arbitrary - so 2.00 isn't "twice as good as "1.00". 

I suppose that this is the same also in computer evals... If you have a "+1.50" advantage in a game you have a winning position. So is a "+0.75" position "half-winning"? :-)

And I think that your son has a great future in whatever he would choose... six years old, and already poses the right questions!! Bravo :-)

Maybe "Solsky scale" will actually help you explain to him the diverse nuances? Try it, then let me know how it went :-)

solskytz

<Elubas> Computers (sometimes) have no idea what I find easy and what I find hard... they could give me two positions and evaluate both +2.82 - with one I'll beat Houdini blindfolded 100 times out of 100, and with another - I'm not so sure that I'll beat a 2500 GM every time - even without considering tough tactics. 

Position A is a pawn-up pawn ending, clearly winning in the long(ish) run but with a "horizon effect"

Position B is the starting position, minus the a8 rook...

This is an extreme case... 

I want to get back to Matt's idea - there are many positions which Houdini evaluates as around 0.00 although you are down a pawn. Ever happened to you?

And did you see how often these positions little by little drift towards -1.00 the more you play them - and you play normally, as a reasonably strong human player would?

And did you then try to make Houdini "prove" the 0.00 evaluation, and then see how he finds incredible, amazing, impossible positional/tactical "mixes" which really make it so?!

I had it demonstrated to me too many times already... so I'm with Matt here. 

(You would say that Houdini would do that to you also with 0.00 positions with equal material - and you're right... but against a human I'd still rather take the equal material 0.00 position than just be down a pawn for compensation which I don't see or understand...)

solskytz

<NM Ozzie> Coming back to "numbers and their meaning", I'm now reminded of an old childhood joke, when you approach someone, and say to them enthusiastically - "you know, you are half a genius!>. They would be complimented, and give you a puzzled look, and you continue - "A genius is 140 IQ - and you have 70!".

Elubas

"And did you see how often these positions little by little drift towards -1.00 the more you play them - and you play normally, as a reasonably strong human player would?

And did you then try to make Houdini "prove" the 0.00 evaluation, and then see how he finds incredible, amazing, impossible positional/tactical "mixes" which really make it so?!"

 

Oh yes, I totally agree with that. But that merely suggests that engines are fallible, which hopefully we all know. In general, I tend to find computer evaluations lining up pretty well with my comfort level in a position. I do know what you mean, I just don't seem to experience it too much myself. A lot of people do, but I guess I'm an odd one :)

 

No doubt though, if one studies with an engine, they have to do it right. In fact, getting good with an engine is a tricky but valuable skill that may take a lot of work to develop. I think I have the right balance of objectivity and scepticism when using one, but I certainly didn't in the past!

IpswichMatt

It's one of my chess ambitions to one day sacrifice a pawn for improved piece activity, and later have Houdini confirm that the sacrifice was correct. But it hasn't happened yet, I'm not good enough. (I don't count Queen's Gambit Accepted etc!)

At least now I have a new way to describe the positions I usually seem to get - now I can say "Bollox, I'm once again on the wrong side of a 5.5"

najdorf96

Ultimately, such an eval system must be used demonstrably. Put up an game, say one of your own or any other one that is easy to follow, and intergrate it. Maybe then we can readily see how helpful your eval ratings are as opposed to the traditional anecdotes or += (slight plus for white), ?! (dubious move), !! (Awesome move) system compares.

najdorf96

(For humans of course, not accustomed to numerical evals like the engines must use, since they cannot, for now, express it in an more intimate way...like how we do with one another?!)

solskytz

<IpswichMatt> That's right :-) when you know what hit you, the blow is softened maybe... or is it?!

<Najdorf> don't worry, man - although this eval system has been with me for 16 years already, ever since I published a considerably longer article about it in the Israeli national chess magazine, I have now put it here for a reason, as indeed, I do have something in mind :-)

I do have a specific game in mind, it is true - but also, from now on, I will be using this eval system also in my other future annotations... so I figured out, if no article about my eval system existed, how could anybody ever understand what I was talking about when I said, for example, "black is totally 5.5'd" (synonymous with "PWNed", give or take) ?!

 

Please note, that I suggested no alternatives to ?! and !! - I would still use them in exactly the same way. 

Also += is still there - it is my +1.00. The difference between my evals and "traditional" evals is only in the "nuancing" of the various types of winning positions. Until you are "winning", there is really no difference - except for my arbitrary numbers...

As the scale is arbitrary, of course, instead of "0, 0.5, 1, 1.5, 2, 2.5 etc." you could call it "0, 1, 2, 3, 4, 5 etc." or, "A, B, C, D, E, F etc.", or "January, February, March..." or whatever lights your fire. 

I decided to call it this way, and I will be consistent in the use of these numbers, if it's any consolation for anybody :-)

zenomorphy

Just curious, ...any corollary to your earliest days of musical development, ...honest introspection of stages of proficiency, ...critical feelings & self-perception regarding progression, expertise, ...useful along the path of advancement to mastery? I can easily imagine a young critical & creative mind developing such a cool tool :). 

solskytz

Not that I'm aware of... :-) generally it's other people who write these kind of biographies, analyzing the different stages in development of the master :-)

Soon I'm going to post the game (in a new thread) that has inspired me to dig out (from 16 years ago) this eval system... a game where I was pretty much winning well before move 15, but which dragged for a further 25 moves - and where the "winning" scale had to be used, to make sure that my opponent never gets more "chances" then he deserved.