Fixing New Analysis

Sort:
dallin

Thanks for your feedback everyone. We are looking into problems with inaccuracy and inconsistent analysis, and working to find the issue (or issues.) At the heart of the problem is different depths analysis is taken at different stages of the experience. All analysis for Web is done by Stockfish 10, but depths can vary based on the task...

Full Game Analysis depth=20
Self Analysis depth=18-22 (this is used for variations and self analysis, and is time-based, with a default of 10 sec, but you can change that in settings)
Retry Mistakes depth=18

If you were to download the game and run an analysis locally with a depth of 30+, you would get a higher level of accuracy still.

The differences in depth can account for some, if not all, of the inconsistencies we are seeing here, especially the difference between depth=18 for Retry Mistakes, and what could be depth=22 (or deeper based on setting selection) for Self Analysis.

We are doing several things that can fix consistency, but taking all of our base line depths to the same level (20) will be the best way to assure consistency.

Consistency is great, but what we really value is Accuracy. We want things to be consistently accurate. For those users who do not feel like depth=20 is deep enough, we are adding options for our premium members to change the depth of their Full Game Analysis.

Unfortunately, going deeper will come at a cost of time. We can run a full game analysis at depth=20 on the server in 10-15 seconds. But a depth=22+ analysis would need to be run on the client (your computer processor) and could take up to two minutes based on your processor speed and the game. But we want to give you the choice to make analysis as accurate or as fast as possible.

Thanks for your patience as we get this right. Please let us know any other thoughts you have on improvement.

flashlight002

@GhstlJrg1-0 not sure what threads you are referring to. I haven't deleted anything.

By the way...for all those reading about how this new engine is so inaccurate here is a post I wrote documenting an additional scenario where the analysis engine gave back absolute nonsense. It classified its own move in its own suggested variation as a blunder but when I played the "best" move it gave in the "on the fly evaluation" at the top of the screen....it reclassified the blunder move as an alternative best move!! Ask me how it can classify the same move as a blunder and a best move?? That's just NOT possible!!

https://www.chess.com/forum/view/site-feedback/new-analysis-feature-is-not-accurate-enough

It's ridiculous. I really really hope the chess.com dev team is taking note of all these complaints of how badly this engine is predicting. It is completely unreliable at present. It's a shame, because if it was acting reliably it would be a very nice analysis system and GUI.

 

harbi_canoshi

I am wondering, has a woman been involved, with all these latest changes. They all seem far more complicated, than they used to be. There is too much information in the new analysis. Most is lost in the thicket.

9thBlunder

I miss the deep analysis. All these changes without any concern to user experience makes me reevaluate my membership.

Romme63

The "standings chart" was somehow bugged on my screen. Only showed 25 moves and not the whole game. Like it didn't fit the computer screen. And for some reason the new UI is laggy? 

giancz91

ignoble, thanks for your answer!

Sorry, maybe I've been a bit too blunt, I know you worked hard for that but it's very inaccurate and that disappointed me a lot, specially because the old analysis was very good and there was no reason at all to change it in my opinion. Personally, I don't care about impressive aesthetic with lots of colours, or about waiting, I just want an accurate analysis, and I don't have it anymore.

In the meanwhile you fix it, could you just give us an option to use old analysis? I don't care at all about waiting times, it's worth it. Thank you in advance!

giancz91
9thBlunder ha scritto:

I miss the deep analysis. All these changes without any concern to user experience makes me reevaluate my membership.

Yeah, old analysis was very good, why change it? For impressive aesthetics? It seems the world goes this way, appearance is getting more important than substance, but I don't like it. And in chess that's especially stupid. You don't win with magnificent colours and graphic effects.

giancz91

@ignoble P.S. I don't believe problem is the difference in depth. I didn't even use "Retry Mistakes" option, still I saw an impressive amount of mistakes. I think problem is it has few time to calculate.

smozi

I would appreciate the option to run the slower, deeper analysis in the browser. The quick server-side analysis is nice but I had no problems with waiting for a better one. I do like the new interface, though I wish that Retry Mistakes did not require the diamond membership - if you were hoping to match up well against lichess, that's a bit of a strike against it. In any case, good luck with polishing this feature.

flashlight002

@ignoble thank you very much for making contact with us. I certainly appreciate it. I am happy to hear that you are working on how to improve the accuracy and consistency of the predictions. This is very very good news.

I also noticed how very inconsistent it was across the different parts of the system. I personally am not even worried if I have to wait longer than 2 minutes or more to do the analysis client side. I mean on the old system a full scan took between 5 and 7 minutes on average...and that didn't worry me. I am more interested in the engine pushing out a highly reliable set of results that I can trust at the end of the day across all its functions. So I am very happy with the idea to give members the option and choice to increase accuracy dramatically across all areas of the new tool, from initial scan to all the other functions (retry, self analyses, etc), hopefully with depths in excess of 25 to 30 or more half moves for the initial scan and across the other sections too! In my opinion a depth of 18 is too low. Even 20 is too low. I mean when I cross checked the terrible results I was getting with the new analysis tool on another program of mine also running Stockfish 10 to see what variations and results it would give me, my program was running at depths of 26 most of the time and even climbed to 50 half moves at certain points near the end! It never even went to depths of 18 or 20. But it returned results that were far far more accurate, and therefore trustworthy.

I am sure you and your team have worked very hard to create this tool, and I can see that you want it to work properly. Giving users the flexibility to determine engine depth and options/settings to improve accuracy from initial scan to all other areas is a great idea. Give us the ability to do this and I will certainly be very very happy. I mean the old system had this ability to choose different levels of accuracy and therefore different speeds that the analysis finished in. 

I look forward to you fixing this all and presenting an even better product. Then I will be truly impressed! 

Please can you keep us up to speed and let us know when you have solved the problems. 

Holding thumbs for you and the team @ignoble happy.png. Please know that we value your work and what you and your dev team are doing on this site.  

flashlight002

As an example of how inaccurate this new engine is here is a screenshot showing a move being played from a variation chosen by the new analysis engine. As you can see the analysis engine has reclassified its own move suggestion 20...Bxe3+ as a blunder. This is a big no no. I am seeing this kind of thing ALL the time. In the variation circled the analysis engine reclassified 2 moves as blunders. There were also inaccuracies in the mix too. So basically an inacurrate variation dished up by the new analysis engine! And this is with engine time limit set to 30 seconds. The default currently is 10 seconds

I really can't wait till they fix this properly. 

AAJorg

Yeah. In one of my engine-suggested variations the engine just hangs and loses a bishop completely without a motive. And I have seen things like that several times. If the variations are such garbage beyond some number of moves they should at least not be shown beyond those moves    

dallin

Thanks all, especially @flashlight002 for the continued feedback. We are working to address your concerns here in two ways...

1. Allow premium members to select depth
Testing this all right now, but our options will likely be:
10 (for users who just want a basic blunder check)
18 (for users who want something fairly accurate, but fast)
20 (our default setting, and a good balance of accuracy and speed - we can share some excellent studies on the accuracy of d=20)
22 (very accurate, but takes longer)
26 (extremely accurate, but would need to be done on the client, and will take several minutes)
30 (insanely accurate, done on client, and you can have lunch while you wait.)

2. Tie Feedback to Full Analysis depth
This includes move feedback for variations and Retry Mistakes. This will solve most of the inconsistencies we have between the initial game analysis, variations, and retry mistakes. This will have all users waiting a bit longer for this feedback, which is currently set to a depth of 18, but the consistency will be worth it.

If you have any concerns about the proposed revisions, please share.

flashlight002

Hi @ignoble only a pleasure helping out to make the tools on chess.com the best they can be happy.png 

Your solutions are sounding good. 

What do you mean by "2. Tie Feedback to Full Analysis depth"?  Are You referring to the depth the engine will now function at on the fly....e.g. when one clicks on a move in a variation or retry variation? And what do you mean by "full analysis depth"? What depth are you referring to? 

Once again thanks to you and the team for your continued efforts in working to get this right.

dallin

Yes, I am referring to on-the-fly analysis there, @flashlight002. Full analysis depth is referring to the initial analysis that is done when your game is first analyzed.

There are other issues with consistency and accuracy that we are working to solve. We won't have all of our issues resolved until next week. Thanks for your patience and continued insight!

flashlight002

@ignoble thanks for this update and expected time line. I forgot to ask.. Am I right in assuming you are using Stockfish 10 as the analysis engine? I don't think the release article that was put up on the site when the new sysytem went live actually mentioned what engine is being used! Out of interest what Kn/s and/or M/n/s are you running at? Good luck ironing out all the additional issues affecting accuracy!! I know we all will be very grateful once you have cracked it! Plus it's only right that the best chess site in the world has the best analysis tools! 

We will await your next communique once it's all working perfectly happy.png 

HebrewWildChild

Aside from taking my most mundane and almost forced defense of a weak pawn likely to be lost anyway and scoring it as 'brilliant', it's also killing me on data usage, just to review one position, just killing me. I can stay on here for everything else but I had to start going back over to another dare-not-name server to even try to post-mortem a game. Sorry not sorry.

flashlight002

@PawnstormPossie once the initial scan of your move list is done it is technically "done" and saved. If you go back to the game in your archive and click "starts analysis" in the block that appears stating who won etc, (or access it via the link "computer analysis” in the game info tab) it will bring up the full analysis again with everything analysed in place to review again quite quickly. I have tested this. Only when you start introducing new line variations to the game list does a new "save" icon appear and if you click it all these new variations will be there next time you access the analysis.

@HebrewMenace if you don't want the system to consume data untick "show lines" in the analysis tab for a start. I have checked and once the scan is done and you click on the moves in the move list that have been analysed it does not need more data. I actually switched off my data connection and ran through the analysed game and it showed all the move classifications, suggested moves etc. It is only when you start clicking on new lines variations or say using the retry section and playing through a new line variation based on a retry that you will need data as it re classifies the moves in the new variation lists inserted in the original game move list.

flashlight002

@PawnstormPossie the way I understand it is they are working on a way where you will be able to choose the depth of the analysis. See ignoble's further explanations of all the depths they are considering giving as options higher up in the post feeds. They consider a depth of 22 to be very accurate and 26 to be extremely accurate. 

Personally I would be putting my money on the 26 depth but then it will run on the client side. So the time to complete will probably be like the old full scan...but I am guessing here. 

When the new revised system parameters come out I will check out all of them! Even the 30 deep! happy.png 

batgirl
ignoble wrote:

Thanks all, especially @flashlight002 for the continued feedback. We are working to address your concerns here in two ways...

1. Allow premium members to select depth
Testing this all right now, but our options will likely be:
10 (for users who just want a basic blunder check)
18 (for users who want something fairly accurate, but fast)
20 (our default setting, and a good balance of accuracy and speed - we can share some excellent studies on the accuracy of d=20)
22 (very accurate, but takes longer)
26 (extremely accurate, but would need to be done on the client, and will take several minutes)
30 (insanely accurate, done on client, and you can have lunch while you wait.)

2. Tie Feedback to Full Analysis depth
This includes move feedback for variations and Retry Mistakes. This will solve most of the inconsistencies we have between the initial game analysis, variations, and retry mistakes. This will have all users waiting a bit longer for this feedback, which is currently set to a depth of 18, but the consistency will be worth it.

If you have any concerns about the proposed revisions, please share.

+1