This problem is driving me up the wall - help required!

Sort:
stephen_33

I help with the technical aspects of running the Knockout match league (I use Python vers. 3.9) and one of the functions I carry out is to provide lists of adjusted (for closed accounts/fair-play) match results with each round in a tournament. Here's a typical example....

Obsessive Chess Disorder. (31.0) vs FanatikClub (29.0)(no result)
Chess Dream Team (34.0) vs 1 day per move club (27.0)(1-0)


That format of team scores and overall result (in blue) has been working faultlessly for years but then we began a new tournament with an Arab-themed club and suddenly I noticed this problem...

Team Match Chess (35.0) vs Arab National Team - منتخب العرب (25.0)(1-0)


I thought it was just another niggling encoding type issue and tried using the Python 'unescape' method from the html module to convert the escaped Arabic characters but it doesn't seem to change the output.

This is the club: Arab National Team - منتخب العرب

URL: https://www.chess.com/club/arab-national-team-mntkhb-l-rb

The name string as given in the match endpoint:-

"Arab National Team - \u0645\u0646\u062a\u062e\u0628 \u0627\u0644\u0639\u0631\u0628"


Another thing that took me a while to work out - not only is the result of (1-0) misplaced in the string, it's also reversed! Then I realised Arabic text is always read from right to left. Something seems to be happening that's causing my version of Python to include some non-Arabic text to be treated as if it is and outputing it in the same way (right to left).

The odd thing is, if I omit the match score or the result it outputs correctly in my console but when I copy the text here it's then reveresed...

Arab National Team - منتخب العرب (1-0)
Arab National Team - منتخب العرب (25)

Strangely it's not reversing the match score, only the result. Here's a screenshot of what I see when I print the various strings to my Python console...

Can anyone else repeat these results? And can anyone suggest a solution?

stephen_33

I've tried various different types of fomatting the string, including concatenation but all produce the same result.

Martin_Stahl

This doesn't have anything about Python, that I see, but may give pointers on what to look for.

https://opensource.com/life/16/3/twisted-road-right-left-language-support

stephen_33
Martin_Stahl wrote:

This doesn't have anything about Python, that I see, bit may give pointers on what to look for.

https://opensource.com/life/16/3/twisted-road-right-left-language-support

Thanks for the link - fascinating read and the kind of thing that leaves you needing to lie down in a darkened room for an hour?

But sadly it doesn't suggest any solution to my problem unless I missed that. I think I'll have to treat that club name as a special case and add it to my look-up table for special handling.

acity609

Without looking deep I can tell you in Java there is a 'left-to-right' mark '\u200e' to have it print properly. I suspect it's the same for Python

Martin_Stahl
stephen_33 wrote:

Thanks for the link - fascinating read and the kind of thing that leaves you needing to lie down in a darkened room for an hour?

But sadly it doesn't suggest any solution to my problem unless I missed that. I think I'll have to treat that club name as a special case and add it to my look-up table for special handling.

 

Sorry, I didn't have much time to dig into the article but had hoped it might give some indication on what exactly to look for.

stephen_33
acity609 wrote:

Without looking deep I can tell you in Java there is a 'left-to-right' mark '\u200e' to have it print properly. I suspect it's the same for Python

Thanks - tried that and it works perfectly. I've already fixed this problem with a work around but I'll make a note of that escape sequence in case I come across this problem somewhere else.

stephen_33
Martin_Stahl wrote:

Sorry, I didn't have much time to dig into the article but had hoped it might give some indication on what exactly to look for.

No, it really was useful and is full of tips on how to work around various issues. But I got the impression the author was saying this is a much more complicated subject than most people realise, how a particular browser or software package deals with RTL text is tricky to predict and there may not be a ready solution to a problem.

Since I have to cope with clubs changing their names during a round, I've treated this club as if it's one of those and converted the name to cut out the Arabic text. Works nicely.