yeah it will be able to calculate more moves ahead
A 3000 could easily beat a 2000, but could a 4000 easily beat a 3000?

Isn't this kind of a silly question? Ratings are based on win probability. A rating difference of 1000 means roughly a 99.7% chance of the higher rated player winning. If the 4000 player didn't have a 99.7% chance of winning, then they wouldn't be rated 4000. It's as simple as that.
Some people here saying that there will be many draws, no, many draw happens among similar rating engines, it is mathematically impossible to have two players of huge gap and they make many draws.

Yes, presuming such a player could exist (non-trivial), the 4000 would need to achieve around 99.97% to justify his rating. Doing so would require effort (top performance always does). While such dominance of such a strong player might seem incomprehensible, bear in mind that the present top engines are already capable of getting close to 100% against the top humans.
A lower (but dominant) score by a 4000 against a 3000 could be achieved with less effort (eg in a simultaneous display).

Exactly. Anyone who thinks there would be a bunch of draws has a fundamental misunderstanding of what ratings are supposed to measure. A ratings difference of 1000 BY DEFINITION means a 99.7% chance of the higher rated player winning. If they don't win that much, then the rating difference isn't 1000... BY DEFINITION!

The question is can you get to 4000 elo?
Stockfish's 3800+ elo now is based on imbalanced openings. Stockfish 12 and later versions will simply draw Stockfish 14.1 in every start position game due to NNUE and insane search.

If it was "every" they would technically have the same rating for chess. Forced openings are a different game.
A variant of chess with a specified distribution of forced openings has its own separate Elo scale.

The math is simple. A player rated 1000 points higher than another should score 99.7 out of 100 points.
In chess, a 4000 rating is probably impossible. But in a game that is not a theoretical draw it doesn’t matter whether the lower rated player is 300 or 3000. A 1000 point difference requires the higher rated player to score 99.7/100.

beat this https://www.chess.com/practice/custom

yeah the higher rating the more likely the draw
It is worth noting that this is a separate mode of variation to the expected score.
- Expected score is a monotone increasing function of difference in rating.
- Probability of a draw is a monotone increasing function of rating for a given difference in rating
To put it another way, variance of result is a monotone decreasing function of rating for a given difference in rating.

There will never be anywhere close to 4000, unless chess is solved, and even then they may be around 3700.

When a rating gets 3000 and above, it's all the same right? A 10,000 vs a 3000 would be a challenge.
If the ratings were accurate to playing strength, then a 4000 would mop the floor with a 3000 as if the 3000 were nothing at all.
Heck, even a 500-point difference is monstrous.
Take Stockfish 15 NNUE (3500+ Elo) and pit it against Crafty 25.2 (3000 Elo). See how it turns out.

I've read here that there are no enough human vs engine games to accurately compare their ratings. That's not true: in ICC Madchess v1.4 played thousands of games vs human players and got 2100 classical rating and 2400 blitz. Other engines too. Also in lichess.
That same engine in CCRL list is 2200 rating. That proves CCRL rating can be directly compared to human rating (give or take some points, but pretty accurate).
yes, 4000 ELO will deferentially crush the 3000 ELO player like stockfish 14.1 on 250 treads (3862) will crush stockfish 14.1(3689) itself.