>>514799677
AI isn't good at that kind of thing, because of the way it works.
It's assessing the probability of what the next word in a sentence is going to be (or the next phrase, or the next sentence, etc. it depends on how its designed). What that means is if you ask an AI what 2+2 = ? it does not solve the equation to figure it out, it instead has assigned probabilities:
A. The answer is 4 (88% probability)
B. The answer is 5 (10% probability)
C. The answer is 2 (1% probability)
...
When you have large enough training sets with a high frequency of the correct answer, the AI is more likely to get the answer correct. If you ask any top of the line AI right now what 2+2 is, it'll probably get it correct. If you ask it what 5918317810135 x 318843450234 equals, it will almost definitely get it wrong.
So in your case, you are asking for something very niche, which doesn't have a lot of training data reinforcing the correct answer, and it has one specific answer, it's not a range of possible answers like how someone might form a sentence describing things with different verbs. Things like Chain-of-Reasoning are bandaids trying to fix this problem, but they don't fundamentally solve it. There are however further improvements coming down the pipe - taking cues from how our brains work, but it'll be months before we see that in big models.
So to tl;dr, what you're asking for is basically what AI is most terrible at.
Keep in mind there is no *understanding* of your question and what kind of answer you are expecting (yet - like I said stuff is coming along), we haven't even formalized methods/data structures for getting AI to do math properly yet. When that changes and AI can handle logic, you will start to find questions like the ones you're asking tend to get better results.