Top AI chatbots outperform students on a new relativity concept test, but fail on a few image-based items | arXiv News