Large language models can carry out parts of statistical proofs but struggle to pick the right strategy | arXiv News