Submitted by Runpeng Dai 7 StatEval: A Comprehensive Benchmark for Large Language Models in Statistics Shanghai University of Finance and Economics 39 4