From pixels to world models: a five-level roadmap for intelligent visual generation | arXiv News