Thinking in Blender: using vision-language models to turn a single photo into an editable 3D Blender scene | arXiv News