ThinkJEPA blends a dense video predictor with a vision–language “thinker” to forecast longer-range hand movements | arXiv News