"""Preprocess 122 isometric Grok videos for SCD training. Encodes MP4+TXT pairs into precomputed latents + text embeddings for SCD LoRA training. Uses combined prompts (_combined.txt from ...
Train vision-language models (VLMs) with reinforcement learning using Group Relative Policy Optimization (GRPO) on multimodal image+text tasks via the verl framework. This workflow trains ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results