Upload a video (max 10 min), transcribe with Whisper Turbo, then generate a script with an Inference Providers model.