TikTok Interview Question

Describe GRPO loss and other RL algorithm