Xuhui Zhou
AboutPublicationsCVBlogMore

Blog

RL from Xuhui's Perspective

Apr 13, 2026

A deep dive into REINFORCE, PPO, GRPO, and REINFORCE++ — and the single theoretical idea that ties them all together.

The Quest of User-Effective AI Agents

Nov 2, 2025

Exploring what makes AI agents truly effective for users, beyond benchmark performance.

The overlooked "bad" word list ☠️

Dec 15, 2024

Stop using outdated bad word lists. Use ToxicTrig instead for better toxic language analysis.