Jing Wang, Bo Sun, Mingyuan Zhao
Chinese Academy of Sciences · Institute of Automation; University of Chinese Academy of Sciences
Optimal trade execution in the Chinese A-share market is complicated by T+1 settlement, daily price limits, and pronounced order-book imbalance dynamics. We present a deep deterministic policy gradient (DDPG) agent that ingests micro-second-level Level-2 order book snapshots and learns adaptive child-order schedules. Against TWAP, VWAP and Almgren-Chriss baselines, our agent achieves a 23% reduction in implementation shortfall on a held-out sample of 200 stocks over 2022-2023, with statistically significant gains in volatile market regimes.