We create three real-time games, Freeway, Snake and Overcooked, to study the challenge of real-time reasoning. In these games, agents need to deal with dynamic environments smartly and timely to achieve high rewards. Experiments show that under cognitive load and time pressure, AgileThinker (Ours), which engages two LLMs with both System 1 and 2 reasoning, greatly outperforms agents engaging only one LLM. Here scores are normalized to [0, 1] for each game and then taken an average.
Abstract
Agents in the real world need to make not only logical but also timely judgments, which demands continuous awareness of the dynamic environment where hazards emerge, opportunities arise, and other agents act - all while the agent's own reasoning is still unfolding. Despite significant advances in reasoning capabilities of language models, existing approaches fail to account for this dynamic nature. We introduce real-time reasoning as a new problem formulation for bringing reasoning capabilities to agents operating in evolving environments and build a Real-Time Reasoning Gym to demonstrate it. We study two paradigms for deploying reasoning language models in agents: System 1 Agent, which employs reasoning models with bounded computation for rapid responses, and System 2 Agent, which allows extended computation for complex problems. Our experiments reveal that even state-of-the-art language models struggle with making logical and timely judgment via either of the two paradigms. To address this limitation, we propose AgileThinker, a parallel architecture that simultaneously engages both reasoning systems. This approach demonstrates superior performance as task difficulty and time pressure increase, managing the trade-off between reasoning depth and response latency. Our work establishes real-time reasoning as a critical frontier for developing practical reasoning agents and provides a foundation for future research in temporally-constrained artificial intelligence systems, highlighting a path toward real-time capable language agents.
Agent Reasoning Progress Comparison
Choose Your Settings
Using the realtimegym Python Package
Installation
Install the Real-Time Reasoning Gym package using pip:
git clone git@github.com:wenyl22/RealtimeGym.git
cd RealtimeGym
pip install -e .
Quick Start
Get started with a simple example:
import realtimegym
# Create environment
env, seed, renderer = realtimegym.make('Freeway-v0')
obs, done = env.reset()
Available Environments
Freeway
Navigate through dynamic traffic with real-time decision making.
realtimegym.make('Freeway-v0')
Snake
Strategic planning for food collection while avoiding obstacles.
realtimegym.make('Snake-v0')
Overcooked
Cooperative cooking with coordination and task prioritization.
realtimegym.make('Overcooked-v0')
Agent Implementations
Reactive Agent
Fast, intuitive System 1
Always react quickly with bounded compute; no planning thread.
class ReactiveAgent:
def think(timeout):
start_reactive_thread(current_observation, "")
run_reactive_thread(internal_budget)
if reactive_thread_is_alive():
s1_budget_forcing()
action = get_reactive_thread_response()
Planning Agent
Slow, deliberate System 2
Plan first within the full timeout, then execute the first action.
class PlanningAgent:
def think(timeout):
if not planning_thread_is_alive():
start_planning_thread(current_observation)
run_planning_thread(timeout)
if not planning_thread_is_alive():
plan = get_planning_thread_response()
action = plan[0]; plan = plan[1:]
AgileThinker
Parallel: System 1 + System 2
Plan in parallel with a fast reactive thread; use budget-aware forcing.
class AgileThinker:
def think(timeout):
if not planning_thread_is_alive():
start_planning_thread(current_observation)
run_planning_thread(timeout - internal_budget)
plan = get_planning_thread_response()
start_reactive_thread(current_observation, plan)
run_reactive_thread(internal_budget)
if reactive_thread_is_alive(): s1_budget_forcing()
action = get_reactive_thread_response()
Complete Example
Show complete code example
import realtimegym
from realtimegym.agents.agile import AgileThinker
from realtimegym.prompts import freeway as prompt
env, seed, _ = realtimegym.make_env("Freeway-v0")
obs, done = env.reset()
log_file = "freeway_v0_agile.csv"
agent = AgileThinker(prompt, log_file, 'token')
while not done:
agent.observe(obs) # Fast observation
agent.think(timeout=4096) # Bounded thinking (token or seconds)
action = agent.act()
obs, done, reward, reset = env.step(action)
BibTeX
@article{wen2024realtime,
title={Real-Time Reasoning Agents in Evolving Environments},
author={Wen, Yule and Ye, Yixin and Zhang, Yanzhe and Yang, Diyi and Zhu, Hao},
journal={International Conference on Learning Representations},
year={2025},
url={https://bleaves.github.io/real-time-reasoning/}
}