Real-Time Reasoning Agents in Evolving Environments

Wen, Yule; Ye, Yixin; Zhang, Yanzhe; Yang, Diyi; Zhu, Hao

Real-Time Reasoning Agents in Evolving Environments

Yule Wen^1*, Yixin Ye^2*, Yanzhe Zhang³, Diyi Yang⁴, Hao Zhu⁴

¹Tsinghua University ²Shanghai Jiao Tong University
³Georgia Institute of Technology ⁴Stanford University ^*Co-leading authors

Paper Code Tweet

Figure 1 showing the overview of AgileThinker architecture

We create three real-time games, Freeway, Snake and Overcooked, to study the challenge of real-time reasoning. In these games, agents need to deal with dynamic environments smartly and timely to achieve high rewards. Experiments show that under cognitive load and time pressure, AgileThinker (Ours), which engages two LLMs with both System 1 and 2 reasoning, greatly outperforms agents engaging only one LLM. Here scores are normalized to [0, 1] for each game and then taken an average.

Abstract

Agents in the real world need to make not only logical but also timely judgments, which demands continuous awareness of the dynamic environment where hazards emerge, opportunities arise, and other agents act - all while the agent's own reasoning is still unfolding. Despite significant advances in reasoning capabilities of language models, existing approaches fail to account for this dynamic nature. We introduce real-time reasoning as a new problem formulation for bringing reasoning capabilities to agents operating in evolving environments and build a Real-Time Reasoning Gym to demonstrate it. We study two paradigms for deploying reasoning language models in agents: System 1 Agent, which employs reasoning models with bounded computation for rapid responses, and System 2 Agent, which allows extended computation for complex problems. Our experiments reveal that even state-of-the-art language models struggle with making logical and timely judgment via either of the two paradigms. To address this limitation, we propose AgileThinker, a parallel architecture that simultaneously engages both reasoning systems. This approach demonstrates superior performance as task difficulty and time pressure increase, managing the trade-off between reasoning depth and response latency. Our work establishes real-time reasoning as a critical frontier for developing practical reasoning agents and provides a foundation for future research in temporally-constrained artificial intelligence systems, highlighting a path toward real-time capable language agents.

Agent Reasoning Progress Comparison

Choose Your Settings

Game

Cognitive Load

Time Pressure

Game Seed

Current Step

0 / 0

Reactive Agent

Fast, Intuitive Reasoning

Score

0

Next Action

Keep

Planning Agent

Slow, Deliberate Reasoning

Score

0

Next Action

Keep

AgileThinker

Parallel Reasoning System

Score

0

Next Action

Keep

Using the realtimegym Python Package

Installation

Install the Real-Time Reasoning Gym package using pip:


                    git clone git@github.com:wenyl22/RealtimeGym.git

                    cd RealtimeGym

                    pip install -e .

Quick Start

Get started with a simple example:


import realtimegym



# Create environment

env, seed, renderer = realtimegym.make('Freeway-v0')

obs, done = env.reset()

Available Environments

Freeway

Navigate through dynamic traffic with real-time decision making.

realtimegym.make('Freeway-v0')

Snake

Strategic planning for food collection while avoiding obstacles.

realtimegym.make('Snake-v0')

Overcooked

Cooperative cooking with coordination and task prioritization.

realtimegym.make('Overcooked-v0')

Agent Implementations

Reactive Agent

Fast, intuitive System 1

Always react quickly with bounded compute; no planning thread.


                  
class ReactiveAgent:

  def think(timeout):

    start_reactive_thread(current_observation, "")

    run_reactive_thread(internal_budget)

    if reactive_thread_is_alive():

      s1_budget_forcing()

    action = get_reactive_thread_response()

Planning Agent

Slow, deliberate System 2

Plan first within the full timeout, then execute the first action.


                  
class PlanningAgent:

  def think(timeout):

    if not planning_thread_is_alive():

      start_planning_thread(current_observation)

    run_planning_thread(timeout)

    if not planning_thread_is_alive():

      plan = get_planning_thread_response()

    action = plan[0]; plan = plan[1:]

AgileThinker

Parallel: System 1 + System 2

Plan in parallel with a fast reactive thread; use budget-aware forcing.


                  
class AgileThinker:

  def think(timeout):

    if not planning_thread_is_alive():

      start_planning_thread(current_observation)

    run_planning_thread(timeout - internal_budget)

    plan = get_planning_thread_response()

    start_reactive_thread(current_observation, plan)

    run_reactive_thread(internal_budget)

    if reactive_thread_is_alive(): s1_budget_forcing()

    action = get_reactive_thread_response()

Complete Example

Show complete code example


import realtimegym

from realtimegym.agents.agile import AgileThinker

from realtimegym.prompts import freeway as prompt



env, seed, _ = realtimegym.make_env("Freeway-v0")

obs, done = env.reset()



log_file = "freeway_v0_agile.csv"

agent = AgileThinker(prompt, log_file, 'token')



while not done:

    agent.observe(obs)    # Fast observation

    agent.think(timeout=4096)    # Bounded thinking (token or seconds)

    action = agent.act()

    obs, done, reward, reset = env.step(action)

View Documentation

BibTeX

@article{wen2024realtime,
  title={Real-Time Reasoning Agents in Evolving Environments},
  author={Wen, Yule and Ye, Yixin and Zhang, Yanzhe and Yang, Diyi and Zhu, Hao},
  journal={International Conference on Learning Representations},
  year={2025},
  url={https://bleaves.github.io/real-time-reasoning/}
}