Simulated-economy benchmark: a computational planner outperforms markets in the model, with important caveats
This paper builds a reproducible computer benchmark that compares three ways of running an economy: a centralized computational planner, a decentralized agent-based market, and a hybrid “meta-market.” In the reported runs the planner produced lower average welfare losses than the other two systems. Across ten training worlds the planner’s mean loss was 0.1018, versus 0.3072 for the agent-based market and 0.2679 for the meta-market. Similar gaps appear in five nominal holdout worlds (planner 0.1100, market 0.3127, meta-market 0.2697).
The authors generate many synthetic “worlds.” Each world has 150 sectors, ten regions, seven social-need groups, and 140 firms. The model uses a Leontief input–output core. That is an accounting structure that ties each sector’s output to the inputs it needs from other sectors. The benchmark also includes capacity limits, different firm productivities, prices that adjust inside the simulation, a list of eleven welfare loss components (for example essential unmet demand, total unmet demand, inequality across groups, quality shortfalls, corruption, rights violations, logistics failures, innovation shortfalls, environmental burden, volatility, and administrative cost), randomized welfare weights, structural shocks, and adversarial stress tests.
The three allocation systems are simulated with distinct rules. The planner receives a noisy demand signal plus some local corrections, directly uses the production matrix to solve the Leontief system, enforces capacity and reserve rules, and prioritizes essential sectors. The agent-based market assigns firms to sectors, updates prices in response to excess demand over ten periods, and lets firms update output, quality, cash-mediated productivity, and reputation. The meta-market adds stronger arbitrage and hedging behavior on top of the market process. The benchmark also runs optimizing red-team adversaries and information-reporting experiments to probe failure modes.