Core Team

Lorenzo Steno

Agentic AI in Science

Master Student

About

Lorenzo Steno is a Master’s student in Computer Science at the University of Twente and is currently completing his thesis at ETH Zurich’s Agentic Systems Lab. His work sits at the intersection of reinforcement learning, efficient post-training, and AI for science, with a particular focus on making language models more capable under real-world computational constraints. Before joining ETH Zurich, he worked at AWS, where he developed a generative AI tool that reduced internal document review time by 40%, and contributed as a research assistant at the University of Twente’s AI & IoT Lab. Lorenzo is especially interested in building AI systems that are not only more powerful, but also more reliable, scalable, and practically deployable.

Projects

01Reinforcement Learning for Cost-Aware Recursive Language Models

Research Areas

01Recursive reasoning

02Reinforcement learning

03Efficient LLM training

Connect

lorenzosteno

lsteno

Project

Reinforcement Learning for Cost-Aware Recursive Language Models

Large language models remain constrained by one of their most important bottlenecks: long-context reasoning. As input length grows, performance often deteriorates, making it difficult to solve problems that require reasoning across large codebases, long scientific papers, or complex multi-step workflows. This project explores Recursive Language Models (RLMs), a promising alternative in which a model decomposes difficult tasks into smaller subproblems and invokes reduced versions of itself to solve them. His work investigates how reinforcement learning can train these systems to decide when to recurse, how deeply to recurse, and when the added computation is justified by gains in answer quality. By combining this with parameter-efficient fine-tuning methods such as LoRA, the project aims to make advanced recursive reasoning trainable on modest hardware. Scientifically, the project addresses an underexplored but increasingly important question: how should models optimize not only for correctness, but also for the cost of reasoning itself? Recursive inference introduces a new control dimension in which the model effectively chooses its own computational path. Lorenzo’s research studies this as a learning problem, where recursion depth, execution mode, and reasoning quality must be balanced jointly. It also extends current work on reinforcement learning with language models into more agentic, multi-step settings, while examining how LoRA interacts with RL beyond standard single-turn reasoning benchmarks. This makes the work relevant both to efficient model adaptation and to broader questions around inference-time scaling. Many high-value enterprise use cases now fail not because models are incapable in principle, but because existing systems become too expensive, too brittle, or too limited when tasks exceed standard context windows. Cost-aware recursive models could enable practical systems for large-scale document analysis, systematic literature review, software engineering support, legal and compliance workflows, and other knowledge-intensive applications that span millions of tokens. A system that can intelligently allocate compute while maintaining strong reasoning performance would be highly attractive in settings where reliability, efficiency, and customization matter at scale.

Other team members