Core Team

Nicolas Zumarraga

Multimodal AI

Master Student

About

Nicolas Zumarraga works at the intersection of time-series AI, multimodal reasoning, and real-world deployment. As a master’s student in Statistics & Data Science, he is focused on advancing Time Series Language Models (TSLMs), with particular interest in healthcare and other domains where continuous data is abundant but difficult to interpret at scale. What distinguishes Nicolas is the combination of research depth and product-minded execution. Before joining ASL, he worked at Amazon, where he managed business accounts and built large-scale ETL pipelines, and he also founded Gober, a data and AI platform for European public procurement that became part of the Google for Startups and NVIDIA Inception ecosystems. Across industry and entrepreneurship, he has built production-ready RAG systems, fine-tuned large models, and designed scalable cloud architectures. At ASL, Nicolas is developing methods that help AI systems reason over massive temporal contexts with greater precision. His work aims to make long, continuous data streams more searchable, interpretable, and actionable — an important step toward turning raw sensor and health data into usable intelligence.

Projects

01Long-Context AI for Time-Series Retrieval

Publications

01Google Scholar

Research Areas

01Time-Series Language Models

02Long-Context Retrieval

03Multimodal Temporal Reasoning

Connect

nicolaszumarragaf

nzuma0

Project

Long-Context AI for Time-Series Retrieval

This project focuses on a major frontier in AI: enabling models to reason effectively over long, continuous streams of temporal data. While modern language models perform impressively on text, they remain far less capable when asked to localize, retrieve, and reason over specific events buried inside millions of time-series datapoints. His work addresses this challenge by designing architectures and evaluation frameworks for Time Series Language Models (TSLMs) that can handle long-context temporal reasoning more reliably. A central component of this effort is TS-Haystack, a multi-scale retrieval benchmark that tests whether models can identify precise temporal events across extremely large contexts. The broader goal is to bridge natural language modeling with real-world continuous signals. Scientifically, this research addresses a foundational limitation in multimodal AI: the mismatch between discrete token-based modeling and the continuous, multi-resolution structure of temporal data. By moving beyond forecasting and toward retrieval, localization, and reasoning, the project helps define a more rigorous standard for evaluating next-generation TSLMs. It also creates a framework for comparing architectures such as Transformers and State-Space Models under demanding long-context conditions, revealing where current systems lose temporal fidelity. This contributes to a deeper understanding of how AI models can represent and reason over continuous phenomena in domains such as health, infrastructure, and industrial systems. In sectors such as healthcare, industrial maintenance, and finance, valuable information is often hidden inside large streams of sensor or monitoring data, yet extracting it still depends heavily on manual review or brittle rule-based systems. Models that can natively search, interpret, and reason over long time-series data could make anomaly detection, diagnostics, and monitoring far more scalable and precise. This opens the door to practical systems that help organizations turn raw temporal data into high-value operational insight, with clear relevance for clinical decision support, predictive maintenance, and intelligent monitoring platforms.

Other team members