Core Team

Haseeb Raza

AI Research

Master Student

About

Haseeb Raza is a computer scientist working at the intersection of agentic AI, large language models, and responsible AI. As a final-year BSc student at Eötvös Loránd University, he combines academic research with hands-on engineering experience across industry and scientific settings. At the Agentic Systems Lab at ETH Zurich, he focuses on how multi-agent LLM systems can make scientific workflows more rigorous, transparent, and scalable. Alongside this, he studies bias in language models at ELTE’s Research Center for Computational Social Science and builds multilingual LLM applications as a Software Engineer at Infineon Technologies. He also founded the ELTE Data Science Club, growing it into a community of more than 100 members. His work is driven by a clear theme: building AI systems that do not just generate language, but support better reasoning, evaluation, and decision-making in high-stakes domains.

Projects

01Agentic Scientific Review Systems

Research Areas

01Multi-agent reasoning

02Scientific AI evaluation

03Responsible LLM systems

Connect

haseebraza715

Project

Agentic Scientific Review Systems

Haseeb is developing agentic AI systems designed to support scientific review and research evaluation. His project explores how multiple specialized language model agents can act as structured co-reviewers for research papers, each focusing on dimensions such as methodology, clarity, novelty, reproducibility, and technical consistency. Rather than producing a single undifferentiated judgment, the system decomposes evaluation into explicit analytical steps, creating more transparent and explainable feedback for human researchers, reviewers, and decision-makers. The goal is not to replace expert review, but to build an intelligent layer that helps surface weaknesses, improve paper quality, and make complex scientific assessment more efficient. From a scientific perspective, the project investigates an important question in modern AI: how agentic LLM architectures can be designed to reason more reliably in domains where precision, justification, and traceability matter. Scientific peer review is a particularly demanding testbed because it requires structured reasoning, sensitivity to uncertainty, and the ability to weigh multiple criteria at once. By studying how multi-agent systems critique, compare, and refine each other’s outputs, this work contributes to broader research on trustworthy AI, evaluation frameworks, and the use of language models in knowledge-intensive environments. It also offers a practical setting for understanding how explainability and reproducibility can be improved in AI-assisted scientific workflows. Scientific publishers, funding bodies, university labs, R&D organizations, and deep-tech investors all face growing volumes of technical material that require fast but careful evaluation. A system that can flag methodological concerns, summarize risks, and highlight scientific or technical strengths could meaningfully reduce the time and cost of expert assessment while improving consistency. Beyond publishing, the same infrastructure could support technical due diligence, internal research scouting, and innovation evaluation across industry. In practice, this creates a strong foundation for a research intelligence capability that helps organizations make better, faster decisions on complex technical work.

Other team members