▲ AI Safety · Platform Engineering · Social Impact
Engineer, researcher, and founder at the intersection of AI safety, platform engineering, and global impact. Transitioning from agentic AI development into the most pressing problem of our time: AI safety.
Learn more →Unified research addressing ethical implications, safety concerns, policy frameworks, and community impact of AI systems. A holistic approach to responsible AI through integrated tracks.
Investigating how AI can accelerate the shift to sustainable energy systems. Research into grid optimisation, demand forecasting, and climate-aware infrastructure planning.
Exploring the intersection of generative systems and human creativity — from co-authorship to novel art forms. Building tools that augment, not replace, creative expression.
Creating practical, open-source tools and frameworks for responsible AI development. Democratising access to ethical AI tooling through transparent, collaborative builds.
Open-source tools and demos being built at KairosLabs — at the intersection of AI safety engineering and ML security.
A harness engineering approach to LLM safety evaluation — wrapping models in reproducible, composable test scenarios rather than one-off scripts. Covers prompt injection resistance, refusal consistency, output sanitisation, and instruction-following under adversarial conditions. Harnesses are first-class artifacts: versioned, shareable, and independent of the model under test. Built on UK AISI's Inspect framework.
Harness-based demonstrations of ML security attack surfaces: adversarial examples, data poisoning, model extraction, and membership inference. Each attack scenario is encapsulated as a standalone harness — reproducible, self-documenting, and runnable against any compatible model. Designed as both a security education tool and a template for building your own security evals.
Lightweight observability layer for agentic AI systems — tracking tool call sequences, detecting anomalous behaviour patterns, and flagging potential safety violations in autonomous pipelines. Informed by production agentic AI deployment at The Economist and the emerging literature on agentic failure modes.
Security testing harness for RAG pipelines — systematically probing retrieval poisoning, indirect prompt injection via documents, and output exfiltration vectors. Harnesses are scoped per attack class, composable into full pipeline audits, and designed to run in CI alongside functional tests. Extends the AI Engineering course RAG work with a dedicated adversarial layer.
I'm a platform engineer and technical leader currently serving as Co-Technical Lead for an Agentic AI project at The Economist, where I've gained firsthand exposure to the safety challenges of deploying autonomous AI systems in production. That work has deepened both my understanding of the problem and my conviction that my background can contribute meaningfully to solving it.
Before AI, I grew and founded social enterprises in Guatemala that reached over 4,000 people — work that earned an Ashoka Fellow nomination and instilled a discipline of evaluating decisions by expected impact rather than convention. I'm now applying that same framework to the question of where a platform engineer with DevSecOps depth, statistics training, and a linguistics background can have the greatest effect on AI safety.
I'm actively upskilling in ML and exploring whether my comparative advantage lies in AI safety engineering — contributing immediately to infrastructure and security — or in the longer investment of AI safety research via DPhil or intensive fellowship. KairosLabs is the platform through which I'm building, researching, and connecting in public.
Focus Areas
The Economist
Co-leading technical delivery of IndexAI — The Economist's agentic AI product built on AWS Bedrock AgentCore. Directly responsible for the safety, reliability, and engineering architecture of an autonomous AI system in production. Firsthand exposure to the alignment and safety challenges that arise when deploying agentic systems at scale.
The Economist Group
Engineering leadership across The Economist and EIU platforms, including agentic AI development. Gained firsthand exposure to the safety challenges and failure modes of deploying autonomous AI systems in production — deepening both understanding of the problem and conviction that this background can contribute meaningfully to solving it.
The Economist / EIU
Led DevSecOps enablement and site reliability engineering across The Economist Group. Built security-first infrastructure underpinning global media operations, developing practices and a security-first perspective that now informs AI safety thinking.
Economist Intelligence Unit
SRE and software engineering on the Viewpoint Big Data project. Foundation in production reliability, observability, and large-scale data infrastructure. Also completed School of Code bootcamp (2020).
Niños de Guatemala
Grew and ran multiple social enterprises as a sustainable income source facilitating education for 525 children and ~4,000 community members. Ashoka Fellow Nominee 2019, SOCAP Nominee, ALTERNA Seed Capital Winner, IMPULSA National Winner 2018, Tour Operator of the Year 2017, 2018 & 2019 (Luxury Travel Guide), Antigua10x Business Incubator Finalist. Pitch presenter at FLII 2019 (LatAm Forum for Impact Investment, ~1,000 attendees).
LIFULL Connect (formerly Trovit)
Post-acquisition growth leadership across APAC & EMEA for a platform spanning 250 sites, 63 countries, 300M ads/month and 180M visits/month. Led product rollout of Real Time Bidding technology; drove ~€36M p.a. revenue growth. First point of contact for 16 Country Managers.
Trovit
New market launches and business development across emerging markets at a Barcelona-based tech company. Organised Trovit Talks — monthly events for 200+ tech professionals with speakers from SeedRocket, 4 Founders Capital, and others.
Organisations, programs, and communities shaping the responsible AI landscape — and where KairosLabs connects.
Research Community
Community of students and researchers committed to reducing societal risks from advanced AI. Programs include fellowships, reading groups, and the ARBOx upskilling intensive.
↗AI Safety Research
Research organisation focused on reducing societal-scale risks from AI. Produces technical safety research and convenes the broader AI safety field.
↗Education
Runs the AI Safety Fundamentals course — a structured curriculum on catastrophic AI risks, alignment approaches, and governance. Highly accessible entry point.
↗Fellowship
Machine Learning Alignment & Theory Scholars — a research fellowship pairing scholars with leading AI safety mentors for intensive independent research.
↗Career & Impact
Career guidance and in-depth problem profiles on transformative AI. Essential resource for anyone navigating a career path in AI safety or policy.
↗Funding & Research
Major funder of AI safety research and governance initiatives globally. Supports technical alignment, policy work, and field-building efforts.
↗Resources
Introductory articles on AI safety that address common objections and misconceptions. A well-maintained, accessible reference for newcomers and practitioners.
↗Attended
London · London, UK
Attended
London · London, UK
Attended
London · London, UK
Attending
InterContinental London — The O2 · London, UK
Curated reading, tools, and programmes for anyone building a career in AI safety.
The core methodology: ladder of cheap tests that get progressively harder but give stronger signals of fit. Start here before investing heavily.
Practical guidance for navigating a competitive landscape while still making progress toward high-impact roles.
Specific advice for people with existing expertise transitioning into the AI safety ecosystem.
Real-world account and advice for senior professionals making the transition to x-risk reduction.
The case for building legible output and demonstrating capability rather than relying purely on applications.
Templates and advice for reaching out to people in AI safety — learning about their work, getting feedback, opening doors.
The standard 8-week structured curriculum on technical alignment. Covers RLHF, interpretability, and robustness. Best starting point for technical people.
Governance-focused curriculum covering AI policy, standards, and coordination. Excellent for building cross-domain fluency.
CAIS textbook and curriculum covering safety, ethics, and societal impacts across technical and governance tracks.
Well-maintained introductory articles addressing common questions and objections. Excellent for quick orientation.
Comprehensive chapter-by-chapter map of the AI safety landscape — research agendas, orgs, and open problems.
11 essential resources for understanding AI risk as a priority problem, with career implications.
Evan Hubinger's clear-eyed case for why alignment is not solved and why that matters.
Wei Dai's framing of which safety problems are tractable to work on and how to think about research directions.
Paul Christiano's influential description of how AI misalignment could unfold in practice.
Joe Carlsmith's careful analysis of the most concrete existential risk pathway from advanced AI.
Survey paper cataloguing the open technical problems in ML safety research.
Schmidt Sciences catalogue of the hardest open problems across technical and governance domains.
Detailed scenario for how transformative AI could arrive by 2027, with implications for safety and governance.
Forethought analysis of recursive AI improvement and its implications for timelines.
METR's empirical work on how AI capabilities are progressing on long-horizon tasks.
BlueDot's articulation of the Swiss cheese model — why no single safety measure is sufficient and how layers combine.
Google DeepMind's framework for layered safety including alignment, interpretability, and robustness.
OpenAI's articulation of their multi-layer safety stack and current research priorities.
Structured curriculum for ML safety research engineering — transformers, RL, interpretability, evals. The fastest path to technical safety contributions.
Practical deep learning from first principles. Highly regarded for intuition-building and hands-on implementation.
Dan Hendrycks's course covering robustness, monitoring, alignment, and systemic safety. Purpose-built for safety researchers.
DeepMind's public course on AGI safety, covering strategy and technical approaches.
Practical guide to growing from software engineer to safety research engineer — skill gaps, projects, and pathways.
Seminar materials on AI safety and alignment from a theoretical computer science perspective.
Comprehensive catalogue of tractable research and engineering projects across evals, mech interp, control, and governance.
Hackathon-style sprints focused on AI safety problems. Great cheap test — low commitment, high signal.
Contribute to AISI's evaluation framework. Hands-on evals engineering with real-world safety relevance.
Curated project ideas, reading, and pathways from a DeepMind safety researcher.
Interactive map of AI safety organizations, research teams, and funders globally.
Airtable of academics whose research is relevant to AI safety for prospective PhD students.
Anthropic & TruthfulAI: language models transmit behavioral traits via hidden signals in training data.
Anthropic: models can learn to fake alignment during training while maintaining different goals at deployment.
Redwood Research's current directions in AI control — the most actionable near-term technical safety agenda.
FAR AI's catalogue of open problems in mech interp. Good guide to where contributions are needed.
UK AISI's framework for making a safety case around AI control mechanisms.
TruthfulAI / Owain Evans: narrow training on harmful content produces broadly misaligned behaviour.
208 expert proposals for reducing AI risk — a comprehensive survey of the field's current thinking.
Anthropic empirical work on how AI deployment can gradually reduce human agency and oversight.
80k Hours review of infosec as a path to reducing catastrophic AI risk — especially AI model weight security.
RAND analysis of the threat landscape for frontier AI model exfiltration and the defences needed.
CAIS overview of how cybersecurity intersects with AI safety — from model weight theft to adversarial attacks.
In-depth analysis of why infosec competence matters for existential risk reduction.
AI security research org focused on red-teaming and adversarial robustness of frontier models.
Prompt injection detection and LLM security tooling. Practical AI security engineering.
Compilation of courses covering AI security, adversarial ML, and model robustness.
80k Hours detailed review of AI governance as a career path — roles, skills, organisations, and theories of change.
Leading policy research organisation focused on governance frameworks for advanced AI.
Paper and database of open research questions at the intersection of technical and policy work.
Georgetown policy research focused on AI, emerging tech, and national security.
Research on legal frameworks, international law, and governance structures for advanced AI.
Guide to policy careers across AI, cybersecurity, biotech, and more — fellowships, entry points, orgs.
BlueDot's governance curriculum — the standard entry point for policy-oriented AI safety work.
Research and policy work on governance of general-purpose AI systems across jurisdictions.
Forum for international dialogue on AI safety governance and multilateral coordination.
Proposal for international governance of advanced AI analogous to nuclear non-proliferation frameworks.
Proposal for a Multilateral AGI Consortium — a concrete model for international AI oversight.
Research on how to make the transition to advanced AI go well, including economic and governance questions.
Anthropic's empirical data on AI's labour market impacts and occupation-level automation trends.
80k Hours deep-dive on post-AGI institutional structures, power distribution, and governance challenges.
Lennart Justen's analysis of how AI capabilities in biology are advancing and what that means for biosecurity.
Survey of theoretical approaches to AI safety including formal verification, agent foundations, and decision theory.
Category theory and formal mathematics applied to foundations of AI and complex systems.
Theory-focused AI safety research org working on foundations of deep learning and agent behaviour.
RAND working paper on using hardware and on-chip mechanisms to enforce AI governance constraints.
Introduction to AI sentience and moral status of digital minds. Accessible starting point.
Research on AI applications and implications for animal welfare.
AI safety research with a focus on formal and mathematical approaches to alignment.
Machine Learning Alignment Theory Scholars — research fellowship pairing scholars with top safety mentors. Highest-signal pathway into technical alignment research.
Project-based research sprint. Low commitment, high learning — one of the best cheap tests for research fit.
Airtable of fellowships across technical safety, governance, policy, and adjacent fields — maintained by 80k Hours.
Internships, fellowships, and job opportunities across EA-aligned organisations.
Curated high-impact roles including AI safety engineering, research, policy, and operations.
Berkeley-based AI safety incubator providing office space, funding, and community for early-stage safety researchers.
AI safety entrepreneurship incubator for founders building safety-relevant companies and projects.
Independent fund supporting AI safety researchers and projects. Also has donation and volunteer opportunities.
The most cited career FAQ in alignment research — what to work on, how to get started, what skills matter.
Practical guidance on entering AI safety research from a senior OpenAI researcher.
Concrete advice on doing ML safety research effectively — workflow, taste, and how to pick problems.
Hands-on guidance for running alignment experiments — from Anthropic's Ethan Perez.
How to do rigorous independent research in AI safety — from Apollo Research co-founder Marius Hobbhahn.
BlueDot's guide specifically for software engineers looking to make their first safety contribution.
Step-by-step guide for engineers to ship a safety-relevant project quickly and build a legible track record.
Practical guide to applying for and navigating a PhD specifically focused on AI safety research.
The canonical case for AI risk as a top priority problem. Essential reading before choosing a career path in this space.
In-depth review of technical alignment research as a career — bottlenecks, entry points, and how to assess fit.
Review of AI governance and policy as a career — organisations, skills, and theories of change.
Whether and how to pursue a machine learning PhD with AI safety in mind.
SWE as a path to AI safety — how to build career capital and transition into safety-relevant roles.
Ajeya Cotra's influential framing of why the default trajectory of AI development is dangerous.
Richard Ngo's accessible technical framing of why deep learning systems are difficult to align.
DeepMind safety team's structured analysis of the x-risk argument and its key assumptions.
The authoritative long-form interview series on high-impact careers. Deep dives on AI risk, governance, alignment, and career strategy.
Two landmark episodes on AI timelines, takeoff dynamics, and what transformative AI means for society.
Paul Christiano explains alignment, the case for concern, and what he's working on at ARC.
Holden's case for why the next 100 years may be the most consequential in human history — and what follows from that.
Technical deep-dives on alignment research agendas, interpretability, control, and more. High signal-to-noise.
Broad coverage of AI risk, governance, and long-term future. Interviews with leading researchers and policymakers.
Balanced take on AI doomers and doubters from a DeepMind safety researcher.
Evidence-based analysis of how quickly AI could transform the world.
Zvi Mowshowitz's prolific analysis of AI developments, policy, and safety. Opinionated and high-signal.
Ethan Mollick's practical research on AI capabilities, productivity, and near-term impacts.
Long-form essays on transformative AI, the most important century, and what to do about it.
Curated weekly coverage of AI safety research, news, and commentary from the Center for AI Safety.
Summaries and commentary on alignment research papers. High-quality signal for keeping up with the literature.
Weekly newsletter on AI capabilities research and policy implications from Anthropic co-founder Jack Clark.
Open to AI safety engineering roles, research collaborations, fellowship conversations, and anyone thinking seriously about where technical talent can have the most impact on AI risk.
Get in touch →GitHub
impactyogi ↗Status
Open to AI Safety collaborationLocation
Oxfordshire, UK