Work
Projects and roles in detail — AI engineering, LLM agents, quantitative systems, and leadership.
Headroom
Independent research & engineering · Mar 2026 – Present
A human-in-the-loop firewall for AI coding agents: a guardian classifies every file, shell, git, and deploy action as safe, approval-required, or blocked (layered rules → LLM risk judgment → human review) before it runs.
- Wrote an accompanying paper showing a counterintuitive result: agent oversight is an inverted-U — a guard that escalates more can make a system less safe, because human review is a finite resource that fatigues.
- Backed it with a replay-based eval framing the guard as selective classification under asymmetric cost (risk–coverage, AURC, and Neyman–Pearson on a 125-action adversarial set), measured against a Fleiss' κ = 0.52 reviewer-agreement floor.
- Engineered the runtime on a LangGraph state graph that pauses risky actions mid-task via
interrupt() for resumable approval, with fail-safe denials, an append-only audit log, a live dashboard, and an MCP server any agent can route tool calls through.
Stack: Python · LangGraph · MCP · LLM risk scoring · selective-classification eval
Live demo
GitHub
The thesis
AI Objectives Institute
Lead AI Engineer · San Francisco · May 2024 – Jan 2026
A nonprofit R&D lab aligning AI with human values. I built the WhatsApp intake agent for Talk to the City — an open-source AI tool for democratic deliberation at scale, used in national-level consultations to help policymakers and peace mediators act on public input without losing the nuance of individual voices.
Elicitation Bot — multimodal intake
Built end-to-end, solo: a multimodal WhatsApp agent (voice + text, Whisper + GPT-4o), deployed in multiple languages to thousands of students across 55 universities and 12+ international conferences (Search for Common Ground, Taiwan AI College Alliance, Notre Dame) — bringing Talk to the City to underrepresented, low-connectivity communities.
- Ran 4 months of weekly live sessions at 10–20K messages/day — sharded onboarding across 5 Twilio numbers to clear the per-number 1K-new-users/day cap (a day-one launch blocker), then lifted delivery success 96% → 98% by fixing Heroku rate-limit and cold-start failures and sizing scaling to the app's CPU/memory profile.
- Built tailored agents for high-profile conferences, plus a REST API and a data-aggregation layer (AWS Lambda + Firebase) that automated processing for clean analysis.
Stack: Python · Twilio Business API · GPT-4o · Whisper · AWS Lambda · Firebase
GitHub
Talk to the City — cloud engineering
As a cloud engineer on the project, handled deployment on Heroku with AWS CloudFront and Docker; resolved a blocking AWS issue (advancing deployment ~80%), built customized Lambda layers, and wrote deployment guides and walkthroughs for future engineers.
Stack: Heroku · AWS CloudFront · Docker · AWS Lambda
Talk to the City GitHub
Moral Learning — AI alignment
Designed API-driven LLM evaluation pipelines automating multilingual survey generation and the assessment of human values, translation quality, and model alignment; built a RESTful API that overcame Prolific's limits and an R / ggplot analysis pipeline for annotator performance.
Moral Learning
OperatorLock
Solo build · Jan – Mar 2026
In high-pressure trading, willpower-based risk management fails under stress — the losing trade that's “almost back,” the winner that “should run further,” the drawdown that demands revenge. Trading platforms answer only with warnings and confirmations you can click straight through; there is no way to make your own rules actually binding. OperatorLock is that missing enforcement layer: every constraint is a hard gate enforced server-side, with no “confirm anyway” button — it removes the choice in the moment discipline is most likely to break.
- Nine non-overridable structural gates enforced at the API: a 5-minute candle lock, a 180-second post-exit cooldown, a per-Renko-bar tempo token, an initial exit lock, a management lock, a zone-entry gate, a hard daily trade limit, and a daily stop-loss that flattens positions and blocks entries for the rest of the day.
- A bar-driven behavioral engine clocked by six independent TradingView Renko webhook streams (1 / 2.5 / 4 / 6 / 8 / 12 pt) rather than wall-clock time — constraint logic reacts to market structure, not the operator's impulses.
- Event-driven Flask + Firestore state machine with state hydrated across restarts, bridged to a live C# Rithmic (RAPI+) execution layer on a Windows VPS that polls for pending orders every second.
- The deeper tension it surfaces — that hard constraints carry a real human cost — is the same control-versus-cost inversion Headroom studies in AI oversight, where past a point more control makes a system less safe.
Stack: Python · Flask · gunicorn · Google Firestore · Heroku · C# (.NET) · Rithmic RAPI+ · TradingView Renko webhooks
Live demo
GitHub
Generative Alpha (G-Alpha)
AI Developer & Data Engineer · San Francisco · Summer 2023
A financial-investment AI agent and vertical AI foundation-model company — a scalable system of AI analysts simulating human roles in quant trading, fundamental trading, and advisory.
- Engineered an SEC API ingestion pipeline extracting and curating a decade of filings (earnings calls, 10-Ks, 10-Qs, 8-Ks) for US30 stocks, normalized and transformed into structured training data.
- Generated Q&A pairs from the curated datasets and fine-tuned GPT-4 and LLAMA2 for financial-data comprehension.
- Applied advanced data preprocessing, normalization, and prompt-engineering techniques.
Stack: Python · MongoDB · GPT-4 · LLAMA2 · LangChain · SEC API
G-Alpha on LinkedIn GitHub
Explomind
Data Scientist & Systems Architect · Los Angeles · 2021 – 2023
A proprietary trading platform merging reinforcement learning with quantitative strategy. Worked across two roles — trading algorithms and systems architecture.
- Developed and validated day-trading strategies with TensorFlow and Scipy; incorporated NLP-based fundamental analysis and platform-independent custom indicators.
- Architected an adaptive reinforcement-learning system with a human-in-the-loop labeling and decision layer; built emergency-handling alternatives and automated performance tracking with AWS SES and Lambda.
- Led and trained a distributed international team — ran the full hiring process and built teaching courses and execution guidelines.
- Designed a copy-trading system from scratch and a full-stack platform for financial NLP and pattern recognition.
Stack: Python · TensorFlow · Scipy · Dash · PostgreSQL · AWS (Lambda, SES) · Selenium · Beautiful Soup · R · MT5/MQL5
Sentium
Quantitative Analyst · 2021 – 2023
A nonprofit advancing machine learning, AI, and SaaS development; I built tooling for high-frequency trading.
- Engineered custom indicators with Python, Scipy, Plotly, and Seaborn.
- Managed millions of data points through TimescaleDB and designed a custom mathematical solution for precise spike detection.
- Overcame AWS Lambda deployment limits by managing Lambda layers with Docker and Cloud9.
Stack: Python · Scipy · Plotly · Seaborn · TimescaleDB · polygon.io API · AWS Lambda · Docker · Cloud9
sentium
Info Investment / Ideal Data
Quantitative Analyst Intern
A securities brokerage; I developed and rigorously tested algorithmic trading strategies for the Turkish stock market.
- Developed and backtested trading strategies for Turkish stocks; validated with the Monte Carlo method across market conditions.
- Modeled execution costs (commissions, slippage) and optimized Sharpe Ratio and Maximum Drawdown.
- Implemented risk management — position sizing, stop-loss, diversification, and real-time liquidity handling.
Stack: C# · Python
Info Investment Ideal Data
Education
University of California, Los Angeles (UCLA)
B.S. Statistics & Data Science — 2024.
← Back to home