Selected Work | Andrew Ji

01 Research · 1st Author

Attributing Extreme Precipitation to Global Warming via AI

A FiLM-conditioned CNN that preserves a causal intervention pathway for GMT, enabling counterfactual attribution of extreme rainfall across the contiguous United States.

Year

2024 — Present

Role

1st-Author RA

Affiliation

Mamalakis Lab, UVA

Status

Manuscript in preparation

, infrastructure failure, and major economic losses. Yet attributing individual extreme rainfall events to anthropogenic warming has remained especially difficult: precipitation arises from complex multiscale interactions among moisture, circulation, and topography, and conventional statistical models rarely capture the heavy upper tail — precisely where the most societally consequential events live.

The Core Challenge: Predictor Redundancy Threatens Causal Validity

ML-based counterfactual attribution requires a warming variable (GMT) that can be meaningfully manipulated. But even after linearly removing the GMT signal from atmospheric predictors, an auxiliary CNN can still recover GMT with R² ≈ 0.77 from the residual fields. This means the network can infer warming from correlated dynamical patterns — functionally bypassing the explicit intervention variable. Good prediction alone is not enough: if GMT loses its independent causal pathway inside the model, counterfactual experiments become uninterpretable.

What I Built

I designed a FiLM-conditioned convolutional neural network trained on the 22-member CESM2 Large Ensemble (1850–2100) and fine-tuned on ERA5 reanalysis. The central architectural choice: Feature-wise Linear Modulation (FiLM) layers at every convolutional block, which give GMT a dedicated pathway to directly modulate intermediate spatial features — applying learned affine transformations to each channel — before spatial pooling collapses the representation. This preserves GMT as a causally valid intervention variable within a high-capacity nonlinear model. The framework covers 15 CONUS boxes spanning 30°–45°N and 75°–125°W, focused on November–March synoptically driven winter extremes.

Attribution Framework: Beyond Risk Ratios

Rather than relying solely on conventional metrics like the fraction of attributable risk (FAR), I incorporated probabilities of necessary and sufficient causation (PN and PS), computed as continuous functions of the exceedance threshold. This distinction is critical for precipitation extremes: as event severity increases, the causal role of warming shifts — from sufficient causation at common thresholds (warming alone can raise risk) toward necessary causation at rare-event thresholds (warming must be present for the event to reach that intensity). The upper tail is modeled via a hybrid KDE–GPD approach for stable, threshold-resolved estimation.

Key Findings

Why This Matters

This work makes both a domain contribution and a methodological one. For climate attribution, it extends ML-based counterfactual approaches from heat extremes to precipitation — a far messier target — and provides the first PN/PS causal decomposition of extreme rainfall events over CONUS. Methodologically, it demonstrates a principle that extends well beyond precipitation: in ML-based attribution, architecture directly determines whether the intervention variable remains causally interpretable. The FiLM-conditioned design offers a blueprint for any attribution setting where the causal signal is dynamically mediated rather than thermodynamically direct.

Good predictive skill is necessary but not sufficient for causal attribution. The architecture must preserve the intervention pathway — otherwise counterfactuals lose their meaning.

Entry 01 of 06

02 Research · Applied

ML-based ESG Investment Return Attribution

Decomposing what ESG signals actually do to long-term, risk-adjusted performance.

Year

2025 — Present

Role

Personal research

Affiliation

Independent

Status

In development

-related signals in asset returns — separating the structural drivers of performance from market noise and from confounded factor exposures. The motivation is simple: ESG investing is over-claimed and under-explained, and most published return decompositions confuse correlation with cause.

Built a custom causal attribution system that evaluates how non-financial signals affect long-term, risk-adjusted performance under controlled conditions. Designed to be auditable rather than impressive — the goal is a number you can defend, not one you can't reproduce.

What I built

Why it matters

In ESG, the number is easy. The defensible number is the work.

Entry 02 of 06

03 Applied Analytics · McIntire

Solidcore Pricing & Customer Value System

Explainable ML for pricing decisions, built so non-technical users can actually use it.

Year

2025

Role

Project lead

Affiliation

McIntire School, UVA

Status

Delivered

Developed an AI-powered decision-support platform using explainable ML (XAI) to model and interpret the impact of pricing on customer lifetime value. The core idea is that a pricing model is only useful if the people setting prices can understand it — so I designed for legibility first and accuracy second.

Layered on top is an interactive training system that lets non-technical users run pricing simulations, compare scenarios side-by-side, and see why the model recommends what it recommends. The simulator was designed to feel like a workshop rather than a black box.

What I built

Outcome

A model that no one trusts can't be deployed. Legibility is part of accuracy.

Entry 03 of 06

04 Data Infrastructure · Deloitte

Global Geospatial Site Selection & ROI

A 90+ variable location intelligence system for EV battery factory siting.

Year

2024

Role

Data Science Lead

Affiliation

Deloitte, Shanghai

Status

Adopted by team

Built a global, multi-region location intelligence system integrating 90+ economic, regulatory, labor, and logistics variables to rank EV battery factory sites and quantify ROI under multiple policy and cost scenarios. Site selection is rarely a single clean decision — it's a sensitivity analysis across many futures, and the framework was designed to make that sensitivity visible.

Standardized and curated a cross-country, decision-grade dataset with automated normalization and sensitivity routines. The team adopted the framework for continued use in future location-strategy work, which I'm prouder of than any single ranking it produced.

What I built

Outcome

A site ranking is just an opinion. The sensitivity analysis is the actual answer.

Entry 04 of 06

05 Venture · Data Pipeline

Investor × Startup Matching Engine

A scoring system that helps founders find the right capital, not just any capital.

Year

2026 — Present

Role

434 Fellow

Affiliation

Venture Central, C'ville

Status

Active

Building a global investor–company matching dataset that aggregates fund theses, past deals, sector focus, check sizes, and geographic preferences — the structured raw material needed to ask: which capital partners actually fit this startup? Most founder–investor matching today is either nepotistic or coarse. Neither scales.

On top of the dataset I'm developing a scoring engine that helps startups prioritize outreach, improve fundraising efficiency, and avoid wasted cycles with mismatched funds. The goal is not to replace judgment, but to give founders a defensible shortlist they can argue with.

What I'm building

Why it matters

Founders deserve a shortlist they can argue with — not one they can only accept.

Entry 05 of 06

06 Internship · Brief

China Securities · Institutional Research

Visualizations and market briefs across seven product lines, plus AI & drone-data exploration.

Year

2025

Role

Institutional Intern

Affiliation

China Securities Co., Ltd

Status

Delivered

Built data visualizations and market intelligence materials for institutional clients across seven-plus product lines, translating dense market activity into briefs that traders and PMs would actually read. Most of the value was in editing — choosing which signals to surface and which to drop entirely.

Alongside the client work I ran research into AI-powered automation (Mobile-ViT for image-based industry signals) and drone-based data systems for industry tracking. The exploration was speculative, but it forced a useful distinction: which workflows can AI legitimately accelerate, and which it just makes more confidently wrong?

What I built

What I learned

The job wasn't producing data — it was deciding what was worth a busy person's attention.

Entry 06 of 06 · End of archive