
Achille Nazaret
Ph.D. Student in Computer Science, Columbia University
I am a Ph.D. student in computer science advised by Prof. David Blei and Prof. Elham Azizi.
I develop mathematical models to describe and understand the world. I specialize in probabilistic and generative models, focusing on (i) scaling them to large datasets and (ii) interpreting what they learned. For example, I train generative models on Apple Watch sensor data to understand subjects' health.
I orient my research towards practical impact, prioritizing simplicity and usability over method complexity. I value teamwork, mathematical rigor, clean code implementation, and clear scientific communication.
Resume
Find a PDF version here.
Education
Columbia University
École Polytechnique
École Spéciale Militaire de Saint-Cyr
Lycée Privé Sainte-Geneviève
Experience
-
Apple Health AINew York, NYMachine learning research scientist (part-time, alongside Ph.D.)Feb 2024 - Dec 2024
- Foundation models of time series from wearables to understand user health and fitness levels.
-
Apple Health AINew York, NYMachine learning research scientist (intern)Jan 2022 - Aug 2022
- Estimated the causal impact of the Watch's notifications on user behavior with novel causal estimators.
- It increases standing rates by 40%.
-
Apple Health AINew York, NYMachine learning research scientist (intern)May 2021 - Aug 2021
- Identified and designed new health biomarkers from the sensor data of Apple devices.
-
Palantir Technologies(remote) San Francisco, CAForward deployed software engineer (intern)Jun 2020 - Aug 2020
- Scoped, prototyped, and deployed data-driven algorithms to reduce costs for a US healthcare insurer.
-
Yosef Lab, University of California, BerkeleyBerkeley, CAResearch assistantApr 2019 - Aug 2019
- Developed an open-source Python package for single-cell data analysis: scvi-tools (1.3k+ GitHub stars).
- Designed generative models to impute unobserved genes in spatial genomics using sc-RNA data.
-
Akwa Group(remote) Casablanca, MoroccoMachine learning consultant (alongside M.S.)Sep 2018 - May 2019
- Designed signals and models to predict the performance of new gas stations -- surpassed human experts by 25%.
-
IMC TradingAmsterdam, NetherlandsSoftware engineer (intern)Jun 2018 - Sep 2018
- Distributed model training pipelines on a cluster for faster overnight training (HFT, futures).
-
BernardaudParis, FranceOperations research consultant (alongside B.S.)Feb 2018 - Jun 2018
- Designed algorithms to find optimal production processes under factory constraints.
- Created a user-friendly full-stack website connecting my algorithms to the databases.
-
Ministry of DefenseParis, FranceJunior data scientist (intern)Nov 2016 - Apr 2017
- Developed graph-mining and NLP models for social network analysis to produce intelligence
Publications
You can find my full list of publications on Google Scholar.
Here are my main first-author publications (* indicates co-first):
-
NeurIPS 2024
-
NeurIPS 2024
-
An earlier version of the work above was presented at the ICML 2024 Workshop on Mechanistic Interpretability.ICML 2024 Workshop Best paper (3rd)
-
ICML 2024
-
UAI 2024
-
Nature Methods 2024
-
bioRxiv preprint 2024
-
AISTATS 2024
-
Nature Biotechnology 2024
-
Nature Digital Medicine 2023
-
Science Advances 2023
-
ICLR 2023 Challenge First place
-
ICML Workshop 2022 Best poster award
-
ICML 2022
-
ICML 2020
-
ICML Workshop 2019 Best poster award
Here are other publications:
-
arXiv preprint 2023
-
Nature Biotechnology 2022
Open source software
-
Treeffuser
An easy-to-use package for probabilistic prediction on tabular data with tree-based diffusion models. -
SDCD
A method for inferring causal graphs from labeled interventional data. -
scvi-tools
A library for analyzing single-cell data with deep generative models.