I joined the the Research team at the Wikimedia Foundation on October 2018 as a Research Scientist. I currently live in the glorious New York City, NY, USA.
My background is in geography and human-computer interaction, with a special focus on understanding (and trying to do something about) how structural inequalities find their way online and into algorithmic systems. Since joining WMF, I have also been heavily involved in research towards better understanding reader needs and behavior, how to model and make predictions about Wikimedia content in a language-agnostic manner, and the impact of external re-use of Wikimedia content.
A collection of tools that I've built (or helped build) for showcasing some of our research work:
- Language-agnostic content tagging models
- List-building models
- Social media traffic report
- Differential privacy parameter exploration
- Search referral data for Wikipedia
- User scripts for visualizing link data
And specifically, a number of Python packages:
Various writings about topics relevant to Wikimedia data, research, etc.
- Trade-offs between performance and sustainability in language modeling for Wikimedia
- Potential content tagging models for Wikimedia
- Data gaps that inhibit equitable and effective ML for the Wikimedia projects
- Various analysis "gotchas" when working with Wikimedia data
- Standard approaches to various Wikimedia research tasks
- Aspects to consider when comparing/studying different Wikipedia language editions
Last updated on 11/16/2023
Active projects that I am currently working on:
Completed research projects and reports:
Projects that I've started, but had to put down for the moment: