Research:A System for Large-scale Image Similarity

13:41, 4 August 2021 (UTC)
Duration:  2021-08 – ??

After several requests from different parts of the movement, the Research team is working on an image similarity tool. The tool will take as input an image and return the most similar images in Wikimedia Commons.


We will compute image "embeddings", namely a compact images representations containing numerical summaries of the main image characteristics. We will then compute similarity between images based on these features.


  1. Investigate best tools and methods to efficiently compute image similarity at scale
  2. Implement a first prototype for large-scale image similarity
  3. Iterate on the prototype and implement a public-facing tool