Use a Siamese network to propose categories for images at Commons
A Siamese network can be used for creating a kind of digest that act like a locally sensitive hash, and that "hash", can be used as a measure whether an image is similar or dissimilar to a a given category.
created on17:06, 17 January 2018 (UTC)

Project ideaEdit

What is the problem you're trying to solve?Edit

Categorization of images on Commons is a pretty huge task. It is really difficult even for experienced contributors to find the correct category. Without categories it is an equally difficult task to find an image, so it is crucial to tag the image with at least some categories.

If we could find even a few core categories for each image it would be a huge help, even if those would be a bit generic in nature.

What is your solution?Edit

The described problem can be solved by using a w:siamese network, a kind of w:artificial neural network, to create a kind of

There is a more in-depth description on /Technical notes.

Project goalsEdit

There would be two project goals; one to verify that a fingerprint can be learned and be stable from the existing images, and the second to verify that the fingerprints can be turned into a category with sufficient precision and recall. It is probably not necessary to achieve 100% accuracy, it is sufficient to just be good enough to be usable. What is sufficient is not quite clear, but a recall of half of the correct categories and a precision of half of the proposed categories would be ball park for barely usable.

An initial test version should run as part of the mw:ORES framework.

