User talk:Tangotango/Mayflower/Archive 1

Latest comment: 17 years ago by Tangotango in topic Not working?

Past comments and suggestions regarding Mayflower are archived here.

Message from guillom edit

I have just discover this tool, it's really awesome! Congrats Tangotango! guillom 17:11, 13 February 2007 (UTC)Reply

Thank you guillom :)  !! - Tangotango 01:34, 14 February 2007 (UTC)Reply

Layout change edit

The tool used to have a nice gallery layout, but now all image thumbnails are shown under each other, slightly overlapping. Something must be wrong, I can hardly imagine this change to be intentional. - Andre Engels 12:22, 15 February 2007 (UTC)Reply

Yes, it is unintentional :) I uploaded a new version of the tool a few moments ago, which uses a new stylesheet. Please do a hard-refresh or clear the cache in your browser and try again. Cheers, Tangotango 12:25, 15 February 2007 (UTC)Reply
Working now :-) - Andre Engels 12:31, 15 February 2007 (UTC)Reply

Image crop edit

Currently portrait images are cropped at the bottom. This is maybe not best for all images as you cut the floor and leave the sky on many pictures. --Ikiwaner 21:42, 14 February 2007 (UTC)Reply

  Done This should be fixed now. (There may be a little cropping at the bottom, but it should be only a few pixels, nothing major) - Tangotango 12:05, 15 February 2007 (UTC)Reply

Questions about the license edit

License edit

This rocks. :) However, I'd like to ask why is this copyrighted? Wouldn't a free license (e.g. GPL) be better for a prospective MediaWiki plugin? --TOR 21:53, 13 February 2007 (UTC)Reply

Thank you :) The tool is copyrighted (as is almost every piece of free software, including MediaWiki), but it will be licensed under an open-source license. The reason why I haven't made that clear yet is because of license implications of the word stemmer library that the tool uses--as soon as I get that sorted, the tool should be freely licensed. - Tangotango 01:34, 14 February 2007 (UTC)Reply

License edit

Hi there. I didn't find any reference to the license your tool is under. Can you point me to the right place please? This is a great great tool btw ! Thank you :-) notafish }<';> 18:25, 15 February 2007 (UTC)Reply

Hello again, I'd really like to know. Thank you! :-) notafish }<';> 13:23, 20 February 2007 (UTC)Reply
It's licensed under the GNU General Public License :) I added a link to the license/source code to the footer of the tool, so please take a look. The license details are also available at User:Tangotango/Mayflower/License. (Sorry for the delayed response :( ) Cheers, Tangotango 14:14, 20 February 2007 (UTC)Reply

Return source edit

What's it implemented in? Can you have it return source? or make the source available? This is a totally nifty tool that I want to crib ideas from :) Thanks for creating it, you rock. ++Lar: t/c 14:39, 20 February 2007 (UTC)Reply

(I see above that you say that the source is linked from the tool, but it's not so linked on this page: [1] which is where I checked first. I DO see it on the license page... :) )++Lar: t/c 14:45, 20 February 2007 (UTC)Reply

Non-ASCII characters edit

I found that the tool does not handle non-ASCII characters correctly: It just considers them spaces. Thus, when I searched for Cléguer, I got answers for Guer instead. - Andre Engels 22:22, 13 February 2007 (UTC)Reply

Thank you for reporting this, I hadn't used the tool to search for non-English words, so I didn't realize this would be a problem. This should be fixable by making changes to the word tokenizer, so I will look into it. - Tangotango 01:34, 14 February 2007 (UTC)Reply
  Done The new index/stemmer treats all accented characters as unaccented characters. A search for Cléguer [2] no longer returns results for "Guer", but it doesn't return any results... neither does Google, for that matter, so unless I'm very much mistaken, it's working properly :) Please tell me if you have any other problems with accented characters. Cheers, Tangotango 15:53, 27 February 2007 (UTC)Reply

Searching for dates edit

Nice presentation. Unfortunately there's no option to search for dates? Commons:User:Man vyi 05:58, 14 February 2007 (UTC)Reply

Thanks :) I will be working towards allowing more search options, and date should be one of them. - Tangotango 12:25, 15 February 2007 (UTC)Reply
  Done Mayflower can now search by date using the Advanced search feature. - Tangotango 12:34, 24 February 2007 (UTC)Reply

Uncategorized images edit

Great tool, congratulations Tangotango ! An option allowing to search for uncategorized images would be great ! Fabienkhan 13:19, 14 February 2007 (UTC)Reply

Use OrphanImages for that. pfctdayelise 08:34, 15 February 2007 (UTC)Reply
Thank you :) Yes, please use OrphanImages for that. - Tangotango 12:25, 15 February 2007 (UTC)Reply
Er - yes but only if you have a lot of time on your hands (well it is slow for me at least), part of the reason for my comment under User_talk:Tangotango/Mayflower#Great --Herby talk thyme 10:56, 25 February 2007 (UTC)Reply

Suggestions from le Korrigan edit

Hi TangoTango, and thanks for this tool ! A few suggestions below :

  • Internationalise (or make i18n'able at least) the interface.
  • Allow UTF-8 searches, but also allow to consider or not accents. For instance, a user could choose that the letter "é" is searched itself, or assimilated as being "e".
  • Allow to restrict search to certain MIME types (simple : photos / diagrams / sounds / videos ; advanced : specify a particuler type).
  • Maybe be able to specify what fields the search is looking at (filename, image description, metadata, categories...)
  • Dream : allow to search only uncategorized images (using OrphanImages' code ?)
  • Search only certain sizes (again, simple would be "large/medium/small", advanced would allow to specify a max / min size).
  • Dunno if it is at all possible, but B&W vs colour images could be selected ?
  • Change the number of results on each page ?
  • Display (or not, the user could choose) the full name of the file, its upload date, its size, the categories it is in...

But don't worry if these ideas are too complicated, your tool is already very useful as such ! Thanks, le Korrigan bla 14:34, 14 February 2007 (UTC)Reply

You're welcome :) Thanks for the many suggestions.

  • I will be implementing an internationalization system similar to the one I currently have for my other toolserver tools, but it will probably be after most of the features are complete, to reduce redundant work.
  • I will be working on Unicode characters... it may require substantial changes to the word stemmer, but it is definitely a priority. About the accents, I will probably be storing all accented characters unaccented so searches like that you mention will be possible.
      Done Accented characters can now be searched for properly. Other Unicode characters (in particular, languages that use spaces as word delimiters) will be coming next; support for languages (mostly CJK, I think) that don't use the space character as a word delimiter will be more difficult to implement, but is under consideration. - Tangotango 15:56, 27 February 2007 (UTC)Reply
  • File type restrictions will be one of the advanced search options in a future version.
      Done Mayflower's "Advanced Search" page now allows users to search by filetype. More fine-grained search options (PNG only, GIF only, etc.) are planned for a future release. - Tangotango 11:36, 20 February 2007 (UTC)Reply
  • Unfortunately, the indexer doesn't store which field it got its words from, so field-specific searches will not be possible at present. However, the new version I uploaded a few moments ago allows inclusion/exclusion of specific categories, so please take a look if you're interested.
  • Uncategorized images are probably best left up to OrphanImages...
  • Size restrictions will also probably be one of the advanced search options.
      Done The "Advanced Search" page now provides this option as well. - Tangotango 11:36, 20 February 2007 (UTC)Reply
  • Hmm... I'm not sure if the database provides bw/colour data, but if the "img_bits" column has got anything to do with this, it will be implementable in a future version, along with the other advanced search options.
  • Yes, results/page will probably appear soon :)
      Done "Advanced Search" page provides this option. - Tangotango 07:40, 23 February 2007 (UTC)Reply
  • This will probably be implemented in a Javascript popup. I am also experimenting with a "panel", which is invokable in the current version by specifiying "&a=1" in the URL. The panel currently provides category information only, but could easily be expanded to include other data (at the expense of bandwidth/CPU/memory usage).
      Done Category/license information can be displayed using the new "Display details" option in Advanced search. Other data will be included in a future release. - Tangotango 07:40, 23 February 2007 (UTC)Reply
    Great ! The icons are cute. Maybe a small tooltip could be more informative as well ? le Korrigan bla 10:57, 23 February 2007 (UTC)Reply
    Thanks :) Yep, I've changed the "alt" attributes on the images to "title"s, so tooltips should now appear when you mouseover the icons. Cheers, Tangotango 14:46, 23 February 2007 (UTC)Reply

Cheers, Tangotango 12:25, 15 February 2007 (UTC)Reply

Suggestions from FrancisTyers edit

  • Change the number of results on each page. As a dropdown, not preferences like google images.
      Done Number of results per page can now be changed via the Advanced search page. I tried to think of a way to include this in the main search page, but I'm wondering if many users will find it necessary or useful--I also do want to reduce any load on the servers that I can. (As a workaround, you can manually change the "z" parameter in the URL to any arbitrary value up to 30 to change the number of results per page.) Please let me know if you require this feature to be on the main search page, rather than the Advanced search page. - Tangotango 16:36, 20 February 2007 (UTC)Reply
  • Search by image size.
      Done Search by image size (currently only the number of kilobytes is searchable, I will probably implement image dimensions as well, later) is now available via the Advanced search page. - Tangotango 16:36, 20 February 2007 (UTC)Reply

License isn't that important as commons has all-free images. - FrancisTyers 13:35, 14 February 2007 (UTC)Reply

It is important because many people request such such an option (even if you wouldn't use it). For Wikipedia, it's not so important, although it would be extremely useful to "exclude questionable licenses" (don't show any images that have a deletion warning on them). pfctdayelise 08:29, 15 February 2007 (UTC)Reply
Possibly an option (checkbox) to enable/disable license showing? — Timichal 08:32, 15 February 2007 (UTC)Reply
  Done License/categories can now be shown/hidden usiing the new "Display details" feature in "Advanced search". - Tangotango 07:41, 23 February 2007 (UTC)Reply

Relevance edit

It seems to me that "sort by relevance" has a funny output. i.e. search for book by relevance shows 2 images on top that have nothing to do with books. The search doesn't seem to be context sensitive like where the images are used. Otherwise useful. This should be the standard behaviour when seraching something on commons. --Ikiwaner 21:42, 14 February 2007 (UTC)Reply

Unfortunately, the way Mayflower indexes words makes it currently impossible to provide the level of context-sensitivity that you request. To Mayflower's little mind, "more of the same word = more relevant". Although this is true in many (most?) cases, it doesn't work every time, so you may see one or two inappropriate results like these. Cheers, Tangotango 12:08, 15 February 2007 (UTC)Reply

Search only in some galleries edit

How about being able to search in some category and its subcategories? Right now I'd need that as I'm looking for pictures of Finnish countryside. Samulili 18:47, 18 February 2007 (UTC)Reply

Although undocumented, you can already do this to some extent; to search within, say, "[[Category:Flowers]]" only, add "+Category:Flowers" (without quotes) to your search query. Please replace any spaces with underscores. However, note that Duesentrieb's MediaSearch tool is a tool specifically for searching within categories and subcategories (something Mayflower can't currently do), so if you already know the category you want, then you should probably use his tool for now. Cheers, Tangotango 11:07, 20 February 2007 (UTC)Reply
By the way, the search-in-category options can also be specified using the Advanced search page. Cheers, Tangotango 12:41, 24 February 2007 (UTC)Reply

Problem with paging edit

I noticed, while working with the spider images, that there is something wrong with the "pages" register at the bottom of the screen. I wanted to start from the earliest images, so I went to the highest page number that I could find. Then by accident I pushed "next page" and found an additional page. Also, if I happened to be look at, e.g., p. 54 and clicked on an image to check whether it was already in the spider gallery, when I clicked on "back" I would be sent to page 55 instead of page 54. Getting to 54 involved the extra time of scrolling to the bottom, punching 54, etc. 152.17.122.151 21:37, 28 February 2007 (UTC)Reply

For your first problem, was the pagination like this: "... 48 49 50 51 52 53 54 ..."? In that case, the elipsis after 54 signifies that there are additional pages after page 54. If that was not the case, please provide me with a complete URL for page 54 of the query you tried, and I will debug it.
As for your second query, did you look at page 55 before going to page 54? In that case, this may be a browser-related issue. I've found the same problem (it may actually be a feature) occurs with Camino, and Safari also suffers from page jumping while backing through the history. Which browser are you using? And does the same happen with, say, Google (when you're looking through the pages of results and you look at a page, and go back to the Google results)?
Cheers, Tangotango 23:37, 28 February 2007 (UTC)Reply

Wrong results with words containing French diacritics and other questions edit

1) The following search :

http://tools.wikimedia.de/~tangotango/mayflower/search.php?q=Ari%C3%A8ge&t=r

gives as the first result : Commons:Image:Aria.png, which has nothing to do with the French river called "Ariège".

So it seems that your tool is mixing up the French "è" (%C3%A8) with an "a".

Can you find a solution to this problem ?

Otherwise, your tool is cool.

2) May I suggest to use popups displaying the full file name, eachtime your tool has to shorten it, which unfortunately makes that name rather obscure ?

3) Would you be able to build another search tool aimed at performing special text searches on Wikisource?

Teofilo 13:28, 6 March 2007 (UTC)Reply

On the third point, I would add that an actual (i.e. working and more advanced than the built-in one) text search tool would be useful for any project, not just for Wikisource... le Korrigan bla 18:18, 6 March 2007 (UTC)Reply
(1) This issue relates to the word stemmer that Mayflower uses. First, Mayflower translates "Ariège" to "Ariege", then passes it to the word stemmer library, which, in this case, has made the word "Aria". This seems like a useless translation, but it works for English, although sadly not for most other languages. (In English, I can search for "french flag" and get an image described as "flag of france", thanks to the word stemmer.) You'll see, however, that in this instance, the correct results--pages containing "Ariège"--do also appear, so it's not a completely hopeless situation. This isn't easy to solve; turning off the word stemmer is possible, but this must be done on the complete index, meaning that "french flag" will no longer return "flag of france". Considering that the correct results *do* still appear later on in the pages of results, and considering that a large proportion of the image description pages on the Commons are in English, I'm inclined to leave it as-is for now. However, this is certainly something that should be fixed in the long term.
(2) I'm currently working on a Javascript popup/alternative interface :)
(3) I'd like to focus my efforts on Mayflower at the moment, especially with the limited time I currently have to work on the Wikimedia projects. Also, indexing the other, text-based wikis requires significantly more resources and processing time, which might be something best left up to Google and co. ;) However, on a more serious note, this is certainly something that I'd also like to see (a better wiki search). Cheers, Tangotango 06:20, 7 March 2007 (UTC)Reply

Wow edit

I haven't tried the advanced search yet, but this looks slick and it actually works better than a mediawiki search. Congrats!

  Done Thanks :) ! Due to a silly CSS error on my part, the advanced search link was there, but was completely white and invisible :P It's now fixed. Thanks for pointing that out! Cheers, Tangotango 13:56, 24 March 2007 (UTC)Reply

Not working? edit

I've put in a couple of names for spiders, and I get "horrible error" and then no actual details where details are promised. No idea of what is going on. 4.152.93.66 04:21, 14 March 2007 (UTC)Reply

As of yet the most funny error message I've ever encountered on the net (when searching for my flickr.com username), is the bone-rattling horrible error :) Keep up the good work, cheers 85.131.18.111 07:47, 14 March 2007 (UTC)Reply
As it seems to be working right now, I'm assuming it was a temporary database error (which is, unfortunately, quite common on the toolserver). I'll add more information to the error message so this is clearer. Cheers, Tangotango 13:58, 24 March 2007 (UTC)Reply
Return to the user page of "Tangotango/Mayflower/Archive 1".