Category flatten
This page is kept for historical interest. Its content is outdated or may be wrong. You may find more up-to-date information at on the www.mediawiki.org website. |
I did not find a page about this specific request, but I have heard that someone is working on something that would help this, but I wasn't able to find it.
I don't know about enwiki but we in huwiki have a problem with categories: we actually would like to have two kind of categories:
- the one we're having now: the category only lists its own entries, and
- a category which lists its entries and every entry in its subcategories.
One example is "category:persons", "category:hungarians", "category:writers", "category:sci-fi writers". If we want to have good lists, we have to list a hungarian sci-fi writer in all of these categories (and "hungarian sci-fi writers" and more too). It would be logical to have "category:persons" as a "flategory", which automagically flattens itself and contains all articles in its subcategories.
However it is maybe unnecessary to create different category types, if one's able to actually flatten any category (eg. making it showing everything from its subcats too).
- Some categories contain literally thousands (even tens of thousands) of subcat articles. This would prohibit flat views for large categories, or require that these be optional, which would be processor intensive.-SV 01:03, 23 December 2005 (UTC)
Getting it done
editI am not familiar with internals, but I guess there's a category table to connect categories to articles, so these tables get updated and listed.
- In Mediawiki 1.5 and 1.6, there's one categorylinks table with columns for ID of categorized page, name of category, and name of categorized page (plus timestamp). It records both page and category categorizations. There's a link in Database layout to the actual SQL for it. -- skierpage
Flattening with recusive walkdown would be extremely slow in large category trees. I would use the database schema below.
Subcat table
editConnects a category to all its subcategories, or the other way, gives the supercategories for a category.
Based on the previous example:
parent | subcat |
---|---|
persons | hungarians |
persons | writers |
persons | hungarian writers |
persons | sci-fi writers |
hungarians | hungarian writers |
writers | hungarian writers |
writers | sci-fi writers |
Category assignments
editThis table is probably already existing without the last field.
When Article "X" inserted in "hungarian writers" category:
- Lookup parent for "hungarian writers" in the Subcat table above, gives: persons, hungarians, writers.
- Then assign X to these categories, showing the "real one" as master, and the others as, well, others:
article | categories | master category |
---|---|---|
... | ... | ... |
X | hungarian writers | 1 |
X | persons | 0 |
X | hungarians | 0 |
X | writers | 0 |
... | ... | ... |
- If we want to get the just categories as we do now, get where mastercat==1
- If we want to flatten one, get all regardless of last field.
Category reassignments
editProblem 1: what if "chinese" get assigned to "persons", with its thousand subcategories?
- (thinking)
Problem 1a: "chinese" get reassigned from "whatever" to "persons".
- (thinking)
Automagical flattening
editAs I explained at the top sometimes a category would only be used flattened. This could be addressed, by wither giving the possibility to show it flattened by giving a new namespace, or modifier (like "category_flat:" someone mentioned). Other way is to actually set categories as "flat" or "normal", but this would require additional settings and we'd lose flexibility.
- Just to throw in another idea about possible syntax - how would people feel about sub-pages in the Category namespace being used to control flattening? In other words top-level category pages would work as they do now, but (for example) Text Editors/Linux, Text Editors/BSD and Text Editors/Windows would be flattened into Text Editors.
- Also I'm wondering if it might make sense to suppress links to category sub-pages in the page margin for articles, but to include just a link to the main page only.
See also
editOpinions
editOkay, my opinion is that my idea is good. :) Apart from this humble comment, I believe the additional storage these tables require is neglectible, and the lookup would be pretty fast (expecially when indexed right). I must confess that we - in huwiki - would badly happy to have these "flattenable" cats, and I would guess others would be happy to not requiring to list articles to several sub-subcategories.
I am very curious about other opinions, expecially of people who do code in php (unlike me).
- This is indeed an important feature, see also bugzilla:1497. In my point of view, there wouldn't be a need for a category "list" if there would be a functional "tree" (lists are just degenerated trees, so they are included). So if there would be a good tree doing the work, there is no need for flattening it. Regards Mkleine --217.234.181.11 23:11, 16 Feb 2005 (UTC)
I think this idea tries to address some serious problems with Categories. I have a related proposal posted at Wikipedia talk:Categorization. -- Wikipedia User:SamuelWantman 11:55, 19 Feb 2005 (UTC)
As far as I can tell, this would cause a problem with category size (large category ar eknow to be troublesome, it's notably the whole point of, for example, Wikiproject Stub Sorting). Also, IIRC, the current system will not list subcats starting with a letter whose individual articles are not displayed. To sum it up, the highest category would need to list all articles in wikipedia, which is probably not a good idea, especially considering the higher levels are ripe with circular categorizations (notably, Pokémon is child of Pokemon and vice-versa). User:Circeus
What is the current status of this? w:User:Ardric47 04:33, 16 March 2006 (UTC)
I like it. I think flattening should be a run-time option, not an edit-time setting of a category. What I'd really like is a way to restrict a search to articles in a particular category (search for "handsome" in articles categorized "Hungarians"), and in such a search I'd nearly always want to flatten.
Would cycles in category hierarchies break flattening? Wikipedia:Categorization recommends against cycles, but also gives a real example of a cycle in Education categories. -- skierpage 10:06, 21 April 2006 (UTC)