Research:Knowledge Gaps Index/Visualization
After finalizing the Taxonomy of Knowledge Gaps and designing the metrics for their measurement, our next milestone is to share the advances on the design of the tool to visualize such gaps. We are working on creating a tool for visualizing the knowledge gaps for each project and over time.
Scopes and Methods
editThe taxonomy of knowledge gaps identified 3 macro-dimensions across which inequalities exist in Wikimedia projects: Readers, namely the set of individual who access Wikimedia sites to consume content, Contributors, the community of editors of Wikimedia projects, and Content, i.e. the knowledge contained in Wikimedia projects.
Each of these macro-dimensions contains multiple gaps. For example, for Content, these are: geography, gender, sexual orientation, time and local content. To measure these gaps in content, we proposed an approach to map articles to each of these categories, with the ultimate goal of visualizing them in a tool.
The Knowledge Gaps Index Tool will provide a versatile and flexible user interface in order to allow the user to navigate and explore the different knowledge gaps. At an initial stage, it will only focus on content gaps, but it will add data from the other two dimensions as quick as there is data coming from surveys. The aim of this tool is to allow any Wikimedian to grasp the state of the gaps in a simple and usable way, but without removing the possibility to go further and dig into the complexity by means of multiple comparisons.
Use Cases based on User Research
editTo be able to understand how this tool would satisfy the user needs, we tried to enumerate a set of different use cases. We counted five basic ones, which we structured according to the general and specific questions the user is trying to answer, the general data characteristics, and what they might want to do next.
The use cases are “the current situation” and four comparisons: compare-past, compare-languages, compare-past-languages, and compare-subgroups. We rapidly review each of the knowledge gaps index use cases for the gender gap.
Use Cases Flow Diagram
editEach of the use cases present a visualization of the data that allows answering multiple questions. In this sense, we should see them as “picture”.
In the following diagram, we see the use cases as different “states”. Depending on the parameters we select, we may be able to switch from one use case to another (like a navigation diagram).
- The diagram contains the parameters selection for each state.
- Between each state, there is only one *parameter change*.
- When switching from certain states, you need to deselect a parameter (in addition to adding a new one).
The opposite paths have not been drawn.
We included two additional use cases not mentioned before: compare-dimension (1E) and compare-metric (1F). Compare dimension would allow seeing the distribution of gender gap in content and in readers. Instead, compare metrics would allow seeing side by side the distributions of the number of articles, extent-score and number of inlinks from the main page. These are two additional use cases which we believe they are independent of the others, and it is not possible to cross or add any extra layer on them.
Design requirements
editThe UI interface is being designed according to the principles of flexibility, progressive disclosure and consistency. While the navigation diagram is the main tool that guides the design of the prototype, there are some aspects to be taken into account:
Default graph types: There are 6 states. We need to select a default graph type for each state. Current (treemap), 1A (stacked bars), 1B (treemaps for 2; stacked bars for >2), 1C (line chart), 1D1 (stacked bars), 1D2 (stacked bars).
Time configuration: There are different aspects to configure related to time. Time comparison requires switching from "current" to different time-spans. These are time-related or measurement-related parameters to be taken into account:
- Time-frame (this is the current, 6 months, 1 year, 5 years, etc.).
- Time aggregation (monthly, quarterly, and yearly).
- Cumulative and Incremental. We usually want to see the cumulative (what I called accumulated), but the incremental is very relevant to understand how well we bridged a gap last month.
- Left-axis absolute or relative value (e.g., absolute is the number of articles for each gap-category; relative is the percentage for each gap-category occupying the entire axis).
Other aspects:
- Table design. There are multiple aspects that can improve table readership, such as coloring cells according to the values, using filters on the top, etc. https://dash.plotly.com/datatable/interactivity
- Microinteractions. e.g., Drop-down menus allowing text input to filter, in-graph hovering, download to csv, sharing button, etc.
- In-graph interactions (e.g., these are some provided in the Plotly library that are especially useful).
- Legend interaction design (e.g., double click on a category removes the rest automatically from the graph).