Learning and Evaluation
Shortcut:
PEDG

Welcome to the Learning and Evaluation glossary. This is to help the Wikimedia community develop a shared language around program evaluation.

It's a work in progress, and we are working on making it as easy to understand as possible. We hope you'll join us in developing it and making it a multilingual tool.[1]

A

activities

Activities are the things we do to help meet our organizational missions and goals. Activities are the actions that happen as part of a program. When you plan a program, your inputs go into your activities, which will result in outputs.

Examples:

  • teaching a new editor how to make a user account at an editathon;
  • designing and handing out a post-workshop survey for participants to provide feedback to organizers;
  • taking photographs during “Wiki Loves Monuments”;
  • arranging for a tour guide to give a Wiki Takes Your City tour;
  • showing students how to edit Wikipedia in the classroom, and enabling them to do this;
  • uploading images of paintings during a GLAM content donation;
  • providing a forum for participants to brainstorm ideas for hacks during a hackathon.


appreciative inquiry

Appreciative inquiry is an evaluation method that focuses on what worked and went well in a program instead of focusing on what didn't work or went badly in a program. By asking positive questions, organizations and program leaders can take what works well and do more of those “good” things to make their program more successful. When you use appreciative inquiry, you're taking on an appreciative inquiry perspective.

Examples:

  • You discover only 10 out of 20 participants in an editathon were editing six months after the program ended. Instead of wondering why 10 participants didn't edit, you inquire about what made 10 participants continue to edit.
  • Wiki Loves Monuments succeeds in encouraging people to take photographs of historic places. You notice that your language Wikipedia is lacking photographs about your country's national parks. What can you take from the successes of WLM and put towards the “Wiki Loves Parks” idea you've developed to make it successful?

Learn more:


assumptions

When you make an assumption (assume something), you're believing without proof that something is true, or that something will happen.

Examples:

  • everyone who attends your editathon will have their own laptop and will bring it;
  • everyone left your workshop with a good understanding of how Wikipedia works.

Learn more:

See also: rationale


B

benchmarking

Benchmarking is when an organization measures the success of a program over time based on the best measurements and outputs produced by similar organizations or programs. You do this by looking at programs that are similar to yours and successfully measured their success or lack of success, and thinking about how those benchmarks might be used or modified for your program. Benchmarking is often more credible where it is expressed in numerical terms, although this is not always possible.

Example:

  • Your chapter wants to partner with a GLAM to do a content donation for an Armenian cultural history museum. You read the case study about the Walters Art Museum and note the important “successes”, which include “55 photographs of museum content were uploaded to Commons”. You modify this to “We expect at least 80 photographs of museum content will be uploaded to Commons with descriptions in Armenian and English during the project, and at least 10 of these to be promoted to valued picture status within three months after uploading”.

Learn more:


C

cohort

A group of people who come together because they have something in common.

Examples:

  • students in a class that is a part of the “Wikipedia Education Program”;
  • all of the participants from a particular edit-a-thon;
  • photographers who participated in “Wiki Loves Monuments”;
  • Spanish-speaking attendees at a particular Wikimania.

Learn more:


comparison group

A comparison group is a group of people who share similar characteristics to a program's participants, but are not involved in the program.

Systematic data are usually collected from both comparison and participant groups and compared to identify the differences between them in relation to the program and its goals. Because the comparison group did not participate in your program, comparing data help to determine whether a program is making a difference in terms of the targeted metrics and outcomes.

A control group is a specific type of comparison group that comes from the random selection of potential into two groups: those who participate and those who don't.

Example:

  • A comparison group could be randomly or strategically selected from the general population of users on Wikimedia, or a particular project, language, or geographic subset relevant to the population targeted by a program. An example is a random sample of new users to Wikimedia Commons as a comparison group for new users to Commons who were recruited through “Wiki Loves Monuments” – the former receiving no program intervention, and the latter entering through the “Wiki Loves Monuments” program.


completion stage

A completion stage, or follow-up stage, is one of four phases in a program evaluation process (identification, design, implementation, and completion/follow-up).

The completion stage focuses on analyzing, reporting, and using evaluation findings. These findings are usually shared through summative and outcome evaluation reporting, to explain what lessons were learned, what target outcomes were reached, and what the next steps should be for a program.


condition

A condition is environmental, political, social, or civic state in which an individual or group of individuals are in.

In impact evaluations, we are attempting to measure whether a program has had an effect on participants' condition.


context evaluation

A type of program evaluation. It looks at variables such as social, political, economic, geographic, and cultural factors, and how these variables affect the optimal running of an activity, and how successful the activity was.

I.e., What external factors influenced the program?


D

data

Qualitative or quantitative pieces of information.

Examples:

  • survey results;
  • contributor edit histories;
  • budgets from multiple organizations;
  • exit interviews;
  • the number of times a person's edits were reverted;
  • articles submitted for assessment on a Wikipedia project;
  • file names.


data collection

The act of collecting, recording and/or gathering data.

Examples:

  • writing, distributing, and collecting survey responses;
  • interviewing people about a topic for their opinions;
  • gathering edit-history logs;
  • compiling a list of all participants at a Wikimedia event;
  • counting the number of men versus women attending a Wikimedia meetup;
  • counting the total number of votes and who voted in an administrator election.


design stage

The design stage is one of four phases in a program evaluation process (identification, design, implementation, and completion/follow-up). The design stage focuses on understanding baseline performance and deciding what your goals and objectives, target indicators, benchmarks, and evaluation strategies will be.


discrepancy perspective

The viewpoint that sees evaluation as a process for identifying and measuring differences and inconsistencies between (i) what that process is in reality (or what you have), and (ii) what you wanted or expected. It works to improve performance by locating challenges and correcting problems in reaching program goals and objectives.


document review

A data collection method in which you intentionally collect, review and analyze existing program documents and materials related to program delivery (documentation of program inputs and feedback such as attendance sheets, presentation materials, programming notes).

Example:

  • After an edit-a-thon you review the promotional material, blog posts, on wiki event space, and general observations of contributions by participants. By doing this, you can capture evidence of program delivery in terms of both inputs, outputs, and the process in your programming.


E

evaluation;
evaluator

The systematic determination of the merit, worth, or significance of something. Evaluation uses a set of standards or criteria to assess any effort, program, or initiative to understand its impact and make decisions about how to proceed.

The primary purpose of evaluation — in addition to gaining insight into past or present efforts, programs, and initiatives — is to enable reflection and assist in the identification of needs and strategies for future change.


evaluation plan

A detailed description of how an evaluation will be done. An evaluation plan sets out strategies to systematically collect the information needed to tell the story of a program and its success.

This can include who is involved in a program evaluation (a designated member of a Wikimedia user group might be the leader in this evaluation, or volunteers who produced an edit-a-thon might work together to evaluate it), evaluation strategies (monitoring, surveying, tracking performance), individual roles in an evaluation (Julia is in charge of gathering all survey data from participants), the timeline used to execute the evaluation, and the resources available for implementing the plan (what data will be gathered, research methods to be used to gather the data, a description of the roles and responsibilities of sponsors and evaluators, and a timeline for accomplishing tasks.


evaluation question

A question related to a program's outcomes, outputs, indicators, or other definition of success.

The goal of an evaluation effort is to answer one or more evaluation questions.


F

feasibility assessment

Type of program evaluation, or evaluation strategy, done prior to beginning a program. Feasibility assessment focuses on whether the proposed program and activities within the program are possible and looks at if the objectives are able to be achieved within the proposed plan and timeline.

I.e., What can we reasonably accomplish?

Example:

  • You want to do “Wiki Loves Monuments” in your country this year. There are only five active volunteers in your country and you're unsure whether you'll be able to produce the program efficiently. You do a feasibility study to determine how much it will cost, and whether you'll have the capacity, with only five volunteers, to get a website up and promotion started.


formative assessment

A type of program evaluation or evaluation strategy that focuses on how the learning process works; what works as planned, what does not, and any strategies to meet challenges in achieving the intended outputs and outcomes of the learning component of program delivery.

i.e., How did it happen? What were the pathways for implementation and change?

See also: process evaluation.


G

goals

Goals are clear statements of the overall purpose of your program or organization. There are two types of goals relevant to program evaluation:

  • “Program goals” about the intended aims or impacts of the program.
    Example:
    • Your “Wikipedia education program” may have goals related to improving content on the Hindi Wikipedia using the expertise of students and professors and to bring more awareness to the educational community in your area on how to use Wikipedia in the classroom.
  • “Organizational goals“ that set a clear direction for the organization as a whole and provide guidance and direction to staff toward the organizations mission and strategic plan.
    Example:
    • The Wikimedia Foundation's five prioritized strategic goals toward increasing: content, participation, quality, reach, and diversity, across all Wikimedia projects.


H

Nothing here yet! Feel free to add something.

I

identification stage

The identification stage is one of four phases within a program evaluation process (i.e., identification, design, implementation, completion/follow-up).

The identification stage focuses on discovering what is known from past successes and failures, providing lessons learned, and identifying assumptions.


impact

In general, impact means a overall effect of a program. It is often used broadly, describing the ultimate goals of the program, such as to alleviate poverty or to help children attend college. Programs are conceived to have an impact, but because the ultimate goal is often very large, long-range, and broadly-stated, it is often difficult or impossible either to measure it or to understand exactly how one program influences it.

Impact is the area where a program’s goals most often align to the mission of funding organizations: one foundation may desire to “improve children’s lives” and one program operated by thas foundation may address this goal in a small way with specific outcomes.

In terms of Wikimedia programs, impact refers to the extent to which an individual program's outcomes lead to long-term and sustained changes toward the Wikimedia vision and strategic goals.

Examples:

  • the amount of new contributors recruited through program activities such as “Wiki Loves Monuments” may work to increase overall participation;
  • a content donation in the area of women's history may attract more female Wikipedia editors on a specific wiki (participation and diversity);
  • the quality of content on a Wikipedia language version (quality);
  • an improved retention rate among very active Wikipedia contributors (participation and content).


impact evaluation

Type of program evaluation, or evaluation strategy, that focuses on measuring how well the goals and objectives of a program were met.

Most often, impact evaluations examine comparison and/or control groups, in order to provide comparable data across programs to inform the decisions of those doing the program.

With this type of evaluation, program leaders are able to see what type of impact their program is making, at what cost, and whether they should continue the program, modify the program, or work to reduce or expand the program based on the cost of obtaining impact.

I.e., How much difference did it make?

See also: outcome evaluation


implementation

Each instance of a program being planned and deployed represents a different implementation of that program.

Example:

  • “Wiki Loves Monuments” is a program whereas “Wiki Loves Monuments 2013 in Estonia” is an implementation.


implementation stage

The implementation stage is one of four phases within a program evaluation process (i.e., identification, design, implementation, completion/follow-up).

The implementation stage focuses on monitoring the program delivery, progresses toward obtaining expected outputs, and successes in collecting the correct data needed to tell the story of the program and its outcomes.


indicators

Indicators are measurable markers that a certain condition or situation exists, or certain outcomes have been achieved. They tell you how much progress has been made toward a particular goal, output, or outcome.


inputs

Something put into a process with the intention of it shaping or affecting the outputs of that process.

In program evaluation, the inputs are the resources (both human and tangible) that you put into your program to make it happen. Time and money are the most fundamental resources.

Example:

  • Inputs for your workshop include how many hours volunteers worked on to plan and produce the workshop, how much money was spent from FDC funds to produce the workshop, if donations were provided by sponsors, and if money was spent on making booklets to hand out at the workshop.


intermediate outcomes

The critical middle layer of any outcomes measurement framework (i.e., short-term, intermediate, and long-term outcomes).

Intermediate outcomes generally refer to changes in behavior and decision-making, the actions taken by those who participated in the program. Specifically, intermediate outcomes are the actions that that are taken that will lead to changes in the conditions related to targeted impact.

Example:

  • You host a “Wiki Loves Monuments” event where you invite participants to photograph a historic site in partnership with a historic monuments organization. 20 people come to your event. An intermediate outcome would be that each participant uploads at least 5 images to Commons at least two weeks after the event, before the end of September.


J–K

Nothing here yet! Feel free to add something.

L

learning organization

This is what we want to see Wikimedia, chapters and affiliates be! Even individuals! (“Learning individual!”) These organizations have the capacity to maintain and improve their performance and programs based on their experience. They evaluate themselves and intentionally create feedback loops to make sure they are always reflecting on their own actions and achievements.


logic model

A visual representation of how your programs works — a “picture” of your program. A logic model includes what you put into your program (resource inputs), what you do (program activities and participation), and what you plan to achieve (program outputs and resulting outcomes).

Importantly, a logic model is:

  1. an organized and basic description of a program and its measurable accomplishments;
  2. an ordered series of “if-then” relationships that are expected to lead to the desired program outcomes (also known as “theory of change);
  3. a framework for describing the relationships between programming inputs, outputs, and outcomes.


long-term outcomes

Outcomes that generally relate to consequences in terms of changes to the conditions. These are the end-point changes in social, economic, and/or environmental conditions targeted by your theory of change.

Example:

  • An edit-a-thon intended outputs include 10 articles written about women in science, however, the long-term outcome/impact target may be having more high quality articles along women's subjects on Wikipedia.

See also: impact.


M

metric

Metrics are well-defined values or sets of values that can be computed and tracked, and are typically used in aggregate to compare different participant groups or projects (i.e., comparing one program cohort) against another.

The metrics computed by the UserMetricsAPI help us understand user activity — and behavior — from the quality, quantity and type of user contribution, to how well our editors are retained.


mission statement

A statement that describes how an organization’s purpose is aligned with its vision. A mission statement is brief — “short enough to fit on a t-shirt” is one rule of thumb — and describes why the organization exists, what it does, for whom it exists, and the value that it creates, without listing specific activities employed to achieve the mission.

Learn more:


monitoring

Type of program evaluation, or evaluation strategy focused on tracking and describing programming inputs, delivery of program activities, and outputs. It is literally the “bean-counting” of activities and events to see if you did what you said you would do.

I.e., Did we do what we set out to do?


N

needs assessment

Type of program evaluation, or evaluation strategy, which focuses on looking at demand (i.e. article subjects that Wikipedia needs improvement in) or gaps (i.e. we have thousands of photos of monuments in Mexico, but what are we missing photos of in Mexico?).

It's usually done before the program, during the design stage. The data gathered is then used to formulate the baselines (starting points), goals, objectives and the resources required for the program.

I.e., What change is needed?


O

objectives

An objective is a small component of a goal. It focuses on the targets that you will take in order to obtain your goal.

Example:

  • If your goal is to get more women editing Wikipedia, your objective would be to hold 10 edit-a-thons in 6 months to help make that goal.


outcomes

Outcomes are the results that your program wants to achieve. Outcomes are a the detailed part of impact. If your edit-a-thon meets its outcomes, in theory it will have impact. Outcomes are the results that your program aims to achieve and are represented by the changes that happen to participants in terms of changes to knowledge, skills, behavior, and/or other attributes and conditions that are targeted by a program.

When defining outcomes, consider: how does the program touch the lives of individuals, groups, families, households, organizations, or communities in the short-term, intermediate and long-term? Ideally, a program's outcomes should result in sustained impact on the Wikimedia Projects. Examples of outcomes include skills needed to edit a Wikipedia article, gaining new editors, retaining editors, increasing editor participation, etc.

Example:

  • If your overarching goal is to have more women edit Wikipedia in the next year, you will have the objective to have 10 edit-a-thons in the next six months. After your edit-a-thon, you will evaluate the edit-a-thon to discover the outcomes. Each individual edit-a-thon will have outcomes, and the entire program as a whole could as well. Outcomes might include that 20% of the 150 women who participated in the 10 edit-a-thons edited Wikipedia 2 months after the series of events ended.


outcomes chain

Also called a “chain of outcomes” or “pathway of outcomes”, this term refers to the sequence in which outcomes are achieved. Outcomes cannot all be attained at the same time, and some outcomes rely on the achievement of outputs and other earlier outcomes.

Your short-term, intermediate, and long-term outcomes should reflect a logical outcomes chain (or pathway of outcomes) in which your longer-term outcomes are supported by, and dependent on, your shorter-term outcomes.


outcomes evaluation

Type of program evaluation, or evaluation strategy, that is most commonly known and focuses on the extent to which a program met its intended goals and objectives for change in its outcomes.

See also: summative evaluation.


outputs

The direct and measurable products of program activities and participation. Outputs are what came out of the program activities during the programming event(s).

This generally includes observations of the number of workshop sessions conducted, educational materials contributed, and participants served. These outputs should be the first step toward the desired outcomes for the program's participants.

Examples:

  • the number of many people participated (20 participants total, 2 volunteers);
  • what type of people they were (15 medical students and 5 professors, 10 women);
  • how many articles or images were uploaded during the programming event(s);
  • the subject matter of the contributions.


P

performance measures

Consistent quantitative indicators used to assess whether, and to what extent, an outcome is achieved.


problem statement

Also simply “problem”, or “issue statement”, or “challenge”, a problem statement is a concise statement of the problem or a challenge that your program aims to resolve.

Example:

  • Wikimedia Projects suffer from a gender gap — more men contribute than women — therefore your program wants to get more women involved.


process evaluation

Type of program evaluation, or evaluation strategy, which focuses on the degree to which an initiative, program or project has been implemented as planned.

I.e., How did it work/happen?

See also: formative assessment.


program

A group of related projects and activities that share the same objectives, are repeated on a regular basis, are based on a similar theory of change, and use similar processes and interventions to make that change happen.

Key characteristics of programs:

  • shared objective: a group of related projects that share the same objective;
  • sustained: a group of related projects that are on-going or repeated on a regular basis;
  • model: a group of related projects that share a similar theory of change and use similar processes and interventions to make that change happen.

Example:

  • In 2012, Wikimedia Sweden, Argentina, Poland and others were running “Wiki Loves Monuments” projects that were all part of a global “Wiki Loves Monuments” program. The global “Wiki Loves Monuments” program has been organized over the course of several years, while its objectives (getting more photos of monuments being uploaded to Commons) and theory of change remained the same. Every year, the processes got refined (e.g. new upload tools got developed), whereas “uploading pictures” stayed at the core of what was happening in order to achieve impact on Wikipedia (with the goal of having a larger percentage of articles about monuments being equipped with an appropriate image).


program evaluation

The systematic determination of something's merit, worth and/or significance. Evaluation uses a set of standards or criteria to assess any effort, program, or initiative to understand its impact and make decisions about how to proceed.

The primary purpose of evaluation, in addition to gaining insight into past or present efforts, programs, or initiatives, is to enable reflection and assist in the identification of needs and strategies for future change.


program evaluation plan

A document that details strategies for systematic collection of information that will be used to answer critically important questions about a program.

A program evaluation plan provides a framework for developing indicators for program outcomes, and determining how evaluation information will be collected.

See also: evaluation plan.


program leader

A program leader is a person who plans, executes and, most of the time, evaluates programs. Sometimes programs have multiple program leaders.

With regards to Wikimedia programs, program leaders might be individuals with no chapter affiliation. They may be the volunteer President of a chapter, or perhaps a paid employee of a chapter who designs and executes programs specifically for that chapter. They could be a member of an affiliate group recognized by the Wikimedia community. It could be a librarian who hosts workshops at their library to teach people how to edit Wikipedia.

You might be a program leader!


project

An individual or collaborative enterprise frequently involving research or design, planned and designed to achieve a particular aim. Generally has a start and end date and may be repeated, but is not designed for repetition.

Not to be confounded with “Wikimedia Projects” like Wikipedia, Wiktionary, WikiQuote, Wikinews, Wikivoyage, Wikisource, Wikibooks, Wikiversity, Wikimedia Commons, WikiData, MetaWiki, MediaWiki, etc.


Q

qualitative data

Data described in terms of “quality”, as opposed to “quantity”.

Qualitative data is often obtained through asking open-ended questions, to which the answers are not limited to a set of choices or a scale. Qualitative data collection is most useful when you would like information in people's own words, or when the questions you are asking have too many possible answers for you to be able to list them. Qualitative data can also come from capturing through observational measures.

Qualitative data is more time-consuming to analyze than quantitative data, but can be a worthwhile and important part of a data collection effort.

Example:

  • If you ask students at the end of their semester in the “Wikipedia Education Program”, to “share what they enjoyed most about editing Wikipedia in the classroom” and they share their own opinions and thoughts either in a written paragraph in a section on a survey or verbally by sharing answers in a classroom poll and coding those, or other natural, observations of volunteers points of enjoyment.

Learn more:


quantitative data

Data described in terms of a “quantity” or number, as opposed to “quality”.

Quantitative data is collected through closed-ended questions, where users are given a limited number of answer choices, or asked to answer on a scale. While quantitiative data collection is suitable for collecting numeric data such as age, income, number of staff, number of children, etc., many types of information can be collected quantitatively if placed on a scale.

Example:

  • When you do a survey of students at the end of a “Wikipedia Education Program” semester, you might ask them to select their age, select their gender, and select what year they are hoping to graduate, giving them only boxes to check instead of asking them an open question.

Learn more:


R

rationale

Why do you expect your program activities to lead to a particular set of outcomes? Also called “theoretical assumptions”, your rationales are statements expressing why you believe your activities will lead to your outcomes, or why shorter-term outcomes will lead to longer-term outcomes. They are often based on research, but may also come from past experience, common sense, or knowledge of your specific situation.

Example:

  • You believe that based on the program evaluations of past “Wiki Loves Monuments”, you believe that partnering with your countries government agency responsible for maintaining monuments will bring more participants as an outcome. Thanks to the reporting of other program leaders in the WLM community, you were able to see that they had more success meeting participation related outcomes due to these partnerships.


S

short-term outcomes

Outcomes generally related to learning, these are the immediate changes in awareness, skills, attitudes, knowledge, and motivations that result from a program along its theory of change.

Example:

  • You host a workshop on how to edit Wikipedia. Participants get first hand experience in how to edit and learn basic skills. At the end of the workshop, you have the participants fill out a survey about their experience. Out of the 30 participants, 20 say that they left with a higher understanding of editing and feel more confident about it. Their improved understanding and confidence is a short term outcome of your workshop.


summative evaluation

Type of program evaluation, or evaluation strategy, which focuses on assessing whether a program met its intended goals and objectives for change.

I.e., Did we do what we set out to do? What worked? What didn't work?

See also: outcomes evaluation.


T

targets

Targets attach a number to the program’s goals and state expectations for the successful performance of outcomes. Identifying targets helps translate general goals (What we will accomplish?) into specifics (How much will we accomplish by when?).

See also: benchmarking.


test

A measurement of performance; can be used as part of a data collection effort around particular indicators.

Example:

  • At the end of a workshop you might quiz participants on what they learned about Wikipedia policy. This will then inform you about how much they learned.


theory of change

A theory of change is a way to design and evaluate social change initiatives. A theory of change presents a theoretical pathway outlining the action steps that:

  1. link your mission and programming activities toward change through logical cause and effect relationships;
  2. allow for the specification of program outputs and participant outcomes you are trying to effect;
  3. focuses on key outcomes that are specific, measureable, attainable, realistic, and time-bound.

A theory of change offers a clear roadmap to achieve your results by identifying the preconditions, pathways, and the activities, outputs, and outcomes necessary for a program's success.


tool

A tool is any physical item that can be used to achieve a goal. Informally the word is used to describe a procedure or process with a specific purpose.

In program evaluation, tools are often used to track and monitor (logs) as well as capture feedback (surveys, interviews, focus groups, etc.) or extract outcome information (reporting tools).


U–Z

Nothing here yet! Feel free to add something.

References

  1. To link to the definition of a term on this page from somewhere else on Meta, use the template {{Glossary|term}} (for simple untranslated terms) or {{<tvar|target-id>Glossary|target-id</>|displayed term}} (if you need to distinguish an untranslated id from the actual translatable term or a minor derivation of that term used in your page, such as plural). For instance, {{<tvar|assumptions>Glossary|assumptions</>|assumption}}.