IRC office hours/Office hours 2013-03-21

ldavis: STARTING PUBLIC LOGGING

[10:00am] ldavis: Hi everyone, I'm LiAnna Davis, from the Education Program team -- since the program evaluation team hasn't been hired yet, I'm helping Frank with the office hours today to keep discussion on track and make sure we get to all the questions.

[10:00am] fschulenburg: Hi everyone. Thank you for coming to this "office hour inside out".

[10:01am] fschulenburg: For those who didn't read my announcement on wikimedia-l yesterday, here's what this is about: we're going to talk about program evaluation. And I would like to start this conversation with learning about what your understanding of evaluation is and what your hopes and fears are.

[10:01am] fschulenburg: Some of you might remember the blog post that I published some time ago: https://blog.wikimedia.org/2013/03/01/lets-start-talking-about-program-evaluation/

[10:01am] fschulenburg: The blog post also links to a couple of documents that I created on Meta. What's on Meta so far is just a start. There will be more documents in the near future. The goal is to develop a resource that will serve the needs of people who are thinking about program evaluation in the context of the Wikimedia movement.

[10:01am] AnjaJ_WMDE left the chat room. (Quit: AnjaJ_WMDE)

[10:01am] fschulenburg: Now, let's start. As I mentioned in my announcement earler, this office hour will be different in so far, as I am hear to listen to you. I prepared a couple of questions and I'm curious to learn more about your experiences with evaluation.

[10:01am] mpeel joined the chat room.

[10:02am] fschulenburg: Here's my first question: How do you define program evaluation?

[10:02am] HaeB left the chat room. (Read error: Connection reset by peer)

[10:03am] jorm left the chat room. (Quit: jorm)

[10:04am] J-Mo: okay, I'll bite: how about "measuring what happened in a program, project, or activity over a period of time in order to figure out how well outcomes match stated goals"

[10:04am] fschulenburg: yeah, that's a good start

[10:05am] Sebaso_WMDE joined the chat room.

[10:05am] J-Mo:

[10:05am] bluerasberry: I define program evaluation as being the objective metrics which are already collected by an established field.

[10:05am] sgardner left the chat room. (Ping timeout: 245 seconds)

[10:05am] fschulenburg: bluerasberry: can you expand on that?

[10:05am] Krinkle|detached is now known as Krinkle.

[10:05am] fschulenburg: what do you mean by established field?

[10:06am] tgr joined the chat room.

[10:07am] bluerasberry: "Established field" means industry. Government, culture, non-profit, commercial sector, and foundations will all have different expectations of what metrics they want

[10:07am] bluerasberry: it may be the case that each of these expects different metrics

[10:07am] bluerasberry: the only commonality may be

[10:07am] bluerasberry: that now all of these are subservient

[10:07am] bluerasberry: to whatever their IT can provide them these days

[10:07am] bluerasberry: that is making people appreciate new metrics

[10:07am] bluerasberry: I hardly care what the Wikimedia movement provides

[10:07am] bluerasberry: but I want it to be objective and well-considered

[10:08am] bluerasberry: whatever metrics I have

[10:08am] Nicole_WMDE joined the chat room.

[10:08am] bluerasberry: I will share with my field

[10:08am] Sebaso_WMDE1 joined the chat room.

[10:08am] bluerasberry: that's all

[10:08am] TBloemink is now known as TB|Away.

[10:08am] fschulenburg: thanks a lot. @all: what makes metrics "objective"?

[10:09am] TB|Away left the chat room. (Quit: So flee youthful passions and pursue righteousness, faith, love, and peace, along with those who call on the Operator from a pure heart.)

[10:09am] brest_ joined the chat room.

[10:09am] brion joined the chat room.

[10:09am] Sebaso_WMDE left the chat room. (Ping timeout: 264 seconds)

[10:10am] bluerasberry: "objective" means that people of different backgrounds, if they followed the same process, would get the same metrics

[10:10am] bluerasberry: hopefully user intervention in this part can be minimized

[10:10am] bluerasberry: and that there will be some process

[10:10am] bluerasberry: like with GA, Facebook analytics, or Twitter stuff

[10:10am] heatherw joined the chat room.

[10:10am] bluerasberry: that just outputs slick looking data

[10:10am] StevenW joined the chat room.

[10:10am] Moberg left the chat room. (Ping timeout: 245 seconds)

[10:10am] bluerasberry: and the users just take what they like from that and report it to their funding organization

[10:10am] bluerasberry: that's all

[10:11am] fschulenburg: yeah, that's a good point. so, results would have to be _comparable_ in order to give us a meaningful picture, right?

[10:12am] MatthewARoth joined the chat room.

[10:12am] bluerasberry: maybe not. A lot of people just like having data whether they understand it or not.

[10:12am] bluerasberry: comparisons do make it understandable

[10:12am] bluerasberry: but some people just want data as part of diligence requirements

[10:12am] J-Mo: bluerasberry I agree that 'cherry picking' is something to be avoided. One way to ameliorate that is to define ahead of time what is going to be measured, why it's important, and what the process will be for measuring it.

[10:13am] bluerasberry: that's all

[10:13am] Ainali joined the chat room.

[10:14am] fschulenburg: from the people who are here – who's done some kind of program evaluation and can you tell me more about what you did and how you did it?

[10:14am] bluerasberry: I have... it was rough for me

[10:14am] bluerasberry: There were about 150 Wikipedia articles of interest to me

[10:15am] bluerasberry: I wanted to know the number of pageviews they got over a period of time

[10:15am] bluerasberry: I counted those by manually taking numbers from grok

[10:15am] bluerasberry: and adding them

[10:15am] ragesoss: I define program evaluation as "Systematic evaluation of how successful programs were and whether they were worth the investment."

[10:15am] Nemo_bis: bluerasberry> I define program evaluation as being the objective metrics which are already collected by an established field.

[10:15am] Nemo_bis: I disagree, evaluation doesn't need to be objective

[10:15am] bluerasberry: so I visited every one of those articles repeatedly, because it was hard for me to set up a tool to do it automatically

[10:16am] ragesoss: certainly, it program evaluage could be based on objective metrics, but it doesn't need to be necessarily.

[10:16am] bluerasberry: that's all

[10:16am] fschulenburg: I would like to hear from other people as well about their experiences

[10:16am] fschulenburg: Ainali: haven't you set up metrics for the Swedish chapter?

[10:17am] fschulenburg: StevieWMUK: and how about the UK chapter – any experiences with the evaluation of your programmatic activities?

[10:18am] bobserf left the chat room. (Quit: Page closed)

[10:18am] Ainali: fschulenburg: Yes we have. Actually I find it much harder to setup good goals than doing the evaluation. Or rather the trick is in picking SMART goals

[10:18am] fschulenburg: Nicole_WMDE: And how about the German chapter?

[10:18am] K4-713 joined the chat room.

[10:18am] fschulenburg: Ainali: which programs did you evaluate? tell me more about how you set up your goals?

[10:19am] ToniSant: StevieWMUK has just stepped away from his keyboard for a couple of minutes.

[10:19am] SarahStierch joined the chat room.

[10:19am] • SarahStierch waves

[10:19am] fschulenburg: ToniSant: maybe you can share some thoughts with us…

[10:19am] tommorris: hey SarahStierch

[10:19am] sgardner joined the chat room.

[10:20am] tommorris: oh and hey sgardner

[10:20am] Ainali: For example, Wiki Loves Monuments. We had a long debate on what we as a chapter most want to achieve with it. Do we want more pictures or new users?

[10:20am] mpeel: In the UK, we've been doing some thinking about metrics - see http://uk.wikimedia.org/wiki/Project_Metrics - but otherwise I think most of our evaluation is fairly free-form at the moment, e.g. in terms of reports from activities, and the various monthly reports/newsletters.

[10:20am] ToniSant: I'm new to the UK office, so I'd rather let Stevie answer the question, if that's ok.

[10:20am] SarahStierch: hey tommorris

[10:20am] ToniSant: Ah, mpeel to the rescue!

[10:21am] ldavis: frank is taking Ainali's comment first, then mpeel's comment

[10:21am] kipcool joined the chat room.

[10:21am] mpeel: also surveys of how well people have found our events, and informal evaluation between organisers of events.

[10:21am] StevieWMUK: *is back now

[10:21am] SarahStierch: mpeel: are those surveys or metrics being published yet?

[10:22am] fschulenburg: Ainali: so, what was the result? Did you go with more pictures or with new users?

[10:22am] Ainali: We actually made a solid works on statistics http://wikimediasverige.wordpress.com/2012/10/31/wiki-loves-monuments-2012-statistiken/

[10:22am] mpeel: SarahStierch: no, I think StevieWMUK would have to ask Daria about them.

[10:22am] fschulenburg: ah. interesting

[10:22am] fschulenburg: *click*

[10:23am] SarahStierch: That's one of the biggest challenges with WLM, and then add in the situation about retaining new editors. It's tough. One chapter I spoke to even said they decided not to do WLM because of the "poor quality" of "most" of the images. Which was interesting…and surprising to hear.

[10:23am] Ainali: We were better on attracting new users than reaching high numbers in uploads. Which is kind of sad since we ended up with goals focusing on pictures

[10:24am] Ainali: So for our original goals WLM 2012 was a fail for us.

[10:24am] fschulenburg: Ainali: so, you looked at the statistics and that informed your goals?

[10:25am] fschulenburg: ah, now i get it. your original goal was getting more pictures, but you didn't reach it?

[10:25am] Ainali: exactly

[10:25am] fschulenburg: ok

[10:25am] StevieWMUK: As far as UK metrics go, it's very ad-hoc at the moment. It's all dependent on the activity. Something that my colleague Daria does is records as much information post-activity as she can on our wiki - attendee numbers, outcomes, contacts made, new articles / editors / volunteers

[10:26am] StevieWMUK: We haven't really been able to benchmark what a definition of success is

[10:26am] fschulenburg: mpeel: I just looked at your Project Metrics page. that looks quite impressive

[10:26am] notnarayan left the chat room.

[10:27am] fschulenburg: @both Sverige and UK: who was involved in setting up the goals and the metrics?

[10:27am] Kbavage_ joined the chat room.

[10:28am] Manuel_WMDE joined the chat room.

[10:28am] fschulenburg: Manuel_WMDE: we were just talking about who to involve when setting goals and agreeing about metrics

[10:29am] fschulenburg: Manuel_WMDE: what's your take on that? how does the German chapter set goals? who's involved in that process?

[10:29am] Ainali: For WLM 2012 it was the AGM that decided on the goals on the boards proposal

[10:29am] fschulenburg: what's AGM?

[10:29am] Ainali: Annual General Meeting

[10:30am] fschulenburg: ah. interesting

[10:30am] fschulenburg: so, you asked your members at the annual meeting?

[10:30am] Ainali: Well, sort of. It was a proposal that was agreed upon without any changes.

[10:31am] James_F|Away is now known as James_F.

[10:31am] fschulenburg: StevieWMUK: is that how you set goals in the UK?

[10:31am] StevieWMUK: We do speak to our community a lot and encourage input from volunteers

[10:32am] Ainali: This year we did not go so much into the metrics on the annual meeting. The board will now decide them in cooperation with each program manager

[10:32am] StevieWMUK: We also have regular contact with our Board, too

[10:32am] fschulenburg: StevieWMUK: does that mean you ask them to leave comments on a talk page?

[10:33am] StevieWMUK: Yes, but we also encourage email too and participation at events

[10:33am] StevieWMUK: We've also recently begun a process of carrying out surveys with members and donors too

[10:33am] StevieWMUK: Sometimes it's difficult to achieve a consensus on what we want to measure and what success looks like, which is a challenge

[10:33am] Manuel_WMDE: hi frank, hi all, at WMDE we involve the AGM, our board, CEO and staff

[10:33am] StevieWMUK: But as we gather more data these things should become clearer

[10:33am] James_F is now known as James_F|Away.

[10:34am] Manuel_WMDE: http://meta.wikimedia.org/wiki/Wikimedia_Deutschland/2013_annual_plan/en#Procedure

[10:34am] fschulenburg: Manuel_WMDE: what's your role in this?

[10:34am] Manuel_WMDE: this describes the process in more detail

[10:34am] fschulenburg: *click*

[10:35am] fschulenburg: Manuel_WMDE: are you working with the individual program managers on defining metrics?

[10:35am] Manuel_WMDE: my role was a consulting role, focussing on the impact evaluation perspective

[10:36am] andrewbogott is now known as andrewbogott_afk.

[10:36am] harshkothari joined the chat room.

[10:37am] TBloemink joined the chat room.

[10:38am] fschulenburg: @all: Ainali mentioned Wiki Loves Monuments as one of the projects that Wikimedia Sweden developed metrics for. What other programs did anyone else do evaluation work on?

[10:38am] tewwy joined the chat room.

[10:40am] Denny_WMDE1 left the chat room.

[10:40am] fschulenburg: None?

[10:41am] Bence: We have had a few article writing or article improving competitions, where the obvious measures we looked at were the number of articles created (with a little hindsight, also the number that were eventually chosen as an FA article) and the new editors involved

[10:41am] Manuel_WMDE: metrics are derived from the intended goals of a program. we achieve this by intensive talks with program managers in a complex and still ongoing process

[10:42am] fschulenburg: Bence: interesting. how did you get that number?

[10:42am] Bence: unfortunately though, it is difficult to have good data, and to know what impact has one had (e.g. we cannot tell if the rate of article creation has increased because of our contest, or did the contest simply channel existing activity into the specific topic)

[10:42am] tewwy left the chat room. (Client Quit)

[10:43am] fschulenburg: Bence: how did you find out how many articles got created?

[10:43am] Bence: usually manual counting and checking people's contributions to see if they were new

[10:43am] fschulenburg: ok.

[10:43am] Bence: we had a page for the contest, where the articles that were created were listed

[10:43am] Bence: so counting them was not that difficult

[10:44am] fschulenburg: interesting. thanks for sharing this with us

[10:44am] fschulenburg: Ainali: which tools did you use to find out how many photos got uploaded and how many people created user accounts?

[10:44am] Manuel_WMDE: in this context we are looking forward to tools like the User metrics API

[10:45am] Bence: +1 on the user metric API - it needs really promising from what I saw (although I haven't found the link to the functioning version, yet)

[10:45am] Ainali: fschulenburg: I'll have to get back to you on that one, it was User:Lokal_Profil who worked with the stats

[10:45am] fschulenburg: ah ok

[10:45am] Manuel_WMDE: most of our metrics should be based on qualitative indicators

[10:45am] fschulenburg: StevieWMUK: which tools does WMUK use?

[10:46am] tewwy joined the chat room.

[10:46am] Ainali: Yeah the User metrics API is much wanted

[10:47am] StevieWMUK: It depends on the project really but mostly we gather data from event participants. So, for example, Ada Lovelace Day we created a list in advance of articles we wanted to get in motion and marked them off as they were started

[10:47am] Ainali: We had already started work on a tool to follow contributions from a selected group of users. It is a volunteer working, so I guess he will finish it anyway

[10:47am] jorm joined the chat room.

[10:47am] StevieWMUK: As far as specific measurement tools I'd have to check as it's not an area I'm particularly active in

[10:48am] jorm left the chat room. (Remote host closed the connection)

[10:48am] fschulenburg: Ainali: is there some documentation on what this tool will do?

[10:48am] fschulenburg: StevieWMUK: thanks. that's great

[10:48am] ldavis: some information on UserMetrics API if you're not familiar with it: http://www.mediawiki.org/wiki/UserMetrics

[10:49am] fschulenburg: our developers and analysts are working hard on getting this tool deployed as soon as possible

[10:49am] Ainali: Not really. It is quite simple. It will let you select users, either based on a category or by just entering the usernames, and it will later do some nice sums of the activity

[10:49am] StevieWMUK: We also regularly monitor things like membership figures, create attendee lists, monitor new accounts on our wiki, media coverage... lots of stuff in our monthly reports at http://uk.wikimedia.org/wiki/Reports

[10:50am] Manuel_WMDE: building sustainable and scalable structures for creating and sharing free knowledge is very important for us

[10:50am] fschulenburg: Ainali: ah, that's nice

[10:50am] Manuel_WMDE: qualitative metrics are very important in this context

[10:51am] fschulenburg: Manuel_WMDE: tell me more about what you mean by "qualitative metrics". Can you give us an example?

[10:52am] wing2 joined the chat room.

[10:52am] SarahStierch left the chat room. (Quit: Ta-ta!)

[10:53am] Bence: To one of the previous questions: A big part of our activities are focused on creating community events (e.g. meetups, camps), where it is quite difficult to come up with good ways to measure and verify their impact. (And unfortunately, we haven't really done much thinking in this area, beyond simply counting attendees.)

[10:54am] andrewbogott_afk is now known as andrewbogott.

[10:55am] Manuel_WMDE: the quantity of new authors/media/etc is only one side of the coin. their quality and the sustainability of the supporting structures are often more important. you need a qualitative perspective to evaluate this

[10:55am] fschulenburg: Manuel_WMDE: oh, yeah, that's a good point

[10:56am] ldavis: i think we need to wrap up now

[10:56am] fschulenburg: Hey, I just wanted to let you know that I really enjoyed this session

[10:56am] TBloemink is now known as TBTv.

[10:56am] James_F|Away is now known as James_F.

[10:56am] fschulenburg: It was so interesting to hear more about your ideas around evaluation in general

[10:57am] fschulenburg: and also about your efforts and success stories in this field

[10:57am] fschulenburg: I learned so much

[10:57am] fschulenburg: thank you all

[10:57am] Manuel_WMDE: it has been a pleasure

[10:57am] Nicole_WMDE: thanks!

[10:57am] fschulenburg: especially StevieWMUK, Ainali, bluerasberry, Manuel_WMDE, Bence, mpeel, ragesoss and J-Mo

[10:58am] fschulenburg: who's going to be at the chapters meeting in milan?

[10:58am] StevieWMUK: Thank you everyone!

[10:58am] bluerasberry: thanks

[10:58am] Bence: Thanks Frank for hosting this

[10:58am] StevieWMUK: Not me

[10:58am] fschulenburg: I'm planning to go this year

[10:58am] Bence: I'll be there for AffCom

[10:58am] fschulenburg: and I would be eager to continue this discussion

[10:58am] fschulenburg: Bence: great!

[10:59am] mpeel: I'll be there with my FDC hat on.

[10:59am] fschulenburg: StevieWMUK: sorry to hear

[10:59am] Ainali: Thank you! I really liked the metrics page from mpeel

[10:59am] fschulenburg: mpeel: great

[10:59am] ToniSant: Thank you. It's been interesting. Bye!

[10:59am] Ainali: I am going to the chapters meeting

[10:59am] HaiSham left the chat room. (Quit: Page closed)

[10:59am] fschulenburg: yeah, the metrics page from WMUK is great

[10:59am] fschulenburg: I didn't know it existed

[10:59am] fschulenburg: Ainali:

[11:00am] fschulenburg: ok, thanks again, everybody It was great talking to you. I really enjoyed it

[11:00am] fschulenburg: good bye

[11:00am] fschulenburg left the chat room. (Quit: fschulenburg)

[11:00am] mpeel: g'bye

[11:00am] ldavis: STOPPING PUBLIC LOGGING