Wikimedia monthly activities meetings/Quarterly reviews/Services/January 2015

The following are notes from the Quarterly Review meeting with the Wikimedia Foundation's Parsoid and Services teams, January 29, 2015, 11:30 - 12:00 PST.

Please keep in mind that these minutes are mostly a rough transcript of what was said at the meeting, rather than a source of authoritative information. Consider referring to the presentation slides, blog posts, press releases and other official material

Combined quarterly review of:

  • Parsoid
  • Services (RESTBase, Auth service, OCG, Mathoid, Citoid etc)

Present (in the office): Roan Kattouw, Ori Livneh, Jared Zimmerman, Marko Obrovac, Rachel diCerbo, Andrew Otto, Ellery Wulczyn, Yuri Astrakhan, Trevor Parscal, Toby Negrin, Gabriel Wicke, Subbu Sastry, Erik Möller, Marc Ordinas i Llopis, James Forrester, Rob Lanphier, Bryan Davis, C. Scott Ananian, Tilman Bayer (taking minutes), Damon Sicore, Tomasz Finc, James Douglas, Arlo Breault; participating remotely: Greg Grossmeier, Chris Steipp

Agenda:
Team intro - 1-2 minutes
Parsoid - 10-12 mins
Services - 10-12 mins
Discussion - 5-10 mins

Presentation slides from the meeting

Welcome, team intro

edit

Gabriel:
welcome

 
slide 3

[slide 3]

Team:
James and Marko joined, previously Services team had been just Gabriel

Parsoid

edit

What we said + What we did

edit

Subbu: ...

 
slide 6

[slide 6]

main goal: work towards supporting VisualEditor; after that, Parsoid HTML views

 
slide 7

[slide 7]

CSS customization: done, needs some additional work before deploy

 
slide 8

[slide 8]

improved on <nowiki> issues, got a lot of community complaints there

 
slide 9

[slide 9]

What we learned

edit
 
slide 11

[slide 11]

CSS customization of Cite extension not controversial among devs

Metrics and key accomplishments

edit
 
slide 13

[slide 13]

What's next

edit
 
slide 15

[slide 15]

 
slide 16

[slide 16]

stable ids: stalled, but picking up again this q

 
slide 17

[slide 17]

Asks

edit
 
slide 19

[slide 19]

stretched thin, need help from either other teams or new hires
did a lot of CSS work within team, but we are not experts
ErikM: ...
Damon: this about CSS rendering?
yes
Trevor: also needs to be familiar with how Wikipedia uses CSS
CScott: ...
Damon: OK, was wondering if this is about special engineering skillset
Gabriel, Roan: not rocket science
Ori: could farm this out to mobile team, they have lots of CSS experience
CScott: lot of false positives in visual regression tests; tweaking the CSS to get rid of these would help a lot
secondly, check our output is consistent with mobile
Damon: (question about long tail)
CScott: don't necessary aim to reduce long tail to 0, but remove as much as possible that creates issues for testing
ErikM: map out the skillset you need, and Damon and I will look at it
Damon: designate VE blockers
does successful VE launch have Parsoid blockers? yes
Damon: are these designated clearly? yes. See VE Q3 and VE Q3-stretch / Q4 columns on https://phabricator.wikimedia.org/project/board/487/

Services

edit

Gabriel:

What we said & what we did

edit
 
slide 22

[slide 22]

Ops told us hardware not yet ready for RESTBase v1 deploy in Q2
So we used that time for frontend improvements instead

 
slide 23

[slide 23]

Tooling, infrastructure: long term thing
depend on RelEng and Ops
Damon: objective?
Gabriel: make it easy to deploy stuff
and e.g. that someone who develops new service knows what to do
right now, it's one-off, pinging people on IRC etc.
Toby: come up with...
CScott: have 4-5 deploys so far
RESTbase, Parsoid, Mathoid, Citoid
so now is a good time to [document/specific workflows]

 
slide 25

[slide 25]

(Gabriel:)
"RESTbase is like Varnish, but with storage, and richer interaction with backend"
Damon: so it's like a cache?
Roan, Gabriel: yes, basically a cache that never expires
ErikM: what is actually going to become available in February?
Gabriel: content API (HTML + metadata for each current revision and those requested)

What we learned

edit
 
slide 27

[slide 27]

collaboration with Ops, Sean very interested, but has lots of other things on his hands
Dev Summit has cleared the air a bit re SOA
ErikM: consensus is basically about moving forward with new services, not necessarily about converting existing code

 
slide 28

[slide 28]

(Gabriel:)
RESTBase as of now reduces compressed enwiki Parsoid HTML from 160G to 100G by storing data-parsoid on the server
can improve things further on template-heavy pages like Obama: from 3.5MB to current mobile HTML size - 950kb (https://phabricator.wikimedia.org/T78676)
microcontributions should be really fast (ideas at https://phabricator.wikimedia.org/T87556)
HTML rewrite needs for apps: e.g. they want to move infobox around
Demand from third party users

Metrics & other key accomplishments

edit
 
slide 30

[slide 30]

RESTbase latency
Toby: this is out of cache probably?
Damon: this looks pretty good

 
slide 31

[slide 31]

Gabriel: test coverage
Damon: I like that

What's next

edit

(Gabriel:)

 
slide 33

[slide 33]

 
slide 34

[slide 34]

pretty much ready for mid-Feb deploy, waiting for hardware
section editing can also speed up VE by cutting down save POST to edited section(s) only & serializing only that section instead of entire DOM
Ori: it's still a bit further away, need to address other tasks first
Roan: (wary about implementing new methods)
Gabriel: section editing *API* has dual benefit of enabling micro-contribution experimentation on mobile & faster editing in VE

Asks

edit

(Gabriel:)
need more bridging between Ops and dev. had to turn candidates with strong DevOps skills away
Ori: one takeway from this q: should not discount internal architecture expertise, see example of (Cassandra?) RfC re databases
Gabriel: disagree about that example
Ori: more involvement of Ops/architecture input from beginning would mean less blockage now. There a NIH syndrome, rejecting existing solutions
Gabriel: actually Ops people like Sean agree with [move to Ccassandra], just don't have time
Ori: fear we will have same issue with monitoring
ErikM: don't entirely agree with Ori ...
(more general discussion about architecture leadership structure)
ErikM: distribution..?
Greg: need consistent release (learn our lessons from MW releases, the more different they are from what WMF uses, the less they are supported), e.g. does releasing by images make sense? This is tied in with how we do deployments for services (containers?); complicated
ErikM: (question about VE blocker re template editing)
JamesF, Subbu: not a blocker
Roan, Erik: ...
Ori: as the person responsible for VE performance, I am willing to trust on RESTbase being available, if you are confident. but could also look into alternatives
ErikM: by end of Feb, all revs available?
Gabriel: not all in storage right away (old ones will be generated on demand), but all since turn on date
have resources, 6TB replicated storage provisioned (18T total, 3-way replication)
Roan: on Labs currently?
Gabriel: actually already in production, but only on three 250G boxes
Roan: cool, then I propose to start VE testing right now
Ori: worry about Services team not meeting deadline without own fault due to external factors
Damon: that's a normal situation actually
Gabriel: in worst case you can start testing using the current test boxes
Ori: OK