Wikipedia:Wikipedia Signpost/2012-10-08/Technology report

Technology report

The ups and downs of September and October, plus extension code review analysis

September engineering report published

In September:
  • 98 unique committers contributed patchsets of code to MediaWiki (up 1 on August)
  • The total number of unresolved commits went from about 360 to 440.
  • About 35 shell requests were processed (no change).
  • 41 developers received developer access to Git and Wikimedia Labs (up 16).
  • Wikimedia Labs now hosts 131 projects (up 11), has seen 241 instances created (up 27) and has 633 registered users (up 46).

—Adapted from Engineering metrics, Wikimedia blog

The Wikimedia Foundation's engineering report for September 2012 was published this week on the Wikimedia Techblog and on the MediaWiki wiki, giving an overview of all Foundation-sponsored technical operations in that month (as well as brief coverage of progress on Wikimedia Deutschland's Wikidata project, phase 1 of which is edging its way towards its first deployment). Three of the seven headline items in the report have already been covered in the Signpost: problems with the corruption of several Gerrit (code) repositories, the introduction of widespread translation memory across Wikimedia wikis, and the launch of the "Page Curation" tool on the English Wikipedia, with development work on that project now winding down. The report also drew attention to the end of Google Summer of Code 2012, the deployment to the English Wikipedia of a new ePUB (electronic book) export feature, and improvements to the Wiki Loves Monuments app aimed at more serious photographers.

It was also a strong month for the Labs and Database Download projects (as detailed in previous editions of the Signpost) as well as the "Micro Design" team that seems to have received something of an ad-hoc consensus for its changes to the English Wikipedia's edit window (albeit with the promise of future fixes). OpenPath, external contractors working on a J2ME app (designed for low end mobile devices) presented their final app during the month, which was verified as working properly and is now awaiting improvements to its memory footprint before release.

By contrast, one of the big disappointments of the month was the unexpected difficulty in "closing out" the long-proposed swap in primary datacentre from the WMF facility in Tampa, Florida to its base in Ashburn, Virginia. Detailed in both the report and a followup post on the wikitech-l mailing list, that move, with its promise of better stability and expansion capability, is now scheduled for the new year. Progress on the VisualEditor (VE) was also strictly non-visual in September, (though it is worth noting for the benefit of regular readers that the current schedule only puts the first VE deployment in June 2013 in any case). By contrast, a deployment of the TimedMediaHandler is expected "soon", notes the report.

Extension code review stable

CheckUser is one example of a WMF-deployed extension

Following on from the recent report on code review times for changes to core MediaWiki code, The Signpost can this week publish its own figures for the code review state of the many MediaWiki extensions in use on both Wikimedia wikis and elsewhere. Nevertheless, the figures for extensions are (relatively speaking) of inferior quality to those for core, given that approximately half of all extensions are not yet in Gerrit; some were late joiners (skewing the time series statistics) and some of those that are on Gerrit choose not to use its code review system on one or more branches. It is also necessary to exclude certain sets of "mass edits", although they do not greatly affect the aggregate figures in any case.

Despite these difficulties, it is still possible to gain a sense of how extensions are faring in a post-Gerrit world (correct as of 19 September). The headline figures, in particular, seem strong: the median patchset (of the 4823 sampled) waits 2 hours 30 minutes for a first review and 95% are reviewed within a week. Of those two figures, the median was stable across May, June, July and August, within the 95th percentile improving significantly over the period.

Despite a large disparity between WMF-authored and volunteer-authored code, the latter of which waits three times longer on average for its first review (whether or not the sample is limited to WMF-deployed extensions), patches for WMF-deployed extensions wait longer for their first review than patches to those extensions which have not been deployed to any WMF wikis. Such evidence lends tentative support to the notion that individuals are put off from reviewing WMF-deployed code for fear of giving an incorrect judgement. Naturally, there are many possible confounding variables to consider: the amount of new code included in every commit, for example, or the quality of the reviews themselves, all of which prevent a more insightful analysis.

In related news, discussions continue about refocussing WMF "20%" time from direct code review to skill sharing, the impact of which is expected to be overwhelmingly negative on all short-term indicators. That initiative is expected to focus on extensions with few active maintainers, contributors to which often struggle to find a proper reviewer.

In brief

Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for several weeks.

  • Partial outage: Wikimedia wikis suffered a partial outage on Sunday UTC, with readers unable to load some pages and editors in some cases unable to load any (wikitech-l mailing list). The problems were blamed on a problematic "fallback" image server which is already scheduled for imminent replacement (as of time of writing). In related news, the software on one of the servers responsible for rescaling images has been updated, suggesting that a more widespread upgrade (which typically improves image rendering times and the display of SVG graphics) could be on its way (also wikitech-l).
  • Skin list trim proposed: A reduction in the number of skins available for selection by users continued this week (gerrit changeset #25170). Although skins have rarely been removed from the list in the past, the increasingly outdated formatting of the older skins has caused continual problems for developers trying to accommodate all of them. "Half the time we add a feature or extension that happens to add a link somewhere we end up with a 'This doesn't work on legacy skins.' bug" explained Daniel Friesen, listing the Nostalgia, Standard, and CologneBlue skins as all either needing a significant rewrite or to be removed completely.
  • GitHub replication, read-only: MediaWiki code is now being replicated to the popular Git-based source code hosting service GitHub (wikitech-l). Replication allowing developers familiar with GitHub to get easier access to an up-to-date copy of MediaWiki code, which they can then "fork" and develop themselves. At the moment, there is no easy way to allow them to contribute back to central code review system Gerrit, effectively rendering the service read-only (manual copy-and-pastes excepted).