< Wikimedia Engineering < Report < 2013
February 2013 Wikimedia engineering report, March 2013 April 2013

Engineering metrics in March:

  • Approximately 113 unique committers contributed patchsets of code to MediaWiki.
  • The total number of unresolved commits went from about 830 to about 816.
  • About 48 shell requests were processed.
  • Wikimedia Labs now hosts 154 projects and 1,103 users; to date 1,641 instances have been created.

Major news in March include:

Note: We're also providing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge.

Upcoming events

There are many opportunities for you to get involved and contribute to MediaWiki and technical activities to improve Wikimedia sites, both for coders and contributors with other talents.

For a more complete and up-to-date list, check out the Project:Calendar.

[edit or add events] [] [view credits]
Date Type Event Contact
31 March 2013–7 April 2013 IRL (physical) meet-ups and conferences Offline Hackathon 2013 (Paris, France)
1 April 2013–7 April 2013 Fresh bugs QA: Fresh bugs triageSkin and page rendering with a dedicated bugday on April 02nd AKlapper, Valeriej
6 April 2013 Features testing QA: Weekend Testing Americas focuses in the new Account creation user experience. Steven Walling, Cmcmahon, Quim Gil.
6 April 2013 IRL (physical) meet-ups and conferences Iconathon with The Noun Project (San Francisco, CA, USA) VBamba (WMF), MAssaf
10 April 2013 Online meetings Language Engineering team office hour in #wikimedia-office connect Runa Bhattacharjee
11 April 2013 IRL (physical) meet-ups and conferences Meetup: GSoC and other open source internship programs (San Francisco, CA, USA) Quim Gil
14 April 2013 IRL (physical) meet-ups and conferences THATCamp London 2013 and Europeana hackathon (London, UK) Wikimedia UK
15 April 2013 Old bugs QA: Old bugs triageBug reports without updates for 18 months AKlapper
18 April 2013 Online meetings Tech Talk: 3 tech projects receiving Wikimedia grants Quim Gil
20 April 2013 IRL (physical) meet-ups and conferences Last day to register for the May Amsterdam hackathon and request travel subsidy Multichill
24 April 2013 Old bugs QA: Language Engineering team monthly bug triage on #mediawiki-i18n connectTranslate UX (TUX) bugs Runa Bhattacharjee
30 April 2013 IRL (physical) meet-ups and conferences Last day to submit a talk for Wikimania in August Wikimania organizing team

Personnel

Work with us

Are you looking to work for Wikimedia? We are hiring for many positions, and we really love talking to active community members about these roles.

In Engineering:

We have seven additional openings in other Foundation departments.

Announcements

Two new full-time employees started in WMF engineering in March:

  • Yuri Astrakhan, Senior Software Engineer in the Mobile group (announcement).
  • Adam Baso, Senior Software Engineer, Mobile (Engineering) (announcement).

Technical Operations

Site infrastructure

This month we saw a few short site glitches that lasted from about a minute to ten minutes each. The outages did not noticeably affect readers, but editors and contributors experienced intermittent problems.
  • The first incident was triggered by a deployment of Article Feedback Tool v5, and once the code was reverted, the site outage ended. The incident lasted for about 10 minutes (incident documentation).
  • The other two were jobqueue-related, according to Asher Feldman. The current MySQL jobqueue implementation is far too costly. In analyzing the data during that 24-hour period, we see that 75% of all queries that take over 450ms to run on the English Wikipedia master are related to the jobqueue, and all major actions result in replicated writes. In fact, the jobqueue takes 58% of all query execution time when not limiting the analysis to queries over the slow threshold. If 1 million refresh-links jobs are queued as quickly as possible without paying attention to replication lag, that causes the Apache servers to experience time-out due to the replication lag. MediaWiki depends on reading from slaves to scale, and avoids lagged ones. If all slaves are lagged, the master is used for everything, and if this happens to English Wikipedia, the site falls over. This MySQL jobqueue was identified as a scaling bottleneck a while ago, and thus we will be switching to Redis very soon. We're currently aiming for that switch to coincide with the release of 1.22wmf1, but we may be able to backport to 1.21wmf12 and get this done in early April.
  • On March 12, we experienced a Esams site outage which was probably caused by packet loss between Esams and Eqiad. Leslie changed routes from Esams to Eqiad to fix the packet loss, which caused Esams to recover. While we still don't clearly understand what caused the outage, we did notice it coincided with the news release when the new Pope was elected. The election did trigger a surge in traffic to our web properties.
In March, we had a short security sprint led by Leslie Carr. We patched servers that needed security upgrades. In addition, we continued to work on MariaDB migration, Ceph deployment and fixing Varnish bugs.
TechOps has initiated a fortnightly meeting with the engineering teams to drive alignment amongst the various engineering projects and TechOps regarding requirements and expectations. This is also the process to surface potential deployment issues (such as capacity demand, new infrastructure and performance). Meeting minutes are documented on the meeting Etherpad.

Fundraising

Added logging to fundraising deployment scripts.

Data Dumps

Work is continuing on tools for import. Setting up a local copy of a wiki which includes only a subset of the page content has always been problematic, since this requires use of the notoriously slow and finicky importDumpphp maintenance script. Under development is a tool to filter the currently produced SQL table dumps against a list of page IDs of a content subset; these tables could then be imported into a MySQL database, along with tables produced from the content subset, bypassing the need for importDump.php. Additionally, these SQL fles could be shared with other users who are interested in the same content subset. We hope to be able to launch this in April.

Wikimedia Labs

A schedule has been posted for LabsDB. A number of glusterfs stability work has been done. We've also begun work on a replacement for project storage. A new feature has been added in support of tool labs: service users and groups. This feature is per-project manageable service users and groups. A number of interface fixes were made, such as adding the admin list to the project page and layout of the instance pages. Network changes were made to the instances and network hosts: the network node has 3 bonded 1GB NICs and all instances were changed to use the virtio network driver, which increases their speed to the speed of the host. Work on tool labs progressed well this month. Most of the necessary infrastructure is available for many tools and bots.

Features Engineering

Editor retention: Editing tools

VisualEditor

In March, the team worked on the major new features that will be added in the coming months. The objective is for VisualEditor to be the default editor for all users, capable of letting them edit the majority of content without needing to use the wikitext editor, in July 2013. This will mean adding support for references, (at least) basic templates, categories and images, each of which is a very large piece of work. This month the primary focus was editing of categories and templates, with draft designs created and initial code developed. The team undertook its first ever "Quarterly Review", whose slides detail these designs, the work done to date and expectations for the near future. The alpha version of VisualEditor on mediawiki.org and the English Wikipedia was updated twice (1.21-wmf11 and -wmf12), adding better input and selection support, fixing a number of bugs, and restructuring the back-end so that the new features will be simpler to create.

Parsoid

In March, the Parsoid team continued with improvements to internationalization, serialization, and extension handling.

The parser test framework now supports language-specific tests, which required support for loading language-specific default setting in Parsoid.

The serializer is now fully DOM-based and uses constraint-based newline / white-space separator handling, which will make the serializer less sensitive to newlines and whitespace in HTML. Round-trip test results of 82% (pages without any diffs) and 98% (pages without semantic diffs) indicates that the new serializer is on par with the old serializer currently deployed on production.

Extension content is now parsed all the way to DOM, which enforces proper nesting. The generic support for balanced fragment parsing will later also be applied to templates. Parsing of transclusion directives (includeonly and friends) has also been improved and simplified.

The DOM specification for images and templated / extension content was fleshed out in preparation for full editing support.

Late in March, C. Scott Ananian joined us as a contractor. Welcome!

Editor engagement features

Echo (Notifications)

In March, the team continued to deploy new features for the Notifications project (code-named Echo) on mediawiki.org. Ryan Kaldari and Fabrice Florin created a new Thanks notification that lets you express your gratitude to users who make constructive edits by notifying them that they have been thanked (this feature was designed to give positive feedback to new editors during their first steps on Wikipedia). Benny Situ built the User rights notification, which is sent when your user rights are changed (this feature was requested by power users on the English Wikipedia). Luke Welling developed new code to send HTML email notifications, based on designs from Vibha Bamba. Fabrice Florin led discussions about these new features to serve the needs of both new and current users, then updated their feature requirements; he also co-wrote this metrics plan with Dario Taraborelli, as well as a socialization plan and new project pages with Oliver Keyes. We are now completing these final features and are aiming for a first release on the English Wikipedia later this month; in the meantime, you can help us test the current version on mediawiki.org. To learn more, read this project update on the Wikimedia blog.

Article feedback

In March, we released an updated version of Article Feedback v5 (AFT5) on the French and German Wikipedia, for evaluation by their communities. Developer Matthias Mullie completed final features for this release, including a new feedback link, auto-archive and a tool that lets you discuss useful feedback on article talk pages, based on designs from Pau Giner and suggestions from Oliver Keyes. Product manager Fabrice Florin worked with Denis Barthel and Sebastian Peisker on the German release and Benoît Evelin on the French release, and we are very grateful to them and many other community members for their invaluable contributions. A German community vote is expected in May on the German Wikipedia, and in October on the French Wikipedia, when they will decide whether or not to deploy the tool across their entire sites. Due to data caching issues, the tool was temporarily turned off on the English Wikipedia, where we expect to re-deploy it on an opt-in basis as soon as practical, as described on this talk page. After the English Wikipedia re-deployment in April, we plan to monitor community feedback on all three pilots before making any more updates, but other projects interested in the tool are invited to read this this 2013 release plan and contact us if they would like Article Feedback on their sites later this year.

Flow Portal/Project information

Design work continued on Flow. We continued creating a "Portal" that will engage discussion about Flow at three locations (mediawiki.org, meta, and the English Wikipedia), and performing research.

Editor engagement experiments

Editor engagement experiments

In March, the Editor Engagement Experiments team largely placed other projects such as guided tours, EventLogging, and others on hold to focus on two key initiatives: the "Getting Started" process for onboarding new Wikipedians, and on making the redesign of account creation and login a permanent, internationalized part of MediaWiki core.

For the Getting Started project, the team launched a new version on English Wikipedia, which included a new landing page with additional types of tasks suggested for brand new editors to try. The list of tasks is now generated by a basic recommender system built by Ori Livneh, which gathers, filters, and delivered a fresh list of tasks automatically for every editor. This new backend paves the way for releasing the "getting started" feature on other projects, after we've completed data analysis and testing to understand which kinds of tasks are ideal for first time editors. Additionally, Matt Flaschen collaborated with the Editor Engagement Features team to build notifications to welcome new editors and invite them to contribute via the Getting Started.

For the account creation and login work, S Page, Munaf Assaf, and the rest of the team rebuilt our design to work with MediaWiki core, and solicited reviews from outside the team. We currently plan to launch both interface redesigns on an opt-in basis in April, to have editors test the localization and other functional aspects of the forms via a URL parameter, before we enable them as default.

Support

2012 Wikimedia fundraiser

In March, we wrapped up our 2012/13 non-English international fundraising efforts, making approximately 5 million USD over the course of the month. Originally, we had planned to run the non-English international fundraiser continuously until June, but were forced to accelerate our plans due to some potential instability at the beginning of April with one of our crucial payment gateways. At the very end of March, we started publishing aggregate public fundraising data to samarium.wikimedia.org.

Language engineering

Language tools

Highlights for this month's team progress include:

Milkshake

The language engineering team continued adding more input methods and web fonts contributions to jQuery.ime and jQuery.webfonts (Milkshake components). UX designer Pau Giner iterated with Howie Fung and Erik Möller to incorporate UX changes to handle logged-in use cases for the Universal Language Selector (jQuery.uls). ULS deployment is targeted for this fiscal year (by the end of June 2013).

Language community outreach:

The Language Engineering team kickstarted its Language Support Maven plan for getting language tools feedback from Wikimedian community members who are using internationalisation and localisation tools developed by the team. The team also held its regular monthly office hours in March. The team's outreach coordinator also reported team progress with multiple blog posts on the technology blog. The team plans to restore its bug triage sessions, starting in April 2013.

Mobile

Apps/Commons

Initial version of Commons photo uploader app for Android is available for download in Google Play. iOS version is still in beta, but should be available in the store next month.

Wikipedia Zero

In March we added new telecom partners (such as Axiata Group Berhad) and fixed some bugs, and brought new staff online. We also won an SXSW Interactive "Activism" award for Wikipedia Zero. In April we aim to start improving the code and the IP detection.

OpenStreetMap

Max Semenik, Arthur Richards and Faidon Liambotis held an OSM mini-hackathon at Open Source Days 2013 in Copenhagen. During the event, they agreed on an implementation strategy for the WMF mapping cluster.

Mobile design/Uploads

In March, we added the ability to easily upload a lead image to articles that lack one in the stable version of the mobile site. We also deployed a workaround for an issue we discovered with heightened security features in some newer browsers that make logging in to all the projects via CentralAuth impossible in certain circumstances; that had prevented a number of users from being able to upload photos via the mobile site. We are now well on our way to reach our goal of 1000 unique uploaders/month by the end of the fiscal year. Check out the mobile app dashboard to see mobile contributions via the website and via apps. Also of note: we've added thumbnails of lead images from articles in the mobile watchlist view, as well as a "last modified" timestamp on articles in the stable version of the mobile site. We are currently focusing on some performance enhancements for the mobile site. In April we will graduate the "uploads dashboard" feature from beta to stable, will further refine our photo upload features, and will expose a feature to identify articles on subjects near your current location to the beta version of the site.

Platform Engineering

MediaWiki Core

MediaWiki 1.21/Roadmap

1.21wmf11 and 1.21wmf12 were deployed to Wikimedia sites in March. We created the REL1_21 branch on March 25, with the goal of a release candidate at the end of the month.

Lua scripting

We launched Lua scripting on all wikis, wrote about the launch's wider significance, and held IRC office hours. In March we also added frame:callParserFunction() and frame:extensionTag(), improved CPU time accounting, and allowed argument expansion to be excluded. We have patches outstanding for "text" module including unstrip functionality, as well as improved debug output. We've also made significant improvements to templates since the launch.

Auth systems

This activity kicked off on March 26. We're planning a minimal OpenID implementation and OAuth implementation in the coming months. Very tentative target date is end of May.

Search

Search deployed to Beta Cluster. Search code instrumented for better troubleshooting and identification of issues, and work is underway to add PoolCounter support. Plan for April to make search updates more robust.

Wikidata deployment

Phase II deployment will complete on March 27 to several wikis, including Italian, Hebrew and Hungarian Wikipedias. This phase includes a new Lua interface to Wikidata so as to make infobox population from Wikidata possible. The Wikidata section of the monthly report has more details.

Multimedia

Mostly bugfixing this month, as well as hiring for the two multimedia positions. Jan Gerber finished work on an API to rotate images (bug 33186), which needs a little site configuration work to get deployed. Transcoded videos have been moved to their own container in the Swift filesystem in anticipation of video-specific optimizations (bug 43343). Jan also improved the user interface in cases where an embedded media player is too small to display credits and player controls. TimedMediaHandler was extended so that other MediaWiki extensions can render player elements using PHP. The Score extension can now use TimedMediaHandler or OggHandler to render the audio player (see bug 43388), which puts the Score extension one step closer to deployment and music staves one step closer to generation and display in wiki pages (see bug 189).

Site performance and architecture

JobQueueRedis merged, JobQueueAggregatorRedis merged and deployed. These improvements to the jobqueue should help site performance (see Tech Operations section of the monthly report).

Admin tools development

We've implemented improved support for blocking users coming in through proxies with GlobalBlocking. Continuing work on identifying accounts that have not been merged with Single User Login, with the goal of merging those accounts starting in April.

Security auditing and response

The fundraising code base review is done. A MediaWiki security release, 1.20.3, was published on March 4. A review is underway for user metrics API.

Quality assurance

QA

Beta cluster

"Phase 1" support on beta for Mobile is complete and Mobile is using the beta cluster for testing now. We added search to beta, including Mobile. Lucene instances have been set up to provide search suggestion and ... search capabilities, but it's a rough base which still needs to be improved. More automated tests are now targeting beta cluster, and targeting the test2wiki/production cluster is underway. Jenkins is now upgrading the database schemas on an hourly basis and deploying changes to the MediaWiki configuration just after they have been merged. If you are curious, have a look at the Jenkins dashboard for the beta project.

Continuous integration

Timo Tijhof implemented a Jenkins job to run the QUnit javascript test for MediaWiki core. That will definitely help us catch most of the javascript issues.

The continuous integration site has been moved from integration.mediawiki.org to integration.wikimedia.org and is now always on HTTPS. The index page has been rewritten based on Twitter Bootstrap (see integration.wikimedia.org).

Antoine Musso has given our Zuul status page an overhaul. It features live reloading through ajax and contains direct links to the Gerrit changesets and Jenkins jobs. A big improvement over the plain text version.

Antoine Musso and Timo Tijhof set up the new doc.wikimedia.org portal. The MediaWiki core (Doxygen-generated) PHP documentation has been moved here (svn.wikimedia.org/doc is now a redirect). We're currently working on packaging jsduck and writing Jenkins jobs to generate JavaScript documentation with JSDuck.

We've packaged various Python modules for the Debian project, which will in turn let us simplify deployment. Meanwhile, we're experimenting with having our Debian/Ubuntu packages built by Jenkins directly.

This month we've continued to extend Jenkins coverage for Gerrit repositories. We're happy to announce that almost all repositories for MediaWiki extensions in Gerrit now have Jenkins integration.

QA/Browser testing

Analytics

Visualization, Reporting & Applications

In order to support mobile initiatives--including the Mobile Website, Mobile Apps, and Wikipedia Zero--we focused our attention on providing data extracts and visualizations with this focus. New visualizations include the Mobile app dashboard.
In addition, we updated the report card for the March Metrics Meeting, improved the robustness of the reportcard infrastructure, added target bars and added links to the metric definitions.

Wikistats

We are currently working on a new mobile pageview report.

Services & Access Points

In March, we saw the launch of the User Metrics API, a service that allows researchers to perform cohort analysis on various data sets, making it easier to measure the effects of programs and platform experiments among discrete sets of users. We are currently working on improving the web-based user interface to make it available for use outside of Wikimedia Foundation staff in the coming months.

Analytics Infrastructure

Our big-data cluster known as Kraken has been undergone no major changes in capability, but we have been working to make it more robust and improve security. Our udp2log monitoring has become more accurate, and Limn can be installed on both production and Labs instances.

Misc: Defects Closed

Fixed the Space characters in pagecounst-raw titles bug.

Misc: Management & Communication

The Analytics team has started to use Mingle to manage its work more effectively day-to-day. Bugzilla remains our primary interface for managing defects with respect to communicating their priority and status.
Finally, we had our Analytics Reboot meeting, where all internal WMF Analytics stakeholders convened and we surveyed what customer opportunities were out there, what Analytics models are currently available, and how to improve inter-team communication.

Engineering community team

Bug management

In Bugzilla, a way to mark bugfixes to copy from the development branch to stable branches was introduced to easier identify important bugfixes to include in tarball releases.

Two bugdays took place as part of the QA Weekly Goals: cleaning up and retesting General MediaWiki reports and a bugday concentrating on the LiquidThreads extension. For the latter, 76 out of 218 open reports received updates. Valerie analyzed which important Wikimedia feedback channels link to each other and Bugzilla, and created a diagram of the current situation. Valerie also published two blogposts explaining how to create a good first bug report and how to help Wikimedia squash software bugs. Andre improved the Bugzilla Weekly Report email to the wikitech mailing list. On most open bug reports with a target milestone set to future MediaWiki version 1.21.0, reminder comments were added for developers. Andre and Valerie also held the first IRC Office Hour on Bugzilla and Bug management for those interested in discussing problems and improvements with Wikimedia's bug management. In Bugzilla's internal product and component taxonomy, several Mobile application products were merged into a single "Wikipedia App" product and two Search components were merged, to simplify finding information for developers and reporters.

Also, the bug management task list received a major cleanup, making it clearer what is being worked on and what you can help with.

Mentorship programs

Quim Gil focused on:

Technical communications

Guillaume Paumier reviewed, published and advertised several tech blog posts, and did communications support for the deployment of Extension:Scribunto; this included reviewing blog posts, organizing IRC office hours, and announcing the deployment on wikis and lists. He modified a template so that events from the MediaWiki calendar show as bulleted items on the home page. He continued to investigate SugarCRM, and expanded Product development to facilitate the involvement of contributors, and redesigned, reorganized and simplified the How to contribute landing page. He added "Open tasks" sections to pages of activities in need of Product help, and linked to them from the Product hub. He also met with Quim Gil to discuss his contributors plan, and followed up on the gerrit tagging proposal. He researched ways to create and maintain translations of the Wikimedia Glossary, and started a discussion on the lists about the best way to move forward with glossaries scattered across wikis. Last, he started to plan for the centralization of mobile documentation, and drafted thoughts about a consolidation of technical communities.

Volunteer coordination and outreach

Quim Gil focused on:

Kiwix

The Kiwix project is funded and executed by Wikimedia CH.

Work on the 0.rc3 release of Kiwix is ongoing, mostly consisting of bug fixing and a few UI improvements. The release date is in around one month. For the first time, a ZIM file of Wikisource (in French) was done, within the scope of the Afripedia project.

Wikidata

The Wikidata project is funded and executed by Wikimedia Deutschland.

Denny Vrandečić and Lydia Pintscher gave a short update on Wikidata's status at the metrics and activities meeting. A more detailed analysis can be found in our blog post. In addition, Wikidata phase 1 (language links) has been activated on the remaining 282 Wikipedias. This means that all Wikipedias now get their language links from Wikidata. Not too long after that, phase 2 (infoboxes) was activated on the first 11 Wikipedias. They can now make use of shared structured data from Wikidata in their articles. On Wikidata itself we introduced a new data type (string), extended references in statements (they can now have multiple values), and improved the search box.
We have written down how we envision queries on Wikidata and would appreciate your feedback.
As a nice demonstration of the potential of Wikidata we've seen two new projects this month: Wiri and a tree of life.

Future

The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.
This article is issued from Mediawiki. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.