Wikipedia:Wikipedia Signpost/2005-09-05/Statistics released

Statistics released

Delayed Wikipedia statistics released

An updated set of the comprehensive statistics produced by Erik Zachte was released last week, after an extended delay due to the upgrade to MediaWiki 1.5. Meanwhile, Wikipedia moves closer to cementing its status as a top 50 site in terms of web traffic.

On Wednesday, Zachte announced that he had produced a new set of statistics for Wikimedia Foundation projects and posted them to the Wikistats sitemap. The statistics cover all projects and languages except for the French and German Wikipedias, as the statistics script failed with those two (in the case of the German Wikipedia at least, the database dump was corrupted).

As Zachte had indicated earlier (see archived story), the database changes in MediaWiki 1.5 required significant alterations in the process of collecting these statistics. As a result, this was the first update to the statistics since May. And even with a new update, the information was already several weeks old, as the most recent dump of the MediaWiki database dated from 13 July.

In connection with this problem, Zachte questioned why the database dumps were being given such a low priority. Chief Technical Officer Brion Vibber explained that he had been working for several days on "getting the dump generation infrastructure up and running in a more consistent, effective fashion." He pointed out that in addition to bugs that needed fixing, the process being used required going through the database twice, once for current files only and once for all revisions. Vibber reported that he was working on a new tool to produce both versions simultaneously, as well as generating filtered database dumps that remove user pages and talk pages. This received its first test run on Saturday.

Wikipedia traffic continues to grow as well, as the site should soon move into the first 50 on the Alexa list of the top 500 global websites. As of Sunday, Wikipedia stood at #51, right behind Mediaplex, whose ranking comes mostly from being an internet advertising provider. This ranking is calculated over three-month periods, and will likely continue moving higher since Wikipedia's daily traffic rank has been fairly consistently in the 40s for a few weeks.