Wikipedia:Wikipedia Signpost/2006-05-22/Statistics

Statistics

Project statistics updated, except for Wikipedia

For the first time this year, Erik Zachte has been able to provide an update of his popular statistical reports for Wikimedia Foundation projects. However, due to the technical demands, so far the reports are available for all projects except Wikipedia.

The update provided last week, which makes the information current as of 10 May, covers five months of additional data, as the last report came in December. This is also the first time that Wikisource statistics have been available broken down by individual languages.

Wikipedia remains the one project for which the statistics are only current as of 10 December. Due to the size of the database involved for Wikipedia as the largest project, it takes considerably longer to produce the report. Zachte hopes to have that completed sometime this week if no further problems come up in the process. (May 24: Wikipedia statistics are done [1], although for English the last working dump dates from February. Other languages are up through 10 May.)

The gap between updates is due to a combination of factors. Ongoing changes from the software development process for MediaWiki force Zachte to constantly tweak his scripts that produce the statistics. Also, the process depends on having current database downloads, which have sometimes been irregular, especially for the largest projects where the database dump is more likely to fail.

Alternative strategies to find the necessary resources to monitor statistics face their own challenges. The toolserver that generates reports such as the popular individual user statistics, and has been considered for use in running Zachte's scripts, is not a perfect solution either. It has to deal with time lags in the data, and due to disk problems has not been working with a current version of the English Wikipedia for a few weeks.

These and other problems have caused Zachte's reports to come out only sporadically over the past year or so (see archived stories). The Communications committee is working on coordinating efforts to keep these statistical updates running on a more regular basis. Some outside researchers and groups are also interested in developing their own statistics to study Wikipedia, and the Wikimedia Foundation frequently gets inquiries from people seeking various kinds of data about the projects.