Wikipedia:Wikipedia Signpost/Single/2012-08-27

The Signpost
Single-page Edition
WP:POST/1
27 August 2012

 

2012-08-27

Tough journey for new travel guide

The Portuguese town of Oporto, a destination of the month featured on the WikiVoyage front page. WikiVoyage, which is set to migrate to the Wikimedia Foundation, is already in the process of picking up freely licensed material from Wikitravel.org.


Wikimedia editors have been debating a community proposal for the adoption of a new project to host free travel-guide content. The debate reached a new stage when a three-month request for comment on Meta came to an end, with a decision to set up the first new type of Wikimedia project in half a decade (Signpost coverage). The original proposal for the travel guide unfolded during April on Meta and the Wikimedia-l mailing lists (I, II, and III; Signpost coverage), centring around the wish of volunteer contributors to the WikiTravel project to work in a non-commercial environment.

WikiTravel (WikiTravel.org) is owned by the for-profit California-based company Internet Brands (website), which operates online media, community, and e-commerce sites in vertical markets. Internet Brands is in turn owned by private equity investors Hellman & Friedman LLC, which bought the company in a US$640M deal almost a year ago. According to The New York Times, Investopedia, and Small Cap Investor, Internet Brands' strategy is to focus on specific target audiences that tend to be attractive to advertisers. The company's portfolio of websites includes many with social-media features, and has a monthly average of 112 million unique visitors (up from 70 million at the end of 2010), and 805 million page views; the company has more than 40,000 direct advertisers. The English-language version of WikiTravel.org consists of some 25,000 freely licensed articles.

Early discussions

The initial proposal was backed by many volunteer editors at Wikitravel.org, including project founders Evan Prodromou and Michele Ann Jenkins, as well as Stefan Fussan, chairman of the board of the German non-profit Wikivoyage Association. The association and its project Wikivoyage—a long-standing fork from Wikitravel.org, run by a mainly German volunteer community with some input from Italian volunteers—formally joined the proposal in June 2012, when the association's general assembly unanimously endorsed it (Signpost coverage). The association offered the domain Wikivoyage.org, and is currently seeking recognition by Wikimedia as an independent thematic organisation. The travel-guide proposal, for which Doc James was a key advocate, quickly gathered support among editors.

RfC Mark 1

The RfC, conducted in several stages, focused on issues such as whether travel content can be regarded as educational, potential conflict-of-interest issues, and how the new project would interact with other Wikimedia projects and with those hosted by third parties.

Proponents have argued that starting with the existing CC-by-SA freely licensed travel content and giving existing volunteer communities a new home would bring significant benefits to those communities, to readers, and to the Wikimedia movement. Editors and readers of travel content would gain advantages from being part of a large and powerful non-commercial movement, and Wikimedia would be able to broaden the scope of its free educational material. The ability of the current travel-content communities to create a properly functioning new project would be facilitated by the improved software available from being hosted by the Wikimedia Foundation (the current travel-guide content is run on an older version of MediaWiki).

Opponents of the move have argued that travel content is not sufficiently educational and is therefore inconsistent with Wikimedia's mission, that setting up a wiki for travel-guide content would offer no conceivable benefit to anyone, that other Wikimedia projects could be disrupted by a potentially resource-intensive move, that travel content involves inherent conflicts of interest, and that there could be technical problems such as the transfer of page histories.

Wikimania 2012—the debate crystallises

In the run-up to Wikimania 2012 last month, the ayes in the Meta RfC had taken an early lead, although several contested issues remained unresolved and there were concerns about limited participation in the debate (Signpost coverage). On 11 July there were 107 ayes, with 11 nays focusing on unresolved issues. Within two days the WMF board, meeting at Wikimania, had examined the proposal and issued a letter to the community, stating the board's opinion that several free travel-content projects can coexist and emphasising the value the Wikimedia movement places on community-consensus decision-making. At the same time, the board announced that it wanted to see an extension of the RfC for at least a further six weeks before looking at the possibility of limited technical support for the community-led initiative. During Wikimania, interested community members met in person for the first time to chart a way ahead.

RfC rebooted

To tackle the participation issue, the community set up a globally displayed notice on Meta in the second half of July (after just a side-notice in April), significantly boosting involvement. At the same time, Internet Brands increased its engagement in the debate through the participation of IBobi, one of its community managers. On 13 August, IBobi issued a company response to the proposal, pointing to the results of its reader survey as evidence that the project has been working well under its stewardship. IBobi proposed that Wikitravel.org could become an Internet Brands–hosted Wikimedia sister project, as long as Wikimedia refrained from setting up a new travel-guide project. However, community members disputed the neutrality of the survey questions, among other issues raised by the company.

A senior Wikimedian volunteer who supports the creation of the new project told the Signpost that Internet Brands nevertheless has a perfect right to put its case as to why the status quo should be maintained. Indeed, Wikitravel's Terms of use clearly states that "if you continue to use the service against our wishes, we reserve the right to use whatever means available—technical or legal—to prevent you from disrupting our work together." On 21 August, Internet Brands' legal department set up an account on the site and issued this warning to eight volunteer editors: "Please be advised that your recent actions communicating directly with members of Wikitravel could put you in violation of numerous federal and state laws. We strongly urge you to cease and desist all action detrimental to Wikitravel.org. If you persist in this course of conduct, you will potentially be a named defendant, and therefore liable for any and all resulting damages."

Community consensus and the way forward

On 23 August, the RfC ended with 78% support for setting up a Wikimedia travel-guide project (540 ayes to 152 nays). A member of the WMF board has said "the board is reviewing the RfC and its talk page over the next week. We are going to share our thoughts with you soon on the RfC's talk page. Please feel free to leave comments there, that's still possible and will be read."

Meanwhile, the Commons community has established a task force to manage the transfer of freely licensed files in any event, and Wikivoyage has moved to import and host freely licensed Wikitravel.org articles. WikiVoyage has also begun a clean-up of its own policies to align itself with Wikimedia standards.

The Signpost invited Internet Brands to put its views on editors' complaints that the version of wiki software used on wikitravel.org is outmoded, and that there has been an intensification of advertising on the site that may undermine neutrality. We also asked about the strategy behind the talk-page warnings in the light of the company's stated desire to "bring back old non-admin regulars". Although we responded to Internet Brands' subsequent request that we clarify the Signpost's affiliations, at the time of publication we have received no reply to our questions.

A Turkish field gun in the Gallipoli gallery of the Australian War Memorial

In brief

Participating countries in Wiki Loves Monuments 2012 are in red
  • Wikimedia South Africa meeting: The South African chapter will host its first annual general meeting on 1 September to look at its own institutional structure and to discuss projects such as Wiki Loves Monuments in South Africa. Details have been published on Meta.
  • Wikipedia Takes the Australian War Memorial—results: On 25 August, Wikipedians took on Australia's national memorial to the members of all its armed forces and supporting organisations who have died or participated in war. The results are being published on Commons. Users interested in Wiki Loves Monuments might also look at the upcoming Wikipedia takes America events in September and other parts of the global photo contest.
  • English Wikipedia Arbitration report: there continue to be no open cases before the committee. Five clarifications and requests for amendments are currently open, which arose from decisions reached in the Sathya Sai Baba (2006), The Troubles (2007; three separate requests), and Race and intelligence (2010) cases.
  • Steward policy reform debate extension: There has been a proposal to extend the RfC on whether to adopt proposed changes to the policies governing the actions of Wikimedia stewards—users with complete access to the wiki interface on all Wikimedia wikis. The RfC aims to gather more community feedback and to resolve several issues.
  • Main-page redesign competition: 17 proposals have been lodged, and discussion about the issue continues on the competition talk page. Editors are welcome to submit their own proposals until 30 September.
  • Editor survey gears up: Preparations for this year's editor survey are in their final stages on Meta. The survey will be conducted for up to 10 days from late August, and results will be publicly shared.

    Reader comments

2012-08-27

New influence graph visualizations; NPOV and history; 'low-hanging fruit'

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, edited jointly with the Wikimedia Research Committee and republished as the Wikimedia Research Newsletter.

Wikipedia-based graphs visualize influences between thinkers, writers and musicians

A visualization of musical genres related to psychedelic music, based on DBPedia data.

In a blog post titled "Graphing the history of philosophy",[1] Simon Raper of the company MindShare UK describes how he constructed an influence graph of all philosophers using the "Influenced by" and "Influenced" fields of Template:Infobox philosopher (example: Plato). This information was retrieved using DBpedia with a simple SPARQL query. After some cleanup, the result, consisting of triplets in the form <Philosopher A, Philosopher B, Weight> was processed using the open source graph visualization package Gephi to create an impressive overview of the philosophers within their respective spheres of influence.

Brendan Griffen extended the idea to "everyone on Wikipedia. Well, everyone with an infobox containing ‘influences’ and/or ‘influenced by’", arriving at a huge, far more dense "Graph Of Ideas" including not only philosophers, but also novelists, fantasy and science fiction writers, and comedians.[2] In another blog post,[3] Griffen added transitive links as well – so that each person is considered to be influenced both directly and indirectly. The most connected people in the graph were ancient Greek thinkers, with Thales, Pythagoras and Zeno of Elea occupying the top three spots. Griffen remarks that this vindicates a statement in Bertrand Russell's History of Western Philosophy (1945): "Western Philosophy begins With Thales".

Also inspired by Raper's posting, Tony Hirst posted a number of visualizations of the Wikipedia link and category structure (likewise using DBpedia and Gephi, queried via the Semantic Web Import plugin) to visualize related entries and influence graphs in the English Wikipedia. The blog posts (all of which include detailed step-by-step tutorials) examine the related graph of philosophers,[4] and also visualize an influence graph of programming languages[5] and one of musical genres related to psychedelic music.[6] All these visualizations and blog posts by Hirst are released under a Creative Commons Attribution license.

Hirst also mentioned a related tool called "WikiMaps", the subject of a recent article in the International Journal of Organisational Design and Engineering.[7] As described in a press release, the tool provides a "map of what is “important” on Wikipedia and the connections between different entries. The tool, which is currently in the “alpha” phase of development, displays classic musicians, bands, people born in the 1980s, and selected celebrities, including Lady Gaga, Barack Obama, and Justin Bieber. A slider control, or play button, lets you move through time to see how a particular topic or group has evolved over the last 3 or 4 years." A demo version is available online.

See also the recent coverage of a similar visualization, based on wikilinks instead of infoboxes: "The history of art mapped using Wikipedia"

Information retrieval scientists turn their attention to Wikipedia's page view logs

Found to be connected to the "#euro2012" hashtag by analyzing Wikipedia pageviews: Euro 2012 football championship

The Time-aware Information Access workshop at this year's SIGIR (Special Interest Group on Information Retrieval) conference brought a wave of attention to Wikipedia's public page-view logs. Detailing the number of page views per hour for every Wikipedia project, these files figure prominently in a variety of open-source intelligence applications presented at the workshop.

A group of researchers from ISLA, University of Amsterdam created an API providing access to this data and performing simple analysis tasks.[8] Though the site appears to be down at the time of writing, the API supports the retrieving a particular article's page-view time series as well as searching for other wikipedia articles based on the similarity of their time series. In addition to machine-readable JSON results, the API will supply simple plots in png format. While the idea of providing page specific time series is not new, support for finding other pages with similar viewing patterns highlights a fascinating new use for Wikipedia page views.

Two other papers are combining Wikipedia page-view information with external time-series data sets. On the intuition that Wikipedia page views should have a strong correlation with real-world events, researchers from the University of Glasgow and Microsoft built a system to detect which hashtags frequently queried on Bing Social Search were event-related.[9] For example, the hashtag #thingsthatannoyme doesn't clearly correspond to an event, whereas a hashtag like "#euro2012" is about the UEFA European Football Championship. After tokenizing the hashtags into a list of words, the researchers queried Wikipedia for those terms and correlated the time series of hashtag search popularity with the page-view time series for the articles which are returned. This correlation score can be used to indicate which hashtags are likely to be about events, a useful feature for web searches and any other temporally aware zeitgeist application.

In a similar vein, researchers from the University of Edinburgh and University of Glasgow used the Wikipedia page-view stream to tackle the problem known as first-story detection (FSD), which aims to automatically pick out the first publication relating to a new topic of interest.[10] While traditional techniques primarily focus on newswire or Twitter, the authors used a combination of Twitter and Wikipedia page views to construct an improved FSD system. To improve on state-of-the-art Twitter-only FSD systems, the authors aimed to filter out false positives by checking that the Twitter-based first stories corresponded to a Wikipedia page that was also experiencing heightened traffic during the same period.

Using a simple outlier detection method, the authors created a set of Wikipedia pages with unexpectedly high page views for each hour. Each Twitter-based first story (tweet) was then matched against the corresponding collection of Wikipedia outliers, employing an undisclosed metric of textual similarity that uses only the Wikipedia page titles. If the tweet failed to match any spiking Wikipedia page, it was down-weighted as a first story candidate. The authors showed that this combined approach improves FSD precision in comparison to a twitter-only baseline for all but the most popular twitter-based stories. Though this research makes advances on the difficult task of first-story detection, perhaps the most immediately useful finding is that Wikipedia page views appear to lag behind twitter activity by roughly two hours. In general, we can expect to see an increasing amount of joint models over various open-source intelligence streams as we learn exactly what each stream is useful for and the relationships between the streams.

See also the Signpost coverage of a small study of the highest hourly page views on the English Wikipedia during January-July 2010, and their likely causes: "Page view spikes"

The limits of amateur NPOV history

In "The inclusivity of Wikipedia and the drawing of expert boundaries: An examination of talk pages and reference lists"[11], information studies professor Brendan Luyt of Nanyang Technological University looks at History of the Philippines, a B-class article that had featured article status from October 2006 until it was delisted at the conclusion of its featured article review in January 2011.

Luyt argues that talk-page discussions, the types of sources cited, and the organization of the article itself, all point to a very traditional view of what constitutes history: in short, great man history concerned mainly with political and military events, and the actions of elites. This style of history does not capture the breadth of approaches used by professional historians, so does not live up to the ideal of NPOV in which all significant viewpoints published in reliable sources are represented fairly and proportionately. In practice, Luyt shows, editors (lacking sufficient knowledge of the relevant professional historical literature) end up using arguments over bias and NPOV to construct a limited and conservative historical narrative—for this article at the least, although a similar pattern could be found for many broad historical topics.

The sources cited are primarily what Luyt calls "textbookese" summaries, easily available online, which focus on bare facts without the historical debates that surround them. Between the valid sources and experts recognized by Wikipedia editors and the good-faith use of the NPOV principle to limit other viewpoints, Luyt concludes that—rather than being more inclusive of diverse views and sources than the typical "expert" community—Wikipedia in practice recognizes a considerably narrower set of viewpoints.

Three new papers about Wikipedia class assignments

An article titled "Assigning Students to edit Wikipedia: Four Case Studies"[12] presents the experiences of four professors who participated in the Wikipedia Education Program, in a total of six courses total (two of four instructors taught two classes each). The lessons from the assignments included: 1) the importance of strict deadlines, even for graduate classes; 2) having a dedicated class for acquiring skills in editing and for understanding Wikipedia policies, or spreading this over segments of several classes; 3) the benefits of having students interact with the campus ambassadors and the wider Wikipedia community.

Overall, the instructors saw that compared with their engagement in traditional assignments, students were more highly motivated, produced work of higher quality, and learned more skills (primarily, related to using Wikipedia, such as being able to better judge its reliability). Wikipedia itself benefited from several dozen created or improved articles, a number of which were featured as DYKs. The paper presents a useful addition to the emerging literature on teaching with Wikipedia, as one of the first serious and detailed discussions of specific cases of this new educational approach.

"Integrating Wikipedia Projects into IT Courses: Does Wikipedia Improve Learning Outcomes?"[13] is another paper that discusses the experiences of instructors and students involved in the recent Wikipedia:Global Education Program. Like most existing research in this area, the paper is roughly positive in its description of this new educational approach, stressing the importance of deadlines, small introductory assignments familiarizing students with Wikipedia early in the course, and the importance of close interactions with the community. A poorly justified (or explained) deletion or removal of content can be quite a stressful experience to students (and the newbie editors are unlikely to realize that an explanation may be left in an edit summary or page-deletion log). A valuable suggestion in the paper was that instructors (professors) make edits themselves, so they would be able to discuss editing Wikipedia with students with first-hand experience instead of directing students to ambassadors and how-to manuals; and to dedicate some class time to discussing Wikipedia, the assignment, and collective editing.

A four-page letter[14] in the Journal of Biological Rhythms by a team of 48 authors reported on a a similar undergraduate class project in early 2011, where 46 students edited 15 Wikipedia articles in the field of chronobiology, aiming at good article status. After their first edits, they were systematically given feedback by one "Wikipedia editor and 6 experts in chronobiology" before continuing their edits (in the paper's acknowledgements the authors also thank "innumerable Wikipedia editors who critiqued student edits"). Because of the high visibility of the results – most of the articles were ranked top in Google results – students found the experience rewarding. Topics were selected collaboratively by the class, and because students came up with a relatively small number of suggestions, one concern was that the project might, if repeated, run out of article topics in the given subject area.

A literature review presented at July's Worldcomp'12 conference in Las Vegas about "Wikipedia: How Instructors Can Use This Technology As A Tool In The Classroom"[15] also recommended to have students actively edit Wikipedia (as well as practicing to read it critically), and concluded that "it is time to embrace Wikipedia as an important information provider and one of the innovative learning tools in the educators' toolbox."

Substantive and non-substantive contributors show different motivation and expertise

"Investigating the determinants of contribution value in Wikipedia"[16] reports the results of a survey of Wikipedians who were asked their opinion about the "contribution value" of their edits (measured by agreement to statements such as "your contribution to Wikipedia is useful to others"), which was then related to various characteristics.

The researchers used Google to obtain a list of 1976 Wikipedia users’ email addresses (using keywords such as “gmail.com” or “hotmail.com”). They sent invitation emails that provided the URL to the online questionnaire. In six weeks, 234 editors completed all the questions. Of these, 205 – Nine females and 196 males – supplied a valid user name and were considered in the rest of the analysis (anonymous editors were removed).

A content analysis was performed of 50 randomly selected edits by each respondent (or all, if the user had fewer than 50 edits), classifying them as "substantive" changes (e.g. "add links, images, or delete inaccurate content") and "non-substantive changes" (e.g. "reorganizing existing content [or] correcting grammatical mistakes and formatting texts to improve the presentation"), corresponding to "two [proposed] new contributor types in Wikipedia to discriminate their editing patterns."

An attempt was made to relate this to the "contribution value" the respondents assigned to their own edits, and to their responses in two other areas:

  • "interests" (measured by respondent ratings of a variety of different motivations to contribute to Wikipedia on how well each applied to themselves, e.g. "Enhancing your learning abilities, skills and expertise"); and
  • "resources" (meaning expertise based on education, profession and hobbies, measured by respondent ratings of their expertise in a variety of fields within eachc, e.g. "Hospitality and tourism").

The "breadth" of interests and resources was defined as the number of ratings above a certain threshold in each, and the "depth" as the highest rating assigned in each.

In an "important consideration for practitioners", the authors wrote that:

"[T]o produce valuable contributions, users with high depth of interests and resources should be encouraged to concentrate their efforts on substantive changes. Meanwhile, for users with high breadth of interests and resources, wiki practitioners should advise them to pay more attention to nonsubstantive changes. The findings imply that practitioners can try to identify two distinct types of users. To achieve this objective, they may develop certain algorithms in wikis to automatically detect the frequencies of substantive/non-substantive changes of users. ... For example, notification messages about wiki articles that need substantive changes can be sent to users who have high levels of depth of interests and resources. Similarly, well-prepared messages about articles that need non-substantive changes can be delivered to users who have high levels of breadth of interests and resources."

Is there systemic bias in Wikipedia's coverage of the Tiananmen protests?

Remembrance of the 20th anniversary of the June 4 events in Hong Kong (replica of the "goddess of democracy" statue)
Remembrance in the West (replica of the same statue at the University of British Columbia, Canada)

Wikipedia: Remembering in the digital age[17] is a masters dissertation by Simin Michelle Chen, examining collective memories as represented on the English Wikipedia; she looked at how significant events are portrayed (remembered) on the project, focusing on the Tiananmen Square Protests of 1989. She compared how this event was framed by the articles by New York Times and Xinhua News Agency, and in Wikipedia, where she focused on the content analysis of Talk:Tiananmen Square protests of 1989 and its archives.

Chen found that the way Wikipedia frames the event is much closer to that of The New York Times than the sources preferred by the Chinese government, which, she notes, were "not given an equal voice" (p. 152). This English Wikipedia article, she says, is of major importance to China, but is not easily influenced by Chinese people, due to language barriers, and discrimination against Chinese sources that are perceived by the English Wikipedia as unreliable – that is, more subject to censorship and other forms of government manipulation than Western sources. She notes that this leads to on-wiki conflicts between contributors with different points of views (she refers to them as "memories" through her work), and usually the contributors who support that Chinese government POV are "silenced" (p. 152). This leads her to conclude that different memories (POVs) are weighted differently on Wikipedia. While this finding is not revolutionary, her case study up to this point is a valuable contribution to the discussion of Wikipedia biases.

While Chen makes interesting points about the existence of different national biases, which impact editors' very frames of reference, and different treatment of various sources, her subsequent critique of Wikipedia's NPOV policy is likely to raise some eyebrows (pp. 48–50). She argues that NPOV is flawed because "it is based on the assumption that facts are irrefutable" (p. 154), but that those facts are based on different memories and cultural viewpoints, and thus should be treated equally, instead of some (Western) being given preference. Subsequently, she concludes that Wikipedia contributes to "the broader structures of dominance and Western hegemony in the production of knowledge" (p. 161).

While she acknowledges that official Chinese sources may be biased and censored, she does not discuss this in much detail, and instead seems to argue that the biases affecting those sources are comparable to the those affecting Western sources. In other words, she is saying that while some claim Chinese sources are biased, other claim that Western sources are biased, and because the English Wikipedia is dominated by the Western editors, their bias triumphs – whereas ideally, all sources should be acknowledged, to reduce the bias. The suggestion is that Wikipedia should reject NPOV and accept sources currently deemed as unreliable. Her argument about the English Wikipedia having a Western bias is not controversial, was discussed by the community before (although Chen does not seem to be aware of it, and does not use the term "systemic bias" in her thesis) and reducing this bias (by improving our coverage of non-Western topics) is even a goal of the Wikimedia Foundation. However, while she does not say so directly, it appears to this reviewer that her argument is: "if there are no reliable non-Western sources, we should use the unreliable ones, as this is the only way to reduce the Western bias affecting non-Western topics". Her ending comment that Wikipedia fails to leave to its potential and to deliver "postmodern approach to truth" brings to mind the community discussions about verifiability not truth (the existence of this debates she briefly acknowledges on p. 48).

Overall, Chen's discussion of biases affecting Wikipedia in general, and of Tiananmen Square Protests in particular, is useful. The thesis however suffers from two major flaws. First, the discussion of Wikipedia's policies such as reliable sources and verifiability (not truth ...) seems too short, considering that their critique forms a major part of her conclusions. Second, the argumentation and accompanying value-judgements that Wikipedia should stop discriminating against certain memories (POVs) is not convincing, lacking a proper explanation of the reasons why the Wikipedia community made those decisions favoring verifiability and reliable sources over inclusion of all viewpoints. Chen argues that Wikipedia sacrifices freedom and discriminates against some memories (contributors), which she seems to see as more of a problem that if Wikipedia was to accept unreliable sources and unverifiable claims.

"Low-hanging fruit hypothesis" explains Wikipedia's slowed growth?

A student paper titled "Wikipedia: nowhere to grow"[18] from a Stanford class about "Mining Massive Data Sets" argues for the "low-hanging fruit hypothesis" as one factor explaining the well-known observation that "since 2007, the growth of English Wikipedia has slowed, with fewer new editors joining, and fewer new articles created". The hypothesis is described as follows: "the larger [Wikipedia] becomes, and the more knowledge it contains, the more difficult it becomes for editors to make novel, lasting contributions. That is, all of the easy articles have already been created, leaving only more difficult topics to write about". The authors break this hypothesis into three smaller ones that are easier to test – that (1) there has been a slowing in edits across many languages with diverse characteristics; (2) older articles are more popular to edit; and (3) older articles are more popular to read. They find a support for all three of the smaller hypotheses, which they argue supports their main low-hanging fruit hypothesis.

While the overall study seems well-designed, the extrapolation from the three subhypotheses to the parent hypothesis seems problematic. The authors do not provide a proper operationalization of terms such as "novel", "lasting", and "easy/difficult", making it difficult to enter into a discourse without risking miscommunication. There may be at least four main issues in the work:

  • (a minor but annoying issue): hypothesis II is incorrectly and confusingly worded in the section dedicated to it: "Older articles (those created earlier) will be more popular to read than more newly created articles"; however, their study of hypothesis II is based on the number of edits to the article, not the number of page views (those are analyzed in the subsequent hypothesis III);
  • regarding the claim "all of the easy articles have already been created, leaving only more difficult topics to write about", it is true that the majority of vital/core articles are developed beyond stub, and their subsequent expansion is more difficult (it takes more and more effort to move the article up through assessment classes). However, while the older articles are more popular, they are not necessarily easier to edit, as Wikipedia:The Core Contest illustrates. While almost everyone may be able to quickly define (stub) Albert Einstein, it is questionable whether 1) developing this article is easier than developing an article on a less well-known subject, where fewer sources mean the editors need to do less research, and 2) while mostly everyone knows who Einstein was, everyone also has knowledge of at least some less popular subjects. As Wikipedia:Missing articles illustrate, there are still many articles in need of creation, and for a fan/expert, it may be easier to create an article on an esoteric subject than to edit the article on Einstein.
  • The claim that "[it is more difficult] for editors to make novel, lasting contributions" is difficult to analyze due to the lack of operationalization of those terms by the authors, but 1) regarding novel, if it means new, see the Missing articles argument above – there is still plenty to write about; and 2) regarding lasting – the authors do not cite any sources suggesting the deletionism in English Wikipedia may be on the rise.

Overall, the paper presents four hypotheses, three of which seem to be well supported by data, and contribute to our understanding of Wikipedia, but their main claim seems rather controversial and poorly supported by their data and argumentation.

See also the coverage of a related paper in a precursor of this research report last year: "IEEE magazine summarizes research on sustainability and low-hanging fruit"

Briefly

  • Barnstars at ASA annual conference: Two Wikipedia papers were presented at the 2012 annual meeting of the American Sociological Association last week, both focusing on "barnstar" awards on Wikipedia.<br?>Michael Restivo and Arnout van de Rijt presented their research on the effect of barnstars, titled "Experimental Study of Informal Rewards in Peer Production", which had found that assigning "editing awards or 'barnstars' to a subset of the 1% most productive Wikipedia contributors ... increases productivity by 60% and makes contributors six times more likely to receive additional barnstars from other community members", as stated in the abstract. See the review in the April issue of this report: "Recognition may sustain user participation".
    Benjamin Mako Hill, Aaron Shaw, and Yochai Benkler presented "Status, Social Signaling, and Collective Action: A Field Study of Awards on Wikipedia", with a more skeptical look at the effect of barnstars. According to the abstract, "Willer has argued for a sociological mechanism for the provision of public goods through selective incentives. Willer posits a "virtuous circle" in which contributors are rewarded with status by other group members and in response are motivated to contribute more. [... But] there is reason to suspect that not all individuals will be equally susceptible to status-based awards or incentives. At the very least, Willer's theory fails to take into account individual differences in the desire to signal contributions to a public good. We test whether this omission is justified and whether individuals who do not signal status in the context of collective action behave differently from those who do in the presence of a reputation-based award. [Analyzing barnstars on Wikipedia,] we show that the social signalers see a boost in their editing behavior where non-signalers do not."
  • How high school, college and PhD students evaluate Wikipedia quality: "Trust in online information A comparison among high school students, college students and PhD students with regard to trust in Wikipedia"[19] is a master thesis that looks at how these three groups judge the trustworthiness of Wikipedia articles, based on the "3S-model" model by the advisors of the thesis (Lucassen and Schraagen (2011), Factual Accuracy and Trust in Information: The Role of Expertise. Journal of the American Society for Information Science & Technology, 62, 1232–1242). Unsurprisingly, the more educated the group is, the more detailed their analysis will be. High school students usually focus on accuracy, completeness, images, length, and writing style. College and PhD students go beyond those five elements, although looking at authority, objectivity, and structure. Interestingly, the differences between college and PhD students were much smaller than those between high school students and the other two groups. Another important finding of the study was that the less educated the group, the less likely they are to be aware of Wikipedia being open source and open to editing by anyone. Further, high school students seem to have much more difficulty in distinguishing between a high and low quality article, and overall, seem much more likely to simply not question the trustworthiness of the sources given.
  • Doctors widely use Wikipedia as a reference: A literature review of 50 articles about the use of social media by clinicians[20] found that "Wikipedia is widely used as a reference tool" among them, despite concerns about its accuracy. The authors remark that "we found multiple projects that sought to emulate Wikipedia's success in crowd-sourcing useful medical content, while additionally emphasizing editorial credibility by verifying credentials of contributors. These include RadiologyWiki, announced in 2007 and currently dormant, and Medpedia, which launched in 2009 with substantial institutional backing. We did not find articles reporting success metrics for these projects or similar ones."
  • Predicting quality flaws in Wikipedia articles: A notebook paper to presented at the annual PAN workshop at the Conference and Labs of the Evaluation Forum meeting (CLEF '12) introduces FlawFinder, a toolset to predict quality flaws in Wikipedia articles.[21] The paper is one of the winning entries in a Competition on Quality Flaw Prediction in Wikipedia. The paper defines 11 types of quality flaws, spanning low-level issues (such as orphaned or unreferenced articles) and high-level quality flaws (such as notability or original research). It uses a corpus of articles tagged with cleanup templates (154,116 articles from a January 2012 dump of the English Wikipedia) as a training set to predict whether articles in a separate, uncategorized set suffer from the same flaws. The model uses a variety of features of the training set based on revision data, lexical properties, structural properties of the article and the reference section, network properties of the link graph. The results suggest, among other things, that the strongest non-lexical features for the advert flaw are links pointing to external resources, while the number of discussions on article's talk page is the strongest feature to predict original research.
  • Quality of text and quality of editors. A poster presented at the 2012 ACM Conference on Hypertext and Social Media (HT 2012) describes a method to measure the quality of Wikipedia articles by combining text survival metrics and the quality of editors editing these articles, where editor quality is calculated recursively as a function of the quality of their contributions. The method claims to be "resistant to vandalism", however no empirical validation is presented in the poster.[22]
  • WikiSym 2012: WikiSym, the annual conference "dedicated to wiki and open collaboration research and practice" was happening in Linz, Austria as this issue of the research report went to press. Links to online versions of all conference papers have been posted in the program; expect fuller coverage in the September issue.

References

  1. ^ Raper, Simon: Graphing the history of philosophy. Drunks and Lampposts, June 13, 2012
  2. ^ Brendan Griffen: The Graph Of Ideas. Griff's Graphs, July 3, 2012
  3. ^ Brendan Griffen: The Graph Of Ideas 2.0. Griff's Graphs, July 20, 2012
  4. ^ Hirst, Tony (2012). Visualising related entries in Wikipedia using Gephi. OUseful.Info, July 3, 2012
  5. ^ Hirst, Tony (2012). Mapping how Programming Languages Influenced each other According to Wikipedia. OUseful.Info, July 3, 2012
  6. ^ Hirst, Tony (2012). Mapping related Musical Genres on Wikipedia with Gephi. OUseful.Info, July 4, 2012
  7. ^ "Wikimaps: dynamic maps of knowledge" in Int. J. Organisational Design and Engineering, 2012, 2, 204–224
  8. ^ Peetz, M. H., Meij, E., & de Rijke, M. (2012). OpenGeist: Insight in the Stream of Page Views on Wikipedia. SIGIR 2012 Workshop on Time-aware Information Access (#TAIA2012). PDF Open access icon
  9. ^ Whiting, S., Alonso, O., & View, M. (2012). Hashtags as Milestones in Time. SIGIR 2012 Workshop on Time-aware Information Access (#TAIA2012). PDF Open access icon
  10. ^ Osborne, M., Petrovic, S., McCreadie, R., Macdonald, C., & Ounis, I. (2012). Bieber no more: First Story Detection using Twitter and Wikipedia. SIGIR 2012 Workshop on Time-aware Information Access (#TAIA2012). Open access icon
  11. ^ Luyt, B. (2012). The inclusivity of Wikipedia and the drawing of expert boundaries: An examination of talk pages and reference lists. Journal of the American Society for Information Science and Technology, 63(9), 1868–1878. doi:10.1002/asi.22671 Closed access icon
  12. ^ Carver, B., Davis, R., Kelley, R. T., Obar, J. A., & Davis, L. L. (2012). Assigning Students to Edit Wikipedia: four case studies. E-Learning and Digital Media, 9(3), 273–283. PDF Closed access icon
  13. ^ Patten, K., & Keane, L. (2012). Integrating Wikipedia Projects into IT Courses: Does Wikipedia Improve Learning Outcomes? AMCIS 2012 Proceedings. PDF Closed access icon
  14. ^ Chiang, C. D., Lewis, C. L., Wright, M. D. E., Agapova, S., Akers, B., Azad, T. D., Banerjee, K., et al. (2012). Learning chronobiology by improving Wikipedia. Journal of Biological Rhythms, 27(4), 333–36. HTML Closed access icon
  15. ^ Hogg, J. L. (2012). Wikipedia: How Instructors Can Use This Technology As A Tool In The Classroom. Worldcomp’12. PDF Open access icon
  16. ^ Zhao, S. J., Zhang, K.Z.K., Wagner, C., & Chen, H. (2012). Investigating the determinants of contribution value in Wikipedia. International Journal of Information Management. doi:10.1016/j.ijinfomgt.2012.07.006 Closed access icon
  17. ^ Chen, Simin Michelle (2012): Wikipedia: Remembering in the digital age. University of Minnesota MA thesis. June 2012. PDF Open access icon
  18. ^ Austin Gibbons, David Vetrano, Susan Biancani (2012). Wikipedia: Nowhere to grow Open access icon
  19. ^ Rienco Muilwijk: Trust in online information A comparison among high school students, college students and PhD students with regard to trust in Wikipedia. University of Twente, February 2012 PDF Open access icon
  20. ^ von Muhlen, M., & Ohno-Machado, L. (2012). Reviewing social media use by clinicians. Journal of the American Medical Informatics Association : JAMIA, 19(5), 777–81. doi:10.1136/amiajnl-2012-000990 Open access icon
  21. ^ Ferschke, O., Gurevych, I., & Rittberger, M. (2012). FlawFinder: A Modular System for Predicting Quality Flaws in Wikipedia. Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN) Workshop (PAN @CLEF 2012), Rome. PDF Open access icon
  22. ^ Suzuki, Y., & Yoshikawa, M. (2012), QualityRank: assessing quality of wikipedia articles by mutually evaluating editors and texts. 23rd ACM Conference on Hypertext and Social Media (HT 2012). DOI Open access icon


Reader comments

2012-08-27

Just how bad is the code review backlog?

Code review backlog plotted over time

The number of unreviewed changesets seems to have peaked last month.

Developers were left one step closer to an understanding of the code review outlook this week after the creation of a graph plotting "number changesets awaiting review" over time (wikitech-l mailing list). The chart, which also shows the number of new changesets created on a daily basis, reveals a peak in the number of unreviewed changesets in mid-July, followed by a short drop. The current figure stands at approximately 219 unreviewed changesets.

Apparently little more than a number, non-technically-inclined users may question the relevance of such a statistic. By contrast, many developers – particularly volunteer developers – care greatly about its implication for the time it will take for their code to gain the attention of senior developers. As volunteer Brian Wolff wrote this week in his comprehensive roundup of his Git and Gerrit experience five months on, "[now we're using Gerrit] it requires someone to approve your commit, as opposed to merely someone not finding an issue with it. Thus if nobody cares, your commit could sit in limbo for weeks or even months before anyone approves it ... [the result is] less instant gratification".

Nevertheless, others may wonder if code review has ever really been that big of a problem, given that the situation is apparently always bad, and yet has never seemingly reached crisis point. To do so would be to forget the forced surges of review activity before previous release deadlines that left many major development projects behind schedule. Of course, such surges will no longer be prompted in the same way for MediaWiki 1.20 and beyond: replaced by what is likely to be a perennial concern for review timeliness that will only ease slightly on the back of these figures.

In brief

Signpost poll
Lua
You can now give your opinion on next week's poll: Do you care about code review?

Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for several weeks.

  • Busy week for Lua: Last week's deployment of Lua to test2wiki, combined with this week's deployment to MediaWiki.org (wikitech-l mailing list) conspired to create a buzz around the template programming language that included the "Luafication" of several prominent citation templates. Predictably, widespread excitement was tempered by specific concerns, many of which were referenced in last week's "Technology report"; nevertheless, few if any commentators felt that the scheme ought to be abandoned and the French Wikisource has already put its name forward to be the first user-facing wiki to trial the software. Likewise, this week's poll results suggest that there could eventually be a significant body of Lua-competent editors, on the bigger wikis at least (example code editor). A full introductory video is also available as of this week, being the video of a presentation given by Lua lead Tim Starling at this year's Wikimania conference.
  • Google Summer of Code: pencils down please: The Google Summer of Code programme has now officially drawn to a close, with all eight of the nine MediaWiki participants who made it to the halfway point seemingly passing the final evaluation. Each has submitted (or will be submitting) a final wrap-up report; many will also choose to be profiled in the pages of the Signpost. Students now have a fortnight to confirm their award by submitting code samples to Google, before beginning the long, hard road to either an eventual deployment (for those with WMF-suitable projects) or widespread adoption. Late on Monday, serial commentator MZMcBride enquired about which metrics might be used to justify the benefit of MediaWiki's annual participation in the scheme, receiving several responses.
  • Planet software upgraded: Planet, the software that powers the language specific subdomains of planet.wikimedia.org, was upgraded this week (wikitech-l mailing list). Wikimedia had previously been using an unmaintained version of the software, which collates entries from multiple blogs into a single feed; it is now running Venus, a maintained fork of the original project available from the Debian package repository. The list of feeds included in the English-language has also been trimmed and revised to reduce the number of errors and warnings generated during the site's daily update cycle.
  • Appreciation thread: In overt contrast to last week's negativity on the developer mailing lists, WMF Engineering Community Manager Sumana Harihareswara began a thread entirely devoted to spreading "appreciation" for other developers' effort (wikitech-l). One of the most popular threads of the week, the topic has provoked a dozen posts naming some 25 technical contributors, including some seemingly deliberately targeted at smoothing over some of the previous controversies. Harihareswara also posted a directory of mailing lists developers may have been unaware of.
  • Wikimedia, thumbnails and running out of space: Software developer Ariel Glenn this week blogged about Wikimedia's uncertain stance on thumbnail caching and the implications of this for its longterm hardware demand. "How do we keep our pile of scaled media from expanding at crazy rates? ... Should we limit thumb generation to specific sizes only?" asks Glenn, suggesting a future debate on the topic, which has been influenced by the introduction of new distributed storage system Swift after initial difficulties.
  • Six bots approved: Six BRFAs were recently approved for use on the English Wikipedia:
At the time of writing, twelve BRFAs are active. As always, community input is encouraged.

Reader comments

2012-08-27

Wikipedia rivals The New Yorker: Mark Arsten

This edition covers content promoted between 19 and 25 August 2012
This week the Signpost interviews Mark Arsten, who has written or contributed significantly to ten featured articles; most have related to new religious movements (NRMs), and some have touched on other controversial or quirky topics. Mark gives us a rundown on how he keeps neutral and what drives him to write featured content; he also gives some hints for aspiring writers.

On editing and featured content
Like most of us, I read Wikipedia articles for some time before I began editing. In my view, one of the best parts of Wikipedia is accessibility. I recall digging around the library back in college having a really hard time trying to find references for the papers I was writing. Having easy access to a quality article with a solid reference section would have made things much easier for me. I became fascinated by the unusual articles on Wikipedia, and that's what led to my registration and writing for the project. It's great to be able to read a comprehensive, well-written article on a strange topic without having to buy a copy of The New Yorker or Harper's. One example is the Museum of Bad Art, which I visited after learning about it through Wikipedia. I've been able to write some decent articles on unusual topics myself, like the Voluntary Human Extinction Movement, and it feels great to have been able to give other readers that same experience.

If I try to work on a boring topic, I'll never be able to finish the project. Take sand for example—I'd hate to do hours and hours of research on a sandy topic. To bring an article up to featured status takes a long time, so when I started attempting to bring articles to featured status I realized I'd have to find subjects that would keep my interest. I settled on NRMs, which I find very interesting. There's usually a mix of heroes, villains, and mystery in them—and in many cases the founders are morally ambiguous. In addition, several NRMs have cosmologies that make most science fiction seem unimaginative; Martin Gardner once said that The Urantia Book (whose publisher I've written about) is a work that "outrivals in fantasy the cosmology of any science-fiction work known".

On neutrality

Writing about the lynching of Jesse Washington was an occasion that required strict adherence to Wikipedia's neutrality policy, Mark says.

When writing about NRMs, you have to avoid the temptation to try to make a point about the group. With most NRMs there are some people who want to make a point that the group is an evil cult or a scam, while there are others who want to communicate that it's objectively no stranger/more evil than established religions or that it genuinely helps people. So you have to strike a balance between the Rick Ross or South Park viewpoint on one side, and the militant universalist viewpoint or the public relations people on the other.

In general, the best way to stay neutral is to use neutral sources. Ensuring that an article is well cited to clear, unbiased sources is the foundation. After that, feedback is very important. It's hard to realize all of your mistakes but somewhat easy to notice the mistakes of others. I'm always surprised by how many issues people can find in what I thought was the "perfect version" of an article. In many cases, others will notice small issues with my wording that never would have occurred to me. Also, an important consideration when working with controversial topics is whether you can achieve a sort of distance from the issue. For example, it would have been harder for me to write neutrally about Trayvon Martin than Jesse Washington, even though Washington's death was far more barbaric.

There are definitely some topics I consider to be "untouchable". The primary reason I'd avoid a topic is the involvement of other editors that would make it difficult for me. My goal is usually to improve the sourcing/comprehensiveness/prose of an article and bring it to GA or FA. There are some editors whose goal is to make sure the article exactly matches their point of view. Having dealt with some of them, I've realized that life is easier and more enjoyable when I stay away from such people. A general rule of thumb is that if an Arbcom case has been named after the article, you want to keep your distance.

On participating at FAC
The most important thing a newcomer to the featured article process should do is to get help from others. A lot of the time in the featured article candidates (FAC) forum, we see articles with issues that should have been taken care of before their nomination. These articles are time-consuming for reviewers; but more importantly, the nominator often becomes discouraged by seeing their article fail. What newcomers need to do is approach users who have experience with the FA process and ask for help. It's not fun to beg for help, but it's more fun than watching an article fail. In my experience, most people who take part in the FA process are very relaxed and good-natured, and I think some of the nicest people on the project work at FAC: a community of brilliant people who are interested in producing quality work. So I recommend you find active reviewers and writers, and harass them mercilessly until they help you—get advice on sourcing, neutrality, prose, MOS, everything. A lot of people think that the featured article standards are too difficult and don't make an effort to get involved with the FA process, but with enough help, almost any committed writer can produce a featured article.


SMS König Albert
British actor Peter Sellers
American singer Kelly Clarkson; her discography is now featured.
A European Robin
An Aporia crataegi specimen

Eight featured articles were promoted this week:

  • SMS König Albert (nom) by Parsecboy. SMS König Albert was a battleship of the German Imperial Navy which served during World War I. Laid down in 1910, König Albert was commissioned in 1913 and participated in most of the major fleet operations of the war. After the war and shortly before the signing of the Treaty of Versailles in 1919, the fleet's commander ordered that the ships – including König Albert – be scuttled in Scapa Flow.
  • Osiris myth (nom) by A. Parrot. The Osiris myth, concerning the murder of the god Osiris and its aftermath, is the most elaborate and influential story in ancient Egyptian mythology. The myth, integral to conceptions of kingship and succession, conflict between order and disorder, and death and the afterlife, reached its essential form in or before the 25th century BC. It is still known, although no ancient Egyptian sources give the full myth.
  • Rex Ryan (nom) by The Writer 2.0. Ryan (b. 1962) is an American football head coach for the New York Jets of the National Football League. The son of another coach, he became an assistant coach after graduating from university and, in 2009, was signed to the Jets. He is known for his outspoken manner, boisterous attitude and successful coaching; players under Ryan consider him friendly.
  • "Say Hello to My Little Friend" (nom) by TBrandley. "Say Hello to My Little Friend" is a 2012 episode of the American television series Awake, in which the main character switches back and forth between two similar realities, struggling to figure out which world is "real". Directed by Laura Innes, the episode generally received positive reviews and was seen by 2.51 million viewers.
  • Louie B. Nunn (nom) by Acdixon. Nunn (1924–2004) was the 52nd governor of Kentucky and first Republican in twenty years. Starting his political career as a judge and working on several national campaigns, in 1967 Nunn defeated Henry Ward to become governor. Nunn enacted many legislative changes but faced race riots and anti-war protests.
  • Sons of Soul (nom) by Dan56. Sons of Soul, the third studio album by American R&B group Tony! Toni! Toné!, was released in 1993 and paid homage to the group's musical influences. Recording mostly took place in Trinidad, which influenced the sound of the album. Sons of Soul was a critical and commercial success, being named the best album of 1993 by Time magazine.
  • Peter Sellers (nom) by SchroCat and Cassianto. Sellers (1925–1980) was a British film actor, comedian and singer. He began his stage career as an infant, accompanying his parents in a variety act. He became a regular on BBC programming after World War II and became a film actor in the 1950s. He has been described as "one of the most accomplished comic actors of the late 20th century".
  • Trait du Nord (nom) by Dana boomer and Tsaag Valren. The Trait du Nord is a breed of heavy draft horse developed in the area of Hainaut. Originally meant to work on farms, the horses were later bred for size and for their meat. Weighing upwards of 1,000 kilograms (2,200 lb), the horses are currently used for recreation, and are considered endangered owing to a low number of foal births.

Five featured lists were promoted this week:

  • List of Grey's Anatomy episodes (nom) by TRLIJC19 and Jonathan Harold Koszeghi. The American medical drama Grey's Anatomy has broadcast 172 episodes since "A Hard Day's Night" on March 27, 2005. The series is scheduled for another season starting in September.
  • List of international cricket centuries by Inzamam-ul-Haq (nom) by Sahara4u. The Pakistani cricketer Inzamam-ul-Haq scored 25 centuries in Test matches and 10 in One Day International matches during his 15-year career. He was the tenth player to score 25 or more centuries in Test cricket.
  • 2008 Summer Paralympics medal table (nom) by Miyagawa. The 2008 Summer Paralympics in Beijing saw 146 National Paralympic Committees (NPCs) send 3,951 athletes. A record-breaking 76 NPCs won medals, with five winning their first golds.
  • Kelly Clarkson discography (nom) by Woofygoodbird. American recording artist Kelly Clarkson has released five studio albums, two extended plays, two video albums, twenty-one singles and twenty-two music videos since winning American Idol in 2002. Her best-performing album was 2004's Breakaway.
  • List of Guillemots songs (nom) by A Thousand Doors. The multinational indie music band Guillemots have recorded more than 80 songs since the band's formation in November 2004. Several have not seen official release, instead being leaked or played in concert.

Three featured pictures were promoted this week:

One featured portal was promoted this week:

  • Maryland Roads (nom) by Dough4872. The Maryland highway system consists of roads in the US state of Maryland that are maintained by the Maryland State Highway Administration, with three main systems: Interstate Highways, US Highways, and Maryland state highways.

One featured topic was promoted this week:

An Italian Tree Frog, a new featured picture


Reader comments

2012-08-27

From sonic screwdrivers to jelly babies: WikiProject Doctor Who

WikiProject news
News in brief
Submit your project's news and announcements for next week's WikiProject Report at the Signpost's WikiProject Desk.
The Doctor travels through space and time in his TARDIS, disguised as a blue police box
Matt Smith plays the current incarnation of the Doctor
Davros with his creations, the Daleks
Captain Jack Harkness, an occasional companion of the Doctor and a lead character in the spinoff Torchwood
A Weeping Angel
The Ood at a Doctor Who exhibition
A cosplayer recreating the Fourth Doctor's signature scarf

This week, we hopped in a little blue box with a batch of companions from WikiProject Doctor Who. Started in April 2005, the project has grown to include about 4,000 pages about the world's longest-running science fiction television show, its spinoffs, and various related material. The project is the parent of the Torchwood Taskforce and a child of WikiProject British TV and WikiProject Science Fiction. With new Doctor Who episodes airing this week and a 50th anniversary celebration around the corner, we thought now would be a good time to inquire about the famed Time Lord. We interviewed members representing a variety of generations and national origins: Redrose64, SoWhy, Glimmer721, Sceptre, and MarnetteD.

What motivated you to join WikiProject Doctor Who? Which incarnation of The Doctor is your favourite? Do you have a favourite companion?

Redrose64: I've been aware of Doctor Who from summer of 1971, when (age 6, nearly 7) I saw an episode of The Dæmons at a friend's house (that friend's father was Spencer Chapman, for those in the know). Colour TV was still rare then, and my parents only had b/w until the mid 1970s. From about 1975 (The Ark in Space) I really got into DW, and in 1976 started buying the Target novelisations, not as they appeared, but when I could afford them. I have also amassed a stack of reference books; and, from about 2005, the DVDs. (ii) Jon Pertwee; Sylvester McCoy; David Tennant. Definitely not Colin Baker. (iii) Sarah Jane Smith; Ace; Rose. Ian Chesterton was always dependable, like an uncle.
SoWhy: I discovered Doctor Who relatively late, when German TV started airing the new Series in 2007 but have then caught up on all the info there is and decided to join the WP to help with those articles, especially those concerning the 2005 series which I am most familiar with. My favorite Doctor is the Tenth (played by David Tennant), followed by the Ninth, Fifth and Fourth. My favorite companion is Jack Harkness even though he only spent little time with the Doctor.
Glimmer721: Doctor Who was brought to my attention by seeing it around on the Internet (including Wikipedia), and as I love time travel I decided to check it out, remembering that we had the channel BBC America. On April 23, 2011 I watched the series 6 premiere "The Impossible Astronaut" and was hooked. Since then I've watched all of the new series (except I haven't gotten around to "Planet of the Dead" and The End of Time yet) and I've been bouncing around the classic series. After a rough first encounter with an editor who has since been banned, I did some minor work with the project, eventually joining. I began editing majorly about a year ago. Matt Smith (11) will always be my Doctor, but i enjoy them all and like embracing the show as one big anthology. My favorite companions are Sarah Jane, Romana, and Amy and Rory.
Sceptre: I was brought up watching reruns of Doctor Who on what was then UK Gold. I got into the new series about halfway through Eccleston's run and, I think, started contributing to the Wikipedia articles after every Saturday evening during the second series led to me often semi-protecting the articles from IP edits. My favourite Doctor would probably be McCoy. He had, to borrow a phrase from Gallifrey Base, the gravitas to carry the role but was bogged down by poor writing that Colin Baker was bogged down by. Eccleston and Smith would take second and third, respectively. As towards companions, no-one will ever live up to Sarah Jane for what she did for the show, but Rory takes a close second.
MarnetteD: I stumbled on the show on PBS one Sunday in 1981. The story was The Keeper of Traken and I was hooked after the first 5 minutes. Our PBS station aired all the episodes for a given story together so I was already spoiled in the way I got to see the show. I remember being rocked the next week when Logopolis aired. I didn't know anything about "regeneration" and I was bummed that this actor (Tom Baker) I had just been amazed by wasn't going to be on the show any more. Little did I know that I would be seeing his serials for about 18 months before any Davison eps aired. I enjoy all 11 actors take on the character (and there even some fun things about Peter Cushing) but I had admired Pat Troughton since seeing his performance as the Duke of Norfolk in The Six Wives of Henry VIII and I enjoyed learning that was the first role he performed after leaving Dr Who. It would be difficult for anyone to top Sarah Jane. I feel like the 4th Dr and Leela were a lethal brains/brawn combination. Ian and Barbara helped make the show work in the beginning. As I rewatch the first two seasons of the new series my admiration for Noel Clarke grows. Even in his early episodes he is making Mickey into something more than a one note character and Arthur Darvill has done the same with Rory.

Interest in the Doctor Who franchise has grown beyond Britain, particularly in recent years. Do you tend to see more editors from inside or outside the UK working on Doctor Who articles? Are there still some cultural differences or language barriers that get in the way when working with these other editors?

Redrose64: It's not always obvious where somebody comes from. We've tried to harmonise (and not as an obsessive adherence to British terminology) right down to terms such as "season" and "serial" by producing WP:WHO/MOS.
Glimmer721: I am an American, but I have a weird trait where I insist that I call everything by its proper name. I have quickly mastered British terminology and sometimes I even spell "realize" or "civilization" the British way in my everyday life. The only problem with being in America is that some resources are unavailable, especially video or audio commentary on the official website.
Sceptre: Since the gap between transmission in the U.K. and in North America has been cut to hours recently, there have definitely been more American contributors to Doctor Who articles. I think the project is better for it; there are few British editors who would contribute to an article about a Lost episode, for example.
MarnetteD: First I would be remiss if I didn't point out that interest in the show outside the UK has been strong for decades. In the mid 80s the Dr Who Fan Club of America was the largest in the world and I was lucky enough to live in Denver where it was based. The planners for the 20th anniversary convention in Chicago had prepared for two or three thousand attendees but wound up with over 10,000. My feeling is that we have always had a mix of editors over the years. When I joined the project User:Khaosworks was one of the driving forces in keeping things in order and he is from Singapore. Later User:Edokter took a major interest in the project and he is from the Netherlands. Now we have editors like Redrose64 and Don Quixote were are active in keeping things going. They are good at helping enthusiastic newbies understand WP:ENGVAR and I have never noticed big problems or at least sustained ones with Dr Who articles.

Are some subjects or periods in the franchise's history better covered than others? Are there any notable gaps in Wikipedia's coverage of Doctor Who? Have the lost episodes from the earliest seasons complicated matters?

Redrose64: The lost episodes do attract WP:HOAX postings along the lines of "A copy of Rider From Shang-Tu on Kodak Super 8 has been found in a cave in Antarctica and will be released on DVD next week in full colour and Dolby 5.1 surround".
SoWhy: Since the 2005 series started airing at a time that Wikipedia existed, articles concerning this period are mostly written quicker and with much more editors taking an interest. For example: Category:FA-Class Doctor Who articles contains 3 episodes and 1 person related to the new Series out of 7 in total. Similarly, Category:GA-Class Doctor Who articles contains almost exclusively articles related to the 2005 series. On the other hand, unlike other areas, there are no notable gaps (to my knowledge) in the coverage and every part of the Classic series is covered extensively as well.
Glimmer721: The new series has more online sources which are obviously more accessible. I have been working almost exclusively in Series 5 and 6 because those sources are more avaliable to me, although I am open to collaborating with the earlier episodes and even the Classic series (City of Death is the best of these articles by far).
Sceptre: I think the Baker and McCoy eras have the problem of being a time when Doctor Who was the laughing stock of the BBC, so coverage other than in retrospective books is actually more scant than the days of Tom Baker, when the show enjoyed about 14 million viewers each week. Other than that, the missing episodes also lead to a gap in coverage, although recent discussion of the lost episodes in Doctor Who Magazine mitigates this. That the audio for all episodes exists helps too. Conversely, the new series has a lot better coverage. I was able to get the article for the episode "Partners in Crime" to featured status within two weeks of its transmission due to such a preponderance of sources. And the attention given to Russell T Davies allowed me – although this time over eighteen months – to get that article featured too.
MarnetteD: I would agree with SoWhy that there are not any major gaps in coverage of either the Classic or the New series. We did lose an article that described the naming conflict/confusion of the serials through the first three years of the show but we had to since it's source was self published. It was correct in its research though and if it could ever be resurrected with better sourcing I think it would be helpful but that is just one editors opinion. I haven't noticed any problems with the "lost episodes" perhaps because coverage of them is still pretty thorough - especially as the soundtracks have been made available on CD over the years.

Does the project deal with a lot of fancruft? What elements of the Doctor Who canon make it on Wikipedia and what elements are cast aside? Is the project in contact with the Doctor Who wiki?

Redrose64: Fancruft? I'll say. Some people just don't get the idea of WP:V and WP:NOR - "just watch this episode, then that one, and you'll see that Martha has clearly used exactly the same two words that were spoken by Barbara in 1965".
Glimmer721: The continuity sections can be a problem and hinge on trivia. I generally try to weed it down to only things found in reliable sources (reviews, official website, books), although sometimes things deal with the main story arcs or are a deliberate nod noted by writers, in which case it can be moved to the "writing" section.
Sceptre: Continuity sections are a problem that the articles have always had. There's a shift among WikiProject editors against these articles, and I for one have been vocal about not using them, and the featured articles don't use them, instead dividing the content throughout the article.
MarnetteD: Fancruft or trainspotting as one editor called it (I know I used that term for a bit until asked not to but it is still apropos) is always a dilemma. Lots of editors show up with "I just noticed this cool fact" and we have to explain the problem of it not being proper for an encyclopedia. Many years ago these were a feature of WikiP but the BLP problems that came to the fore over six years ago, and lead to our WP:RS and WP:V policies meant, rightly, that they had to go. I have sometimes wished that there was a WikiFiction or Fictionpedia that we could send editors to to play with to there hearts content. I can remember two+ years ago there was one newbie who wanted to add the term "base under seige" to various articles. We couldn't find any reliable sources to back up the use of the term at the time. Now the term is used numerous times in the documentary about the 2nd Doctors era of the show that is an extra on the DVD for The Krotons. If a Wikifiction existed the editor could have entered it there and then it could make its way back here when the sourcing problems were solved. I don't know if there are many (or any) editors who work both here and at the Dr Who wiki.

Has the project had any difficulties acquiring images for articles? What are the copyright implications of posting fair use images of British programs on the US-based servers of Wikipedia?

Sceptre: We used to have a problem with non-free screenshots being used in articles about yet-to-be-transmitted episodes. The scrutiny given to non-free images at FAC has trickled down and editors are more aware of those issues. Because people are very good at crowdsourcing filming locations, we have been able to find images of the show in production (for example, this photo of "The Eleventh Hour" being filmed, something I don't believe we have for other shows.
MarnetteD: We do still have several articles of the Classic series (and even a few for the currant one) that do not have images in their infoboxes but I understand so little about the ins and outs of usage that I do not know if that can ever be solved.

The project has built a collection of 66 Good Articles and 7 Featured Articles. Have you contributed to any of these articles? Why do the Good Articles outnumber the Featured Articles by such a wide margin? What are the most difficult aspects to improving a Doctor Who article to GA or FA status?

Glimmer721: I have promoted 22 episodes from series 5 and 6 to GA status, as well as "Fear Her" and Doctor Who (series 5), and I have collaborated with User:Eshlare on some of the character articles. Series 5 is currently at WP:GTC in hopes of becoming the project's first GT. I do not have much experience with FA, but I hope to bring "The Eleventh Hour" and Doctor Who (series 5) up to that status sometime; my only concern is some questionable sources. The FAs are most likely outnumbered by GAs because many of them are episodes, and not all episodes have enough information to bring them up to FA. The most difficult part for me, as I noted earlier, are continuity sections and sources, especially for the earlier episodes and even the classic serials. Many of the earlier articles are comprised of bulleted lists, which I am working on removing. These articles must be reformatted before any significant improvement can be made.
Sceptre: I think the main reason is that I've had a case of burnout relating to the Russell T Davies article, having been working for months to get it featured, and I don't really want to go to FAC so soon. For the past four years I've been meaning to make Doctor Who (series 4) a featured topic, but a lack of time or motivation is the major obstacle. I would believe, though, that most post-revival episodes have the potential to become featured, if any editor was interested enough. The controversial cancellation of Doctor Who Confidential may have an effect on future series become featured, as that show was valuable for the amount of information about the show's production it provided.

How does WikiProject Doctor Who compare to other science fiction projects, like WikiProject Star Trek and WikiProject Star Wars? Is there any overlap in membership between the projects? Have there been any efforts to collaborate with these projects?

Sceptre: I would assume that there was overlap, but I'm not actively aware of such. Seeing as there will probably never be a crossover between the shows, I don't anticipate inter-project collaboration, but I think those project's editors may be able to provide us with help as Doctor Who fans.
MarnetteD: I have also noticed very little overlap between projects and I know of no efforts to collaborate. While there might have been some crossover when the Star Trek franchise was still producing television episodes that was in the early days of WikiP and Dr Who was in hibernation at the time.

With the seventh series scheduled to premiere this Saturday, what are the project's most urgent needs? How can a new contributor help today?

Glimmer721: I think the most pressing matter now, with the new series coming up, is that we make sure we are up-to-date on information but do not include rumors or info that are published on fansites and not officially confirmed (such as episode titles or details). We have been doing a good job so far. The new companion (played by Jenna-Louise Coleman) is also something to watch out for, as according to reports little information will be released on her before her debut at Christmas. I also have a goal to bring the newly created episode articles to DYK. Any new contributer is welcome to drop a message at WT:WHO detailing what he or she would like to do and what sources he or she has.
Sceptre: I'm more concerned with the upcoming fiftieth anniversary next year; I anticipate that a Doctor Who article will be featured on November 23 next year, for example. I think general maintainence work to ensure the articles are reliably sourced and decently written is the top priority, as well as some vigilance to ensure rumours aren't reported as fact; Wikiality is a real concept, as I can testify.
MarnetteD: Due to the way that I like to experience the show I stay away from the articles about upcoming episodes so that I will know as little about them as possible before viewing so apologies for being of no help there. I would agree with Sceptre that anything that we can do to prepare for the 50th anniversary (due to the quirks of our calendar it is going to fall on a Saturday which I think is kind of a fun coincidence) would be a benefit to the project and to WikiP in general.


Next time, we'll check out the fungus among us. Until then, pretend you're a mycologist in the archive.

Reader comments

2012-08-27

Sidebar and main page alterations; Recent Deaths; Education Program extension

Proposals

Additions to the sidebar
There has been a suggestion to add links for the Article creation wizard and the Teahouse to the sidebar.
Adding broad topic box to main page
A new section on the main page has been suggested to encourage editing of more topics. This box would contain broad topics that may persuade a reader to become an editor.
Closing Wikiquette assistance
Wikiquette assistance is a place where users who feel they are being treated uncivilly can request assistance in order to come to a mutual solution.
Proposal to promote to guideline
Wikipedia:Official names is an essay that has been cited in many instances including requested moves. Is it time to promote this essay to a guideline or merge it into existing guidelines?
Main page redesign
A competition has begun to find a better alternative to the current main page, which was last redesigned in March 2006.

Requests for comment

Categorization of persons
Should biographies be further categorized to include genetic and cultural heritage, faith or sexual orientation?
Changing the Mediation Committee
Users are requesting input on how the Mediation Committee should be changed or if it should be closed.
De-adminship
There are currently 5 methods for how administrators can lose their rights with 3 being controlled by the administrator. Should more community based action be available to remove administrator rights or once elected to become administrator, are you one for life?
Global ban discussion
As mentioned in previous Discussion reports, the community is still trying to work on the details of the global ban policy regarding problematic editors.
Recent Deaths blurb discussion
Should the recent deaths section of In the news be replaced with names or should more be added?
Turning on Education Program extension
Should the Education Program extension be turned on to make it easier for the community to track assignments?

Reader comments
If articles have been updated, you may need to refresh the single-page edition.