Wikipedia talk:Wikipedia Signpost/2022-11-28/Op-Ed

Discuss this story

  • Lots of interesting observations in this op-ed but I disagree with the author's conclusion that The day Wikipedia is replaced, it will likely be by something completely different that didn't even set out to compete with the Wikimedia wikis. I have previously written about 3 scenarios I see for how something fundamentally similar to Wikipedia could usurup us. In particular one of the advantages Julle identifies - the years of insitutional work - doesn't strike me as an insurmountable obstacle given the ability to port Wikipedia and use that as a basis especially if it could be accompanied by successful AI writing which could out produce human editors. Best, Barkeep49 (talk) 16:00, 28 November 2022 (UTC)Reply
    Thanks, Barkeep49, for reading and commenting. Indeed, there is even a comment in the technology report of this very Signpost issue on attempts to do so. Should machine learning get to that point where it becomes significantly better than our efforts, so that readers find their way to something new simply because we're lacking content they are looking for, we might perhaps also want to try to use similar tools if they were available to us – not necessarily automatically but as a tool for the writer, with the added benefit of manual oversight. As a movement, we have tried various ways to include content in a semi-automatic manner: bot creation, mw:Extension:ArticlePlaceholder, now the work on Abstract Wikipedia for languages which lack a lot of content. I guess this would most likely be first adopted by smaller languages, always more desperate for article content, and then move to bigger ones if it proved superior to what we are currently doing.
    But I think English Wikipedia is comprehensive enough that it would be difficult to compete on mainstream topics, because of the inherent inertia in people's habits. Maybe – thinking out loud here – the real threat to our existence from AI would be it was used to create content to lock people into the platforms they are already using? As a way to keep people from having to leave Facebook, TikTok or other platforms, because the information was already there. /Julle (talk) 02:07, 30 November 2022 (UTC)Reply
    And to clarify: When I talk about the years of institutional work, I don't necessarily mean the content. I primarily think of name recognition, links, reader patterns and processes. /Julle (talk) 09:44, 30 November 2022 (UTC)Reply
    @Barkeep49: Interesting scenarios. They remind me of these ones from the WMF's 2018 revenue strategy - though those were attempting a slightly different scope but seem to hit some similar themes. T.Shafee(Evo&Evo)talk 11:24, 28 December 2022 (UTC)Reply
    The obvious AI/LLM advantage, looking at the last few months of development, isn't quality or even amount of information, but rather convenience. We could very well be replaced by someone being able to ask a simple question and get an answer, instead of having to read – few want a long article if they're just looking for a brief answer to a question. /Julle (talk) 21:55, 15 April 2023 (UTC)Reply
  • Some insightful commentary here worth thinking about - thanks for writing this Op-Ed! —Ganesha811 (talk) 18:24, 28 November 2022 (UTC)Reply
    Thanks! /Julle (talk) 02:08, 30 November 2022 (UTC)Reply
  • Following on from Barkeep49's comment, my thinking is that the next step would be automated information synthesis via AI, but would this replace Wikipedia or transform it? As I understand it, there are some article types, like place articles, that have been auto-generated en masse via interpretation of GIS data via an algorithm, but this is a far cry from AI article creation. One thought is to proactively consider what policies and guidelines would need revision in the face of truly competent AI authors. Would they be treated and held to the same standards as human authors? Or would AI contributors start as vandalism eradicators and new article patrollers - which comes with its own set of policy and guideline needs.
    As for the notion that a Wikipedia replacement might come from an unanticipated quarter, that is certainly a possibility. Consider the Cochrane Library; this is a medical knowledge repository consisting of detailed reviews which are frequently updated. The content of this resource is used as source material for more general Wikpedia articles, but if either AI or other human collaborative technologies could compose meta-reviews based on these detailed reviews, then much of the medical article content of Wikipedia could be supplanted. This would be 'replacement by evolution' and is consistent with the notion that particular content domains covered by Wikipedia could be addressed more thoroughly and authoritatively by focused repositories (e.g. Wookieepedia, Memory Alpha). --User:Ceyockey (talk to me) 03:07, 29 November 2022 (UTC)Reply
    Thanks! I took this comment into account when replying to Barkeep49 above. /Julle (talk) 02:08, 30 November 2022 (UTC)Reply
  • Wikipedia has done what it needs to do to become the top product in its field, but that's why it's so important that we keep improving article quality. Wikipedia is a public service, and its position as the most popular encyclopedia makes it that much more important that information in Wikipedia is verifiable and neutral, not to mention as comprehensive as possible. Thebiguglyalien (talk) 16:33, 29 November 2022 (UTC)Reply
    Thanks for reading and commenting! I appreciate it.
    I don't think anyone disagrees – as the penmultimate sentence states, "[a]rticle quality is important, as a method to achieve our mission". To clarify, the main point isn't that quality is irrelevant, or that it's pointless to have better articles than we have now, but only that we are probably way past the point of what is good enough to persuade readers to use or not use something as a source of information, and to think about what this means for others. In short – not "article quality is not important" but "article quality won't save us".
    There is also an inherent conflict between being as verifiable and as a comprehensive as possible. Look at the discussion around Olympic atlethes, for example, where English Wikipedia has decided to require more and better sources for inclusion. This, one could argue, makes the individual articles better, but comes at a cost when we're looking at a comprehensive encyclopedia.
    As I talk about in the "Quality and the reader" section, it's not obvious that what we define as quality corresponds to reader needs. The more work we put into having the best sources, into making sure our articles match reality as described by the best works we can find to build on, the better, but I'd argue our definition of "quality" also comes with assumptions regarding what makes a good article which might or might not live up to what our readers are looking for. For example, we celebrate length, in our FA or GA articles, because we serve the reader with more information, but someone looking something up in an encyclopedia might often be looking for a briefer, quicker overview than we're giving them, somewhere between the article as it is and our summary in the lead section. /Julle (talk) 01:49, 30 November 2022 (UTC)Reply
  • Regardless of what an article concludes, I personally still am interested in improving the quality of content on the encyclopedia. It also helps me improve my writing skills, I think. casualdejekyll 14:11, 30 November 2022 (UTC)Reply
Hey casualdejekyll, thanks for reading. As the article states, "[a]rticle quality is important, as a method to achieve our mission". The text doesn't say we shouldn't improve our articles – of course we should. It just comments on how article quality, past a certain point, relates to reader retention. /Julle (talk) 08:23, 2 December 2022 (UTC)Reply
  • Dear Julle. your essay above is great, a well structured text with a thesis (we should scrutinize our concepts of "quality"!) to hit the bull's eye. Even in a well-maintained context like The Signpost this OpEd excells. And as a German I may add that it's much better than most all contributions in the German Wp equivalent "Kurier" where even the debates tend to be argy-bargy and tedious.
As for the content I want to comment that I've often compared wikipedia decorated articles with classical Encyclopedia Britannica articles. The latter were mostly much better structured, had good subtitles and an optimized length. And I've asked myself: why couldn't we use those well-known technics to collapse the abundance of details in order to get an optimized length at least in our "excellence" decorated articles? But I'm afraid there's no willingness in broad parts of our community to enter a path of innovation like this.
Last not least I'd like to emphasize the broader context of Wikipedia in a increasingly messed up Western society. The public educational system in America is not very good, even in rich Germany it's not good and underfinanced. Maybe Scandinavia is better off in that reference compared to most of the world. The cost of living crisis (inflation of 7-12% in core Europe) we are facing doesn't make it easier to appreciate classic standards of text quality. Isn't the Tiktok mania like other hypes before not also an evidence that most individuals nowadays are psychologically struggling for attracting notice instead of fulfillling standards of quality like pre-neoliberal generations before? Only in older (1960-1980) feature films or literature we can find a much slower pace of everyday life. But the postmodern lifestyles nowadays leave the good old path of the achievements of the Enlightenment. --Just N. (talk) 16:16, 1 December 2022 (UTC)Reply
Thank you for the kind words. /Julle (talk) 08:23, 2 December 2022 (UTC)Reply

To the point that convenience often supersedes quality, look no further than the shift from land lines to cell phones. AT&T (COI note, I worked for that company for 15 years) put a lot of effort into improving long distance & land line quality, only to be thwarted by the adoption the more convenient but lesser quality audio of cell phone calls. It is true that cell phone quality has improved over the years, but this was not much of a factor in general adoption of the technology. Peaceray (talk) 22:11, 2 December 2022 (UTC)Reply

And quality can continue to improve long after adoption – as in the case of Wikipedia. /Julle (talk) 23:26, 2 December 2022 (UTC)Reply
  • Really interesting and much needed! Thank you for this and especially the note on mobile editors (as I type this now on my phone it feel particularly pertinent). I think something that is at the periphery of this discussion is the 133,000-odd articles tagged as unreferenced which is a) an undercount because there are many more not tagged and b) a significant decrease from about a year ago when we were at 145,000 or so. While this is a relatively small percentage of English Wiki articles, if combined with the others that are poorly, insufficiently, or questionably sourced, this is a serious weakness in what has been correctly described as one of the main principles of Wikipedia that must survive beyond any expansion, technology change, or other adaptation: how did we get this information and where did it come from? (I think this is a major weakness of many Fandom Wikis that cite their subjects recursively.) Article quality is important; content verifiability is a major component of that quality we must uphold. What are the consequences to the health and longevity of this project of having junk articles that throw sand in the gears? While I am neither a deletionist nor an inclusionist, it seems like the only wrong answer is to do nothing. I imagine other projects that have tried to walk this line of breadth vs quality have failed by tilting too far in either direction and we are on a narrow and possibly fraying tightrope. And yet the lack of a better alternative is possibly what’s keeping us going. Late night food for thought. Kazamzam (talk) 02:53, 4 December 2022 (UTC)Reply
    Thanks for your kind words. (: /Julle (talk) 15:30, 7 December 2022 (UTC)Reply
  • Wikipedia used to be really popular for most of 2000s, but once we enter 2010s whatever visits and editors we had at the early times of Wikipedia mostly lost. As such article quality mostly suffered. Not just that, but my language's Wikipedia also suffered from this - while there were some attention in recent years, article quality seems not be improving. Also, Google really seems to moving away from Wikipedia, as my searches don't show Wikipedia at the top. Don't forget that hobbyist wikis (mostly) catering to gaming space and Internet culture are where Wikipedia is not competitive. To compete with today's internet, we should try to compete and strive to increase coverage and quality of our articles, as well as keeping our principles, and regaining enough trust that Wikipedia can be used as a trusted source. MarioJump83 (talk) 12:24, 6 December 2022 (UTC)Reply
    @MarioJump83: Wikipedia used to be really popular for most of 2000s, but once we enter 2010s whatever visits and editors we had at the early times of Wikipedia mostly lost. The statistics don't support that characterization, at least on the English Wikipedia.
    Monthly pageviews stats for July 2015 – November 2022 show an increase from the the stats for December 2007 – July 2016, despite the fact that we know more people are accessing Wikipedia information via other means (like Google search results pages or YouTube fact-checking boxes, as two examples) that wouldn't be counted in those stats.
    Even the worst month of 2022 exceeds the best month for all of 200x. The peak for pageview counts here occurred in the period from 2010 – 2015, when monthly views would occasionally spike over 10 million billion; since July 2015 we've only broken 9 million billion once (April 2020) and rarely exceed 8 million billion. But the raw readership numbers, at least, trend markedly upward from 2007– 2014, then slightly dip over 2015/2016 before turning effectively flat since 2017.
    Editor counts, I can't speak to. The current 30-day active user count is ~121,000, up insignificantly from ~118,000 in January (found in some random article written then), but if there are any historical archives for that figure I don't know where they live. FeRDNYC (talk) 03:09, 7 December 2022 (UTC)Reply
    The number of active editors on English Wikipedia started falling in 2007 (the most cited article discussing this is "The Rise and Decline of an Open Collaboration System: How Wikipedia’s Reaction to Popularity Is Causing Its Decline", by Halfaker, Geiger, Morgan and Riedl), but the curve flattened out and stabilized in 2014. Other wikis follow similar patterns – around 2007 or 2008, what had been a steady growth of active editors turns into a net loss. A more likely connection to quality seems to be that as article quality grew, it became more difficult to add to them; many of the older and larger wikis gradually shifted focus from semi-desperately wanting new content to being more concerned about the quality of the content they had, partially fuelled by a number of unfortunate statements in biographies of living people. The time and effort involved in adding new content to Wikipedia increased. /Julle (talk) 15:29, 7 December 2022 (UTC)Reply
    I don't think it's really about the difficulty of contributing to Wikipedia so much as the strong negative reaction that newcomers received in response to their contributions. In that paper, we include an analysis of the quality of contributions of newcomers and note that there is no meaningful transition that corresponds to the sudden drop in newcomer retention. Instead, we see that the negative feedback to newcomers was the strongest predictor. This also bears out in a multivariate statistical model that accounts for other potential factors. Talking to people about their experiences as a newcomer in Wikipedia confirms supports this conclusion. This result has also been replicated across a wide range of hobby wikis. See Revisiting “The Rise and Decline” in a Population of Peer Production Projects. When quality control activities increase and it starts biting newcomers more, retention rates go down. It's not about the difficulty of contributing, but rather how we treat good-faith newcomers when they find somewhere to contribute. --EpochFail (talkcontribs) 17:22, 7 December 2022 (UTC)Reply
    One of my underlying assumptions is that quality concerns is a factor in how we treat newcomers. I hypothesized a bit more about this in m:User:Julle/Essays/The Patroller's Dilemma back in April, on how quality concerns means that patrolling gets more difficult because you can't just fix something without significant effort, which means that it becomes more of a binary "yes or no" thing even without semi-automatic tools (since we see the same development on wikis which didn't mirror English Wikipedia's technical development in that area). /Julle (talk) 23:08, 8 December 2022 (UTC)Reply
    (Note: All of my "million"s above should've been "billion"s; I've corrected them inline with the previous text struck out. Sorry about that.) FeRDNYC (talk) 19:10, 8 December 2022 (UTC)Reply
  • I am a bit late to the party that is this comment page, but I just wanted to plus-one the comments elsewhere herein, affirming that this is a great and useful essay, displaying the kind of great analytical thought that humans will always need if they are to keep themselves out of the ditch regarding quality of life, quality of epistemology, and quality of civility among themselves. Great work. Also, kudos to the several commenters who had useful and interesting thoughts to add as well. Quercus solaris (talk) 23:46, 9 December 2022 (UTC)Reply
    Thank you! And yes, it's a privilege to be blessed with a good comment section. /Julle (talk) 01:33, 10 December 2022 (UTC)Reply
  • I think that this is an excellent essays like others have said. I personally think of one other scenario that something like this can happen: a company making 1000 featured-quality articles that is built from scratch and is in WP:Vital articles, then make it widely accessible for readers. They don't need to replicate Wikipedia, they just need to write it themselves, look at Wikipedia's sources and ask experts to make sure that the info is correct. You don't even need AI to do this, though it would greatly accelerate the cause. Because the content was written from scratch, they can just sue Wikipedia if we borrowed information from them, but more likely we won't catch up in time to compete with them as they would've produced even more quality articles. The company would be heralded as the conquerer of Wikipedia, just like when Wikipedia was heralded as the conquerer of Britannica. CactiStaccingCrane (talk) 14:46, 10 December 2022 (UTC)Reply
    @CactiStaccingCrane: Well, it's a funny thing, that. To look at https://www.britannica.com/, you'd be forgiven for thinking that if they've been conquered, nobody told them! Wikipedia has forced Britannica to pivot heavily away from stodgy encyclopedia-writing and diversify into a more general publishing company, but they've done that successfully and don't appear to be going anywhere. (Which, don't get me wrong, is great. I don't know about Jimbo Wales, but personally I've never been here to be a giant-slayer. I'm here to build an encyclopedia.) The thing that "conquers" Wikipedia may ultimately just force Wikipedia to reevaluate how it operates and what it represents, and that's rarely a bad thing. FeRDNYC (talk) 10:31, 12 December 2022 (UTC)Reply
  • I found it interesting that most threats to wikipedia's survival and domination are identified within the same perspective where Wikipedia excels: the supply side. WP has massive and obvious technological limitations from the demand side, which place it already way behind the technological development curve. Think of the capacity to gauge reader's use and feedback (it takes a full team of academics writing a journal's paper to gain some light, whereas other social media know instantly what I am doing); or to integrate multimedia or external data sources. As it was said well long ago, you cant fix a problem with the same mindset which generated it.Tytire (talk) 13:46, 18 December 2022 (UTC)Reply
  • What, if anything, does the upper bound on user quality expectations mean in the graph adapted from Clayton Christensen? - Jmabel | Talk 22:38, 19 December 2022 (UTC)Reply
    Jmabel: It merely indicates that quality expectations differ a bit from person to person, so even if we're ignoring outliers, having one line saying "this is what people expect" would be a bit too far from the truth. But – yet again, adapting from Christensen – the majority of users will have roughly similar quality expectations. /Julle (talk) 17:56, 20 December 2022 (UTC)Reply