Wikipedia talk:No original research/Archive 64

Latest comment: 3 months ago by ScottishFinnishRadish in topic Input requested
Archive 60 Archive 62 Archive 63 Archive 64

Cut the second sentence

The second sentence of paragraph 1 of the lede was particularly bunk. Removed here. However, the whole of the five sentences of the lede is poor. I think it should be tossed for a major rewrite, but concisely. SmokeyJoe (talk) 03:00, 28 December 2022 (UTC)

The words have been there a long time. That doesn’t mean they were ever good words. SmokeyJoe (talk) 23:37, 28 December 2022 (UTC)
It does mean you must convince other editors before you take action. That applies to any lead, especially the lead of a policy.
You also need to be specific: Start a section and copy an existing phrase or sentence, explain how it doesn't properly summarize content in the body, and then show us your suggested improvement. Repeat ad libitum.
Create a consensus with others. Seeing you hold a conversation with yourself is painful. -- Valjean (talk) (PING me) 23:52, 28 December 2022 (UTC)
Status quo:

Wikipedia articles must not contain original research. The phrase "original research" (OR) is used on Wikipedia to refer to material—such as facts, allegations, and ideas—for which no reliable, published sources exist.[a] This includes any analysis or synthesis of published material that serves to reach or imply a conclusion not stated by the sources.

Footnote [a]

By "exist", the community means that the reliable source must have been published and still exist—somewhere in the world, in any language, whether or not it is reachable online—even if no source is currently named in the article. Articles that currently name zero references of any type may be fully compliant with this policy—so long as there is a reasonable expectation that every bit of material is supported by a published, reliable source.

Proposed version:

Wikipedia articles must not contain original research. "Original research" (OR) includes any analysis or synthesis of published material that serves to reach or imply a conclusion not stated by the sources.

SmokeyJoe (talk) 00:08, 29 December 2022 (UTC)
What I particularly dislike about this definition is that it's not a definition. By relying on this "includes" [but is not limited to] language, it says, effectively, that OR is anything I want to claim. Right now, if I want to say that it's OR, I have to write a sentence in which the old definition makes some sense. If I'm going to object to someone using the letter C at the start of a paragraph, I can't say "using the letter C is OR", because I'd have to say "using the letter C is material—such as facts, allegations, and ideas—for which no reliable, published sources exist", which wouldn't make sense.
With this proposed definition, I could insist "Oh, sure, OR includes any analysis or synthesis of published material, but it also includes other things, like using the letter C at the start of a paragraph, and citing Twitter, and adding anything about COVID that I personally disagree with."
This change would stir up disputes. Policies should not do that. WhatamIdoing (talk) 23:26, 4 January 2023 (UTC)
Also, this changes the definition of OR to omit unpublished sources. So if – as has happened in the past – someone working for a company updates the Wikipedia article to provide up-to-the-second information about the company's revenue on the basis of a confidential spreadsheet they have access to at work, this would stop being a NOR violation and "only" be a WP:V violation. WhatamIdoing (talk) 23:37, 4 January 2023 (UTC)
Thank you User:Valjean. I certainly don’t want to cause pain by talking to myself.
The bold edit clearly defines the intended action. The BRD thing says to do the edit and draw in someone else who cares. BRD is a dubious method, ok.
After being reminded by WAID of the details in the lede, I was reminded that the lede is displeasing to read. The five paragraphs contain things of dubious merit to be in the lede. The lede contains too many words and ideas. Too much makes it difficult read and more difficult to distill meaning. I believe that most people, like me, when reading weird verbosity begin to skim read. On having the computer read it aloud, it sounds like a person talking to themself at the end of they day after a good meal.
On close examination, I identify the second sentence, the very sentence that WAID identifies as generating conflict with PSTS, as the biggest stumbling block.
Without debating whether the second sentence is true, I propose to cut it, and to let the current third status give the working definition of OR. It did that already, but in confusion if not contraction with the second sentence. I think the proposes sentences 1 and 2 flow very easily. SmokeyJoe (talk) 00:19, 29 December 2022 (UTC)
Also add, that be having the second sentence implicitly reference WP:V is not good. Either do it explicitly, or start off talking about what this policy says. SmokeyJoe (talk) 00:24, 29 December 2022 (UTC)
I think this proposed change doesn't actually define original research. Instead it merely says here are some other things that are also original research. I think the first sentence does accurately define OR as it is used on Wikipedia and so if we're looking to clean up the LEAD that's not where I would cut. Best, Barkeep49 (talk) 00:38, 29 December 2022 (UTC)
I think that references to how Wikipedia uses a term, and and what the community means, needs verification. It’s displeasing to read core policy on sourcing for content that itself asserts facts without references.
On the definition of original research, that is a good question. SmokeyJoe (talk) 00:44, 29 December 2022 (UTC)
User:Barkeep49, in looking for a rigorous definition of “original research”, it eludes me. I think outside Wikipedia it is a rare and mere descriptive term. The mainspace entry and its references are unimpressive. As a term of art, I think it is a Wikipedia neologism.
Looking at its use on Wikipedia, I think it is used loosely, and mostly consistent with editor synthesis in combining multiple primary sources. This is squarely classified as the creation of a secondary source, historiographically speaking, as well as journalism and science. This is in stark and awkward contrast the mainspace entry reference.
I think it is fine to have this policy define it as a Wikipedia term of art, but then it should do so explicitly, not by inference to what the “community means” without linking to evidence of that. And in any case, this is quite a distraction of a rabbit hole for the second sentence of a core content policy.
I don’t think “this proposed change doesn't actually define original research” is a serious criticism, and I think you are wrong in your use of the word “accurately”.
Do you offer a hint of agreement that the lead of WP:NOR fits poorly with WP:LEAD, if allowing for the principles of a good article to correspond to the principles of a well written policy page?
- SmokeyJoe (talk) 22:10, 29 December 2022 (UTC)
I'm sorry you don't think that my statement is serious criticism. So let me take this opportunity to note that I am quite serious when I say that I do not support the change because you are removing a definition of Original Research, which as you note is a Wikipedia term of art. As for the LEAD needing a rewrite, I haven't given it much thought but am happy to accept it needs a change. Following the idea of LEAD makes some sense to me and LEAD would certainly support defining what it is we're talking about in the opening sentences and paragraph. Best, Barkeep49 (talk) 00:13, 30 December 2022 (UTC)
OK. It should begin with a definition of OR. I don't think the definition should boil to down to: Anything that fails WP:V.
I would define: "Original Research" is the interpretation or combination of facts without guidance from at least one reliable secondary source". SmokeyJoe (talk) 01:09, 30 December 2022 (UTC)
I agree with edits in this direction. I see no reason why sources that "exist" independently of anyone in a given dispute reading or referencing them makes a bit of difference. We don't write articles based on the idea that sources merely "exist" somewhere on the planet for what we are writing. I'm aware though that this has been discussed before not long ago and some editors like it, so I would suggest preparing an RfC. Crossroads -talk- 00:18, 29 December 2022 (UTC)
IMO, if we stop explaining the difference between verified and verifiable, we'll end up with more people removing unsourced but perfectly reasonable stuff. DFlhb (talk) 20:07, 30 December 2022 (UTC)
It's always good to discuss, even for policy changes. Still, the policy is made clearer by the removal. The lead explains the main idea, and leaves the detail for later. Shooterwalker (talk) 01:12, 30 December 2022 (UTC)

If our policy was "every statement in an article should be cited to a reliable source," we wouldn't have these problems. But that aside, if OR is not "anything that fails V", then I can't imagine what OR could possibly be. In other words, what's the other potential definition, besides "anything that fails V"? Levivich (talk) 16:16, 30 December 2022 (UTC)

"Original Research" is the interpretation or combination of facts without guidance from at least one reliable secondary source.
- SmokeyJoe (talk) 21:34, 30 December 2022 (UTC)
So, all use of primary sources is original research? WhatamIdoing (talk) 22:53, 4 January 2023 (UTC)
“The *exclusive* use of primary sources is original research” is a good lie-to-children.
It may be possible to compile a list using only primary sources, but newcomers should be discouraged from doing this.
In WP:FRINGE topics, all primary sources is a huge red flag.
In medical science, it is very likely that expert editors will argue that all these sources are primary, and I might argue that their primary source definition is affected by the different science definition, and the question devolves to subtleties of source typing. WP:NOR is not written to restrict expert editors, but to guide newcomers. SmokeyJoe (talk) 00:15, 5 January 2023 (UTC)
It is possible to start quite a lot of good articles on notable subjects using only primary sources, without violating NOR at all. Consider:
Do you see any NOR violations? I don't. Do you see anything that is a secondary source? I don't.
I don't think it would be possible to write a high-quality article with only primary sources, but that's not really NOR's problem per se. WhatamIdoing (talk) 03:18, 5 January 2023 (UTC)
Yes. Quite right. What this means is that a NOR violation is not a sufficient reason for deletion. It is a reason to look to improve the article. SmokeyJoe (talk) 03:31, 5 January 2023 (UTC)
I don't agree with that at all. In fact, I think all four examples should be considered policy violations:
  • a document signed in 1787 that describes the organization of the US federal government is wrong because while it was signed in 1787, it wasn't ratified until 1788 and didn't go into effect until 1789. It did not describe the organization of the US federal government in 1787 -- the document that described the organization of the US federal government in 1787 was the Articles of Confederation, from 1777. And the document itself isn't a valid source to say it describes the organization of the US federal government today -- or at any point in time, because the document doesn't say whether or not it was ratified. It's easy to make this statement about the US Constitution, but suppose it was "The Constitution of Fiji is a document signed in 1874 that describes the organization of the Fiji federal government" cited to some document from 1874 that said "Constitution of Fiji" at the top and recited 1874 as the date--that shouldn't be enough to publish the statement in wikivoice, because we don't know if the document is genuine, if it went into effect, etc.
  • Like a document, a video of an Oscar ceremony also has authentication problems. How does one distinguish a video clip of an authentic Oscar acceptance speech from a video clip of a staged one from a movie? A video clip of the ceremony is a terrible source to verify that someone won an award. Aside from that, it's not really a primary source for the award itself...when they announce the award, they're announcing the results of the primary source, which is the tally of the ballots itself; so the announcement is an interpretation of a primary source (the tally), and thus the announcement is a secondary source.
  • Same as above; the announcement is a secondary source. An announcement of a death is not the contemporaneous recording of the fact of death--that would be the death certificate. We shouldn't announce the Queen's death based on a scan of a death certificate, as I'm sure we all would agree. But the point is: you can't start an article about someone's death based on a primary source (a death certificate), although you could based on a proper secondary source, like an official announcement.
  • I've long been opposed to the use of case law as sources, but they're not primary sources, either. They're primary sources for their holdings and dicta, but not primary sources for their summary of the procedural and factual history of the case. They're mixed primary/secondary sources, as are almost all sources. But it's not a purely primary source, like a Constitution, a ballot tally, or a death certificate.
That aside, I acknowledge Joe's point that WP:SYNTH is also OR, but I would say that SYNTH, or "the interpretation or combination of facts without guidance from at least one reliable secondary source", also fails V. So I'm afraid neither Joe nor WAID's examples convince me that there exists anything that complies with V but fails OR. IMO, SYNTH should just be moved to V and we should downgrade NOR to an explanatory supplement to V. Levivich (talk) 00:19, 6 January 2023 (UTC)
Your points are correct if the sources are historical, but are not correct if the sources are for today’s news. It is ok to use today’s primary source for an event that happened today.
SYNTH should just be moved to V and we should downgrade NOR to an explanatory supplement to V. How is that different to merging V and NOR into WP:ATT? I think this might be the most viable option. SmokeyJoe (talk) 04:35, 6 January 2023 (UTC)
I don't think any of these merges should be done. Original research and verifiability might have some overlaps but they are not the same thing. And I don't agree with Levivich that mixed primary/secondary sources, as are almost all sources Many sources are purely primary, many sources are purely secondary.
  • authentication problems. This, to me, is about the "definition of published." Sources have to be reliable and reliably published. Method and form of publication creates a database paper trail and a record that is verifiable. A video clip may or may not have that record - some do, like the Library of Congress or whatever, but any old clip on the web does not.
  • And the document itself isn't a valid source to say it describes the organization of the US federal government today -- or at any point in time, because the document doesn't say whether or not it was ratified that kind of logic is what we're trying to prevent with the idea of original research. What you're doing is interpreting a primary source. You can interpret a primary or a secondary source. But we need to get interpretations from secondary sources and not interpret them in a novel way on our own. Original research is about doing your research, and letting that research tell you how to think about things, rather than thinking about them yourself.
  • Original thought can sometimes be very verifiable. It's just a novel arrangement of fully verifiable information. Particularly a fringe theory. If I give you a ton of really obscure papers that are fully verifiable and I've decided to arrange them in such a way to suggest something that isn't the academic consensus of reliable secondary sources, that's original research. Andre🚐 04:40, 6 January 2023 (UTC)
@Levivich, as I have said before, all original research fails WP:V; however, some things pass NOR but don't pass WP:V. The overlap between the two is acknowleged to be substantial; that overlap was the basis of the Wikipedia:Attribution merge proposal in 2007. WhatamIdoing (talk) 01:36, 9 January 2023 (UTC)
If OR was as simple as "anything that fails V", then we wouldn't need two separate policies. This policy page describes things like synthesis, where people use verifiable sources to imply their own conclusions or ideas. The policy against OR keeps articles to the summary style of an encyclopedia, instead of a research essay or opinion piece. There is a lot of overlap between NPOV, OR, and V, because they all combine to achieve some of the same goals. But we've kept each policy focused on different aspects of the same problem, so it's easier to point those problems out. Personally, I think that we should try to avoid so much repetition between the policy pages, and refer people between policies if they need more information. Shooterwalker (talk) 21:39, 30 December 2022 (UTC)
The claim "'Original Research' is the interpretation or combination of facts without guidance from at least one reliable secondary source" is seriously incomplete. Relying on primary sources is not original research, although it may violate other policies. Reporting facts witnessed by the Wikipedia editor is original research, although it may not involve interpretation or combination. Jc3s5h (talk) 22:02, 30 December 2022 (UTC)
Jc3s5h, can you expand on “seriously I complete”? Eg, what’s another violation of NOR that is not already a violation of V?
”Relying on primary sources is not original research” is only true depending on the detail of what you are doing with them. What other policy limits the use of primary sources?
Reporting facts witnessed by the Wikipedia editor is a straight WP:V failure. SmokeyJoe (talk) 22:36, 30 December 2022 (UTC)
The fact that much original research also fails the verifiability policy doesn't mean it isn't original research. Jc3s5h (talk) 00:59, 31 December 2022 (UTC)
To answer one of SmokeyJoes questions, another guideline that limits the use of primary sources is Wikipedia:Notability. A topic for which only primary sources can be found is not a suitable topic for a Wikipedia article. Jc3s5h (talk) 01:06, 31 December 2022 (UTC)
All original research fails WP:V; some things pass NOR but don't pass WP:V.
For example:
  • My shirt today is off-white: There are no published sources anywhere; therefore it fails NOR and WP:V
  • My shirt today is off-white, plus I join some antisocial media and post that fact there: a published source, so it passes NOR, but it's self-published, and I'm not a relevant expert, so it fails WP:V.
The fact that there is substantial overlap between the two was the basis for the old WP:ATT proposal (i.e., to merge them). WhatamIdoing (talk) 23:00, 4 January 2023 (UTC)
The three core content policies are so intertwined and mutually reinforcing that any two of them fully imply the third. (I have seen it argued that WP:NPOV implies both of the other two) “All original research fails WP:V” requires a thorough and deep understanding of WP:V. I don’t think bright eyed newcomers should be expected to read it at that level. I think content policy should be written for the level of a young bright-eyed newcomer. SmokeyJoe (talk) 00:07, 5 January 2023 (UTC)
I don't think NOR was written for bright-eyed newcomers. I think that both the history and the use of the policy shows that exists for experienced editors to point at, when they're telling people to take their unwanted content and go away. WhatamIdoing (talk) 03:21, 5 January 2023 (UTC)
I think that some policies, especially NOR, were written as a high culture debating forum for leading Wikipedians, and that they are improved in utility by rewriting bits for easier comprehension by the newcomers. These are Wikipedia’s most important policies, which is not to say that they contain the deepest wisdoms, but that they are the first things to be read by a newcomer when they start to become serious Wikipedians.
I am also aware of project pages may serve as clubs to beat troublesome editors. WP:ENC is an example. It is unhelpful to point any well intending newcomer at WP:ENC. I suppose it is useful if the editor is entirely deaf to prior advice and they are a net negative to the project. SmokeyJoe (talk) 03:42, 5 January 2023 (UTC)
  • a high culture debating forum for leading Wikipedians I find this a puzzling construction. Wikipedia policies were written by many people over a long period time on talk pages and mailing lists and discussed on emails and IRC channels. Some of those discussions may have been contentious but many were not. WAID and others know the history better about certain things but the lore aside, some of the policies haven't changed that much since 2003, in principle.
  • project pages may serve as clubs to beat troublesome editors That would be battlegrounding but policy can and should be used as a way to express logical arguments. Policy is the law, and we don't wikilawyer the letter of the law. Wikipedia:The rules are principles. So you need conceptual clarity to see why policy was made the way it was and what it was for, to be able to apply it in a logical discussion in a constructive way. In other words, apply spirit and intent, not the wriggly wiggly gray areas like a math problem. The policies are meant to be interpreted and applied, not debated. But we have policy shortcuts so they can be used defensively, so to speak, to defend those principles in the fracas and the day-to-day messy business of collaborating online, where there is asynchronous information and we need a basis for collaboration: shared understanding of our goals, vision, and values. Andre🚐 04:48, 6 January 2023 (UTC)
    Yes, but you started editing when Wikipedia:Product, process, policy was taken seriously. WhatamIdoing (talk) 01:39, 9 January 2023 (UTC)
    That's a good read and pretty instructive of a lot of thinking I still agree with. Radiant was largely responsible for mergism and a lot of pragmatic thoughts from that time. Andre🚐 01:53, 9 January 2023 (UTC)
  • I think that "original research" is when someone publishes a theory or contention that hasn't previously been published. If it's previously published then it's not OR, regardless of whether the previous publication was a primary, secondary or tertiary source, and regardless of whether the source it was published in is reliable.
The ancient astronaut theory isn't OR. It's fringe and bad science and so on, but it's not original.—S Marshall T/C 22:50, 2 January 2023 (UTC)
I agree with: "original research" is when someone publishes a theory or contention that hasn't previously been published.
I think that on Wikipedia, the term “original research” is only well used for describing WP:SYNTH where there are multiple reliable sources. I think it would be helpful to formally exclude from the Wikipedia definition of “Original research” things that are more simply described as violations of WP:V. Thus: “false facts are not OR”, they are WP:V fails. I think that most uses on Wikipedia of the term “original research” do exclude simple WP:V fails, and that all meaningful uses do exclude simple WP:V fails. WP:NOR is a level of complexity above that of WP:V.
Primary vs Secondary source classification is extremely useful, for all levels of editor from child up, because verifiability in a secondary source easily distinguishes Original research from source-based material, even in the presence of a myriad of reliable primary sources. Without WP:PSTS leading WP:SYNTH, WP:SYNTH would be very hard to work with. It is harder to explain WP:SYNTH than to ask the enthusiastic editor “where are some secondary sources?” SmokeyJoe (talk) 23:18, 2 January 2023 (UTC)
  • No, whether it's primary or secondary or tertiary is quite irrelevant to this. It's OR if not previously published. That's all OR means.
    Please don't confuse this with PSTS, SmokeyJoe, because PSTS has nothing to do with it.—S Marshall T/C 01:10, 3 January 2023 (UTC)
    No, that’s not all OR means. There is more subtlety to NOR than just V.
    The subtle difference happens to align exactly with historiographical distinction between primary sources and secondary sourcess. Note the linking to mainspace articles, not to Wikipedia essays. If articles are wrong or lacking, then fix them!
    A theory is necessarily secondary. The need for secondary source for a theory is unavoidable.
    The historiographical approach, like all approaches, will have differing value in different applications, weaker with cutting edge medical science, but very strong with anything historical. The approach, I note, has difficulty applying to many articles on sportspeople, and current highways, which I think is appropriately, interesting.
    Wikipedia is a tertiary source. Does this have any implications to anything?
    There is nothing of substance in NOR for which PSTS is not a foundation. The motion to excise PSTS is a motion to devalue NOR. SmokeyJoe (talk) 05:59, 3 January 2023 (UTC)
    In 1905, Albert Einstein published the Special Theory of Relativity. It was original research, and his paper was a primary source. Would you agree with those statements?—S Marshall T/C 09:27, 3 January 2023 (UTC)
    Well, no. If we can please be precise in a policy discussion about policy being precise.
    Einstein did not do “original research” because “original research” is a Wikipedia term of art. If you want to use the term retrospectively for historic people, you’ll have to define your use of the term. I would say it is an inappropriate use, an anachronistic use, out of context, for the Wikipedia term of art.
    Source categorisation is fuzzy, and it depends on how the source is used. His paper is a primary source for his paper, yes, trivially. His paper is a primary source for his theory as described in his paper, yes. His paper is not a good source, primary or secondary or otherwise, for the importance or validity of his paper. SmokeyJoe (talk) 11:45, 3 January 2023 (UTC)
    If that's to be dismissed as anachronistic then let's do a modern publication. This article on nuclear physics, accepted for publication on 2 Jan 2023, literally has "ORIGINAL RESEARCH article" at the top left. Would you deny that that's a primary source?—S Marshall T/C 18:06, 3 January 2023 (UTC)
    Do you want to talk about meanings of the term “original research”? The journal and Wikipedia use it differently. Differently, but with a large degree of overlap.
    Or do you want to talk about source typing? Source typing is not absolute, but depends on the use you’re putting the source to. Both sources, the 2023 paper and Einstein’s, can be primary for some things and secondary for others. But for most purposes, they would be primary sources, speaking in terms of historiography. In science, they are not primary sources, because neither are experimental reports. SmokeyJoe (talk) 21:02, 3 January 2023 (UTC)
    I want to kill off this idea that original research has anything to do with primary, secondary or tertiary sources, because it doesn't. You said A theory is necessarily secondary, and that's what's got me engaged with this: my position is that a theory can and often does originate from a primary source such as a scientific paper.—S Marshall T/C 21:18, 3 January 2023 (UTC)
    The theory paper is a primary source for the existence and wording of the theory, but the theory is is a secondary source for the analysis of the topic that it applies to.
    WP:PSTS is an excellent framework for analysing the development of information, and in historiography it is mature and robust, and an encyclopedia is best treated as an historiographical document.
    PSTS is an excellent analysis method for original research, the Wikipedia term of art. For example, you can’t be doing WP:OR if your information comes from a secondary source. If all your sources are primary sources, you are probably doing WP:OR.
    What exactly do you want to kill, and why? SmokeyJoe (talk) 21:41, 3 January 2023 (UTC)
    I want to kill the idea that original research is connected with primary, secondary or tertiary sources, because it's muddled and confusing, and because it's wrong. Original research is ideas that haven't been previously published elsewhere, and that's all it is.
    You've just said to me that the theory is is a secondary source for the analysis of the topic that it applies to, and I have absolutely no idea what you mean?—S Marshall T/C 22:15, 3 January 2023 (UTC)
    Original research, as coined on Wikipedia, particularly the Jimbo mailing list quote, was describing PSTS without actually using the terms.
    I disagree that it is any of muddled, confusing, or even partly wrong. Have you read primary source, secondary source, or historiography? Are you onboard with that information?
    Is sounds to me that you are re-articulating the reason to merge WP:V and WP:NOR into WP:A. PSTS is real-world academia; NOR is a Wikipedia term of art, a jargon, and jargon is a barrier to newcomers. WP:A That was a good idea, but was implemented with a small failure in change management.
    RE the theory is is a secondary source for the analysis of the topic that it applies to. I do not believe that you have no idea, but if not, please read the linked articles on sources. Einstein’s theory on special relativity is a secondary source for the topic of special relativity. It contextualises special relativity, builds on Maxwell’s equations, etc. Special relativity is a fundamental topic of nature, and the theory is a human analysis of that topic. Don’t confuse the theory on the topic with the topic itself. This might be easier to understand for a topic on which there have been multiple competing theories. —- SmokeyJoe (talk) 22:54, 3 January 2023 (UTC)
    Well, I was familiar with these concepts before but I've just been through and re-read primary source, secondary source and historiography to make sure I've got it. I think you're saying that Zur Elektrodynamic bewegter Körper is a secondary source for special relativity? If so, then I'm utterly bewildered. I can make absolutely no sense of that at all. I think it's a primary source, for the topic as well as the article, and I don't see any connection whatsoever with the originality or otherwise of the research. We might need a third opinion here.—S Marshall T/C 23:21, 3 January 2023 (UTC)
    Sure, but I ask you to specify, the definition of primary and secondary sources. Are you following the historiographical definition? Or the science definition? Or the journalism definition? SmokeyJoe (talk) 23:36, 3 January 2023 (UTC)
    I'll follow whichever one you want, if you can make an intelligible connection between PSTS and original research.—S Marshall T/C 00:10, 4 January 2023 (UTC)
    Most certainly I want to use the historiological definitions.
    PSTS is well defined. The Wikipedia articles are quite good.
    The term “original research” is not well defined. The content and references at original research are poor. WP:NOR has longsince defined by vague reference to what Wikipedia and the community mean. I think the first step is to agree in a definition of WP:OR. I reject yours as simplistic, and fully redundant to WP:V, and if yours is accepted it would amount to the culling of WP:NOR. If that is the path, I point to WP:A.
    I offer the definition: "Original Research" is the interpretation or combination of facts without guidance from at least one reliable secondary source. SmokeyJoe (talk) 00:28, 4 January 2023 (UTC)
    But that's not what it means. You agreed with me above that original research is when someone publishes a theory or contention that hasn't been previously published. Let's stick to that please.—S Marshall T/C 00:39, 4 January 2023 (UTC)
    I agree that original research is when someone publishes a theory or contention that hasn't been previously published.
    A theory or contention on some topic is necessarily a secondary source on that topic. (Do you agree?)
    - SmokeyJoe (talk) 01:02, 4 January 2023 (UTC)
    I don't agree. I believe that original research is when someone publishes something in Wikipedia that hasn't previously been published elsewhere. My definition includes your unpublished "theory or contention", but it also includes anything else that is previously unpublished. WhatamIdoing (talk) 23:15, 4 January 2023 (UTC)
    Ok. Your definition is quite reasonable. I would like the WP:OR definition to focus on the subset of OR that is not already a simple reading WP:V fail. SmokeyJoe (talk) 00:00, 5 January 2023 (UTC)
    I don't agree with it. Original research doesn't have to be published on Wikipedia. Good OR is published in scholarly journals. And OR isn't Wikipedia jargon. It's an academic term for the process in science that advances human knowledge. And it comes from repeatable experiments or checkable observations published in primary sources.—S Marshall T/C 11:59, 5 January 2023 (UTC)
    But here we are discussing the on-wiki, by-editor-only type of OR. WhatamIdoing (talk) 01:40, 9 January 2023 (UTC)
    Yes we are, but I don't think it's right to say that "original research" is Wikipedia jargon. I think that when we use it, we mean the same thing that scholars use when they say it off-wiki. WAID, you say that NOR should apply to things that aren't theories or contentions. Could you give me an example?—S Marshall T/C 17:53, 9 January 2023 (UTC)
    Scholars don’t use it, off-wiki, with any real definition. “Original” is straightforward, “research” is not. The term appears in places, but not with any real definition. SmokeyJoe (talk) 22:07, 9 January 2023 (UTC)
  • Yes they do, and it has a clear meaning. The University of North Florida defines it here. Let me quote them in full.

Original research is considered a primary source.

An article is considered original research if...

  • it is the report of a study written by the researchers who actually did the study.
  • the researchers describe their hypothesis or research question and the purpose of the study.
  • the researchers detail their research methods.
  • the results of the research are reported.
  • the researchers interpret their results and discuss possible implications.
Universities need to define "original research" clearly because in the sciences, a doctoral thesis usually needs to be original research.—S Marshall T/C 08:59, 10 January 2023 (UTC)
Some academic journals have a category of papers called "original research". This distinguishes it from opinion pieces, review articles, book reviews, etc. I particular appreciate this label when I find it on an article in a medical journal, as it simplifies the analysis wrt WP:MEDPRI.
As for an example of something that isn't a theory or contention, but should be covered by NOR: IMO anything that is unpublished is OR if you stick it into a Wikipedia article. For example:
  • Big Corp will be hiring soon. Source: E-mail message received at work today by a Big Corp employee, who also happens to edit Wikipedia.
  • Joe Film's birthday is tomorrow. Source: Birthday party invitation I received.
  • Nell and Ned Notable got married in Las Vegas today. Source: She posted that they got married this morning, and he tweeted that they were in Las Vegas.
None of these are theories, and none of them are contentions. But they all fail NOR. WhatamIdoing (talk) 00:12, 11 January 2023 (UTC)
Universities need to define "original research" clearly because in the sciences …. I considered that source, which is a teaching instruction for postgraduate science theses, to be too different in context to be useful for WP:NOR. If the point is to distinguish experimental from opinion, it completely removes the useful purpose of WP:NOR, which is to exclude editorial opinion. If NOR covers anything that is unpublished, then it covers everything covered by WP:V. —SmokeyJoe (talk) 00:27, 11 January 2023 (UTC)
  • So, the reason I want policies to talk about claims, contentions, theories, thoughts, and ideas, is because I've been confronted with edits like this, where a user wanted to apply WP:V and WP:OR to individual words. My position is that paraphrasing a source isn't OR. Our role is to produce an accessible summary of what the reliable sources say, and sometimes that means using words the sources don't use. My position is that doing this isn't OR.
Following the sources' language leads to horrible articles, and particularly medical ones. Many or most of our articles about medical conditions are inaccessible to a lot of the people who actually have the condition. (My favourite example of this is attention deficit hyperactivity disorder, which is utterly inappropriate to be read by sufferers, and clearly aimed at the medical professionals who you'd think might have better places to read about ADHD than Wikipedia.)
If my wording is problematic then please can we devise alternative wording that has a similar effect?—S Marshall T/C 09:13, 11 January 2023 (UTC)
User:S Marshall, I think the threading has become horrible. I’m not sure who you are replying to, and I am sure that the post below (S Marshall T/C 02:04, 4 January 2023) was not replying to you.
Re your first paragraph “ So, the reason I want policies…”, I strongly agree. Content ideally is based on multiple similar sources, each independent of the others, and it is to be expected that they are in different styles, and Wikipedia style is yet another thing. WP:V might mean that some individual clauses should be cited, but WP:OR, in my view of it being a higher level more nebulous version of attribution of transformative information, it is either a mistake, or at least dangerous, to try to apply it below the paragraph level.
Re your second paragraph, I think that’s completely right. Although, one might be surprised at the frequency that experts make use of Wikipedia for things supposedly in their expertise. Thank goodness for easy IP editing.
RE your third paragraph “If my wording is problematic then…”, I don’t know what you are talking about. What wording? I hope you’re not asking me to improve your wording, you are extraordinarily clear in your writing. Even in the a case where we have different perspectives. SmokeyJoe (talk) 10:07, 11 January 2023 (UTC)
Yes, the threading has become horrible! My remarks above are addressed to everyone (because I'm talking on a policy page), but they are in response to WAID. The flow is:
S Marshall: You agreed with me above that original research is when someone publishes a theory or contention that hasn't been previously published. Let's stick to that please.
SmokeyJoe: I agree that original research is when someone publishes a theory or contention that hasn't been previously published. .
WAID: I don't agree. I believe that original research is when someone publishes something in Wikipedia that hasn't previously been published elsewhere. My definition includes your unpublished "theory or contention", but it also includes anything else that is previously unpublished.
S Marshall: WAID, you say that NOR should apply to things that aren't theories or contentions. Could you give me an example?
WAID: As for an example of something that isn't a theory or contention, but should be covered by NOR: IMO anything that is unpublished is OR if you stick it into a Wikipedia article.
S Marshall: (The post above.)
My remarks immediately below, though, are in response to you. The flow there is:
SmokeyJoe: A theory or contention on some topic is necessarily a secondary source on that topic. (Do you agree?)
S Marshall: No, that doesn't follow.S Marshall T/C 16:17, 11 January 2023 (UTC)
So, going back to your post S Marshall, I agree that a paraphrase or a reasonable attempt at synthesizing and summarizing is not necessarily OR. It relates to an archived thread I brought up 5 months ago: Simple synthesis is not original research Andre🚐 17:00, 11 January 2023 (UTC)
  • If NOR covers anything that is unpublished, then it covers everything covered by WP:V.  Yup.  NOR is a subset of WP:V.  It goes into some additional and important detail about material that can't be found in a source, especially including SYNTH, but as I have said before, everything that is banned by NOR is also banned by WP:V. If you prefer to think of it the other way around, WP:V is additional restrictions piled on top of NOR's basic rule that everything must come from a source instead of from a Wikipedian's head. Either way you look at it, the two policies overlap. This overlap was the primary basis for the old Attribution proposal.
  • where a user wanted to apply WP:V and WP:OR to individual words.  Yup, that happens.  Some editors do not want us to WP:Use our own words.  We see this frequently in "hot button" areas (how dare you just go and label them, just as if murdering someone makes you be a murderer!), or if someone tries to re-write a sentence to use (what they believe is) less judgmental language (which we generally ought to do, because NPOV includes tone, but not if, e.g., it means whitewashing; some people should be called "murderers", and nobody should be called "involuntary life termination technicians"). We also see this with editors who attach some odd meaning or subtext to a word, whose grasp of English (or of language in general; we have lots of editors with medically diagnosable communication impairments) is weak. But, of course, the opposite happens as well: sometimes words that seem interchangeable really aren't, and sometimes we (and our sources) accidentally mis-describe things. As one example, Wikipedia:Two times does not mean two times more. Two times more means three times as many. Sometimes a single word could be a problem for NOR.
WhatamIdoing (talk) 03:22, 14 January 2023 (UTC)
FWIW, IIRC, NOR predates V and V was an attempt at unifying other policies and putting in place sourcing standards. Andre🚐 03:27, 14 January 2023 (UTC)
According to the page history, it's the other way around. WP:V was started in August 2003 and NOR was started in December 2003. Of course, back then, everyone could have agreed to something on the mailing list or IRC for months before anyone bothered to write it down on wiki. Off-wiki communication channels played a huge role back then. WhatamIdoing (talk) 03:46, 14 January 2023 (UTC)
Hmm, yeah, you should find the mailing list post though where Jimbo said that originally. In this version of the V page[1] it already has several mentions of "original research." Maybe it existed on another page other than the page it's on today. Anyway, the mailing post should be from July 2003 and I believe it was entitled, "Crackpot articles." Andre🚐 03:52, 14 January 2023 (UTC)
Here it is [2] and he uses the term "confirmability" back then Andre🚐 03:54, 14 January 2023 (UTC)

Post hoc convenience break

  • No, that doesn't follow.—S Marshall T/C 02:04, 4 January 2023 (UTC)
    Ok. We are speaking different languages. The FAQ displayed at the top of this page may be relevant. I think we need a third opion. The question is whether “A theory or contention on some topic is necessarily a secondary source on that topic” is true. I don’t know what you mean by “follow” and suspect it relates to the confusion. The statement is true according the definition of a secondary source, independent of the preceding statement. SmokeyJoe (talk) 02:23, 4 January 2023 (UTC)
    I contend that the number of English Wikipedia editors will go down this year, compared to 2021. Is my comment here a secondary source on the number of editors? If not, then it is not true that "a theory or contention on some topic is necessarily a secondary source".
    Is Darwin's 1859 On the Origin of Species a secondary source on evolution? The book definitely lays out two scientific theories. If not, then it is not true that "a theory or contention on some topic is necessarily a secondary source".
    Is Watson and Crick's 1953 "Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid" paper a secondary source? The paper concisely lays out their revolutionary conclusion that DNA (rather than protein) is the storage mechanism for genetic information. If not, then it is not true that "a theory or contention on some topic is necessarily a secondary source". WhatamIdoing (talk) 23:11, 4 January 2023 (UTC)
    You may like to take source typing analysis down to the sentence level, but I do not. Source typing requires input information: the source, how used, and what for. More context is better. At the sentence level, one can construct apparent contradictions. Source typing is really about analysis the progeny of information. A sentence is brief for information. Information really needs to be minimally at the paragraph level. SmokeyJoe (talk) 22:13, 9 January 2023 (UTC)
    Earlier in this debate I've literally linked a piece of original research published in 2023. It's self-evidently a primary source.—S Marshall T/C 02:36, 4 January 2023 (UTC)
    RE “Nuclear Fission Properties of Super Heavy Nuclei described within the four-dimensional Langevin model” [3]
    That paper is not mainly theory or contention. It’s a modelling outputs paper. It is not meaningful to declare something a type of source without saying what for and how used. The paper looks to be at least in part a secondary source on their four-dimensional Langevin model. If it “analyzes, assimilates, evaluates, interprets” etc, it’s a secondary source. SmokeyJoe (talk) 03:56, 4 January 2023 (UTC)
    It's the second paper on their new mathematical model, which is claimed to be able to predict the outcomes of experiments. That would amount to a theory, right?—S Marshall T/C 10:46, 4 January 2023 (UTC)
    I'm not sure if this helps. But I think of WP:OR / WP:V / WP:NPOV as covering the same thing whenever there's a flagrantly false claim. That's easy and we hardly need all three policies. But you need the WP:OR policy when someone tries to manipulate WP:V to imply things not explicitly stated. For example:
    • "Source A" defines a "fad" as a product that had a burst of popularity followed by a collapse in less than two years.
    • "Source B" says NFTs had a burst of popularity followed by a collapse in 2022.
    • An editor puts those two sources side by side, and says "NFTs are a fad." But no source explicitly states that.
    Whatever your feelings are on the truth of that matter, that would be WP:OR. Some aspect of it is verifiable, but the editor is interpreting it to draw their own conclusion. Shooterwalker (talk) 22:44, 4 January 2023 (UTC)

Something has to be done. A bunch of editors are claiming that the simple reading of a map is original research. Elsewhere an editor has claimed that even reading charts and graphs is OR. [4] But quite frankly, the definition that SmokeyJoe has given above that is restricting to secondary sources only would mean throwing away 90% of all newspaper citations across the site, including for every item of WP:ITN. --Rschen7754 03:36, 5 January 2023 (UTC)

Well, none of us want that. As I wrote similarly earlier, WP:NOR almost never amounts to a reason to delete, but is a reason to find more sources and to add. NOR never “restricting to secondary sources”, PSTS emphasises the desire for a balanced mix of primary and secondary sources.
WP:ITN is a special challenge. No one wants Wikipedia to not be up to date, and for current events, all sources will be primary, as judged from a year or more later. A good secondary source is at arms length from the event, both in time and space. Obviously, this can’t be a requirement for starting coverage on a current event. SmokeyJoe (talk) 03:49, 5 January 2023 (UTC)
In the above discussions and others though, NOR has been used to invoke WP:BURDEN and subsequently WP:DISRUPT has been used to intimidate editors into compliance. --Rschen7754 03:52, 5 January 2023 (UTC)
Wait a sec, NOR is totally a reason to delete. OR most commonly in my wiki-experience takes the form of someone who has read sources and has a novel interpretation of how they fit together. There is definitely OR that should be removed or rewritten because it's a new spin on the accurate information in sources. Andre🚐 04:01, 5 January 2023 (UTC)
Yup. If you get some editor saying that Wikipedia should be the first publisher to host their ground-breaking proof that Einstein was wrong about general relativity, then OR is totally a reason to delete that nonsense. WhatamIdoing (talk) 01:42, 9 January 2023 (UTC)
If memory serves (please correct me if I'm wrong), SmokeyJoe's definition of secondary source is something closer to "any source containing at least one adverb or adjective", or, perhaps more precisely, any expression of opinion or subjective claim, so most WP:PRIMARYNEWS wouldn't be a primary source according to him. Also, it would be impossible for an opinion piece or a book review to be a primary source. This is not the definition we normally use. WhatamIdoing (talk) 01:47, 9 January 2023 (UTC)
No, not correct, not a definition, but a working methodology. I think I remember the conversations, about scratching the bottom for secondary sources for the WP:GNG. I was probably unclear, but to try again: When looking to analyse secondary source content in a source, if it is thin, I search out adverbs and adjectives looking for it. If a source has no adverbs or adjectives, it is very probably something like a facts-only press release.
I think I am completely onboard with WP:PRIMARYNEWS. SmokeyJoe (talk) 02:32, 9 January 2023 (UTC)
  • I think SmokeyJoe is saying OR isn't a reason to delete articles. I don't think he opposes removing disputed text.—S Marshall T/C 12:03, 5 January 2023 (UTC)
    Now that I've read up a little bit on the map discussion I understand a bit better, so my apologies. Combined dataset maps aren't SYNTH. However I will add that it is definitely possible that an article can be deleted because it is original research. The article could be referenced and it could be a long article about a theory that puts too much weight and emphasis on certain things. Andre🚐 05:08, 6 January 2023 (UTC)
    There are really two sets of issues related to maps. The other is whether maps can be used as a source for geography articles. Some are saying that doing so is original research, and then invoking BURDEN to delete such statements. (And, some are also declaring that all maps are primary sources to attempt to further the argument). --Rschen7754 00:17, 8 January 2023 (UTC)

Second sentence - arbitrary break

  • The key reason why I think PSTS is a distraction in the context of NOR is this: NOR is not limited to using Primary sources… it is quite possible to engage in Original Research using secondary or tertiary sources.
In NOR situations, it is the analysis or conclusion that that is Original. The primary/secondary/tertiary nature of the source material that was used to to form an original analysis/conclusion does not matter.
Yes, it is easier to create an Original analysis or conclusion when using Primary sources… but it is possible to do so using ANY source type.
Original Research occurs when an editor goes beyond what is explicitly stated in the sources, not what sources one uses. Blueboar (talk) 12:43, 5 January 2023 (UTC)
  • Hmm. If Blueboar is saying this too, then I'm starting to wonder whether this view that OR and PSTS are inextricably linked might be unique to SmokeyJoe?—S Marshall T/C 12:48, 5 January 2023 (UTC)
    That wasn’t my view. Not “inextricably” linked. My view has been that it is very good to view OR via the PSTS framework. I admit that it is possible to have all primary sources and it not be OR, and to have OR even using primary sources as Blueboar explains. Blueboar might be right, but are there any examples of OR where PSTS was a distraction? SmokeyJoe (talk) 13:20, 5 January 2023 (UTC)
    It was a distraction in this very conversation… the question we were asked to consider was whether to cut (or amend) the second sentence - which attempts to define what we mean when we say “Original Research”. That sentence does not even contain the words “primary” or "secondary" (or tertiary) … they are irrelevant to the question… and yet we have spent the last two days distracted by a discussion about those words. Why? Blueboar (talk) 14:27, 5 January 2023 (UTC)
    I guess it relates to my contention that “original research” (like all research) necessarily involves the generation of secondary source content. And that we should exclude fake data from our definition of original research, and keep “original research” confined to information as opposed to facts and data. SmokeyJoe (talk) 23:41, 5 January 2023 (UTC)
    I'm confused by this idea that you can "generate" source content. If you're generating content, that isn't a source. That is the article text. The article text by definition is not the source. It's one layer up from the source content. Article text is never primary or secondary, but always at least tertiary since this is an encyclopedia. Andre🚐 04:58, 6 January 2023 (UTC)
    Generating content, I prefer to say “generate secondary source content”, but others include “generate primary or secondary source content”, is what WP:NOR says not to do. SmokeyJoe (talk) 08:34, 6 January 2023 (UTC)
    At the risk of making consensus harder, PSTS would probably be more at home at WP:V or WP:RS. But it's hard to get a consensus to change policies that have been in place this long. Shooterwalker (talk) 01:06, 6 January 2023 (UTC)
    Does Wikipedia:Attribution#Primary and secondary sources provide a nice model of PSTS at home? It’s a subsection under Wikipedia:Attribution#Reliable sources.
    If NOR is not to be considered different to WP:V, then that means going back to WP:ATT. A lot of work went into WP:ATT to solve these NOR confusions. SmokeyJoe (talk) 04:41, 6 January 2023 (UTC)
    I'm going to say that I will lean toward opposing any such change and I expect many other old line wikipedians will as well. Attribution and verifiability are closely related, but original research is not the same as either of those, though of course it has overlapping aspects. The concepts of what sources are primary or secondary is related, but is a bigger thing on its own as WAID said earlier. Primary and secondary sourcing has to do with the nature of the information in the source relative to verifying that information and attributing it, but the information itself is a more fundamental layer. It is the content itself. Original research also exists in the realm of content. It's about thinking and the construction of the boundaries of the topic.
    I'll give you an example. For years, for some reason, Wikipedia considered Democratic-Republican Party and Democratic Party (US) to be one and the same. However most American history specialists consider those to be distinct. That was just some original sauce that someone wrote in, based on overweighting fringe thinking. Stuff like this is still around all over the place, and folks pushing it and trying to change the way we organize the content. Andre🚐 04:55, 6 January 2023 (UTC)
    Not a good example. The idea that the modern Democratic Party is a direct continuation of the Jeffersonian era Republican-Democratic Party may be flawed, but it is hardly Original Research, nor is it fringe… it is how most high school level US history text books present them. Sure, it is a simplistic view of political history - and yes, higher level academics see them as two distinct things - but the idea did not originate in the mind of a Wikipedian. It isn’t OR. Blueboar (talk) 14:16, 8 January 2023 (UTC)
    I think it is important that a page (any page) that is marked as a "content policy", contain a definition of the wikijargon "original research". I don't much care what the title of that page is, or how much (or little) other content that page contains. I also don't care whether that definition is the second sentence of this policy (the current state) or whether that definition is placed in a section called ==Definition==. I just want an official definition, so that when someone says "It's original research to say ______ in an article", I can write "Okay, the WP:POLICY says that 'Original research' is the whatsihoozit of thingamabubs'. Are you saying that it's a whatsihoozit of thingamabubs to say ______ in an article?" The reason I want to be able to ask that is that, in a surprising number of disputes, when you ask that question, the response is "Of course not! I'm not an idiot! I'm saying that it's self-promotional POV pushing." Yeah, well, then don't say that it's a NOR violation. See also Wikipedia talk:No original research/Archive 63#The table for one of the recent efforts to get editors to say what they mean, instead of claiming Policy X when they are actually complaining about violations of Policy Y.
    As a completely separate point, I do think that PSTS should be separated from NOR. I lean towards having it on its own separate page, but it actually has a lot to do with NPOV (especially due weight) and Notability (especially GNG). WhatamIdoing (talk) 01:55, 9 January 2023 (UTC)
    I agree that it’s important to define the WikiJargon. I am skeptical that the WikiJargon “original research” has ever been well defined. A root cause of the difficulty is the definition of “research”.
    I am also skeptical that separating PSTS will be a net positive change. Maybe it will be. I don’t feel that you have explained why you think PSTS should be separated from NOR. The likely largest cost, in my opinion, is the distraction. I continue to suggest that we move to a subpage to discussion the drafting of the RfC. SmokeyJoe (talk) 02:49, 9 January 2023 (UTC)
    I think that PSTS should be moved out of NOR because it has nothing specific to do with NOR, especially SYNTH. PSTS is fundamental to writing good encyclopedia articles, but nobody with any sense is going to have these conversations:
    • A: That editor made stuff up that hasn't ever been published in any reliable source. That's OR and is banned.
    • B: Oh, no, you can't say that so easily. First tell me if he stuff up that hasn't ever been published in any reliable primary source or any reliable secondary source. PSTS is fundamental to NOR!
    • A: It's no source at all. This is "material—such as facts, allegations, and ideas—for which no reliable, published sources exist." That's OR.
    • B: But you really can't figure out whether it's OR until you know whether the non-existent source was primary or secondary.
    • A: The source doesn't exist! He made this stuff up out of his own head! Non-existent sources can't be either primary or secondary!
    • B: Well, I still think you need to settle PSTS classification before you can decide whether this is material—such as facts, allegations, and ideas—for which no reliable, published sources exist. Obviously, if you don't know whether it violates PSTS, then you can't know whether it violates NOR. Maybe if you go ask at NORN, they can tell you whether an editor's head is primary, secondary, or tertiary. Once we have that settled, we can get back to your question of whether the editor's head exists, is published, and is reliable.
    I also think that PSTS should be moved out of NOR because it's fundamental to other policies and guidelines. Putting it in NOR has the effect of implying that PSTS is only about NOR, which is certainly wrong. WhatamIdoing (talk) 03:00, 9 January 2023 (UTC)
    OK. How would you like us to proceed? SmokeyJoe (talk) 03:18, 9 January 2023 (UTC)
    Here's a suggestion, IMHO, WAID, Smokey, et al. Just make a new essay page explaining PSTS. Then once a lot of people have endorsed that you can see if there's consensus to make a guideline. No reason "move out" or "remove" the current policy in effect. It sounds like y'all don't even agree what a primary source is. For example, current guidelines include contemporaneous news coverage as part of a primary source. There's no bright line when "contemporaneous" has ended for an ongoing event, is there? Andre🚐 03:44, 9 January 2023 (UTC)
    No, WAID's right. PSTS is wrongly placed in NOR, and leaving it there creates more confusion than it resolves.—S Marshall T/C 17:34, 9 January 2023 (UTC)
    That essay was Wikipedia:Wikipedia is a tertiary source. SmokeyJoe (talk) 22:23, 9 January 2023 (UTC)
    Yes, I suppose it is, but it appears to have gone down in flames because of the image usage portion. Andre🚐 22:28, 9 January 2023 (UTC)
    I believe that the simplest thing to do is to split PSTS to its own page, while retaining its current status as a policy. Same words, just on a different "physical" page. WhatamIdoing (talk) 00:19, 11 January 2023 (UTC)
    when "contemporaneous" has ended? Somewhere between one year and five years after the last death of all involved. SmokeyJoe (talk) 22:25, 9 January 2023 (UTC)
    BTW, Blueboar, I'd love to see the textbook cites so we could improve that pair of articles by explaining the above point if you have them Andre🚐 03:55, 9 January 2023 (UTC)

Returning the key point

  • Whether we split off the PSTS section or not, I do think we should return the point that gave rise to it: That we need to avoid adding anything that would make Wikipedia a Primary source for information.
It was this point that originally caused us to explain what a primary source is (and subsequently what Secondary and Tertiary sources are), and when we removed that point we removed what directly tied PSTS to the broader point behind this policy. Blueboar (talk) 19:23, 25 December 2022 (UTC)
When was this? Wikipedia should not be used as any kind of a source for Wikipedia. SmokeyJoe (talk) 22:08, 25 December 2022 (UTC)
NOR began life to deal with the same question as Wikipedia:What Wikipedia is not#Wikipedia is not a publisher of original thought. NOR's first mention of primary sources was to say that Wikipedia itself isn't (meant to be) a primary source. This was a couple of years before you started editing, and there had been a problem at the time with people thinking that they could publish their own actual research papers here. People were literally posting papers they'd written for school, or had rejected from academic journals, and they thought they'd just put it on this Wikipedia site they'd just heard about so the world could find out about it. Quaint, right? But Wikipedia wasn't well known back then, Wikipedia:Five pillars still hadn't been written, and folks were still sorting out what belonged and what didn't. This was a really l-o-n-g time ago. We actually did need a rule that said Wikipedia is not meant for any facts/information/material/content/or anything else that hadn't already been published elsewhere. That's why this policy exists That's why the definition is about "material—such as facts, allegations, and ideas—for which no reliable, published sources exist", rather than about transforming "facts" into "information", or about getting the correct balance between secondary and non-secondary sources. This policy's raison d'etre is to stop people from writing Wikipedia articles about how they've single-handedly disproven modern physics, even though the evil physics cabals refuse to let them publish their proofs in peer-reviewed scientific journals.
(As for Wikipedia being an invalid source for Wikipedia... If you believe that, then Category:Wikipedia is awaiting your clean-up efforts.) WhatamIdoing (talk) 23:05, 27 December 2022 (UTC)
I find the quaint history interesting, fascinating. I know about the 1990s tendency of amateur wannabe science philosophers posting manifestos to the local university. Wikipedia appeared as an outlet for them, and I completely understands Jimbo’s quote. I think the tendency was cured not by Wikipedias improvement in articulation of purpose, but by the improvement of Wikipedia search engines, which allowed kooks to self-educate and then to find their theories were not new.
The Jimbo quote makes perfect sense in the PSTS lens, even if Jimbo himself couldn’t define a secondary source. WP:5P1, Wikipedia is an encyclopedia. And an encyclopedia is a tertiary source.
You think that we should return to the point that gave rise to PSTS? Ok, I am keen to listen. However, before any action, I think we should also consider the question: what is the message to be given to the newcomers. Core content policies get used for high-language debates amongst old Wikipedians, but their real purpose is starting information for the newcomer who is starting to become serious about contributing to Wikipedia.
The rise of PSTS? Please, share your perspective.
- SmokeyJoe (talk) 23:58, 28 December 2022 (UTC)
Might I suggest that we document this history in a collapsible box on this talk page to help future editors know its roots? (We really should do this on other core P&G pages) Masem (t) 00:02, 29 December 2022 (UTC)
I like this idea, but suggest doing so in a separate page. Self referencing is poor. No document should attempt to describe its own history. SmokeyJoe (talk) 00:27, 29 December 2022 (UTC)
@Masem, there's a FAQ transcluded at the top of this page; it's a separate page, but would be relatively visible. A history of NOR was written in 2007 by SlimVirgin, after the ATT merge failed. That has since been merged to Wikipedia:Core content policies. If you added a question and a brief answer, you could point to the longer story.
I notice that words such as primary and secondary do not appear in SV's telling of the history, even though by that point, the PSTS section looked much like it does today. This does not surprise me, given that PSTS is central to writing a decent encyclopedia article, rather than central to keeping pseudophysics out of Wikipedia. WhatamIdoing (talk) 22:49, 4 January 2023 (UTC)
SV was not comfortable with the historiographical definitions. She seemed to prefer the journalism definitions. I recall challenging her on her choice of definitions, and she said (my loose recollection) that good journalism standards (like good history writing) supports good Wikipedia content. SmokeyJoe (talk) 00:34, 5 January 2023 (UTC)
That definition also allowed her to conflate secondary with independent, and thus claim that a breaking news story, or eyewitness journalism, should be counted as "secondary" for the purposes of WP:GNG. It is one of the few points that she and I completely disagreed with. But overall, I'd say that PSTS is central not to NOR, but to something much bigger than just NOR. WhatamIdoing (talk) 02:53, 5 January 2023 (UTC)
PSTS is central not to NOR, but to something much bigger than just NOR?
Very interesting. SmokeyJoe (talk) 03:28, 5 January 2023 (UTC)
Yes. I call that bigger thing "writing an encyclopedia article". Not making up stuff, whether the made-up stuff about how you're smarter than Einstein or about why you think the local mayor lost the election or about why you believe that mass murder victims are all crisis actors, is important to that bigger goal, but PSTS goes well beyond not making stuff up. PSTS is how you (should) decide what you can say and how much you should say about it. WhatamIdoing (talk) 02:31, 24 February 2023 (UTC)

Theoretical question on if someone published their own secondary source

Sorry if this was asked before or if I missed this mentioned here but i didnt catch it if it was in here. Lets say a PHD student decided to write and publish their own paper to an academic journal then used that as a source on wikipedia.

Would this count as a loophole (the editor being known as the publisher), or would it be accepted provided the paper doesn't get retracted later on? DarmaniLink (talk) 12:54, 8 February 2023 (UTC)

If it isn't on the page and I didn't miss it, we should add something brief about the unlikely-but-possible scenario DarmaniLink (talk) 12:58, 8 February 2023 (UTC)
Meh… see WP:COI. While we don’t disallow it, we also recognize that there is a conflict of interest with a Wikipedian citing their own work… so, at a minimum the editor should disclose that he/she is the author of the paper (not the publisher - that would be the journal) … and, ideally, they would wait and let someone else cite the paper. Also, a lot depends on the specific journal that published the paper and its reputation. Not all journals are equal. Blueboar (talk) 13:27, 8 February 2023 (UTC)
Disclose would require that they out themselves. North8000 (talk) 13:52, 8 February 2023 (UTC)
I presume North8000 is referring to the "General COI" section of the "Conflict of interest" guideline. That section describes possible ways to disclose a general COI, but does not explain how to decide if a COI exists.
I think the relevant section to decide if a COI exists when a subject matter expert cites a paper written by the expert is the "Citing yourself" section of the guideline. As I read that section, having your username in the page edit history is sufficient if the edit is "within reason, but only if it is relevant, conforms to the content policies, including WP:SELFPUB, and is not excessive." Of course, if one of the other sections about various kinds of COI apply, and are more stringent about disclosure, the other section should be followed. For example, if the editor believed that citing the paper would lead to it being cited in journals, and improve the editor's Author Impact Factor and thus improve the editor's likelihood of obtaining a desirable job, a clear conflict of interest would exist. Jc3s5h (talk) 18:47, 8 February 2023 (UTC)
thanks, i thought this might had already been covered but now i know for sure and know more too DarmaniLink (talk) 19:12, 8 February 2023 (UTC)
Actually I was just making the observation that an editor disclosing that they are the author is the editor outing themselves. North8000 (talk) 19:51, 8 February 2023 (UTC)
@DarmaniLink, you might be interested in Wikipedia:Party and person, Wikipedia:Identifying and using primary sources, and Wikipedia:Identifying and using self-published works. It is possible to write a secondary source, self-publish it (e.g., on a blog – this does happen), and then come to Wikipedia and try to cite it under WP:SELFCITE. It is uncommon, but not impossible, and sometimes (rarely) it's a valid source. WhatamIdoing (talk) 02:41, 24 February 2023 (UTC)

Turning one policy page into two policy pages

The Wikipedia:No original research#Primary, secondary and tertiary sources section is important, but developed on this page almost accidentally, rather than through deliberate intention. Should this section of the policy be split to a separate policy page, with no other changes made to either (except as necessary to fix links, grammar, etc.)?

Should this policy be split into two policies?
Yes, this is a good idea. No, we should not add this.
  • "Original research" is about editors making up stuff that isn't in the published reliable sources.
  • Whether the published reliable sources are primary, secondary, or tertiary ("historiographical classification") is not essential to the concept of original research. It doesn't even get mentioned outside of the specific WP:PSTS subsection.
  • The definitions of primary, secondary, and tertiary sources appear in multiple policies and guidelines. Although not closely tied to original research, it is a core concept for Wikipedia:Notability.
  • Sources that are used to claim something in a Wikipedia article, when the sources don't directly support that claim, are NOR violations. It does not matter whether the misused sources are primary, secondary, or tertiary. Misuse of sources will remain 100% banned, just like it is today.
  • Splitting the page will have the effect of making the key WP:SYNTH section more prominent in this policy.
  • NOR is about knowledge production, in particular about proper knowledge production for our encyclopedia. Wikipedia is generally text based knowledge. Authors outside Wikipedia produce knowledge through published primary, secondary, and tertiary sources, and in the later two mixing these types of sources together. Wikipedians must generally create text that is original in the authorial sense (no copyright violations and no plagiarism), but not original knowledge. We do this by properly adhering to the three types of sources, prizing secondary sources above the others, so it is vital we have some sense of what they are.
We prize secondary sources because that is where analysis happens. We must understand primary sources because that is the basis for secondary sources (and we sometimes use primary sources ourselves). We must understand tertiary sources because that is the template for our articles' collectionary and summary purpose (and we sometimes use them ourselves), although our articles cannot contain anything "new". By understanding and properly using these three, we avoid original production. These concepts are central to this policy; we must understand this world of knowledge outside the pedia, and how we are the same but also different in the process of knowledge production (careful research in the three types, mixing them together appropriately, creation of original writing based on them, but not original knowledge).-- Alanscottwalker (talk) 14:18, 24 December 2022 (UTC)
  • The rest of the policy does refer to the three types, everytime it mentions sources, our whole process is the labor of putting together (and omitting) and representing together (and omitting) the three types appropriately to make good encyclopedia articles (original in writing but not original in substance). -- Alanscottwalker (talk) 15:57, 24 December 2022 (UTC)
  • This text developed organically here, so we just shouldn't change its location. Idea: dedicate a page for this, and transclude it here. —SmokeyJoe (talk) 01:57, 23 December 2022 (UTC)
  • We do ourselves no favors by further balkanizing policy. The three policies V, NPOV and NOR must be read together, we say in each policy, but too often they are already balkanized. The last thing Wikipedia needs is yet another policy. -- Alanscottwalker (talk) 14:18, 24 December 2022 (UTC)

Questions

Frequently asked questions:

How did this happen?
In 2004, after a discussion on the mailing list about an editor who wanted to use Wikipedia to promote his new idea about physics, this policy was updated to say "Wikipedia is not the place for original research such as "new" theories. Wikipedia is not a primary source."  Later, someone though it would be helpful to have a definition of "primary source", and eventually instruction creep took over, and the one sentence has turned into 800 words with 7 explanatory footnotes, and 7 sources.
Won't splitting the content between two pages gut the policy?
No. Every sentence currently part of this policy will continue to be part of a policy.
Won't this make PSTS stop being a policy?
No. Both pages will have the policy tag at the top.
Won't this make SYNTH stop being a policy?
No. Both pages will have the policy tag at the top.
Do you seriously mean just cutting and pasting some text to a different page, and nothing else really changes?
Yes, with the caveat that the new page will need a basic introductory sentence, and we'll need to fix a few links or similar details. The WP:PSTS and other shortcuts will point to the new page. If you are interested in the details, see this sandbox.
What would the new policy page be called?
The new page could be located at Wikipedia:Primary, secondary, and tertiary sources (an existing redirect to the current section). If you have better ideas, please add them in your comments.

Your questions:

Comments

  • RFCs are discussions, not votes. Add your comments here!
  • Oppose. WP:PSTS is the intellectual basis for WP:NOR and it’s removal would leave WP:NOR with a hole at its centre. If *any* restructure of core content policy is a good idea, it is the merging of WP:NOR and WP:V (see WP:A), which preserves WP:PSTS as the intellectual foundation of the combination. No good content doesn’t have both primary sources and secondary sources, and in balance. Trying to explain WP:NOR without strong emphasis on the need for secondary sources (as defined in the historiological field) is flawed and confusing. —SmokeyJoe (talk) 00:39, 23 December 2022 (UTC)

    Later, someone though it would be helpful to have a definition of "primary source", and eventually instruction creep took over, and the one sentence has turned into 800 words with 7 explanatory footnotes, and 7 sources.

    Instruction creep, and general bloat, is a problem in any instruction. Concise is good. Redundancy in instructions is bad. The more words, the more likely none will be read.
    Regarding the “primary source” definition bloat, the answer is to strip it back, remove anything redundant with the article primary source. Keep core policy concise, do not fork and pander to the bloat. SmokeyJoe (talk) 00:45, 23 December 2022 (UTC)
    On re-reading the policy, I disagree that there is too much instruction creep in creating a new definition of "primary source" and "secondary source". WP:PSTS prominently links to the mainspace articles, and paraphrases from the articles. SmokeyJoe (talk) 22:56, 24 December 2022 (UTC)
WAID, please sign your posts. —SmokeyJoe (talk) 00:47, 23 December 2022 (UTC)
If you want this RFC to be a private draft for now, do it in a userspace subpage. If it is here on the policy talk page, it is open, now. —SmokeyJoe (talk) 00:48, 23 December 2022 (UTC)
@SmokeyJoe, please look at the coffeeroll-colored tag at the top of this section that says "This draft RfC is not yet open for comments. Please discuss changes to the format of this RfC on the talk page, but do not comment on the topic of the RfC itself until it opens".
The reason the community created this tag is so that RFCs could be drafted in public. That's because it's hard to draft a discussion privately and still follow the advice at WP:RFCBRIEF that says "It may be helpful to discuss your planned RfC question on the talk page before starting the RfC, to see whether other editors have ideas for making it clearer or more concise." Please remove your comments for now (and this one, too, if you'd like), and wait until we have a clear question. If you have advice on the question itself, then please scroll up one section and join Blueboar and me on whether this question could be improved. If you just want to talk about the subject itself, you're always welcome on my own talk page. WhatamIdoing (talk) 00:52, 23 December 2022 (UTC)
If this draft is not open for comments, take it off this page. Put it on a subpage, preferably in your userspace. Possibly make it a project page so that it can have its own talk page.
I don’t respect your right to put that tag on a section and have it respected. No ownership of talk page or talk page sections.
I don’t agree to remove my comments, and would object to you removing yours. Instead, put the owned draft on its own page, with its own watchlisting and history of edits. In general, I think if something is worth an RfC, it is worth its own page.
You could link section to talk page threads.
I’m all for refining a question to be a better fairer question. SmokeyJoe (talk) 01:28, 23 December 2022 (UTC)
@SmokeyJoe, WP:RFCBRIEF says to "discuss your planned RfC question on the talk page before starting the RfC". Please tell me how I am to discuss my planned RfC question on the talk page if I take it off this talk page. WhatamIdoing (talk) 01:50, 23 December 2022 (UTC)
By putting the draft RfC on the page WP:No original research/2022 RfC to spinout WP:PSTS. Use WT:No original research/2022 RfC to spinout WP:PSTS to discuss it. I can also think of other possible ways that don’t involve implied ownership. SmokeyJoe (talk) 01:54, 23 December 2022 (UTC)
  • I'm very open to this idea and I expect to be supporting this proposal when the drafting is done.—S Marshall T/C 00:31, 24 December 2022 (UTC)
  • Compare the sentence "Original research" is about editors making up stuff that isn't in the published reliable sources. with Readers must be able to check that any of the information within Wikipedia articles is not just made up. If that is the purpose of NOR, then what is the difference between NOR and V? I think this sentence is somewhat confusing.--Paul Siebert (talk) 04:27, 24 December 2022 (UTC)
    • I think OR is a spinout of V. It's encountered so often and has so many facets that it has its own page but actually it's a case of V, not a truly separate thing. An attempt to merge the two at WP:ATT was historically unsuccessful.—S Marshall T/C 10:39, 24 December 2022 (UTC)
      You think OR is a spinout of V? That’s an odd thing to think. Why do you think it? I think your quite differently. WP:V is about facts. WP:NOR is about information, and whether others have though the information was worth publishing in reliable sources. WP:V is the simple requirement that Wikipedia has its facts right. WP:NOR is about ensuring that editors are not weaving new information, by synthesising new information from selected facts. This is no subset of WP:V. In historiography, this is an old and mature discipline of source typing, and an encyclopedia is well considered squarely an historiographical document. It certainly is not well considered science or journalism.
      WP:ATT was pretty good, but there was a failure in change management. SmokeyJoe (talk) 21:26, 24 December 2022 (UTC)
      Actually, if you trace the history of the two policies you will find that it happened the other way around... NOR came first, and WP:V was a spin out from that. Not that it matters... the two are intimately linked. Blueboar (talk) 23:54, 24 December 2022 (UTC)
        • Yes, that's what I think. They're two sides of the same coin. If I was Ruling Tyrant of Wikipedia, then I'd reduce both policies to a single line each:
          WP:V: Make sure that each article says the same thing that the sources say. Use citations to show how you've done this.
          WP:NOR: Make sure that each article doesn't say something the sources don't say.
          There would be supplementary guidelines that explain everything else, and all wrangles about primary, secondary and tertiary sources would be banned until Wikipedians can coalesce on a single definition of each of those words and then give a short, clear explanation of why they matter.—S Marshall T/C 23:17, 24 December 2022 (UTC)
          Pithy.
          Your approach has merit. I think WP:5P is a celebrated success of that approach. SmokeyJoe (talk) 23:45, 24 December 2022 (UTC)
          I would support this, if anyone can be bothered to propose it. BilledMammal (talk) 02:51, 21 March 2023 (UTC)
  • I am opposed, and I have inputed some reasons in the draft template above. Alanscottwalker (talk) 14:28, 24 December 2022 (UTC)
  • Oppose as this approach to the draft. The question should be simple and the exactly changes to policy (given as major that us being asked) should be avoided. If there is general support for making PSTS a core content policy, then a separate discussion can be had to discuss wording. --Masem (t) 00:54, 25 December 2022 (UTC)
  • Oppose. But the definition of Primary Sources clearly needs to be improved. Somebody above complained that an earlier attempt at this ballooned into 800 words. My guess is that this is where the footnote of 150+ words at the end of the sub-section's opening paragraph came from. Buried in this note are fundamental examples: philosophical and religious works, poems, scripts, screenplays, novels, motion pictures, videos, television programs, etc., several of which are mentioned twice in the footnote. Little in the current definition accounts for these works. My plaint, by the way, stems from an extended argument recently resulting from the current definition's emphasis (its opening words) on "original materials that are close to an event, and are often accounts written by people who are directly involved". This phrase does not directly apply to any of the examples in the footnote, none of which would be considered "events" by most people. In any case, please do something to improve this definition.

RFC plans

I've started drafting an RFC below(this is not even an archiving-robust cross-reference SmokeyJoe (talk) 01:59, 23 December 2022 (UTC)). I'm struggling to articulate a logical reason (beyond a bias against change, which is not unreasonable) for having PSTS remain on the NOR page. If anyone has an idea, I'd be happy to see it. WhatamIdoing (talk) 00:00, 23 December 2022 (UTC)

  • There is one part of PSTS that I think could remain: the warning that Primary sources should be used with caution. This is the only part of PSTS that directly addresses the concept of original research - since it is very easy to (perhaps unintentionally) misuse primary sources in ways that result in original research. Blueboar (talk) 00:35, 23 December 2022 (UTC)
    We had talked above about adding a summary of PSTS to the Wikipedia:No original research#Related policies. This would be an easy way to duplicate that reminder. WhatamIdoing (talk) 00:40, 23 December 2022 (UTC)
@SmokeyJoe, we're discussing the potential RFC question up here. Your comment that Trying to explain WP:NOR without strong emphasis on the need for secondary sources (as defined in the historiological field) is flawed and confusing. might give me something to put in the other column, which would make me feel better. To make sure I've got this right, you're saying that:
  • The definition is of original research is: "material—such as facts, allegations, and ideas—for which no reliable, published sources exist."
  • It is wrong to tell someone that some bit of "material—such as facts, allegations, and ideas—for which no reliable, published sources exist" is original research unless you first explain to them what a secondary source is (as defined in historiological terms.
Right? So you can't just say "Alleging that Queen Elizabeth was a reptilian alien is what we call 'original research', because no reliable published sources say that she was either reptilian or an alien". You first have to say "Articles must overall use more secondary sources than primary sources", and then go on to say that.
Looking at that, I'm pretty sure that one of us is wrong. WhatamIdoing (talk) 01:01, 23 December 2022 (UTC)
1st dot point No. Original research does not include fake facts. The word choice of “material” is poor. Original research is about information and knowledge.
2nd dot point. To understand original research, one must first understand primary and secondary source distinction. It is not wrong to tell someone things in a confusing order, but it is not good.
On the alleged alien nature of the queen, discussing original research is confusing because WP:V is not even met. There is not a single reliable primary source for this. There is not a single source. Getting into source typing with zero sources is silly. Start with the sources, and then we can discuss whether your article writing is derived from these sources, or is original research on your part. SmokeyJoe (talk) 22:20, 24 December 2022 (UTC)
  • The word choice of "material" might be poor, but it is the word that is in the second sentence of the policy. OR does include fake facts. OR == "material for which no reliable, published sources exist". Fake faces are one type of "material for which no reliable, published sources exist".
  • Why? What exactly do I need to know about primary and secondary sources distinctions before I can understand that "Queen Elizabeth was a reptilian alien" is an example of "material for which no reliable, published sources exist", which the second sentence of this policy says is called "original research"?
  • The whole point of NOR is that it's stuff that can't meet WP:V. There is not a single primary source; there is not a single secondary source; there is not a single tertiary source. That's the point. NOR == material for which no source exists [i.e., in the real world]. That is the literal definition of OR in the second sentence of the policy. The actual definition of OR is given in the second sentence. The actual definition of OR does not mention historiography at all.
WhatamIdoing (talk) 22:36, 27 December 2022 (UTC)
You’ve criticised the policy for having bloat like the redundant definition of primary and secondary sources. I disagree with that, as it think it does a good job of paraphrasing the articles. My criticism of WP:NOR is it’s lead, including the introduction of the vague word “material”. I actually think the policy would be improved by deleting of the entire five paragraph lead. Possibly, we are in agreement that there is a problem of bloat? The answer to bloat is to cut the bloat, not to WP:SPLIT as that would encourage worsening bloat. SmokeyJoe (talk) 02:40, 28 December 2022 (UTC)
OR does include fake facts? I disagree. The meaning of OR does not include fake facts. OR is about an editor being their own secondary source for content writing. A fact fake is a much simpler problem, not merely of something being unverifiable, but of being wrong. I think you have recently adopted a peculiar perception of OR. I would like to know why. I suspect that it has to do with a lax application of WP:NOR to medial articles, and I think this happens because Wikipedia has become very close to the cutting edge of medical science. It’s harder to follow when you are close to the cutting edge. “Harder” does not mean “wrong”. SmokeyJoe (talk) 02:48, 28 December 2022 (UTC)
the second sentence of this policy says is called "original research". Yeah. The second sentence is bunk. The whole five paragraph lead is poor. The problems you seem to be seeing is the lead, not PSTS, not the policy structure NOR-PSTS-SYNTH. SmokeyJoe (talk) 02:50, 28 December 2022 (UTC)
Are you interested in proposing the removal of the definition of OR from this policy? WhatamIdoing (talk) 22:18, 4 January 2023 (UTC)
User:WhatamIdoing, this opened an interesting can of worms. I see pros and cons to a few options:
1. As far as possible, use real world definitions (but they aren’t very good, or in the applicable context)
2. Remove the definition (but there is User:Barkeep49‘s respectable objection below, and failure to define a page title is a page failure. )
3. status quo (it uses poor vague waving to define, it is wordy, it is confusing)
4. Move the definition to its whole section at the bottom
5. Change the title, eg by merging to WP:ATT, so that core content policy *is* written in plain English.
I don’t think the removal of the definition, entirely, is a good idea, if it is the title. I think we should improve the definition, as a Wikipedia term of art, because, vague as it is, it is very deeply entrenched in Wikipedia culture. I think we can all agree on the meaning of “original”. I think more focus should be drawn to the definition of “research”, which points immediately to knowledge, which is above facts, pointing immediately to epistemology. SmokeyJoe (talk) 00:54, 5 January 2023 (UTC)
The whole point of NOR is that it's stuff that can't meet WP:V. Disagree again. We are not even agreeing on what we disagree about. “The whole point of” is a sweeping construct. I think you should be more precise, rather than I try to falsify your statement by pointing out points of NOR that are different to WP:V. SmokeyJoe (talk) 02:53, 28 December 2022 (UTC)
  • I'd keep it simple and neutral. A question along the lines of the following should suffice (everything else, including the table and the FAQ, should be moved to the discussion section): should the WP:PST section of the policy be moved to its own policy page, WP:Primary, secondary, and tertiary sources (currently a redirect), with no other changes made to either (except as necessary to fix links, grammar, etc.)? M.Bitton (talk) 02:00, 23 December 2022 (UTC)
    I'd like to, but the previous discussion showed that people really struggle with the very simple question. It took a while for people to grasp that we were starting with one page that says "{{policy}} NOR – PSTS – SYNTH" and that we would end up with two pages, one of which said "{{policy}} NOR – SYNTH", and another of which would say "{{policy}} PSTS". I think there was a fear that PSTS was somehow being demoted, or that it wouldn't really be a policy if it wasn't part of this specific policy. WhatamIdoing (talk) 22:40, 27 December 2022 (UTC)
    It is not sensible to suggest the a reading of SYNTH without having first read PSTS. It is secondary sources that make information out of facts, not editors. SmokeyJoe (talk) 02:38, 28 December 2022 (UTC)
    So if I have a source that says:
    The United Nations' stated objective is to maintain international peace and security.
    and another source that says:
    Since the creation of the UN, there have been 160 wars throughout the world.
    then I'm not going to be able to understand that it would be a SYNTH violation for me to write:
    The United Nations' stated objective is to maintain international peace and security, but since its creation there have been 160 wars throughout the world.
    unless I first read PSTS?
    Do I understand your view correctly? WhatamIdoing (talk) 22:34, 4 January 2023 (UTC)
    I’ll admit that it is possible to understand WP:SYNTH violations without using PSTS, but I wouldn’t call your red text example a SYNTH violation. SmokeyJoe (talk) 08:22, 6 January 2023 (UTC)
    You wouldn't? That's rather dismaying, because I copied that red text straight out of WP:SYNTH. That sentence has been given as the first, simple example of SYNTH since mid-2009. WhatamIdoing (talk) 01:30, 9 January 2023 (UTC)
    I think the example tends a bit stringent. I would prefer to consider WP:SYNTH at the level of paragraphs. It certainly isn’t a “good” combination. SmokeyJoe (talk) 02:19, 9 January 2023 (UTC)
    A lot of misinformation can be written in a single sentence, or even less. SYNTH can happen in very small pieces. WhatamIdoing (talk) 02:33, 9 January 2023 (UTC)
    @Alanscottwalker, I appreciate your addition at Wikipedia talk:No original research#c-Alanscottwalker-20221224141800-Turning one policy page into two policy pages. I particularly want to have sound reasons in that column.
    That said, I'm not sure that anything you've written there is unique or specific to NOR (or if it does, it doesn't explain how). I therefore wonder if it actually supports your view as strongly as you hoped. For example:
    • NOR is about knowledge production, in particular about proper knowledge production for our encyclopedia.
    • NPOV is about knowledge production, in particular about proper knowledge production for our encyclopedia.
    • WP:V is about knowledge production, in particular about proper knowledge production for our encyclopedia.
    Those are all true, right? They seem equally true to me, or perhaps it is an even bigger point for NPOV. Do you think that the other content policies aren't about producing proper knowledge in the encyclopedia? If not, then it's probably not a good idea to implicitly claim that as a unique characteristic for NOR, or as a reason why proper knowledge production needs to be in NOR instead of in some other page.
    I also have some concerns about this:
    • Authors outside Wikipedia produce knowledge through published primary, secondary, and tertiary sources, and in the later two mixing these types of sources together. Wikipedians must generally create text that is original in the authorial sense (no copyright violations and no plagiarism), but not original knowledge. We do this by properly adhering to the three types of sources, prizing secondary sources above the others, so it is vital we have some sense of what they are....By understanding and properly using these three, we avoid original production.
    Tertiary sources don't have to mix the types together (it's not a case of primary+secondary=tertiary, like 1+2=3; many tertiary sources, such as textbooks for children, are written entirely from secondary sources), but leaving that correctable detail aside, it's unclear how it relates to NOR. How does properly prizing secondary sources above the other types help editors avoid adding material that isn't contained in any existing source at all? I can easily see how properly prizing secondary sources above the other types helps us avoid non-neutral, unbalanced articles, helps us avoid giving equal validity to unequal POVs, etc., but how does prizing a secondary source's analysis help us stop making up stuff that no source says? It seems to me that the real value in prizing secondary sources is in NPOV, not in stopping editors from adding "material—such as facts, allegations, and ideas—for which no reliable, published sources exist". No sources means no existing sources at all – not just no secondary sources.
    WhatamIdoing (talk) 02:50, 9 January 2023 (UTC)
1) Knowledge production: That phrase is meant to convey creating knowledge, NOR emphasizes that Wikipedia is aimed at communicating knowledge through original writing but not creating it new. Wikipedia does that summarizing sources, but not all sources are the same, and so therefore their use and usefulness in our summarization writing is different, depending on the source, and in our regularly employing combination of sources of different types, with the continueing aim of not being original. The 3 types of sources is a categorization that is widespread outside Wikipedia, it is not something Wikipedia invented, it is a conceptualization that is useful in research based expository writing, when one is putting sources together to not to create new knowledge. They serve both as exemplars of writing to be mimicked, and a warning of what not to do, eg., you should not be a primary source. All three policies though have some overlap and are designed to be read together, but their emphasis is different. NPOV is not emphasizing the handling of new knowledge creation, it's focus is taking already existing sourced knowledge, and V is not emphasizing creating new knowledge, its focus is about taking a single source.
2) Your phrase, "textbooks for children", clues the clued-in researcher/writer on its usefulness in our writing (not much), and clued-in involves dealing in the three types of sources. Anyone who has seen textbooks for children will also often see primary source material in it too, even if it is just a phrase or sentence of an original document (or a pull-out box of someone's quote). But even where a tertiary source only refers to secondary source material, it subtextually encompases the primary source material of the secondary source material. A secondary source being an exemplar of analysis, gives lines our writer is not to cross: not your own analysis, their analysis is what you are to convey. Alanscottwalker (talk) 22:43, 9 January 2023 (UTC)
  1. As a point of practical politics, editors are generally very suspicious of anyone who says he's creating knowledge on wiki. I think I know what you mean, but the very idea of producing knowledge here is going to make editors' skin itch.
    I'd like to know more about your implicit statement that NOR's focus is not on existing sourced knowledge. You say that NPOV is about "already existing sourced knowledge". NOR prohibits all content/knowledge in that isn't "already existing" in the real world. If NOR prohibits "material—such as facts, allegations, and ideas—for which no reliable, published sources exist", then how is NOR not about making sure that Wikipedia contains only "existing" knowledge?
    And again: What does the requirement that Wikipedia articles contain only knowledge that already exits in the real world have to do with PSTS? Yes, we've borrowed and adapted the three categories of sources from the real world. But OR is banned with primary sources exactly as much as it's banned with secondary and tertiary sources. So why does PSTS need to be explained specifically in the context of "material—such as facts, allegations, and ideas—for which no reliable, published sources exist", instead of having it explained in the context of writing a decent encyclopedia article?
  2. Using a textbook written for children (e.g., for 12 year olds) would not violate NOR. I don't remember whether you were involved in the Wikipedia:Identifying reliable sources (history) proposal, but one of the reasons it failed was because editors refused to agree that a textbook written for even young children was an unreliable source. You'd be better off having most of the article WP:Based upon scholarly books and upper-university textbooks, but a textbook for 12 year olds can be relied upon to correct report that simple facts, such as that Guy Fawkes didn't blow up Parliament, that Abraham Lincoln was the 16th president of the US, and so forth. I'm not sure why straightforward, simple NOR operations ("Don't write that Guy Fawkes blew up Parliament in an article, because that's "material—such as facts, allegations, and ideas—for which no reliable, published sources exist" and thus a NOR policy violation") requires us to prize secondary sources. Does it really? Or is prizing secondary sources less about NOR per se, and more about writing a decent encyclopedia article?
WhatamIdoing (talk) 22:36, 15 January 2023 (UTC)
@Alanscottwalker, I wonder if you are interested in continuing this conversation. WhatamIdoing (talk) 02:44, 24 February 2023 (UTC)
I think, I have already addressed your points, the clue to the focus is the word "No" in the title of this policy, as it has pride of place. And as I already indicated, misusing primary, or secondary, or tertiary sources likely leads to publishing original research, which is a "No" - to even begin to not misuse them, you have to have some understanding of them. -- Alanscottwalker (talk) 04:09, 24 February 2023 (UTC)
@Alanscottwalker, that seems unlikely to me. Did you need to read PSTS before you were able to figure out how to handle sources without introducing "material—such as facts, allegations, and ideas—for which no reliable, published sources exist"? For example, your first edit cited a source. Did you need to study PSTS to know whether you were using the source correctly? WhatamIdoing (talk) 02:26, 26 February 2023 (UTC)
Personal questioning? As should be apparent, without getting personal, writing based on research in sources was not invented by Wikipedia. The conceptualizations of primary, secondary, and tertiary, predates Wikipedia. As already established above, Wikipedia did not invent these concepts, nor did Wikipedia invent the processes of writing - these processes and conceptualizations were already within standard educational models for writing. To the extent there is something new, here, it would be necessarily communicating to each other the encyclopedia process, and have it be replicable and replicated, mimicable and mimicked, in a public wiki for all to see and do, at the same time. -- Alanscottwalker (talk) 08:33, 26 February 2023 (UTC)
On the one hand, you say that "to even begin to not misuse them, you have to have some understanding of them", but on the other hand, you say that you didn't actually have to read PSTS to avoid OR. So why is it essential, in avoiding OR, for editors to read the thing that neither you nor I needed to read? WhatamIdoing (talk) 02:48, 21 March 2023 (UTC)
It's essential because doing reading and research writing is using PST. -- Alanscottwalker (talk) 06:11, 21 March 2023 (UTC)
@Alanscottwalker, I have a question about a different statement you've made in support of keeping PSTS inside NOR, rather than splitting it out to its own independent policy. You say: The rest of the policy does refer to the three types, everytime it mentions sources
I would like you to open the policy page, find the PSTS ===subsection=== and blank it. Do the same for the ==See also== section. Then count up how many times you find these words in the text of the policy:
  • primary
  • secondary
  • tertiary
I'm not sure how to understand your claim. Do you mean to say that there is a secret, unwritten mention of primary, secondary, and tertiary sources, so that where the policy says "independent sources" or "reliable sources" or just plain "sources", we are expected to read "independent primary, secondary, and tertiary sources" or "reliable primary, secondary, and tertiary sources", etc.? WhatamIdoing (talk) 02:55, 21 March 2023 (UTC)
They are upfront, so it is no secret, they are the 3 categories of sources, encompassing all sources. -- Alanscottwalker (talk) 06:24, 21 March 2023 (UTC)

RFC on clarification to this proposal

I have started a RFC at WP:VPP asking for clarification of the OR policy regarding the use of maps and charts. This is related to a couple of threads that are already on this talk page where such clarifications were discussed. Dave (talk) 05:51, 19 March 2023 (UTC)

This is now at Wikipedia:Requests for comment/Using maps as sources. So far, 52 editors have made a total of 348 comments in this three-part RFC. Your participation would be welcome. WhatamIdoing (talk) 02:58, 21 March 2023 (UTC)
The RfC has been expanded since announced here. The proposals are now:
New proposals are marked in bold. BilledMammal (talk) 23:47, 27 March 2023 (UTC)

Implicit synthesis

The policy clearly states that implicit synthesis should not be made by editors. But what about cases where there is implicit synthesis in the source? TFD (talk) 21:21, 7 January 2023 (UTC)

Any synthesis in a source (whether implied or implicit) would not be a violation of our NOR policy… as the synthesis does not Originate with a WP editor. NOR is about what we write, not about what the sources write. Blueboar (talk) 22:11, 7 January 2023 (UTC)
I have had a change of mind… because an implied synthesis could be unintentional. We might see a connection between things in a source that the author didn’t mean to be connected. If so, then we are the ones connecting the dots, not the author of the source. And that is indeed OR. Blueboar (talk) 22:21, 7 January 2023 (UTC)
Specific examples would be helpful. M.Bitton (talk) 22:28, 7 January 2023 (UTC)
A good secondary source should be explicit in the synthesis. It may not be a good source. I’m imaging an example where examples are grouped, and it is implied that grouped examples are similar, but it is never stated why the examples are grouped together. SmokeyJoe (talk) 22:52, 7 January 2023 (UTC)
Not sure… if an author states that A, B and C are all examples of X, wouldn’t the synthesis be explicit? The author is explicitly linking them as examples, even if the author does not explain why they are examples. Blueboar (talk) 23:07, 7 January 2023 (UTC)
An author stating A B C are examples of X is explicit, even if weak on why. An author might put Q R S together, maybe after previous groupings with explicit reasons, and it may be that they are similar, or it may be that they are the leftovers. A reader may infer a reason for the grouping, which would be synthesis by the reader.
Specific examples, would be helpful. SmokeyJoe (talk) 23:19, 7 January 2023 (UTC)
  • I think this is about this (permalink). The disputed edit is whether to say "Four officers who responded to the attack killed themselves within seven months", on the basis of this source (archive link). My take is that yes, the Reuters source is reliable for the claim that four officers who responded to the attack died by suicide, and to say so doesn't violate WP:SYNTH because the inference there is in the source. This is because SYNTH stops Wikipedians from reaching novel conclusions, but it doesn't stop sources from reaching them. However, I think it's a problematic edit to make for other reasons, and I'd draw your attention to this essay and the discussion with User:WhatamIdoing at User talk:S Marshall#Disapproving tone which gave rise to it.—S Marshall T/C 00:13, 8 January 2023 (UTC)
    The source makes a primary source observation connecting an event to a later cause of death. This is a primary source in demography. The source, Reuters, and the named author, is not a reliable source for demographic synthesis. Reuters, in my opinion, is a leading quality news reporter for primary source information and standing back from opinion and bias and any other form of secondary source content.
    Statistics is a mature robust academic discipline. Four deaths is not statistically significant.
    The making of a connection, four counts, a demographic connection between an event and later deaths, might not be WP:OR due to a source having done it, but the source is unreliable for any judgement on the reliability of there being a connection.
    # Primary sources that have been reputably published may be used in Wikipedia, but only with care, because it is easy to misuse them.[e]
    Any exceptional claim would require exceptional sources.
    WP:EXCEPTIONAL applies.
    - SmokeyJoe (talk) 01:18, 8 January 2023 (UTC)
  • Sorry, SmokeyJoe. I know I keep disagreeing with you and I swear it's not personal. We've known each other on-Wiki for a long time and agreed a lot over the years. But I can't accept what you say here.
    On your first point, Reuters is reporting figures from the District of Columbia Police Department. The District of Columbia Police Department is a reliable source for the cause and number of deaths of its own staff, and Reuters is an editorially independent secondary source.
    Counting suicides is not demographic synthesis.
    Around 2,000 police officers were deployed on 6 January 2020. Between that date and the Reuters report on 3 August 2021, four of those police officers killed themselves. Well, age-standardized suicide rates in the US for a comparable period are 14.5 per 100,000 (according to List of countries by suicide rate which is in turn based on WHO data). So contrary to your statement that "four deaths is not statistically significant", it is in fact quite a few standard deviations from the mean.
    The problem is that the Reuters article doesn't go further: it just counts the four suicides. In doing so it is blatantly inviting the reader to draw conclusions about the effect of the capitol attack on police officers' mental health, but it doesn't provide the proper paper by an academic statistician that we would consider ideal.
    But we can't rule out Reuters as a news source for articles about law and order in the US.—S Marshall T/C 12:30, 8 January 2023 (UTC)
    With such determined different perspectives from amateurs, there’s no way Wikipedia can work. SmokeyJoe (talk) 12:38, 8 January 2023 (UTC)
  • And yet somehow Wikipedia does work… amazing! Blueboar (talk) 13:13, 8 January 2023 (UTC)
    That's not statistically significant? Says who?
    In the US, a little more than four out of each 30,000 people died of suicide over the course of the 12 months of 2020. Among the officers surviving the attack, four out of less than 1,000 died of suicide over the course of just 7 months. That's about a 50x difference. I haven't actually calculated the p-value, but I'm confident just from a glance that a 50x difference on these numbers is statistically significant.
    More to the point, if a source says something, then it's not OR. It could be UNDUE; it could be a NOT violation; it could be an inappropriate source; it could be the kind of source that isn't really usable in practice; it could be a source that is misused (e.g., it needs WP:INTEXT attribution), but the one thing that content taken directly from a generally reliable, published source is not is "material—such as facts, allegations, and ideas—for which no reliable, published sources exist" – and, for better or worse, that is exactly the definition of OR. WhatamIdoing (talk) 02:30, 9 January 2023 (UTC)
    Four is a small number.
    Using google scho r to search: capitol attack and police suicide, since 2022, yields many . This is a serious matter, and Wikipedia should proceed carefully. It is not an OR matter. SmokeyJoe (talk) 03:13, 9 January 2023 (UTC)
    This is why no original research is foundational to Wikipedia and why it predates even the requirement for referencing. Before we had developed the ideas behind ref templates etc., Wikipedia still knew that it was not an outlet for original thought. That means we have to accept the conclusions of reliable sources. If the reliable sources ascribe event A to cause B, in the preponderance of their weight and reliability of course, we should also accept that as fact. Original research is original speculation, or connecting dots without a foundation, or creating a chain of implications. It also could be trying to limit the boundaries of what is reliable in a novel way. For example, in the situation where someone has conducted a statistical analysis to determine whether the capitol police suicides, we have to accept what sources ascribe that to. If there are notable minority viewpoints we may consider those in an attributed, contextualized way. But doing math to verify the claims made by reliable sources for editorial evaluation and using that to influence the article is a kind of synthetic thought that goes beyond the bounds of encyclopedic summarization. We simply need to present what the scientific, or journalistic authorities have reported. Andre🚐 03:53, 9 January 2023 (UTC)
  • Yes, exactly, but what's interesting in this case is that the source doesn't explicitly reach a conclusion. It gives you the numbers and implies the conclusion that these suicide rates are abnormal. That conclusion is not unreasonable because mathematically they are abnormal (as verifiable by Wikipedians doing basic arithmetic that would pass WP:CALC); so I would say we can give the numbers from the source.—S Marshall T/C 08:45, 9 January 2023 (UTC)
    I concur. Andre🚐 22:33, 9 January 2023 (UTC)
    I concur with what you have written. Reporting reported data is absolutely fine. Reuters is somewhat a secondary source because it contextualises the data, even if it doesn’t do much more.
    What I mean is that the source is unworthy to claim a connection. It may be random. The p-value is not reported, and even then it may be cherry picking of extraordinary data from a large data set (a selection bias). There may be a non-causal correlation (eg maybe the capitol attacks happened in part due to a recent decline in respect afforded to police). There are indeed quality secondary sources addressing the connection, but the Reuters report is not one of them. The unusual slip towards comment by Reuters might be attributed to Reuters staff and the author being well aware of the quality secondary sources. SmokeyJoe (talk) 00:38, 10 January 2023 (UTC)
  • The Reuters report definitely is an example of implicit synthesis (connecting these police suicides with the Jan 6 riot). However, it does not violate our WP:NOR policy because it is Reuters making that implication, not a Wikipedian. We can talk about other reasons to omit/keep the information, but NOR isn’t in play here. Blueboar (talk) 13:19, 9 January 2023 (UTC)
  • I agree what Reuters did doesn't violate Wikipedia's SYNTH rules since the implied causal relationship comes right from the source. However, I also agree that this is a great example of an implied claim that is also an exceptional claim. Per WP:V exceptional claims requires multiple high-quality sources. In this case that the implied claim would need exceptionally high quality sources and a normal news source, even a good one like Reuters, isn't at that level. It's OK for a news source to beg a question (or conclusion) like this but an encyclopedia shouldn't. Springee (talk) 18:37, 9 January 2023 (UTC)
    But the majority of the reliable sources do make the same conclusion. Andre🚐 01:56, 10 January 2023 (UTC)
In which case, we probably should cite those sources instead of Reuters. Blueboar (talk) 02:39, 10 January 2023 (UTC)
Additionally, we should check to see if any sources dispute the association or note that no causal relationship has been established. Springee (talk) 03:49, 10 January 2023 (UTC)
This factcheck.org article suggests that two of the 4 suicides are likely causal (and if I read correctly, later declared deaths in the line of duty) while the other two were much later and may not be causally linked. [5] Springee (talk) 04:00, 10 January 2023 (UTC)
@TFD The policy, WP:NOR, as currently written, is clear that the source must explicitly support the article content:
The best practice is to research the most reliable sources on the topic and summarize what they say in your own words, with each statement in the article being verifiable in a source that makes that statement explicitly.;
Even with well-sourced material, if you use it out of context, or to reach or imply a conclusion not directly and explicitly supported by the source, you are engaging in original research;
Do not combine material from multiple sources to reach or imply a conclusion not explicitly stated by any source. Similarly, do not combine different parts of one source to reach or imply a conclusion not explicitly stated by the source.;
A source "directly supports" a given piece of material if the information is present explicitly in the source so that using this source to support the material is not a violation of this policy against original research.
Rotary Engine (was Ryk72) talk 22:43, 13 January 2023 (UTC)
WP:SYNTHNOTSUMMARY. The quoted policy says that the best practice is to do X, and says reaching a conclusion not directly supported, it is OR. It does not say that simply summarizing conclusions made by sources is OR. In the case of the suicides, the sources aren't saying, they are simply stating statistics (WP:CALC) and we summarize their descriptions. Sources are perhaps also inferring or implying stuff but we are not. We are simply summarizing. Andre🚐 22:55, 13 January 2023 (UTC)

One way to look at it is that making the (implied) statement needs to be directly supported by the source. The source did not directly make that statement and so is not sufficient to support including that implied statement. North8000 (talk) 01:51, 10 January 2023 (UTC)

That is contrary to my understanding. Andre🚐 01:57, 10 January 2023 (UTC)
Well if there are multiple sources that say something similar isn't there one that says it more directly ? (In which case my point would be moot) Or even a source that reports that others postulate a connection.North8000 (talk) 02:09, 10 January 2023 (UTC)
If there are a series of reliable sources that all make the same implied connection, and none that say otherwise, that is sufficient to state it as fact in Wikivoice. Andre🚐 02:52, 10 January 2023 (UTC)
Per my comment above, Factcheck.org only links the two suicides that occurred within a month of Jan 6. Springee (talk) 04:02, 10 January 2023 (UTC)
You don't need to make the comment twice. I don't see how the Factcheck.org link is germane to the discussion of original research. That probably belongs on an article talk page. Andre🚐 17:56, 10 January 2023 (UTC)
My sincere and deepest apologies for thinking to post the comment twice. I'm sorry you had to take the extra time to read it a second time. The reason why it is relevant is you suggested that if no sources disagree... well at least one does. Springee (talk) 04:10, 11 January 2023 (UTC)
I don't mind reading it but it fragments the discussion. That source doesn't really say what you say it says. It's older than some sources, and it was updated later with a note on the bottom, and it is presenting a nuanced point that really doesn't contradict the other sources. In fact, it specifically avoids taking a position: We take no position in the debate over whom to include in the deaths from the riot Andre🚐 04:21, 11 January 2023 (UTC)
The discussion was already fragmented so I decided to reply to both fragments. Since we are quoting things, "On Aug. 2, the Washington Post reported: “Authorities drew no connection between the riot and his death. An official familiar with the investigation said Hashida had struggles beyond Jan. 6 that could have played a role.”" The quote you include actually applies to all the deaths they discuss, including Ashli Babbitt and not just the suicides. Springee (talk) 04:34, 11 January 2023 (UTC)
That was outdated, and it's just a reference to another Washington Post article that is outdated. This has nothing to do with original research at all. Nor does it contradict the claims other sources that are also reliable that Jan 6 played a role in the suicide. Andre🚐 04:43, 11 January 2023 (UTC)
Outdated? So what newer articles make it clear there was a causal relationship and how did they prove it. I guess that doesn't need to be answered here vs on the article talk page but it seems you are accepting of questionable correlation. Incidentally, we certainly can say that some politicians associated the suicides but that was likely for political rather than evidentiary reasons. Why would Wikipedia editors want to put an implied conclusion in our article, don't know. I'm sure we can find sources that say who made the associations instead. Springee (talk) 05:08, 11 January 2023 (UTC)
It's not original research to not check the "proof" of what sources did. They just say stuff and we say the same stuff. We do not need to check their reasons or their conclusions or implying conclusions or making any conclusions. It's simple. The sources say X number of people died from Y, we just say the same thing. We are not implying a conclusion. Andre🚐 22:57, 13 January 2023 (UTC)
The Reuters example is an edge case, and edge cases make for bad policy. This is an example where we would need to discuss. "The source connects these facts." "No, it doesn't explicitly do that, read carefully." If we treat this connection as a more extraordinary claim, then we'd expect more than one source to say it. Discussion is always going to be important, and it's impossible to create a policy that settles every argument in advance. Broadly speaking, it's not WP:OR when a reliable source says it, but it might be WP:OR if I draw a controversial interpretation from that source. Shooterwalker (talk) 15:36, 10 January 2023 (UTC)
  • This does not seem that hard to me, the primary source is things like the death certificate, statements made by people with personal knowledge (decedent, family, doctors) etc.; the secondary, here, is the report not from someone with personal knowledge of the primary fact (but with expertise in reporting events), that placed the fact within the topic (should we decide to use it, the matter is a V issue of sticking to the source, don't misrepresent it); and the decision remaining is an issue of basically NPOV, comparative use of source material. -- Alanscottwalker (talk) 16:01, 10 January 2023 (UTC)
    WP:Secondary does not mean secondhand. For most of Wikipedia's purposes, most news is WP:PRIMARYNEWS. WhatamIdoing (talk) 00:25, 11 January 2023 (UTC)
    As I alluded to earlier, while I agree with this, I think it's worth noting that it is specifically breaking news reporting that should be considered purely primary. As time goes on, major news media and editorial outlets start taking on secondary analysis. It seems to happen rather quickly these days. And many political articles would be lightly sourced indeed if we didn't start considering articles in major media outlets as secondary sources after a while. Andre🚐 00:36, 11 January 2023 (UTC)
    I agree with this. Quickly? I’d say they start in about a week, are well into it in a month, but are not good secondary sources until about a year later.
    In any case, a breaking news topic is the sort of topic that should never be deleted, or AfD-ed, on the basis of being a NOR failure, but should be updated daily with better sources according to the advice of PSTS. (Is this a reason for PSTS to be a mere guideline??). Note here WP:DRAFTIFY#2c specifically excludes “a new topic likely to be of interest to multiple people (such as current affairs topics)” from back door deletion. It is a great strength for Wikipedia, and a means for new editor recruitment, that Wikipedia is up to date, within minutes of release of the first reliable source on a breaking news topic. SmokeyJoe (talk) 04:50, 11 January 2023 (UTC)
    I agree with most of that Andre🚐 04:54, 11 January 2023 (UTC)
    How unconvincing your rote employment of such shibboleths are. Examining the actual information in issue, examining the context in issue, examining the actual sources in issue, is the only useful path, here. Distance matters and always will (distance gives a wider view), while it is the case that newspaper articles from a historical time period are viewed as primary sources that is still a function of distance (but your analysis would rate historians as mere second handers), it is also the case that many newspaper articles of recent events are the only thing providing context for now (and doing it much better than any Wikipedia editor, could possibly do it). Newspaper articles may be the first draft of history but that's fine, because in our recent events articles, that is what we are doing. The unfinished work-in-progress pedia is what we are. (And anyone who is knowledgeable of the state of Wikipedia knows by now, no matter how much anyone may decry it, we are not going to stop the flood of recent events articles in the pedia, our only way forward is finding the best sources that exist now, those that give context.) -- Alanscottwalker (talk) 01:29, 11 January 2023 (UTC)
    I do generally agree with most of this - funny that I'm ending up here at the end of this with this dovetailing nicely with the prior thread Andre🚐 04:57, 11 January 2023 (UTC)
    Secondary sources contain some sort of analysis ("A review of previous reports of house fires in our town indicates that this has been the biggest fire in town for at least the last 20 years"). Secondhand reports may or may not contain any secondary contents. "My neighbor told me that she was afraid that he'd kill her" is secondhand. It is not secondary. WhatamIdoing (talk) 03:34, 14 January 2023 (UTC)
    But that's the point, there's breaking news reporting close to an event, and then we reach a point where they start analyzing and reaching conclusions about what's happening probably about a week later or less. The breaking of breaking news is a pure primary source. The analysis articles may be a mix of primary/secondary and becoming more and more secondary over time until they are mostly. Andre🚐 03:37, 14 January 2023 (UTC)
    I agree. The contents of a newspaper cannot be accepted as pure secondary ("but it's all secondhand information, and the author has expertise in reporting events, and there are journalistic ethics, and..."). They can also not be dismissed as pure primary ("but 19th-century newspapers get treated as primary sources by historians, so why not 21st-century ones, too?!). Most of what appears in my local newspapers are primary sources and should be treated as such by Wikipedians (also, they should mostly not used at all, because "New store opened" or "Mayor presided at city meeting" or "Police arrest drunk drivers on New Year's Eve" are not usually suitable for an encyclopedia article). However, that's only most, and some of it is secondary sources. WhatamIdoing (talk) 03:42, 14 January 2023 (UTC)
  • Synthesis only applies when an editor combines two different sources to reach an implication not present in either. If an WP:RS itself combines two datapoints to make an implication, then our default stance should be to assume that that implication has been through their standard fact-checking and accuracy. Editors might object to it because they feel it is undue (especially if it is just a passing mention), but I think it is generally inappropriate to object to it because an editor feels the source is performing invalid synthesis, since that is functionally no different from "well, I think the source is wrong" or "well, the source hasn't convinced me, personally." Synthesis, like original research, is something we are not permitted to do ourselves but which we are supposed to rely on sources for - saying "the implication of these two things is significant and meaningful" is the kind of thing a secondary source is for. --Aquillion (talk) 13:26, 30 March 2023 (UTC)

Is constructing lists from multiple unrelated sources WP:OR?

With this I am referring to two types of lists:

  1. Lists with objective inclusion criteria, but which contain subjective information
  2. Lists with subjective inclusion criteria

For an example of #1, List of wars by death toll. It uses different sources for each entry, and each of the different sources uses a different method; some rely on ancient sources which typically provide inflated counts, while others rely on more modern sources - but even more modern sources can differ in methodology. Is it synth to use these different sources, with wildly different methodologies, to tell the reader that more people died in the Greco–Persian Wars than in the Wars of the Roses?

Another example of #1 that is possibly less clear would be List of wealthiest religious organizations. It uses more modern sources, whose methodologies won't differ so wildly, but the sources are still unrelated. Is it synth to use different articles, from different publishers, to tell the reader that the Catholic Church in Germany is wealthier than the Catholic Church in France?

For an example of #2, List of massacres in France. It uses a variety of different sources to determine whether an event should be called a massacre; is it synth to use different sources to tell the reader that Siege of Avaricum, Massacre at Béziers, St. Bartholomew's Day massacre, and Marseille bar massacre are all comparable and classified under the same definition of "massacre"?

BilledMammal (talk) 02:04, 20 March 2023 (UTC)

There's generally no problem with using multiple sources to build out a list, as long as all the sources are generally reliable. Eg: there's no issue with using an academic journal to list the death toll in one war, a book for another, and a modern-day newspaper article for yet another, as long as all three are generally reliable. I think there can be an issue with the last example, where the definition of the list includes terms that can be taken subjectively, and that's where there must be clear reliability on the sources. Masem (t) 02:09, 20 March 2023 (UTC)
For #1, what if the academic journal, the book, and the modern-day newspaper all use different methodologies? If there is a widely agreed upon methodology to calculate a number then I would consider that similar to a list of French writers - an objective list that a reliable source would assemble, if they had an interest in doing so - but if there is not, if different reliable sources have different methods of coming to their own conclusion, then I am concerned that we are producing a list that no reliable source would ever assemble and that makes statements that no reliable source would ever make.
The same goes for #2; when no reliable source has placed two items in the same subjective categorization, is it appropriate for us to do so? I see that List of video games considered the best takes a novel approach to this; they require that six reliable sources consider a game to be the "best/greatest of all time" - in other words, it appears to require that there is a consensus among reliable sources that the included game is very good. It might be a good idea to apply this requirement to all subjective categorization lists; require that for a topic to be included in any such list there must be a consensus among reliable sources that the topic belongs in such a list. BilledMammal (talk) 01:46, 21 March 2023 (UTC)
If there is a list definition that is decidedly more subjective or requires more than simple factual statements, it does seem reasonable to ask list editors to require multiple sources, so that one source doesn't create UNDUE inclusion. Masem (t) 02:16, 21 March 2023 (UTC)
I've opened a discussion on that; Wikipedia talk:Stand-alone lists#Creating minimum inclusion criteria for lists involving subjective categorization.
I'll think more on #1, as I'm still concerned that we're engaged in WP:SYNTH and in the process making incorrect statements that no reliable source would make. BilledMammal (talk) 02:40, 21 March 2023 (UTC)
I think your description of these as containing "subjective information" does not represent your examples. Subjective means that your view depends on your personal experiences/beliefs/values. For example: Is it good or bad to have teenagers wear school uniforms? One person will say "It's good, because then rich kids aren't showing off so much, which made me feel like we were all equal in the classroom." Another person will say "It's bad, because my school uniforms were always ugly and I wanted to be able to express my individuality." Neither of them are wrong; it depends on what "the subject" thinks.
In the lists you've mentioned, we're not talking about subjective information. We're talking about different sources counting to the best of their objective abilities. We're not going to get differences based on personal experience or identity; we're going to get differences based on newly discovered information or specific limitations (e.g., only battlefield casualties vs population-wide excess mortality attributable to the war). Some of this can be handled by providing a variety of estimates (high, middle, low numbers) or by adding a note ("called the best, but only on his mom's Facebook page").
Overall, I offer this advice: Whatever you do, try not to break pages like List of alternative rock artists and List of anti-war songs. Musical genres are blurry (which is not quite the same thing as being subjective), and whether a song like "Turn! Turn! Turn!" is sufficiently anti-war to be included could be debated. WhatamIdoing (talk) 03:42, 21 March 2023 (UTC)
"Subjective" might have been the wrong word, but the differences aren't just due to newly discovered information or specific limitations (e.g., only battlefield casualties vs population-wide excess mortality attributable to the war), but also due to different methodologies - this is particularly clear in List of wealthiest religious organizations where the wealth of the Sree Venkateswara Swamy Temple is determined through public declarations by the temple, the wealth of the Catholic Church in Australia is determined by assessing property values in New South Wales and extrapolating the value across the country, the wealth of The Church of Jesus Christ of Latter-day Saints is determined by a whistleblower, the wealth of the Seventh-day Adventists is determined through an undisclosed method of estimation, etc.
It is this difference in methodologies that I am concerned causes WP:SYNTH issues, because no reliable source has said that the Sree Venkateswara Swamy Temple is one and a half times as wealthy as the Catholic Church in Australia, and I don't believe any reliable source would unless they have used the same methodology to estimate their wealth.
I don't know enough about pages like those to comment, and will leave them to more educated editors. BilledMammal (talk) 03:54, 21 March 2023 (UTC)
This last comment is very helpful for me, because I don't think "subjective" is in any way the "right word" here. I would observe the following:
(1) a list (or in-text comparison) that brings together sources that use multiple, different methodologies would generally fall afoul of WP:SYNTH;
(2) A list (or in-text comparison) that brings together multiple sources using essentially similar methodologies would in general not represent SYNTH;
(3) reading sources to determine how they report their own methodologies is an appropriate role for editors as they collaborate to establish a page-level consensus (which, in this context, could be "the sources are essentially coherent", "the sources are essentially incompatibile", "a subset of sources are coherent" or "the domain is too fundamentally contested to assess the available sources", among other possibilities).
These observations are entirely orthogonal to whether or not "subjective" criteria are used - public opinion polling, for example, can reach findings about "subjective" beliefs that are in themselves somewhat robust, based on consistent methodologies. Meanwhile, competing and incompatible estimates can be developed - using differing methodologies - of phenomena that in themselves are not "subjective" at all, such as the mass of astronomical objects. Newimpartial (talk) 18:34, 21 March 2023 (UTC)
You've raised a concern about making unsourced statements, and I don't think that's what's happening here, at least not in the list that I looked at (List of wealthiest religious organizations). That list is a collection of reliably sourced information that could exist anywhere in the encyclopedia, and doesn't draw any conclusions other than the data provided. For example, if a reader can make a conclusion from the list that church A is twice as wealthy as church B, they can make the same conclusion from the individual articles as well. So are those wealth estimations reliably sourced enough to be included in their respective articles? If so, then I don't see much of a problem including them in other articles, including lists. If they're not reliable, then inclusion is problematic everywhere, not just in that list. Orange Suede Sofa (talk) 23:04, 21 March 2023 (UTC)
The difference I see is that with the inclusion of the estimates in the individual articles we are not inviting and encourage comparison; we are not saying that these figures are comparable. When we put them in a list that is what we are saying. BilledMammal (talk) 12:42, 22 March 2023 (UTC)
This type of list is common, and what should be done in the lede or pre-list is to explain, to the reader, how entries are included and in such a case, while there is no common scale or calculation used, the RSes that support the number are given within the list table. If there is one most authorative source but known to have gaps, the list can explain that most entries are to that authorative source while other entries are slotted appropriately using data from other RSes. Some brief explanation of how the list is assembled, set at the reader level. One example of such is the prose right before the table on List of biggest box-office bombs. Masem (t) 12:47, 22 March 2023 (UTC)
I think that differing methodologies count as "specific limitations". The way to present that is to add explanatory notes. It's fine to for someone to see "Alice Church 100 (extrapolated from real estate prices)" next to "Bob Church 150 (per disgruntled ex-employee)". What you don't want to see is "Alice Church 100 – Bob Church 150", with no hint that these estimates are not comparable. WhatamIdoing (talk) 02:34, 23 March 2023 (UTC)

Most lists are technically WP:OR taking the policy literally and by itself. . The creator has created a topic which (particularly for compound criteria lists) is per se not covered by RS's. Then, the applicability of the title of the list to the entry has no straightforward sourcing. But the overall Wikipedia system does not treat them as OR, so they aren't. North8000 (talk) 19:08, 21 March 2023 (UTC)

  • As is often true on WP… there is no “one-size-fits-all” answer to the question. Such lists certainly CAN be Original Research… but they are not always Original Research. You have to look at the specific list (and its sources) to determine whether it is OR or not. Blueboar (talk) 19:39, 21 March 2023 (UTC)
  • If no reliable source has ever included such a list, it is OR to generate the list. A justification that might work is if the entries are all bluelinks and your list is a navigation aid. —SmokeyJoe (talk) 23:09, 21 March 2023 (UTC)
  • I agree with Blueboar that there's no single answer to this question. In relation to List of wealthiest religious organizations, there are issues that should be addressed, including the methodology used to assess wealth and the date on which the wealth was assessed. My concern is not so much OR as whether the list is accurate.
    In practice, we have sometimes distinguished between list articles and categories, with a less strict approach to the latter. Whether this is justified is another matter: is Category:Anti-war songs any different from List of anti-war songs? Peter coxhead (talk) 11:30, 22 March 2023 (UTC)
  • We expect people to base Wikipedia articles on multiple unrelated sources, and we usually require it. Where all the sources are related, that's potentially a POV problem. In fact, writing proper Wikipedia content involves properly educating yourself about the topic by reading a variety of good sources in a critical and reflective way. We research the topic. The reason it's not original research because this process is entirely derivative of published work by others.
    This train of thought gives me a set of principles for the construction of lists. To me, it's clear that you can base lists, or any other content, on multiple unrelated sources and in fact wherever possible you should. There's a specific challenge with comparing numbers because we want to know those numbers are properly comparable -- were they calculated using the same methodology? Did the studies cover the same period? The same geographical area? The same population? Where you don't have that information, but you still want to construct a list, I think you need to disclose all the potential problems and inconsistencies as clearly as possible.—S Marshall T/C 10:09, 23 April 2023 (UTC)
    One way to construct lists based on multiple sources can be seen at List of countries by GDP (nominal) per capita; rather than merging the figures together and suggesting that the value for Cuba from the UN is directly comparable to the value for Montenegro for the IMF it makes it clear that the sources are different. I think this is what you are suggesting; when the sources aren't equivalent we make it clear to the reader that they are not. I think it would be worth adding a paragraph to Wikipedia:Stand-alone lists#Content policies requiring this?
    However, I feel this is only applicable to numerical lists; for categorical lists where the criteria for inclusion is debated by reliable sources I still believe we should follow WP:DUE; further discussion can be seen here. BilledMammal (talk) 14:15, 23 April 2023 (UTC)
    Categorical lists are for helping encyclopaedia users find content. They have their own rules at WP:CLN, but the key point is that they're doing the job of our encyclopaedia's index. You say "the criteria for inclusion", and I don't think that's the right way to think about them at all, because these subjective criteria that we're talking about will very rarely be mutually exclusive. So you allow all the different lists with different inclusion criteria. To take a trivial example: I could make a List of Star Wars movies with nine entries, and I could make another Complete list of Star Wars movies with however-many entries, and each would meet some users' needs, so both should exist.—S Marshall T/C 22:58, 23 April 2023 (UTC)

Social media account

If a person has a well-known social media account, does it count as "original research" to refeer to that account about a claim regarding themselves? Felixsj (talk) 14:27, 24 April 2023 (UTC)

  • It depends on what you want to say. Are you talking about a direct quote? Blueboar (talk) 14:59, 24 April 2023 (UTC)

Translation (?) of song lyrics

Please see this discussion and chime in if you please! Do we need a clearer guideline re: song lyrics? --SergeWoodzing (talk) 20:11, 1 May 2023 (UTC)

RfC on clarification of WP:CALC for costliest tornadoes

The following discussion is an archived record of a request for comment. Please do not modify it. No further edits should be made to this discussion. A summary of the conclusions reached follows.
Option 2; there is a rough consensus that calculating tornado costliness based off of NOAA, generally due to issues with NOAA itself, does not fall under WP:CALC but more so WP:OR. Costliness of a tornado must have a reliable secondary source attributed to the fact. InvadingInvader (userpage, talk) 06:35, 3 May 2023 (UTC)

Clarification as requested by Elijahandskip: Most editors seem to think that due to data issues with NOAA itself, that calculating ranks of tornado damage within a year without a non-NOAA source would violate WP:OR. Editors should reference a non-NOAA secondary source when claiming a tornado as the Xth-costliest. InvadingInvader (userpage, talk) 06:58, 3 May 2023 (UTC)

There has been disagreement between editors on multiple occasions whether or not the following situation is original research (not allowed) or if it falls under basic and routine calculations:

Most tornadoes in the United States are given a damage total provided by the National Oceanic and Atmospheric Administration (NOAA). Based on the those damage totals, a list of the top ten costliest tornadoes of that year is created. An example would be Tornadoes of 2022#Costliest United States tornadoes. NOAA does not provide a straight list of the costliest tornadoes of the year. This means there is no explicit source saying what the top ten costliest tornadoes of the year are as it was derived from the provided damage totals. Are Wikipedia articles allowed to say X tornado was the (1st/2nd/3rd ect..) costliest tornado of the year under a basic and routine calculation (looking at which numbers are larger than other numbers) or does it fall under original research as no source explicitly states the list?

  • Option 1: It falls under WP:CALC as a basic and routine calculation.
  • Option 2: It falls under original research as no source explicitly states the list.
  • Option 3: Other - Should be described in detail by the editor.

Elijahandskip (talk) 23:06, 19 March 2023 (UTC)

Discussion (tornadoes)

  • Option 1 - The way I look at this is similar to how Wikipedia charts are able to sort information. A really good example is List of the deadliest tropical cyclones. The charts allow the reader to sort the chart in increasing or decreasing order in terms of how many deaths occurred. Each of the deaths are sourced, but one source does not specifically state whether X event was deadlier than Y event. The computer just sorts the numbers as asked to by the reader. In this circumstance, no source says whether X damage total was higher than Y damage total, but the reader can visually see that $5 is greater than $1. So I believe this falls under routine calculations. Elijahandskip (talk) 23:06, 19 March 2023 (UTC)
  • Option 2/3 - But in a way that can be fixed through the in-article wording. If the NOAA doesn't provide a damage estimate for all tornadoes, and doesn't itself publish a list of costliest tornados, it would be OR to produce a ranking that makes it seem like there are estimates assigned to all tornadoes. It could be fixed, however, by framing it as something more like "among the tornadoes NOAA published a damage estimate for, this was the second costliest of 2022". — Rhododendrites talk \\ 00:09, 20 March 2023 (UTC)
What about in infoboxes like Tornado outbreak of March 5–7, 2022#Macksburg–Winterset–Norwalk–Newton, Iowa, which was the costliest tornado of 2022 and has a news article backing that up as well. Instead of it being mentioned in the article, it is a comment following the damage total in the infobox. Elijahandskip (talk) 01:01, 20 March 2023 (UTC)
Looks fine to me. If a secondary source has gone through NOAA data and made the superlative claim themselves, I don't think anyone here would complain about including it? — Rhododendrites talk \\ 01:38, 20 March 2023 (UTC)
Also a side note: Rhododendrites all tornadoes technically get a damage estimate, but some are left as "$0" which obviously isn't accurate. A good example is the 2021 Western Kentucky tornado. NOAA breaks up their tornado reports per county. The tornado was on the ground for 165 miles, so it crossed through a lot of counties (I think 15 in total). In the tornado reports, only 1 county has a damage total marked for the tornado that isn't "$0". So its official damage total is >$25,000. Obviously the tornado caused way more than $25,000 (destroyed thousands of buildings), but the damage total hasn't been formally finalized yet. That's the problem. Your wording 100% fixes the error that can be given to a reader, especially since not every tornado will have a true damage estimate. I will note though, per US Law, NOAA is the only source of official US weather data, meaning even if the damage total is marked at "$0", that technically still is an official damage estimate. But I do agree saying something that let's the reader understand that not every tornado has a finalized damage total is the best course of action. Elijahandskip (talk) 01:21, 20 March 2023 (UTC)
  • Question Does the NOAA have a set of guidelines they use when determining what is/is not likely tornado damage? Alternatively do they at least have a dedicated division to compute these figures so the same people are likely to compute these figures for all the tornadoes in a year? IMHO that's the key. If there's something in place to ensure that the same criteria was used to compute damages from a tornado in Florida as there is from one in Minnesota that happened 6 months apart, I don't see the problem. However, for example, if the NOAA were to simply ask the individual states to provide these figures, there's likely not consistency in how those figures were computed. Dave (talk) 00:26, 20 March 2023 (UTC)
Moabdave, let me try to answer some of your question. Yes, NOAA surveys storm damage to determine what is from a tornado or from straight-line winds and such. All the tornado reports are put together by the National Weather Service's local branch for that area and every tornado report can be found on the Storm Event Database (currently contains data from January 1950 to December 2022). For a reference of how a typical tornado report looks, here is one for the deadliest tornado of last year ([6]). Damage totals are split by "Property damage" and "Crop damage". In this case, the report says the tornado caused $75.00M or $75 million in property damage. That is the same for tornadoes dating back to 1950, which is when official US Government records began on tornadoes. Wikipedia has articles for every year (Tornadoes of [year]) with information going back through 1950, with articles being made for years before 1950 as well. Hope that answers your question and helps answer anyone else who had a similar question. Elijahandskip (talk) 01:13, 20 March 2023 (UTC)
That's the only concern I see with it. IMHO, Option1 is fine provided there is consistency in how the figures are computed. I read the concerns so far raised by others and accept those as well, but from what I read they can be addressed by a footnote to clarify the scope of the data. Dave (talk) 16:44, 20 March 2023 (UTC)
@Moabdave: the answer isn't actually that clear. While the reports may have damage totals listed on the surface, the summary text (if one is even given) may say that the damage encompasses the entire event. In some cases, the damage might be tabulated in another listing's summary but not added to the specific event file. The database seems clear-cut at a glance, but there are nuances to understanding the human errors it is riddled with. When it comes to total damage from an event, namely the billion-dollar disaster list, everything is lumped together. ~ Cyclonebiskit (chat) 02:52, 21 April 2023 (UTC)
  • It's not original research to say that one number is bigger than another number. We should of course make certain that all the sources use the same methodology and cover comparable areas and time periods. When that's done, WP:CALC is satisfied. But there are other policies and guidelines that might argue against including this information on Wikipedia.—S Marshall T/C 00:38, 20 March 2023 (UTC)
  • Option 3, per Rhododendrites. The information itself might be fine, but the presentation needs clear explanations. As an example, it seems apparent from the linked list that these are very broad estimates. A secondary source talking about how those values are determined would be good, as well as a list clearly saying it's a list of NOAA estimations rather than presenting itself as the actual cost. CMD (talk) 02:31, 20 March 2023 (UTC)
  • WP:CALC permits editors to say that one number is bigger than another number. Having endorsed #1, I add that outside of the specific, simple context here, there are many pitfalls waiting to trip up the unwary editor. One would need to exercise caution when comparing numbers from different sources, different methodologies, or that might otherwise be non-comparable. It's okay to say that NOAA says this one in 2022 is $25K and that one in 2022 is $24K, so this one is bigger. It's not okay to to say that NOAA gives US$25K for a tornado in Texas in 2009, which is bigger than the estimate from a private insurance company of NGN$11.5 million for a tornado in Nigeria in 2023. WhatamIdoing (talk) 03:09, 21 March 2023 (UTC)
  • Option 1/3, an editor can absolutely state that one number is bigger than another, but it needs to be framed correctly as per comments from Rhododendrites and WhatamIdoing. -- LCU ActivelyDisinterested transmissions °co-ords° 19:59, 24 March 2023 (UTC)
  • Option 1 with the caveat that we don't mix sources using different methodologies. If we are to create the list ourselves from damage estimates, we need to be explicit that all of the estimates come from one organization, and use them consistently, and be explicit in the text what that singular source is. --Jayron32 12:30, 27 March 2023 (UTC)
  • Option 2. There are multiple issues here (some of which can be fixed by tweaks as described above, and some of which can't.) One issue is that not every tornado is tracked by NOAA. In practice I suspect that it is unlikely they would miss a tornedo that would be in the list, but it's still possible. A more serious issue is that a list like this carries the implication that the "most damaging tornado" or "Xth most damaging tornado" is a significant and recognized status, and that these rankings are a useful and meaningful way to examine them. WP:CALC is for mathematical calculations, not for how information is presented, and in particular I'm skeptical about using it for comparisons, which inevitably carry implications related to the items chosen for comparison. (To respond to a comment above, I absolutely do believe that in many situations it is OR / SYNTH to say that one number is bigger than another, since the selection of numbers to compare can be a form of research or synthesis and can carry unsourced implications.) --Aquillion (talk) 09:29, 30 March 2023 (UTC)
Just as a comment: Speaking honestly, yes, the “Xth most damaging tornado” actually does carry a recognized significance that is mentioned both by NOAA and media RS as well as academicly published papers. For example, NOAA has a page specifically on the 10 Costliest U.S. Tornadoes. One of the NWS branches, NWS in Norman, Oklahoma, had a list of the Top Ten Costliest Oklahoma Tornadoes. Numerous RS contain similar things as well. The Weather Channel, NBC News, & a key one being KCRA, which specifically states the costliest tornado of 2022. So in terms of coverage, the costliest nature is 100% a factor used for comparison. Elijahandskip (talk) 12:55, 30 March 2023 (UTC)
If secondary sources cover the costliest tornadoes in a particular way, then we should use those sources with the timeframe and categorization that they use. But I'm skeptical about editors pointing to one arrangement of data to justify a different arrangement of data with potentially different implications. It looks to me like what most RSes cover is the costliest tornadoes of all time, so why not stick to that, and only list costliest tornadoes per year when we have a secondary source showing that a list for that year is relevant? --Aquillion (talk) 13:21, 30 March 2023 (UTC)
NOAA is the source for those secondary sources. A secondary source saying the costliest tornado of X year is using data from NOAA. Why would it not be ok to just use the NOAA data, rather than wait for a source to use it? Everyone (including the general public) has access to the data through the Storm Event Database. That is where RS get the data. That is where finalized tornado reports go as well. Without really saying it, your reasoning somewhat would deprecate NOAA finalized tornado reports, since 99% of the report would be usable on Wikipedia, but this singular section about the tornado’s damage total would not be usable. For instance, in the database, anyone can sort it however they want. In this example, I sorted it to be all time. Would that count as a source for the costliest tornadoes of all time? One could argue that based on the specifications, the U.S. Government just specifically said this tornado’s report caused more damage than this tornado’s report, since it is arranged in that order by their computer system. Where do you think the media get their data? They go to the interactive database, hit sort based on what specifications they want, and boom, they have their data. If an editor needs to wait for RS confirmation, then effectively, NOAA tornado reports would need to be deprecated, aka the U.S. government be deprecated. Elijahandskip (talk) 13:44, 30 March 2023 (UTC)
Why would it not be ok to just use the NOAA data, rather than wait for a source to use it? Because primary data must be contextualized by a secondary source for us to extrapolate meaning from it, because content must be not only verifiable but also comply with NOT, and because we cannot imply a conclusion not directly and explicitly supported by the source. JoelleJay (talk) 02:25, 20 April 2023 (UTC)
By that logic, the US Government (including the White House) can't be used as a source. NOAA is the US government, so we are effectively depricating them by saying we cannot use them as a source and must wait for another source that isn't the US Government to say it. Elijahandskip (talk) 02:31, 20 April 2023 (UTC)
I mean, that depends on the current office /s It doesn't have to be that extreme... We can use the primary sources in order to present information, but we cannot use it to make our own categorizations (such as "list of costliest"). We can use NCEI sources and say "X tornado caused Y dollars in damage" because that's what the source says. We cannot use NCEI to say "Y tornado was the costliest in [year]" because the database does not say that, it simply lists data without providing context. Applying our own interpretation is the primary issue here, not the usage of NOAA sources. ~ Cyclonebiskit (chat) 02:19, 21 April 2023 (UTC)

Technically option #2 Maybe #3 except #3 is unclear. But I think math is not the main issue. NOAA is weather people not economics people and so so making an unattributed statement on economics based on derivations from NOAA data is too much of a stretch. If it was attributed/ explained like "According to NOAA damage estimates....." then it's down to just the math issue and I think that it would be OK. Maybe that's option #3, but option #3 is unclear. Sincerely, North8000 (talk) 15:26, 30 March 2023 (UTC)

  • Option 2, Aquillion makes an excellent point. JoelleJay (talk) 02:12, 20 April 2023 (UTC)
  • Option 2 – I've been on both sides of this for years and years, and Aquillion's point pushes me to one side. The recent influx of costliest tornado lists to the yearly articles is concerning. We can compile a list of certain values, but we can't impose meaning upon it that isn't presented. We had a rampant issue with "records" being interpreted by users through the National Hurricane Center's database (HURDAT) years ago when no context was given to many of these records. They were just cherrypicked pieces of information with no corroborating source outside of the user's interpretation of the database. Because it breached issues with OR we axed any mentions of records without accompanying text sources clearly stating the information. This same logic needs to be applied to severe weather.
    The NCEI database being brought up by Elijah has some quirks that need to be take into account. For a large chunk of it in earlier years, there are not actual damage estimates rather there are categories for damage ranges. The automated system converted the damage category into the lower bound dollar value and lists that as the damage caused by a specific event. I'm uncertain off the top of my head when the switch to tabulated/estimated values took place, but the methodology within the database itself is inconsistent. The damage estimates are also not available for every event as the NCEI reports are published monthly on the period 3-4 months prior—these reports are imported from the 122 branches of the National Weather Service, not made by the NCEI itself. After the initial publication of NCEI reports, information is rarely, if ever, updated. There's nuance to using/understanding the database and where its information comes from, and making such broad claims of "costliest" is inappropriate without secondary sources. ~ Cyclonebiskit (chat) 02:19, 21 April 2023 (UTC)
  • Option 2 – Agree with reasoning given by Cyclonebiskit. Knowing the massive number of inconsistencies as a result of basic human error over the many years of maintaining NCEI Storm Data, I'd be reluctant to use it in so much as an elementary school research project. When we're talking about this costliest tornadoes list, I am in agreement that these lists are concerning given the lack of outside sources and knowing the likely errors in computing the NWS totals in the first place. United States Man (talk) 02:27, 21 April 2023 (UTC)
  • Important Comment: For those wishing to remove NOAA damage totals from NCEI, please remember that all tornado damage totals come from NCEI, therefore Wikipedia would no longer accept any information about a tornado's damage total since it all comes from NCEI. Elijahandskip (talk) 02:32, 21 April 2023 (UTC)
I don’t think this has anything to do with the issue at hand. The discussion is on whether it is okay to assemble a separate list from the totals that isn’t explicitly published, not whether the damage totals can be used at all. I believe the totals can be used no problem, but the issue is that you can’t take the totals and assemble your own list. United States Man (talk) 02:49, 21 April 2023 (UTC)
(edit conflict) I don't get where these extreme swings in commentary are coming from, but no one in this discussion has suggested removing usage of NOAA. They're saying to not apply meaning beyond the simple statement of "X is the damage total". And for what it's worth, we don't exclusively use NCEI for tornado damage. We're not bound to NOAA when it comes to sourcing information, we can use other reliable sources to expand upon the topic. ~ Cyclonebiskit (chat) 02:52, 21 April 2023 (UTC)
  • Option 2 I've never liked the idea of putting in a chart like this. Many tornadoes don't receive damage figures, while others have incomplete figures. There are too many holes to include it in the article. ChessEric 07:12, 22 April 2023 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Quibbles about the definition of primary sources

I've got nits to air about the definition of "primary sources".

First, we're defining it quite specifically in terms of reporting of events. The population of Brazil isn't an event, nor is the height of Mount Everest, nor are the identities of Laurence Oliver's parents, nor are the media used in a portrait painted by Titian. "Primary source" needs to be defined more broadly. It seems to me that it should be defined as first-hand observations about a fact, maybe what we could call an "observable", or even a "nullary source".

Second, I don't see how a work of fiction is a primary source. It's an observable! The book Things Fall Apart is an observable, a nullary source, as is the film Five Easy Pieces. If somebody writes a plot summary about either of those based on their own reading or viewing, what they write here is the primary source, a first-hand account of what's in the work. In contrast, if someone puts together a plot summary here based on descriptions of the plot wrritten in published reviews by critics based on their reading or viewing, then the critics' writings are the primary sources, first-hand accounts of what happens in the work.

In an article here about a work of commentary, such as Africa Is Not a Country or All the President's Men, the author often draws from their readings of primary sources and secondary sources, as well as interviews with knowledgable people serving as primary sources. So those works may serve as valid secondary and tertiary sources for all that information, and, if deemed reliable, such sources, citations of those works are valid for purposes of WP:V. But when the topic (such as in the articles linked at the beginning of this paragraph) is the work itself, then, once again, the work is the observable, so a description of what the work covers, if drawn only from an editor's own reading of the book, is first-hand observation, making the description here a primary source.

Third, the concerns I've been voicing here came up in the context of WP:PLOTSUM, which says "it is generally assumed that the work itself is the primary source for the plot summary". No! Not if we're being consistent about what the term "primary source" means! It amazes me that the guideline says that because it's a contravention of the general principles in operation here. If a scientist conducts an experiment that reveals interesting facts about a members of a particular species in the presence of a particular environmental factor, then those findings are the observables and the scientist's notes and papers on those findings are primary sources for the information. That much seems to be agreed to on Wikipedia. If that scientist were to use Wikipedia as the forum in which to publish those findings, that would be a classic case of original research.

At Calabrian Greek, where it states (based on reliable sources) that that dialect of Greek is written in the Latin alphabet. Someone changed it to the Greek alphabet and argued on the talk page that they had themselves seen it written in Greek letters in one city in Calabria. Another clear example of original research.

So if I watch a film and then come to Wikipedia and write my observations about what happens in the film, how is that different?

In response to a much shorter comment by me elsewhere to the effect of the above, one person responded that if we had to depend on, say, critics' reviews and other works that have covered the plot of a film, then it would be too hard to write a decent plot summary here. Is that the benchmark here, that in topical areas where it's too hard to properly source content according to the usually applicable principles, then we'll just jettison the principles? PLOTSUM argues that the book or film itself satisfies WP:V because any reasonable person can see it and observe the same things. But that's also true of the outcome of a scientific experiment (to the extent that the nature of modern scientific methodology is to craft research to produce reproducible results), and anybody who wants to can travel to that town in Italy and see the same signs for themselves. Why do we accept "Anybody can buy and read the book" as an argument when we don't accept the counterpart arguments in other realms? Largoplazo (talk) 16:30, 1 July 2023 (UTC)

@Largoplazo, you might be interested in Wikipedia:Identifying and using primary sources, if you haven't seen it before.
We do accept "anybody who wants to can travel to that town in Italy and see the same signs for themselves"; you can {{cite sign}}. We pretty much accept any source that is in some sort of fixed media. (The problem with repeating a lab experiment is that "just anybody" isn't able to reproduce the results.) WhatamIdoing (talk) 16:53, 1 July 2023 (UTC)
{{cite sign}}? Are you sure about your conclusion as to the use of this template? Consistent with what I've said thus far, it seems to me that the template would be applicable at Calabrian Greek only if the editor had seen a sign that declared "Calabrian Greek is written in the Greek alphabet". In that case, the sign would be a primary source for that information and could be cited as such.
However, these were only street signs; they included the Calabrian Greek names of the streets, and those were written in Greek letters. Those are an observable, a nullary source, and the relating of that observation here by the editor was a primary source. (In the case of that discussion, there was also the matter of weight: a multiplicity of reliable sources versus this one person's observation of signs in one location—and, frankly, I don't know whether those signs were in Calabrian Greek; they could have been meant to be in standard Greek, as far as I know.)
The medium and the message aren't to be confused with each other. Largoplazo (talk) 17:12, 1 July 2023 (UTC)
Editors should only cite signs for the content that's plainly printed on the signs, so the sign would need to say something like "This is Calabrian Greek, which is written in Greek letters".
There's no such thing as a "nullary" source, and I wish you wouldn't make up words. An editor might read it and think it's a real concept. WhatamIdoing (talk) 15:29, 3 July 2023 (UTC)
I've innovated the term "nullary source" to encapsulate what it is relative to primary source (at the first level of remove from the thing being observed and about which observations are being made), secondary source (at the second level of remove), etc. So the thing itself is at the zeroth level of remove. "Nullary" is an extant word (as is "zeroth"), and a suitable adjective comparable to "primary" and "secondary". I'm very, very far from the first person who has ever devised novel terminology to make or clarify or illustrate a point. Largoplazo (talk) 23:21, 4 July 2023 (UTC)
In relation to the concerns about works of fiction (but applies in general), works can be both primary and secondary at the same time. A work of fiction that draws from research about a topic would be secondary to that topic, but the work itself its still primary. It all depends on what topic that is to be sourced to the given work. Masem (t) 17:13, 1 July 2023 (UTC)
Yes, I wrote about that in my paragraph beginning "In an article here about a work of commentary". Largoplazo (talk) 17:15, 1 July 2023 (UTC)
User:Largoplazo write in the language of journalism, and mentions science. Does he realise that different fields define the term differently, and that Wikipedia is neither journalism nor science? Wikipedia is historiography. With that in mind, User:Largoplazo, could you please restate your question? SmokeyJoe (talk) 22:39, 1 July 2023 (UTC)
I'm commenting on the definition given on this project page and inconsistencies in application across genres, not definitions as they might be given elsewhere in the context of specific domains. I really don't get your absolute statement, Wikipedia is historiography. The first sentence at Historiography is "Historiography is the study of the methods of historians in developing history." As I noted above, The population of Brazil isn't an event, nor is the height of Mount Everest, nor are the identities of Laurence Oliver's parents, nor are the media used in a portrait painted by Titian. Where does historiography come into those things? Historiography is basically about how people study events. My first point above is that, indeed, the definition given here of "primary source" is written as though everything we write about on Wikipedia is an event. American robin isn't a history article. Neither is Five Easy Pieces. So to say "Wikipedia is historiography" ignores everything written here that the definition given here for primary source ignores. And it doesn't explain to me why someone writing up the findings from their own experiment is original research, but someone writing up what they saw on TV isn't. Largoplazo (talk) 23:10, 1 July 2023 (UTC)
Wikipedia is about knowledge and where that knowledge comes from, and this is most compatible with historiography. What’s more important is that terminology from journalism and science causes trouble, unless you’re very clear as to what definition you are using.
My reading of the project page is that for a definition, it refers you the article Primary source. The only thing I have thought was missing is that for Wikipedia, one should know to use the historiography definitions found at that article.
I think I understand what you’ve written, but I don’t understand what your question is. I don’t understand the importance to you of terms like “event” and “observable”. Maybe let’s start with, what is the definition of “primary source”, as you read it. SmokeyJoe (talk) 08:30, 2 July 2023 (UTC)
The population of Brazil isn't an event, but the publication of its census results is. I think Largoplazo is correct that WP:PRIMARY is not absolutely comprehensive, but it overall seems to be working out anyway. WhatamIdoing (talk) 15:36, 3 July 2023 (UTC)
WP:PRIMARY begins “Primary sources are …”. This, to me for sure, implies that for a definition, one should refer to the article Primary source, and the words following are a summary, intended for introducing the newcomer to this core policy.
At the article, Primary source, there is a very nicely written lede. It’s firstly concerned with history.
If anyone is concerned that the definition in the article is quibblesome, they should fix it, using quality sources.
Wikipedia Project space should avoid creating terms of art, and I think this project page doesn’t do that. SmokeyJoe (talk) 05:14, 4 July 2023 (UTC)
Quality sources don't dictate to Wikipedia how Wikipedia should choose to assess the verifiability of assertions made in its articles. I'm expressing my concern that, despite what's currently stated in the fiction-specific guidelines "I read this in the book" is just as much original research as "I saw this happen in my lab" or "I measured the Western Wall myself" and that it's just as deficient for the purpose of satisfying WP:V as those other sorts of assertions, and I've tried to set up a systematic way of expressing why that is. Largoplazo (talk) 23:18, 4 July 2023 (UTC)
I think you may be doomed to be disappointed with us, but we see a difference between situations that sound like:
  • "I read this in the book – and you can, too, because the words in your copy of the book will be exactly the same as the words in my copy of the book"
and situations that sound more like:
WhatamIdoing (talk) 08:26, 5 July 2023 (UTC)
  • Reading a book isn't research. Googling something isn't research. Research is gathering data as well as interpreting it.—S Marshall T/C 08:48, 5 July 2023 (UTC)
  • I agree that there is a difference between I saw something and there are no collaborating witnesses and this summary is from a book I read that has been professionally published so anyone can buy a copy to verify the content of the summary.--65.93.194.183 (talk) 06:44, 11 July 2023 (UTC)

14 July Community Call

Edit Check protoype (mobile)

Hi y'all – as people motivated to ensure the content people contribute is sufficiently sourced, the Editing Team thought some of you might be interested in participating the virtual meeting we're hosting this Friday, 14 July (15:30 to 17:00 UTC).

We'll use this time to discuss Edit Check, a new project that will present people with guidance while they're editing.

The first "check" we're building prompts people to add a reference when they don't think to do so themselves.

Regardless of whether you're able to make the meeting or not, we would value learning what you all think of the Edit Check prototype.

If the above brings any questions to your mind, please ping me so that I can try to answer.

In the meantime, this MediaWiki page should contain all the information you need to join Friday's conversation. PPelberg (WMF) (talk) 19:52, 11 July 2023 (UTC)

Recent addition - maps, charts, etc.

  • So… apparently there was an RFC (which I missed) concerning maps, charts and similar sources… and based on that RFC a new paragraph has been added. Having now read through it, I don’t object… BUT… I do have a concern:
This policy clearly tells editors NOT to engage in analyze or interpret sources themselves… however the new paragraph says that “routine interpretation” of maps, charts, etc is OK. That is problematic. It sounds like we are carving out an exception to NOR for certain types of sources (and I don’t think that was the intent).
Could we either clarify what “routine interpretation” consists of, or use a different word than “interpret” to convey what we are trying to say. Blueboar (talk) 11:30, 17 May 2023 (UTC)
For the record, you did know about it. There's a comment signed by you on April 19th at the RFC at Wikipedia:Requests for comment/Using maps as sources. Do you really want to re-open this? It was debated ad-nauseum by dozens of editors (including you) over the span of several weeks. The debate is over 400k of text and got quite spirited. That included arguments over the specific word intepret, with your exact argument being made and responded to. I don't think anybody got the exact outcome they wanted, but such is the joy of "governing by consensus". Dave (talk) 14:36, 17 May 2023 (UTC)
Um… having checked my contributions for that date (and double checked the RFC) I don’t see a comment by me. Perhaps you have me confused with someone else? Blueboar (talk) 17:08, 17 May 2023 (UTC)
My apologies, March 19th. My bad. Copy-paste from the RFC: Questions - are maps considered primary sources (with the restrictions and cautions of such), secondary sources or tertiary sources? Or some mix of all three? Does it depend on the specific map? Does it depend on the specific information WE are attempting to cite to the map? I’m not sure we can make blanket statements here. There is a LOT of nuance and grey zone when it comes to maps. Blueboar (talk) 13:03, 19 March 2023 (UTC) Dave (talk) 17:23, 17 May 2023 (UTC)
Huh! I actually remember writing that… I guess I did notice the existence of the RFC at one point!… anyway… my concern with the language still stands. I find the words “routine interpretation” confusing. We either need to use another phrasing, or we need further explanation. Blueboar (talk) 19:34, 17 May 2023 (UTC)
@Moabdave: Can you clarify if it was the intent to carve out an exception to NOR for certain types of sources? If not, then Blueboar's point has merit and perhaps we can tweak the wording. BilledMammal (talk) 14:44, 17 May 2023 (UTC)
I repeat, the appropriateness of the word interpret was debated ad-nauseum. I stand by my comments made at the RFC, and you can read the argument I made as to why I think interpret is an appropriate word to use by searching my comments. I don't see how it would help to repeat the same arguments here, just because you want to accuse people who voted against you of bad conduct to try to re-open the litigation. Dave (talk) 15:29, 17 May 2023 (UTC)
Regarding the personal attacks, see your talk page. Regarding the rest, it was a yes or no question; an attempt to determine if it would be possible to find different wording that would meet your intent while addressing Blueboar's concerns. BilledMammal (talk) 15:37, 17 May 2023 (UTC)
The wording was taken right from the closing note of the RFC. I suggest talking to the closing admin if you don't like the verbiage. We've already been through the discussion so there's no need to start it over just because you don't like the outcome. –Fredddie 16:40, 17 May 2023 (UTC)
  • Simple solution. "Do not reinterpret sources". - Floydian τ ¢ 15:27, 17 May 2023 (UTC)
  • Would a phrasing like "Routine reading of maps, charts, etc."; the intent is, if a map shows, say, a river called "Nile River", then reporting that fact is not WP:OR. I think the problem is the contradictory use of "interpret" here; a slightly different wording may help alleviate that and keep the spirit of the policy intact. --Jayron32 15:45, 17 May 2023 (UTC)
    Well I was hoping to avoid another 400k of text re-hashing the arguments made at the RFC. But I guess that's futile. My counter point to that WP:OR does not forbid interpretation, but rather puts limits on how to interpret. By other policies we are required to summarize sources. One must interpret a source to be able to summarize it. In fact, summarization is a form of interpretation. If we can't interpret, we can't summarize only regurgitate. Dave (talk) 15:58, 17 May 2023 (UTC)
    Jayron's suggestion would resolve my concern. Blueboar (talk) 17:13, 17 May 2023 (UTC)
    Not really; we're quite allowed to just make Wikipedia better without needing to ask permission, and that includes improving policy pages where the wording is confusing. --Jayron32 18:30, 17 May 2023 (UTC)
    Summarizing and interpreting are two different things. If I wanted to explain to someone how to create a neutral and accurate summary, I would tell them to avoid interpreting. An interpretation is something that could reasonably be disputed. A good summary should never in dispute, even if you disagree with what is being summarized. Shooterwalker (talk) 03:21, 18 May 2023 (UTC)
    The policy now, by RFC, allows you to interpret a map: meaning you can, in a routine and uncontroversial way, read the scales and labels and keys of a map, and figure out what it actually means about spatial truth in the world, and then write that in your own words in Wikipedia. Andre🚐 03:34, 18 May 2023 (UTC)
    The text added is Routine interpretation of such media is not original research provided that there is consensus among editors that the techniques used are correctly applied and a meaningful reflection of the sources , not read the scales and labels and keys of a map, and figure out what it actually means about spatial truth in the world. If you want that specific wording you'll need another RFC. Given the way 2a and 2b both failed, I doubt that is the communities interpretation of the wording. Bit such things can be worked out at WP:NORN on a case by case basis. -- LCU ActivelyDisinterested transmissions °co-ords° 23:53, 18 May 2023 (UTC)
    I'm sorry, I didn't put my text in quotes or mean it verbatim. "interpretation" of "such media," to me, means exactly what I just said. But I am indeed making the argument that we should not mess with the RFC closure without a new RFC. Andre🚐 00:11, 19 May 2023 (UTC)
    Agreed the wording should stay, unless very similar but clearer language can be agreed upon (per SFR's comment below). I very much doubt that will happen now though, and probably be left to a later date once things have calmed down. However my point was that the agreed upon wording is not a cante blanche, further discussions on detail will be inevitable. -- LCU ActivelyDisinterested transmissions °co-ords° 09:48, 19 May 2023 (UTC)

A strictest enforcement of the most literal interpretation of wP:NOR would require the deletion of the majority of Wikipedia. Things like the RFC result give guidance on the practical interpretation of how we operate. As a side note, the location of the RFC can make it easy to miss. When an RFC is put elsewhere that is about a specific policy page, we probably need multiple prominent notices on the subject policy page. North8000 (talk) 17:29, 17 May 2023 (UTC)

  • Having just added this wording as a result of the RFC, if there's a need to change the wording, it would need a strong enough consensus to override the consensus that was obtained in the RFC. And it's not a great look to say "huh, I didn't know about the RFC" when you did know about it and even participated. Andre🚐 03:29, 18 May 2023 (UTC)
    Meh… My participation consisted of one post - written two months ago - that asked a few questions… but yeah… I freely admit that I was mistaken when I said I was unaware of the RFC. A more correct comment would be that I had long since forgotten that the RFC existed. My bad, but not a big deal.
Anyway… moving forward… what do people think about Jayron’s suggestion of simply changing one word: from “routine interpretation” to “routine reading”? This would absolutely resolve my concerns. Do we really need a follow up RFC to change one word? I am willing to go that route, but it shouldn’t be necessary. Blueboar (talk) 13:26, 18 May 2023 (UTC)
I think it is a good change; it matches what I understand the intent of the proposal to be (ensuring that editors are permitted to summarize maps) and the beliefs behind many of the support !votes; for example, Jc3s5h said I believe the kind of interpretation intended by the proposed addition is routine map reading, and fits into the kind of statement that can be verified by any education with access to the map. BilledMammal (talk) 13:33, 18 May 2023 (UTC)
I still believe what I wrote earlier, but wish I had written it "...can be verified by any educated reader with access to the map." I concur with the change. Jc3s5h (talk) 13:38, 18 May 2023 (UTC)
That verbiage was the result of a well advertised RFC that was open for 2 months where dozens of editors participated. You want to override it with an unadvertised discussion between a half dozen of people who didn't like the outcome? In other words RFC's are now meaningless and governing by consensus is replaced by governing by which side refuses to concede. I'm not 100% happy with the results of the RFC either, but I accept it. Dave (talk) 15:46, 18 May 2023 (UTC)
I agree with Dave, the change suggestion is out of order and I oppose it on principle as it would violate the consensus obtained by the RFC. Andre🚐 15:53, 18 May 2023 (UTC)
The question remains if the specific verbiage was approved, or merely a concept that could be expressed by many different possible expressions was approved. I see people asserting that the specific wording is sacrosanct and cannot be changed because everyone voted on it intending it to be exactly as written, but that wasn't how I read the RFC. The RFC reads to me like its to confirm that simply reading and reporting what a map says doesn't violate WP:NOR policy. There's many different ways the same concept can be expressed, and if one of them causes confusion, its better to use words that don't cause confusion. If we can make the wording less confusing and still maintain the meaning of the RFC, why not? --Jayron32 15:58, 18 May 2023 (UTC)
For the record - no, I DON’T want to override the RFC. I want to clarify it by changing one single word. Blueboar (talk) 16:01, 18 May 2023 (UTC)
Two rival wordings were proposed, one with and one without the phrase "routine interpretation". That debate happened at the RFC. Dave (talk) 17:50, 18 May 2023 (UTC)
I am the only editor here who opposed proposal one (although I supported the similar alternative one); everyone else either supported it or didn't !vote. I don't think it is fair to characterize this as people who didn't like the outcome. BilledMammal (talk) 16:06, 18 May 2023 (UTC)
Considering I supported the original proposal, I find Dave's characterization of me as being on the "side refuses to concede". Insofar as I was on a "side", my side already won. Still, I don't think in terms of winning, never have, never will, and Dave's accusation against me is demonstrably false, and quite frankly, insulting. --Jayron32 16:16, 18 May 2023 (UTC)
Charitably I think he was referring to Blueboar. But in terms of the wording and the process, since this wording comes pretty closely out of the RFC close, a new RFC should be started to change the wording unless it's an uncontroversial change. Andre🚐 16:18, 18 May 2023 (UTC)
Well, given that my sole contribution to the RFC was a single post to ask some questions (which I promptly forgot I had even asked)… I don’t think you can accuse me of taking a “side” either… but I’m not upset about it. I try to look forwards, not backwards. Blueboar (talk) 17:23, 18 May 2023 (UTC)
No insult was intended. An alternative proposal was created which removed the phrase "routine interpretation". This proposal was called proposal 1a in the RFC. Again, all this was already litigated in the RFC. The consensus can be judged by comparing the comments from proposal 1 to proposal 1a. This discussion is arguably to nullify the results on proposal 1 and re-instate proposal 1a of the RFC (or perhaps retroactively create a proposal 1b now that it had been declared that prop1 passed, not prop 1a). So I don't think a characterization of "not accepting the results" is inaccurate. Dave (talk) 16:29, 18 May 2023 (UTC)
I'm still bothered by the whole WP:NOTBURO ickiness this "WE MUST OBEY THE RFC AT ALL COSTS" gives me. We must do what is best for the encyclopedia, and if the existing wording is a problem, we shouldn't feel the need to not fix it because there was a vote, which may or may not have even considered the issue being raised today. --Jayron32 16:42, 18 May 2023 (UTC)
I understand your point. I just disagree with it. Literally everything being said here, including the debate over the phrase "routine interpretation" was brought up at the RFC and voted upon. The dislike of the phrase "routine interpretation" was one of the primary reasons for creating proposal 1a. That debate occurred, and the results are in. Dave (talk) 17:10, 18 May 2023 (UTC)
I think that's also an uncharitable straw man. We don't have to obey an RFC at all costs but you need a better reason to IAR than just, "the wording is unclear." I don't think everyone agrees it's so unclear. Andre🚐 17:58, 18 May 2023 (UTC)
Not everyone, in a unanimous sense, but most people in this discussion apparently do. --Jayron32 18:08, 18 May 2023 (UTC)
WP:Consensus can change, even shortly after an RFC. If people are finding it self-contradictory, then there's nothing inherently wrong with asking for clearer wording. WhatamIdoing (talk) 04:56, 21 May 2023 (UTC)
No, but it does feel disingenuous to do it immediately after the RFC closes. Roughly the same thing happened ("my side didn't 'win' so I'm going to complain like hell") in the Olympian RFC enough that the closing admin caved and reversed his decision. –Fredddie 17:33, 21 May 2023 (UTC)

Comment - what strikes me about this situation is that, if an RfC reached consensus for language that is in some important respect ambiguous, then any attempt later to "clarify" that language cannot take its mandate from the result of that RfC. "Interpretation" is a word that is itself susceptible to multiple interpretations, and to select one more specific synonym on the basis that one or two RfC participants actually meant to restrict the meaning of "interpret" in that way does not seem in line with the RfC process or result.

(This spoken by someone who was uninvolved in that RfC, and whose views on "interpretation" are far too complicated to factor into this particular discussion. As a more general observation, I have found many editors to use a definition of "interpret" that means something like, "read something into a source that I don't", which differs from the way the term is used in the top shelf of the human sciences.) Newimpartial (talk) 16:19, 18 May 2023 (UTC)

  • I think this hinges on whether the closer of the RFC found consensus for specific language or consensus for a broader concept when closing. So… I have asked the closer to comment here. Blueboar (talk) 17:15, 18 May 2023 (UTC)
    • The consensus was for the specific language as that was the question of the RFC, however I didn't read that consensus as mandating "interpretation" in the Wikipedia legalese sense which matches the second definition here, as much as the first definition which is plain explaining what the source says. I don't believe it would require a new RFC to adjust that language to something that doesn't have a specific on-wiki meaning or is causing confusion, but I also don't see a need to get hung up on the word interpretation. One word can have two meanings where one doesn't communicate the right thing and the other is okay. What I'm saying is feel free to change it to read or something if you think that will lessen confusion. ScottishFinnishRadish (talk) 18:02, 18 May 2023 (UTC)
      Thanks… that helps. Blueboar (talk) 19:24, 18 May 2023 (UTC)
      This is something I've tried to get across to no avil. Interpret has two distinct meanings. You interpret glyphs on a page when you read a book. You interpret something when you read a news story and put your own broken-telephone spin on it while recanting. - Floydian τ ¢ 01:11, 19 May 2023 (UTC)
      Thank you for that clarification; since you did not see a consensus for interpretation in the Wikipedia legalese sense some clarification is required, as otherwise at least some editors will interpret it in that sense. I've implemented Jayron's suggestion for this. BilledMammal (talk) 02:52, 19 May 2023 (UTC)
      I don't think that was appropriate, for two reasons. First, this discussion has only been open for a few hours and fewer than 10 people have opined. The RFC was open for 2 months and had dozens of people opine. It was also advertised in at least a dozen places. By contrast, only the people who watch WT:OR know about this debate. Second, it was you who proposed the alternative wording specifically to remove the phrase "routine interpretation", which did not pass. So it's not a good look for someone to re-instate their own failed proposal the day after the vote ended. Even if consensus here is overwhelming that the phrase "routine interpretation" should go, neither you nor I should be the judge to decree one side has won the debate. Very few venues allow someone to be both an advocate,judge, and sentence executioner at the same time. Dave (talk) 04:07, 19 May 2023 (UTC)

From a procedural side IMO having the RFC result say to put that text in doesn't necessarily include "don't change a single word for a long time unless there is a new similar scope RFC. So IMO that would not categorically preclude minor changes even shortly afterwards without a similar RFC. But IMO the discussed word change is a major change, not a minor one. "Interpret" is a farther reaching word than "read" and is at the core of what the meaning of the decided RFC wording says. Also, the RFC was very recent. IMO it at this point it would take another RFC to make that change. Sincerely, North8000 (talk) 18:55, 18 May 2023 (UTC)

  • I'll note that the single most common objection during the RFC was in regards to this exact issue, so I don't think it would necessarily be contradicting the RFC to change the wording slightly to accommodate that fear. I don't think we need to re-litigate the whole discussion over a nitpick. Obviously it being part of my !vote on the matter biases me slightly here, but still --Licks-rocks (talk) 21:55, 18 May 2023 (UTC)
    • IMO its pretty common to have minor changes to the wording which don't actually change the meaning/spirit of the consensus following a major RfC as the specific language is more or less locked in at the beginning of the RfC which could be weeks before the close. Horse Eye's Back (talk) 21:00, 12 July 2023 (UTC)
      Query: are you happy with the proposed amendment on this issue, most recently formulated part-way through this section? Newimpartial (talk) 21:22, 12 July 2023 (UTC)

Clarification of concern (maps)

I have been thinking further about my concern with this addition. In some ways, those who support the current language are correct… The NOR policy does not actually ban “interpretation”… However, it clearly bans novel interpretations, analysis and conclusions. That goes to the heart of “No original research”. The problem with interpreting maps is in determining whether the interpretation is novel or not. I think the new addition needs to give some guidance on that. Something a bit more than “routine”. Blueboar (talk) 13:38, 21 May 2023 (UTC)

That's a good point. I think the interpretation was meant to be the routine and uncontroversial interpretation that any reasonable person would make. Andre🚐 01:46, 22 May 2023 (UTC)

Dynamic maps

I have to strongly object to further modifications of the text here. Neither Proposal 2a or 2b had consensus, but that does not mean that there was consensus to disallow the use of dynamic maps or referring to the satellite layer (though admittedly, by my own reading of 2b, there was enough negative commentary that disallowing the satellite layer would be a reasonable finding). --Rschen7754 03:06, 19 May 2023 (UTC)

The issue is that there was a consensus not to allow them; given that the current wording could be interpreted as allowing them we need to clarify that it doesn't. I believe my wording does this, with the use of "excludes" rather than "disallows".
If you believe that this wording is unclear how do you suggest clarifying it? BilledMammal (talk) 03:10, 19 May 2023 (UTC) BilledMammal (talk) 03:10, 19 May 2023 (UTC)
The issue is that there was a consensus not to allow them [citation needed] --Rschen7754 03:14, 19 May 2023 (UTC)
(Ec) So are we gonna allow a single user to dictate one of the most important policies on Wikipedia? I for one suggest representing it exactly as the discussion panned out. No consensus = no change. If you disagree, how do you suggest strong-arming your position? - Floydian τ ¢ 03:17, 19 May 2023 (UTC)
For proposal 2a and 2b there was a consensus; There is consensus against both proposal 2a and 2b? BilledMammal (talk) 03:21, 19 May 2023 (UTC)
Proposals 2a and 2b were to add text to policy. That they failed means the status quo remains. Dave (talk) 03:22, 19 May 2023 (UTC)
Proposals 2a and 2b were to change policy to allow something. There was a consensus against allowing it, and since the text implemented as part of proposal 1 could be interpreted as allowing this we need to provide clarification to ensure that policy reflects consensus. BilledMammal (talk) 03:32, 19 May 2023 (UTC)
Proposals 2a and 2b were to change policy to allow something [citation needed] - in order to use that as an argument you have to show that it was disallowed before. Given that many FAs use Google Maps, it clearly was not disallowed before. There was a consensus against allowing it also [citation needed], at least with 2a there were many different reasons why it failed, including those who thought it was redundant to other policies. --Rschen7754 03:37, 19 May 2023 (UTC)
I think this is where the misunderstanding is coming from; my edit isn't attempting to disallow it. It is attempting to clarify that this line should not be interpreted as allowing it because the RfC it was added in rejected allowing it. In other words, I am attempting to maintain the status quo of this policy not commenting on the use of satellite imagery and dynamic map applications.
I understand that you feel my proposed wording could be misinterpreted, which is why I am asking you to propose alternative wording which does this. BilledMammal (talk) 03:43, 19 May 2023 (UTC)
Why do you care so deeply about this? NOR only affects editors who research topics and create articles based on that research. Judging by your contributions, that does not apply to you. So why do you care? –Fredddie 04:26, 19 May 2023 (UTC)
Agreed with BilledMammal that this should be clarified. Perhaps stating explicitly that "there is consensus against including dynamic maps and satellite layers as RS". JoelleJay (talk) 17:29, 19 May 2023 (UTC)
Again, your assertion is without evidence. Rschen7754 00:03, 20 May 2023 (UTC)
Also against the addition of the text. Censenus against the addition of text is not an explicit consenus against the practice, however the fact that neither received explicit community support should be taken into account when using maps. The idea of allowing very open interpretation of maps was rejected, the details of where the line is is probably best handled by discussions at WP:NORN. There examples can be discussed, hopefully with input from other editors. -- LCU ActivelyDisinterested transmissions °co-ords° 10:00, 19 May 2023 (UTC)

Interpret or reading

Newimpartial, in the above discussion most editors endorsed the change permitted by the closer, and most agreed that it would reduce confusion.

Further, I consider it obvious that using "interpret" here in a different manner to the common meaning used on Wikipedia - without any clarification that it is being used differently - will lead to confusion. If you disagree with the specific change of "interpret" to "reading", what alternative would you propose? BilledMammal (talk) 09:16, 27 June 2023 (UTC)

I don’t have my own proposal yet, but I have explained my objection to the proposed phrase (using "reading") on the closer's Talk page. Newimpartial (talk) 09:22, 27 June 2023 (UTC)
@Newimpartial:, any further thoughts on this? I would add that above I am seeing an agreement to change this to reading; while a few editors disagreed most supported such an act. BilledMammal (talk) 01:41, 5 July 2023 (UTC)
Just to be clear, my objection is to the verb-like use of "reading" (as an activity) in that policy text; I would have no objection to using the word as a substantive, e.g. "a reading of". Newimpartial (talk) 01:45, 5 July 2023 (UTC)
Sorry, it's not obvious to me where that would fit in. Can you put it in the full sentence? BilledMammal (talk) 01:53, 5 July 2023 (UTC)
Well, since you're asking me, I'd propose something like "Any routine reading of such media is not original research ...", and I actually think "conventional" or "straightforward" would work better than "routine" in this context. But anyway, I believe that gets the nuance across. Newimpartial (talk) 02:05, 5 July 2023 (UTC)
That would address my concerns; I also have no objection to substituting "conventional" or "straightforward" for "routine". Perhaps Any straightforward reading of such media is not original research provided that there is consensus among editors that the techniques used are correctly applied and a meaningful reflection of the sources? BilledMammal (talk) 02:10, 5 July 2023 (UTC)
That language looks fine to me, thanks. Newimpartial (talk) 03:05, 7 July 2023 (UTC)
I cannot understand "Routine reading of such media is not original research" on logical grounds. Reading something refers to an input operation but what counts is the output from reading the source (for us, the output is an edit based on the source). I can't just "read" a map—that's meaningless unless I interpret what I see and deduce, for example, that X is closer to Y than it is to Z. It's like WP:CALC where routine calculations are fine but routine is in the eye of the beholder and a discussion has to decide whether text in an article is a reasonable interpretation of the source. Johnuniq (talk) 03:23, 5 July 2023 (UTC)
In the case of maps, I firmly reject the notion that "map reading" is original research. See, for example, [https://www.nwea.org/map-reading-fluency/ "map Reading Fluency" by NWEA, an organization that develops standardized tests for US K-12 education. "Map reading" means using maps for the purpose intended by their authors, such as finding the way to a hotel, or walking through the woods. This equivalent to reading text material. Saying that reading a map is original research would mean that editors could only display maps in articles, not read them before writing an article. It is the same as if we said articles must be composed of a series of cuts and pastes from reliable text sources; actually reading the text sources and paraphrasing them would be forbidden. Jc3s5h (talk) 10:17, 7 July 2023 (UTC)
I would agree, which is why we are saying that it is not; the trouble is that interpreting a map, depending on your definition of "interpret", can involve OR. BilledMammal (talk) 08:21, 9 July 2023 (UTC)

Misunderstanding

 
Outside Wikipedia, original research is a key part of scholarly work. However, Wikipedia editors must not base their contributions on their own original research. Wikipedia editors must base their contributions on reliable, published sources.

The explanation "for which no reliable, published sources exist." for the so-called "original research" is extremely confusing, because logically, any published study, is "original research", if it really deserves the title "research" and/or if not containing plagiates. We urgently need a better term conveying what is really meant here - distinguishing facts and views. Thanks for any attempt! HJJHolm (talk) 13:21, 14 June 2023 (UTC)

  • Wikipedia doesn't publish original research. We allow (and require) citations to studies published by others.—S Marshall T/C 15:45, 14 June 2023 (UTC)
  • That statement is in the context of Wikipedia editors, not the sources. North8000 (talk) 17:32, 14 June 2023 (UTC)
    Does that need to be clearer? It seems this is a misunderstanding that comes round to often. -- LCU ActivelyDisinterested transmissions °co-ords° 18:33, 14 June 2023 (UTC)
    HJJHolm, did you notice the image towards the top (which I've copied here)? External sources are allowed to do all the original research they want. It's only Wikipedia editors who shouldn't be posting their own original research/ideas on wiki. Also, you might be interested in reading Wikipedia:Core content policies#cite note-1. Basically, this policy was created to deal with a certain Usenet personality who thought he had disproven Einstein's theories, and wanted to post it on Wikipedia because none of the physics journals would publish his nonsense. That is the kind of "no original research" we're banning. WhatamIdoing (talk) 01:12, 15 June 2023 (UTC)
You miss the point. Please read again and exactly what I wrote. I did not object the policy but the term used for this policy which leads to misunderstandings. E.g., Your example refers to a paper (?) which not really was a scientific research.HJJHolm (talk) 05:52, 16 June 2023 (UTC)
I personally think that the policy could usefully be renamed WP:No original research by Wikipedians, but my point is that the policy already explains that we're banning OR by editors, not OR by sources. WhatamIdoing (talk) 00:32, 17 June 2023 (UTC)
Already somewhat better. And still insufficient. Nobody seems to be able to clearly define that rule. E.g., id an editor cites his own research from a - naturally - external, peer-reviewed source, is both,by an editor AND a "source". Try to make it better! Sorry, is this too difficult for wiki admins?HJJHolm (talk) 04:46, 29 July 2023 (UTC)
An editor is an editor of Wikipedia, if your have a paper in a peer-reviewed journal it's not something you did as an editor of Wikipedia. It would not be OR, this is covered under WP:SELFCITE. -- LCU ActivelyDisinterested transmissions °co-ords° 10:07, 29 July 2023 (UTC)
Also to be clear no-one in this thread is an admin. -- LCU ActivelyDisinterested transmissions °co-ords° 10:12, 29 July 2023 (UTC)
Completely irrelevant. Admins don't have more say in these discussions than other editors, nor do they have more of the applicable knowledge.  — SMcCandlish ¢ 😼  18:52, 29 July 2023 (UTC)
The OP specifically mentioned admins, and since he mostly edits at dewiki, it's possible that he's accustomed to a system in which admins are treated differently. WhatamIdoing (talk) 03:03, 30 July 2023 (UTC)
Regarding Nobody seems to be able to clearly define that rule. E.g., id an editor cites his own research, there is a guideline at WP:SELFCITE which may help. There is also some recent discussion about how this works in practice which you may be interested in participating in. Orange Suede Sofa (talk) 03:59, 30 July 2023 (UTC)

Comparisons of statistics

What's the difficulty with comparisons of statistics from sources that use different methodologies? Obviously they shouldn't be treated as equal, but it seems to me that such comparisons can be quite useful for maintaining NPOV in some situations. Imagine that Israel and the Palestinian Authority both run a census of the same town and get different results; if we use just one of them, readers will suspect a POV favouring one side or the other. But if we say "Israel reported that the town has 1000 residents, and the Palestinian Authority reported that it has 1,100 residents", and we're citing sources published by the census agencies, we're being neutral; and by merely presenting the numbers without evaluating them, we're also avoiding the WP:SYNTH problem that might arise if we attempted to say that one methodology was superior to the other. Nyttend (talk) 01:52, 15 August 2023 (UTC)

Not sure that's really the kind of "comparison of statistics" that is meant, and maybe the guideline needs clarified wording.  — SMcCandlish ¢ 😼  04:55, 15 August 2023 (UTC)
I'd be happy to propose alternate wording, but I can't imagine what it's really intended to mean. Anyone else have an idea? Nyttend (talk) 06:34, 15 August 2023 (UTC)
  • Example: A high-quality university sociological study in 1990 finds that 37% of a carefully selected sample of households say they attend church in the last month; in 2007, 52% of respondents to a voluntary web survey said the same. Therefore (the reader is presumably meant to conclude) church attendance is increasing. EEng 06:45, 15 August 2023 (UTC)
    ^ This. "A says 1000, and B says 1100" is not a comparison. "A says 1000, and B says 1100, so it [i.e., reality] has changed by 10%" is an inappropriate comparison.
    Normally, we would accept such calculations as being routine calculations using basic arithmetic. "A said 1000 in 2010, and A also says 1100 in 2020, so it's gone up 10%" is fine.
    Yes, if you provide non-comparable statistics, the reader might make their own assumptions. That's not really our problem. Our job is to not make those assumptions ourselves, and to provide gentle hints that these are not entirely interchangeable. A reader who is engaged enough in the subject to make the comparison might also be engaged enough in the subject to realize that "A" and "B" are not the same and might not be comparable. WhatamIdoing (talk) 00:32, 17 August 2023 (UTC)
    By putting "A says 1000 in 2010 and B says 1100 in 2020" you could run foul of SYNTH Do not combine material from multiple sources to reach or imply a conclusion not explicitly stated by any source. It would depend on what is implied by stating that there has been an increase. -- LCU ActivelyDisinterested transmissions °co-ords° 00:55, 17 August 2023 (UTC)
    "Source A says 1000 in 2010. Source B, using different methodology, says 1100 in 2020." This doesn't, and couldn't ever, run foul of SYNTH.—S Marshall T/C 09:31, 18 August 2023 (UTC)
    Yep.  — SMcCandlish ¢ 😼  11:21, 18 August 2023 (UTC)
  • Example: One source says the population of the City of London is 8,600, and another says the population of the city of London is 8,797,000. Both are, in fact, roughly correct for 2021. The discrepancy is because the City of London isn't the same geographical area as the city of London. (Capitalization really matters!)—S Marshall T/C 08:20, 15 August 2023 (UTC)
    Yes, the wording needs to say something about comparability of data sets and of quality of data, but someone from a stats background should probably wordsmith it.  — SMcCandlish ¢ 😼  13:30, 15 August 2023 (UTC)

Poor grammar

i witness a poor grammar called "to refer to". It sounds irritating while remind me of So Okuno. Please change if you can. Who someone invented the phrase "to <verb here> to" for this? 103.10.97.150 (talk) 15:22, 19 August 2023 (UTC)

Are you talking about where it says The phrase "original research" (OR) is used on Wikipedia to refer to material ...? That's perfectly normal usage, as is "to [verb] to" in general. "He asked me to try to do a better job." "I used to want to travel a lot but now I don't." "This is all the money I'm able to give to you." Largoplazo (talk) 15:58, 19 August 2023 (UTC)
Yes, and "i witness a poor grammar called" is a pretty strong indicator that the anon is not a native English speaker, and in the throes of the Dunning–Kruger effect regarding they amount they've learned of the language so far.  — SMcCandlish ¢ 😼  18:33, 19 August 2023 (UTC)
OTOH, we could make the sentence WP:REFERS-compliant, if we wanted to. There's no compelling reason why that sentence needs to say The phrase "original research" (OR) is used on Wikipedia to refer to material... instead of something like For the purposes of this policy, "original research" (OR) is any material... or On Wikipedia, the phrase "original research" (OR) means any material... WhatamIdoing (talk) 20:36, 19 August 2023 (UTC)
Is it pedantic to point out that it's a process and not not the product of the process? Probably. Should be: "original research" (OR) is the creation and addition of any material... DeCausa (talk) 20:43, 19 August 2023 (UTC)
While that should be true, I'm not sure that is true. Consider a sentence like "I've removed this section because it is original research", or the text of Template:Original research: "This article possibly contains original research." WhatamIdoing (talk) 21:26, 19 August 2023 (UTC)

Discussion at Wikipedia talk:What Wikipedia is not#RfC: Deprecating WP:NOTDIR

  Resolved
 – Discussion already closed in favor of "no change".

  You are invited to join the discussion at Wikipedia talk:What Wikipedia is not#RfC: Deprecating WP:NOTDIR. BilledMammal (talk) 12:32, 3 September 2023 (UTC)

WP:MEDRS link

In the "Reliable sources" section, a sentence was added in April 2022, saying "However, note that higher standards than this are required for medical claims." From looking at contribs and talk pages, I don't think this addition was discussed. However well meaning it was, and it may well represent a general view of some editors, I don't think it is helpful nor is it relevant to WP:OR. MEDRS has always been, in my opinion, the application of our core policy and key guidelines to the biomedical domain. The selection of sources it recommends follows naturally when one is trying to meet not just OR and V but probably most importantly, NPOV. The reason that MEDRS is so against citing primary research papers, for example, isn't that those papers unreliably report whatever study the researchers did or unreliably state the findings of the research (though some may). A Wikipedia article mentioning and citing as its source a pilot study of 12 volunteers that was published in The Lancet is likely to satisfy both OR and V, provided the editor doesn't get carried away with their interpretation of the result. But the weight of secondary literature may well ignore or reject that study when discussing the article topic.

The biomedical topics have particular issues because it is so easy to search PubMed for primary research papers and they superficially look very attractive and academic. And because newspapers daily output dubious wellness stories or oversell the latest discoveries, in a way they just don't do for quantum physics or woodwind instruments or any other topic you wouldn't even consider using a newspaper for. But that doesn't mean the standard is "higher". Just appropriate. So I propose that sentence is removed. -- Colin°Talk 16:37, 7 September 2023 (UTC)

MEDRS is a subpage of Wikipedia:Identifying reliable sources, which is linked under the header of the "Reliable sources" section. So some form of link to MEDRS is helpful in that it point out that determining sources in that area is particularly problematic. It could bethat "higher standard" isn't quite the right wording, but some form of link is helpful. Maybe it could just be linked under the heading as with WP:RS. -- LCU ActivelyDisinterested transmissions °co-ords° 19:08, 7 September 2023 (UTC)
I've no problem with retaining a link, but "higher standard" is inaccurate. WhatamIdoing (talk) 21:59, 7 September 2023 (UTC)

Semi-protected edit request on 10 September 2023

जम्बुद्विप (Jambudwip) is not part of Anicent India. There was no Ancient India but large number of small and big kingdoms in the territory. Kush.Khanal (talk) 10:26, 10 September 2023 (UTC)

  Not done: it's not clear what changes you want to be made. Please mention the specific changes in a "change X to Y" format and provide a reliable source if appropriate. Paper9oll (🔔📝) 11:15, 10 September 2023 (UTC)

Text/prose

@Mathglot: Re. [7]: my thinking was that "maps, charts, graphs, and tables" are all also text-based media. You'd be hard-pressed to extract any information at all from them if they consisted only of graphics. The distinction the section appears to be trying to make is that those media typically don't have continuous blocks of writing in ordinary language, hence prose. – Joe (talk) 09:28, 25 September 2023 (UTC)

@SMcCandlish: who apparently agrees. – Joe (talk) 09:31, 25 September 2023 (UTC)
It's not what first came to mind for me, but your point is a good one. Neither term seems ideal, but I'm not sure if there's a better term that avoids the problems of each; if not, yours is the better one. Mathglot (talk) 09:39, 25 September 2023 (UTC)
MoS, throughout it, routinely refers to running-text content material in the main body of the article as "prose". It's completely normal wording in our P&G material, and anyone bothering to read that material will already understand it (and we don't here care about anyone not reading such material since WP:NOR is such material). Around a decade ago, a single person showed up and tried to replace "prose" with other terminology, arguing that "prose" only means "not poetry", and then was shown a dictionary, and went away. :-)  — SMcCandlish ¢ 😼  09:54, 25 September 2023 (UTC)
FWIW it's also something I've encountered regularly in off-wiki writing, e.g. editors suggesting that a short table or insubstantial figure be "turned into prose". – Joe (talk) 10:08, 25 September 2023 (UTC)

Would WP:CALC apply to a tornado's forward speed?

There was a disagreement between editors as to whether or not calculating a tornado's forward speed would be classified as WP:OR or acceptable under WP:CALC using an online calculator as a reference (like this).

The reason this is a debate is because most tornado damage surveys include the start and end times (duration of the tornado) as well as the exact distance traveled. For almost all tornadoes, the forward speed is not relevant. However, a disagreement occurred on 1999 Loyal Valley tornado, which brought this question to light. For this specific tornado, multiple sources referenced the forward speed (i.e. a slow-moving tornado), but did not provide the average forward speed of the tornado by number. For the WP:CALC argument, it is because speed = distance / time, and given the exact distance and exact time is provided by sources, a simple and basic calculation could be preformed to get the average forward speed. An online calculator, which can provide an exact URL for the problem (Example being 1 mile divided by 1 minute for 60 mph), can be referenced to satisfy a reference concern. The WP:OR argument is because no source specifically states a number, but rather just mentions the forward speed as being a key aspect of the tornado.

So, would situations like that be satisfied under WP:CALC, or does a source need to specifically state a number for a Wikipedia article to have an average forward speed? For record, there are more articles besides the 1999 tornado example which have had disagreements in the past related to this question, so an answer here from an editor familiar with policy, outside of WikiPriject Weather would be appreciated. The Weather Event Writer (Talk Page) 01:08, 6 September 2023 (UTC)

If said calculations are being used to display forward speeds as records (either slow or fast), one would think a reliable source would be needed to confirm that. Cherry-picking certain tornadoes to list forward speeds (such as recently at Tornado records) seems entirely rooted in OR because no source explicitly states the tornadoes were the fastest. United States Man (talk) 01:22, 6 September 2023 (UTC)
I will note, user has recently gone on a mass deletion spree (see their contribution history) of several articles, including Tornado records, which seems to have been done mere minutes before their comment here. The Weather Event Writer (Talk Page) 01:28, 6 September 2023 (UTC)
How is that relevant to the discussion? United States Man (talk) 01:29, 6 September 2023 (UTC)
You are a member of WikiProject Weather and went on a deletion spree, minutes before commenting here, in what appeared to be an attempt to show that it must be WP:OR (as your comments removing it indicated), prior to a discussion from a non-weather person here. Things you just removed as being "OR" had been on the article for months (at least since March 2023), which numerous editors, including yourself, editing the article several times. All of a sudden, when a question was posed here, you decided to remove it as being OR, prior to a discussion even getting answered. Like whatelse are we to think, given the mass deletion spree, on top of removing it coincidentally just after this discussion was started, for the same argument used on the 1999 tornado article. The Weather Event Writer (Talk Page) 01:30, 6 September 2023 (UTC)
The one and only edit to that was made an hour prior to your initiation of this discussion upon discovery of the issue, which makes your accusation of me trying to skew the discussion here null because I couldn't have known about it. I merely commented here to provide more context that was left out of the original comment. United States Man (talk) 01:35, 6 September 2023 (UTC)

IMO it's an edge case, leaning towards OK with respect to that particular consideration. If there is a particular discussion about a particular entry, perhaps it's best to leave it out. But an edge case argument isn't enough to do do unilateral categorical mass deletions or to dominate the results of debates. Sincerely, North8000 (talk) 01:51, 6 September 2023 (UTC)

  • I'm not so sure about this, I would have thought that most sources don't include this information because tornadoes don't travel in straight lines. So if a sources says it travelled 10 miles in 60 minutes, that doesn't necessarily mean that it travelled at 10mph. It could have been moving a lot faster but not travelling directly point to point. -- LCU ActivelyDisinterested transmissions °co-ords° 12:21, 6 September 2023 (UTC)
Good point North8000 (talk) 13:35, 6 September 2023 (UTC)
  • As long as you differentiate between speed and average speed (see Speed#Average speed), this is high school mathematics and clearly WP:CALC compliant.—S Marshall T/C 14:30, 6 September 2023 (UTC)
    Better point. It's the language used that's important. -- LCU ActivelyDisinterested transmissions °co-ords° 15:35, 6 September 2023 (UTC)
    No, average speed is not based on the final position minus the initial position, but the total length of the path taken. So if there is any doubt as to the shape of the path, it is incorrect to assume it was a straight line. Zerotalk 23:50, 2 October 2023 (UTC)

What if primary source contradicts secondary source?

BilledMammal made this edit (addition underlined):

    1. A primary source may be used on Wikipedia only to make straightforward, descriptive statements of facts that are not disputed by reliable secondary sources and can be verified by any educated person with access to the primary source but without further, specialized knowledge.

This addition would prevent correcting a typographical error in a secondary source by referring back to the primary source that it is based on. Also, among reliable sources, some are better than others. This would elevate the reliability of every secondary source (unless it's total crap) above every primary source. Finally, it doesn't address the situation — Preceding unsigned comment added by Jc3s5h (talkcontribs) 00:03, 2 October 2023 (UTC)

I think BilledMammal's addition would cause more trouble than it solves. The problem of contradictory sources is not restricted to primary versus secondary. The option of properly citing both sources, avoiding OR while providing the reader with a maximum of information, would be forbidden by the proposed text if one of the sources is primary. I don't think that's an improvement. Zerotalk 01:39, 2 October 2023 (UTC)
Reading your replies, I realize I was looking only at one half of the picture - I was looking at situations when secondary sources are discussing a primary source. In such a situation I feel we can't say "my interpretation is better than the secondary sources interpretation", even if the editor thinks that they are only presenting a straightforward, descriptive statements of facts; that is very obviously WP:OR and we should make it clear that we cannot do this.
I forgot the other half - when a primary and a secondary source are discussing a third topic. In such a situation, it is reasonable and compatible with the word and spirit of our policies to cautiously use the primary source for straightforward, descriptive statements of facts even when they are contradicted by the secondary source.
Considering this, my original proposed wording is not suitable, but if other editors agree that we shouldn't be preferencing our interpretation of primary sources over the interpretation of reliable and secondary sources, then I think we need to workshop wording that will prevent that without having unintended side effects. BilledMammal (talk) 01:59, 2 October 2023 (UTC)
Just wanting to note that I wrote an essay partially related to this topic (User:WeatherWriter/Verifiability, not truth in action), where primary sources from the US government contradicted outdated secondary reliable sources. It is a case-by-case basis, but for the real example used in the essay, the secondary sources trumped the primary sources. The Weather Event Writer (Talk Page) 02:08, 2 October 2023 (UTC)
Agreed; for a very obvious example, if a reliable primary source from 2010 says that the strongest recorded earthquake in Paris was 8 on the Richter scale in 1991, we should use that over a source from 1970 saying that it was 7 on the Richter scale in 1938.
However, these obvious exceptions don't apply to secondary sources that are discussing primary sources; if a secondary source says that section 81 of the Australian constitution means X, then we cannot say the secondary source is wrong and that it actually means Y - even if we believe that Y is a straightforward, descriptive statements of facts, even if we believe that the secondary source is wrong.
Perhaps if we add a seventh bullet point to that list, saying something along the lines of: When a primary source has been interpreted by a reliable secondary source, do not use the primary source for a statement of fact that is disputed by the secondary source.? BilledMammal (talk) 02:39, 2 October 2023 (UTC)
I think the addition is troubling for other reasons, despite its well-meaning intent. Various kinds of primary sources prove things with certainty, and secondary sources can get them wrong, e.g. by a misstatement/misinterpretation or even a simple typo, or by missing information; that error is then picked up by other secondary sources that "cannibalize" the earlier one without noting the error. This happens fairly often in poor reporting on complicated science. E.g.:
  • A journal paper (primary source) makes claim A (only), in no uncertain terms, but a weak science writer reports it as an "A+B" claim (a misinterpretation that takes the claim beyond what was actually stated), and this wrong interpretation of the claim is then picked up by other science-popularization publishers, until eventually it gets annoyedly corrected again by the original researchers (perhaps in another primary journal paper), but without that correction probably being widely reported in secondary material. The primary original research paper is a reliable source for what claim it actually did make.
  • In precisely the same way, a novel is a valid source for the plot points it actually does contain, even if a secondary writer gets one of them wrong.
  • A more specific case I can think of is that an early-20th-century writer on Gaelic surnames, Patrick Woulfe, made in 1906 a variety of highly speculative claims about original Old Irish versions of various anglicised Irish and Scottish names, and some of them cannot be attested in any suriving Old Irish to Modern Irish materials, meanwhile other spellings than the ones Woulfe prefered (and seems to have just made up out of nowhere) are regularly attested in the primary sources he sometimes consulted closely and sometimes did not. Yet numerous later writers, from Black's The Surnames of Scotland to the Library of Ireland's own website on Irish surnames, and now Wikipedia itself, are uncritically parroting Woulfe's fictions/errors, while the original forms of the names are easily provable with sources that pre-date him and which are "primary" in being old mss. originally, though of course re-published in modern critical editions, which some might consider a form of secondary source, but others would not since it's presenting the material in original and translated language without analyzing the historical claims made by the material.
  • Another example is a claim I've seen in secondary works about Highland dress, that a particular Highland regiment was not wearing the kilt until the 1780s, but their own log books prove 1750s; the log books are primary, and while published verbatim inside a modern source, that secondary source did not use them to contradict the incorrect secondary sourcing (which actually came out later), it just published them without specific comment on such a question; so the material is still primary in a sense. But as an encyclopedist, I cannot ignore that it conclusively disproves the 1780s claim in no uncertain terms. The writer of the incorrect 1780s claim was simply unaware of this source material (I know this for a fact because I contacted that still-living author and asked; after seeing the primary material, he agreed that it proved a 1750s date).
  • A more stark-obvious example is that lots of far-right media is secondary, but much if not most of what it says can be readily proven to be false, including by various primary sources. If Breitbart claims "Biden said X" in a particular speech but a publicly available, verifiable recording of the entire speech in question clearly shows that he instead said Y, then the X claim is disproven unequivocally, without having to resort to latecoming secondary-source analysis of the recording.
  • Similarly, we can report on the exact wording and origins of Godwin's law by citing original Usenet posts about it, and doing so is correct and proper, no matter how many poorly-researched secondary sources get wrong its wording, date of first posting, or other facts.
I can come up with examples like this all day long. In short, "secondary" does not mean "better" or "more accurate". It's simply a classification that entails a different kind of handling (and usually but not universally preferential treatment, as a class) by us than other classifications such as "primary" or "tertiary".  — SMcCandlish ¢ 😼  04:37, 2 October 2023 (UTC)
It is a mistake to declare that a journal paper is a primary source. Journal papers rarely make claim A only, devoid of comment on the claim. Journal papers analyse and contextualise their own data, as well as prior data. Journal papers are often well cited as primary sources for a claim, but they can also be cited for contextualisation of the claim. SmokeyJoe (talk) 10:42, 2 October 2023 (UTC)
I don't mean to deny that a typical journal paper can sometimes be a secondary source for some kinds of things; indeed, there is too much assumtion going around that "source type A is always secondary, and source type B is always primary" as it is (e.g. incorrect assumption that everything in a newspaper is secondary, when much of it is primary, including editorials, op-eds, advice columns, and reviews, not to mention all the advertising). However, as addressed in more detail below, journal papers for the novel claims they are making about the data they are working with are primary, and in many cases are invalidated or at least superseded by later research, often quite quickly.  — SMcCandlish ¢ 😼  10:56, 2 October 2023 (UTC)
Yes. SmokeyJoe (talk) 10:18, 4 October 2023 (UTC)
The language in Wikipedia:Identifying reliable sources (medicine)#Respect secondary sources might be handy: you shouldn't use a primary source to de-bunk the secondary source. The problem is, as these examples show, is that sometimes you should. Wikipedia:Secondary does not mean good could have "or up-to-date" added to the long list of desirable qualities that are not inherent to secondary sources.
I'm not sure what problem is meant to be solved, but it feels like a DUE problem rather than a "making stuff up" problem (except, of course, when it really is NOR, or at least NOR and DUE, like the guy who thought he'd disproven Einstein, and thus we ended up with this whole policy). WhatamIdoing (talk) 05:08, 2 October 2023 (UTC)
  • I don't support the proposed amendment. For example, when science becomes news, the journos have an unfortunate tendency to simplify the science for public consumption. Fairly often, a peer-reviewed article in a reputable journal ["primary source"] gets distilled into a BBC News article by Pallab Ghosh ["secondary source"]. Now, Pallab Ghosh is brilliant, and very careful, and we're lucky to have him, but if there was a contradiction between him and the peer-reviewed article, then I would advocate following the peer-reviewed article rather than the journo. In science and mathematics, I feel the pinnacle of the reliability hierarchy should be published, peer-reviewed material by university professors and scholars of equivalent renown, even if those sources are primary (with the caveat that to S Marshall, archaeology is science, and economics isn't.)—S Marshall T/C 08:11, 2 October 2023 (UTC)
    With the caveate that later scientific secondary sourcing like literature reviews, or even later but better primary sourcing, can trump the earlier primary sources as to the claimed facts. A tremendous amount of primary research published in reputable journals is later superseded or even invalidated. However, secondary journalism is never more reliable for what a scientific primary paper actually says; there can be no more reliable source for that than the paper itself, by definition, just as there can be no more reliable source than the film for what a film's dialogue is, even if a journo later misquotes a line from it.  — SMcCandlish ¢ 😼  09:05, 2 October 2023 (UTC)
  • Oppose the addition. A secondary source disputing a primary source sounds like a very difficult case of source typing for that secondary source. A secondary source my deny the relevance, significance, even reliability of a primary source, but to “dispute” a primary source fact requires reference to another primary source, and if the source is the very secondary source, then it is also a primary source. I can’t imagine a simple case of such a dispute, in historiographical terms. It sounds like billed mammal is using the science terminology, which Wikipedia shouldn’t do. Is there an example to illustrate? —SmokeyJoe (talk) 10:34, 2 October 2023 (UTC)
    I'd guess that the example will be found in the vicinity of this comment, which begins by saying 'we're not allowed to say "that secondary source is wrong"'. (Is someone taking bets on how many more sentences will be posted before the old phrase "source-based research" gets mentioned?) WhatamIdoing (talk) 00:49, 3 October 2023 (UTC)
    BilledMammal wrote we're not allowed to say "that secondary source is wrong" because we have a different interpretation of the primary source that the secondary source is writing about. That’s not the reason. The reason is that secondary source material is not facts that are subject to being right or wrong, but are opinions of the author. The assertion “X is wrong” is the same as “X is false” (assuming “wrong” is not a moral comment), and “X is wrong” is a verifiable or falsifiable fact, and thus is primary source material. Find a more reliable primary source. BilledMammal may not have been using scientific meanings, it looks like legal meanings. Both are different to historiography. Wikipedia is an historiographical work, not science, not legal. SmokeyJoe (talk) 10:18, 4 October 2023 (UTC)
    I like the distinctions you're drawing, but I don't really buy that "we're not allowed to say 'that secondary source is wrong'". We pretty routinely do this, in one wording or another, espcially with regard to sources that have not aged well, which is pretty common with things that depended initially on early news coverage in stories that considerably develop in complexity in later coverage with access to more facts; and with subjects where the research conclusions of some earlier secondary writers have been completely refuted by those of the preponderance of later writers, usually on the basis of more research materials coming to light, or new data and research based on it having been more recently published and disproving the earlier claims. (Though frankly it also often has to do with actual standards of research quality having improved, especially when it comes to the difference between modern source material and stuff from the 19th century and earlier.) E.g., in working on tartan and Highland dress articles, I find that many assumptions of and claims by late Georgian, Victorian, and even Edwardian writers have been consistently debunked by all experts in the topic since the mid-20th century to the present. While WP would not write something like "According to false information published by James Logan in 1831", we could certainly write something like "According to James Logan in 1831, [dubious claim here]; but modern scholarship does not agree with this.[1][2][3][4]" And that really does amount to WP telling the reader that the old source is wrong, just phrasing it nicely. We have to do things like this all the time in science, when research gets superseded, or when an old but once-popular view has drifted across the FRINGE line according to modern scientific consensus. To return to the "crap news" kind of case, there are even instances where the faulty nature of the "breaking" news about an event becomes an integral part of the encyclopedic story of whatever the event was, but we don't use "false balance" rhetorical techniques to keep implying that the early secondary-but-wrong material was actually right or even an "equally valid point of view". If the current sources tell us that the earlier writers were mistaken, we make this clear to our readers.  — SMcCandlish ¢ 😼  13:28, 4 October 2023 (UTC)
    All this is why we had to spend so much effort removing "Verifiability, not truth" from WP:V. Sometimes things are verifiable but not true -- because some sources are wrong. And we can't, mustn't, publish known error. Particularly in articles about living people, medicines, or medical conditions.—S Marshall T/C 15:29, 4 October 2023 (UTC)
    We should at least use Wikipedia:Editorial discretion to omit information from a known-wrong source, even if we don't add the Truth™ to the article. WhatamIdoing (talk) 21:37, 8 October 2023 (UTC)
    That wouldn't quite be an example of what SmokeyJoe is asking for - it isn't a secondary source disputing a primary source, it's a secondary source saying that the meaning of a primary source is X. However, it is the example that prompted this; if secondary sources say that the meaning of a primary source is X, we shouldn't be saying "no, the meaning of the primary source is Y". BilledMammal (talk) 04:59, 8 October 2023 (UTC)
    Except see everything I wrote above already. We do it routinely, when a newer and better set of secondary sources says the meaning is Y while older and more dubious secondary sources said it was X. If all you're getting at is that WP editors using their own WP:OR shouldn't contradict a general view in secondary RS that the meaning was X, we all already know that. It's much of what the whole OR policy is about.  — SMcCandlish ¢ 😼  11:30, 8 October 2023 (UTC)
It's not that I disagree with the point. It's just a lot more complicated in practice, and each situation is different. Edge cases make for bad guidelines. We're better off confronting errors when they come up than being overly prescriptive.
The advice to do our own fact checking based on primary sources is potentially dangerous, because it opens the flood gates of "the sources are wrong, and my research from this primary source (non-neutral, unverified, and not fact checked) is correct". If anything, the solution to a bad secondary source is to take it to the reliable sources noticeboard, and/or look for a better secondary source. Shooterwalker (talk) 20:23, 4 October 2023 (UTC)
What we really need is better guidance on what it looks like when we "analyze, evaluate, interpret, or synthesize material found in a primary source" ourselves. Right now there's no clearly defined boundary between citing a primary source and interpreting it, and that makes it difficult to determine whether the use of a primary source is acceptable. The closest I've seen to this is WP:Writing about fiction, which is about a specific type of primary source. Thebiguglyalien (talk) 04:43, 8 October 2023 (UTC)

This seems like a harmful change. Generally agreed with the detailed comments by SMcCandlish above. – SJ + 02:33, 8 October 2023 (UTC)

This addition seems to presuppose that secondary sources are inherently more reliable than primary sources. That's not so - for instance, it is common for secondary sources to oversimplify things or introduce errors that weren't there in the primary source. I see that SMcCandlish has cited a number of examples of such unreliability. When the claim isn't analytic, evaluative, interpretative or synthesized the primary source is not only perfectly acceptable, but often more reliable. Jo-Jo Eumerus (talk) 11:03, 8 October 2023 (UTC)

uhh

"The phrase "original research" (OR) is used on Wikipedia to refer to material—such as facts, allegations, and ideas—for which no reliable, published sources exist" What if those facts, allegations, and ideas are examined and published within a source? Darbymarby (talk) 14:47, 11 August 2023 (UTC)

If the material is published in a reliable source, then, in Wikipedia jargon, it isn't original research. If the source is not reliable, then it is considered original research (in Wikipedia jargon). Jc3s5h (talk) 15:05, 11 August 2023 (UTC)
But keep in mind that if the material is the outcome of a single (possibly controversial) original study (even published in a high quality scientific journal) it would still be a primary source. Overreliance / overinterpretation on primary sources is also to some extent capture under the larger original research umbrella (see the project page section on primary, secondary and tertiary sources). So even if published, handle with care. Arnoutf (talk) 18:30, 16 August 2023 (UTC)
Considering Wikipedia considers places like WSJ to be reliable sources, this basically amounts to "we'll allow or deny posting whatever we feel like". That notwithstanding, even at face value this rule just means that an expert's years of experience mean nothing compared to uneducated babbling of a journalist, for no other reason that the journalists' words were published in some official capacity and the expert's words weren't. If you're an expert in anything, you know painfully well that virtually none of the time someone does reporting about your field they convey anything correctly - not even in terms of details, but just in general. Worse yet, even citing peer-reviewed scientific papers doesn't automatically means it's reliable information. If you know, you know.
I understand why this rule exists but it's the evil genie wish kind of deal. Wording matters, especially when it's enforced by power-tripping drones. You created an environment where educated people's input is worth less than that of someone who doesn't knows what they're even talking about. It's why despite your best efforts, Wikipedia is rightfully a laughing stock in the realm of sources of information. 46.42.22.160 (talk) 01:35, 25 October 2023 (UTC)
We don't know who here is or isn't an expert. For example, is "46.42.22.160" renowned throughout the world for copious and solid contributions to our understanding of the universe by the great researcher of that name? Meanwhile, if what's on Wikipedia reflects what's in "the realm of sources information" as it's supposed to, it can't be a "laughing stock" in comparison because it contains the same information.
What gets me are people (not necessarily you, I have no idea what your previous contributions to Wikipedia, if any, have been) who end up ranting about what a joke everybody thinks Wikipedia is—in response to their having been frustrated in their own attempts to contribute to this work that everybody purportedly considers a joke.
Considering the millions of visitors looking up information on this site, it's a stretch to make out that "everybody" considers it a joke. Largoplazo (talk) 02:18, 25 October 2023 (UTC)
Add to this the facts that a) experts in every field regularly contradict each other and prove each other wrong, like on a daily basis; and b) an expert in one narrow field rapidly loses relevance the further they wander from their speciality (or the further we cite their specialized material as allegedly relevant to something else); and c) even when we know for a fact that a particular author is a pre-eminent expert, we will not be copyright-violating them by dumping their material into our article, but will be (usually non-expertly) summarizing it, including how it interrelates with other research and source material. The anon above just has utterly unreasonable expectations.  — SMcCandlish ¢ 😼  02:31, 25 October 2023 (UTC)

Specifying useful and proper uses of primary sources and reference works

A number of recent conflicts and confusions seem to revolve around trying to apply a single gloss of "reliability as a source" to an entire category of sources (direct observations, primary sources, secondary sources, encyclopedic tertiary sources, catalogs of canonical reference observations, &c).

These discussions have covered everything from geo stubs and how OR relates to summarizing primary sources, to whether secondary sources are naturally better sources, to what can be inferred and summarized from books and maps.

Rather than trying to wordsmith further short paragraphs that try to convey similar universals (when there may not be universal lessons to draw about such broad source categories, and the categories are not always well bounded), perhaps there could be a more fully-detailed set of examples and analogies. These could include examples of excellent use of direct observations (illustrated by a Commons photo), primary sources (direct quotes and as indications of what a source was thinking), tertiary sources (as indications of commonly held overviews), and reference data (official names, measurements, statistics).

In the right context, each of these types of sources (observations or reference data, primary, secondary, tertiary) may be the best available source — which an be supplemented but not replaced by some of the others. – SJ + 02:33, 8 October 2023 (UTC)

In particular, the notion that notability is obliged to include 'noted in reliable secondary sources' and can never mean 'notable in reliable tertiary sources' or 'passing a statistical threshold in reliable datasets' seems unhelpfully narrow. Let's keep encouraging topic areas to develop their own thoughtful notability guidelines. Highlighting interesting variations and pointing to them (here) as positive examples may dissuade people in the future from trying to retcon all variations away through well-meant but lossy attempts to Stick To The Rules. – SJ + 02:47, 8 October 2023 (UTC)
I think people need to refrain from confusing reliability and secondaryness. Perhaps we need to say so. Jo-Jo Eumerus (talk) 12:12, 8 October 2023 (UTC)
Remember that we EXPLICITLY allow primary sources in this policy, but with a caution that says it is easy to misuse them. The key to understanding NOR is that the policy is not about the nature of the source material … it is about what WE do WITH the source material. Are we going beyond what is explicitly (directly) stated in the source material? If so, then we are engaging in OR. Blueboar (talk) 14:05, 8 October 2023 (UTC)
I feel like we've lost a sense of WP:RSCONTEXT since approximately when WP:RSP became popular. Back in the day, we'd have said, "Oh, sure, that's a state-controlled propaganda outlet, but it's fine to use for certain sentences (e.g., "He was promoted to Senior Assistant Under Junior Secretary of his apartment building")". Now the response is more categorical: "How dare you use that kind of source for anything at all!" I wonder whether the goal, in some editors' minds, is to avoid making detectable mistakes (e.g., excise anything highlighted by certain scripts), rather than trying to do something useful. WhatamIdoing (talk) 22:05, 8 October 2023 (UTC)
Categorically excluding particularly unreliable sources is good, actually. Sure, you could probably source RT for simple facts, that it gets right in basic reporting. But then what you're doing is driving more eyes and traffic to a state propaganda outlet, which will misinform and disinform our readers. RSP is a great service we provide. RSCONTEXT is still an extremely important concept. And you have to remember that the prescribed remedy for someone deleting a bunch of yellow source material in an uncontroversial usage is to revert and cite RSCONTEXT. Like anything else, consensus can always exclude a source in a particular usage/context but it is up to those who want to defend that usage (a little ONUS/BURDEN for ya there) to justify why the particulars would allow it, which they fully can. I've seen a fair bit of hand-wringing over "RSP but colors but people just do yellow green bad!" and I've never actually seen this happen. Andre🚐 22:14, 8 October 2023 (UTC)
In the instance I was thinking of, the enforcer didn't remove the material. Only the source was removed. Previously, we had material cited to a source that certainly passes RSCONTEXT; now we have uncited material about a BLP. I do not think that is an improvement. WhatamIdoing (talk) 01:07, 10 October 2023 (UTC)
IMHO, to the original post, writing articles based on datasets is WP:OR. I thought we had resolved the map issue that straightforward interpretation of maps isn't OR. We shouldn't be taking a bunch of tables and either just dumping them into Wikipedia (try wikidata) or writing articles based on trends or analysis of what the narrative that corresponds to the data is. However, that does not mean that valuable data should not be included in context where appropriate, as interpreted and contextualized by secondary sources. Tertiary is a dicier issue, ie other encyclopedias. But let's leave that aside because I think it has entirely orthogonal considerations. Andre🚐 22:19, 8 October 2023 (UTC)
User:WhatamIdoing/Database article is written entirely from a single database entry. Please identify the "material—such as facts, allegations, and ideas—for which no reliable, published source exists" in that page. If you can't, you may want to revise your opinion about whether "writing articles based on datasets is WP:OR". WhatamIdoing (talk) 01:08, 10 October 2023 (UTC)
Digression
Well, in the description it includes a lot of information that appears to fail verification. Which is to say that your background knowledge of relative fish size, and general knowledge of catfish, is OR, since you wrote it in based on this dataset and thinking about and analyzing this fish, but I don't actually see anywhere where it reads that this is a small fish. A source probably exists for this information. But your source when writing it, since you didn't cite it, was your own head. I have no idea, not being a fish expert, if this is a small fish or a big fish based on the information given. There's a size, but I have no idea if that size is relatively large or small. Therefore, I cannot verify it, it is not verifiable, with the source given. Andre🚐 05:17, 10 October 2023 (UTC)
I think that's a reasonable description of the facts. The fish is 7 cm (2.75 inches) long. If you're having trouble picturing the size, then the typical adult's index finger is about 3.25 inches. Think about the kind of WP:SKYBLUE information we can expect most people to have with fish (e.g., from eating fish sticks) and the general standards that have for describing animal sizes (e.g., a mouse is small, an elephant is not). Do you think that anyone would describe a fish whose body is half an inch shorter than their fingers, and slightly shorter than a mouse, as "big"? WhatamIdoing (talk) 17:46, 10 October 2023 (UTC)
Well, like I said, I couldn't verify it. The sky is blue, because I just looked at it. SKYBLUE is for obvious stuff that anyone knows and could easily verify. But I tried to search Google for 2 minutes and do a bit of WP:BEFORE to determine if a 2.75 inch fish is a big, or a small fish, and whether therefore this particular species of catfish is generally categorized as big, small, or medium, which would require me to know something about fish in general not just what I can see with my eyes a la SKYBLUE. I have no idea. The last fish I saw was about 1 inch and it was a goldfish. I've seen a picture of a big bass. But again this is all not SKYBLUE, but OR. SKYBLUE doesn't cover common knowledge that anyone might know or have read that's out there - that is stuff we don't allow. (And, it's an essay that does not supercede OR and V which are some of the most important core content policies) I know a mouse is small to an elephant, but that isn't something I can just write in Wikipedia without a source, because I have no idea if there are dwarf pygmy elephants, or giant mice. Sweeping, essentially scientific taxonomic statements, must have a reliable source. Now, I might not delete that content out of hand because I think it can probably be sourced and verified, but that's a different point. Andre🚐 18:02, 10 October 2023 (UTC)
Describing the size of an object in plain English is not a "Sweeping, essentially scientific taxonomic statement".
I wonder what you would predict for the outcome of a straw poll via RFC. The question will be: "A reliable source gives the length of an animal in centimeters. Its body is shorter than your finger, and a photo in the source shows someone holding it in their hand, with the animal laying across the tips of two fingers. Is it fair to describe that animal as 'small', or should we leave open the possibility that an animal you could hold a dozen of in your hand is actually a large animal? A concern has been raised that describing an animal in plain English, when the source gives numbers instead of words, might constitute 'material—such as facts, allegations, and ideas—for which no reliable, published source exists'."
What's your prediction for the result? A WP:TROUT for the editor raising the concern, perhaps? WhatamIdoing (talk) 00:19, 30 October 2023 (UTC)
While I do appreciate the TROUT pun and maybe we should all be roundly trouted for discussing at the talking shops instead of making content. On the other hand, I've been making pretty legit content all day on JSTOR. "Could I hold this animal in my hand" might be a SKYBLUE statement, since as you say, I can verify that, but that wasn't the statement. "A large fish." How large is a large fish? How large is a small fish? That's scientific common knowledge, it requires a source. If I live in a landlocked state and I've never been to the aquarium, how SKYBLUE is the range of possible fish sizes? Sure, there might be a "photo in the source" that we could interpret but interpreting the relative size of things in a photo is sometimes a fraught thing. Even if there's something in the photo for scale, that doesn't necessarily tell us how big or small the object is relative to all other objects in its taxonomic category (i.e., all fish species, let's exclude non-true fish that are grouped commonly with fish like whales and dolphins). Also, I believe "no source exists" is ambiguous here: I think a source probably does exist, but does it actually exist for this particular catfish? Maybe a general source that generally says "small fish range is X to Y" would suffice; but does this constitute original research assuming such a source wasn't provided? "Hey that's OR - where's your source or did that come from your head?" The source might exist, but I think the OR policy is written wrong if you think it means that OR is only when I definitely believe a source could not exist. It's more about whether the source is possibly obtained. Andre🚐 00:29, 30 October 2023 (UTC)
The statement was that the fish is small.
The largest recorded specimen has a body length shorter than a typical adult's finger. A typical adult male could hold a dozen of them in a single hand. I really think you should take a look at the photo: https://fishbase.mnhn.fr/photos/ThumbnailsSummary.php?ID=47665 Look at the photo, then hold out your hand and try to picture the fish laying in your own hand. Would you – you personally, just your own opinion – ever call that a large fish?
I have never heard anyone claim that the size of an animal is scientific knowledge. Little children who aren't even old enough to go to school describe animals in terms of size. Most of them seem to know that a mouse is small and an elephant is big. Do you have a source that says that whether people would consider an animal that is much smaller than their hands to be "small" is scientific knowledge? WhatamIdoing (talk) 01:33, 31 October 2023 (UTC)
I agree it's a reasonable inference for a child with no scientific knowledge to say the fish is small. but when Wikipedia says the fish is small, it means that on average, this fish tends toward a small size as a species, which is different from a child's observation (which, again, is still original research according to the original meaning of that - not the "no source exists" part). I really don't know what the relative fish size groupings are in fish taxonomy. It's possible the sizes are "very small fish," "small fish," "medium fish," "large fish," and "very large fish." That's a pretty normal distribution, and I think it's fairly safe to assume that this fish is a small fish on that assumption, but I actually don't know. Sometimes the scientists who group and name things have meanings that differ from our common understanding of things. The point of original research is that you should write articles based on sources and things that external to one's novel fact-finding. Andre🚐 01:52, 31 October 2023 (UTC)
I do not agree that when Wikipedia says the fish is small, it means that on average, this fish tends toward a small size as a species. We can use plain old English – the sort that a non-specialist, non-scientist typical person (of any age) would use.
Describing quantities in plain old English is not original research. It is not novel fact-finding. It is, in the words of this policy, "a meaningful reflection of the sources". WhatamIdoing (talk) 02:59, 3 November 2023 (UTC)
But why is it ok for someone to look at a pic of a small fish in someone's hand and write in the article, for that fish, "This fish is small." That statement does not, in the article on the species, refer to an individual representative of that fish. It could go either way. Is that individual fish in the pic actually small, relative to all other fish or all other members of the species? How do we know that? Maybe it's an adult or a baby or a runt or a mutant. Why is it ok for someone to look at a pic of a fish and conclude that "It is a small fish" belongs in the species article? How does that pic tell us that? Isn't that more something that comes from reading the stats on its size? Andre🚐 03:17, 3 November 2023 (UTC)
The editor is not looking at a picture; the editor is looking at the picture and reading the words in the source and actually understanding(!) the numbers in the source, and then thinking about how to summarize (=not "regurgitate a single isolated part of the source word for word") the information in that single source, and thinking about how to best describe the overall sense/meaning of the source for the reader.
Sometimes we have very precise information, but precise information isn't actually DUE. Sometimes what the article needs is "Elephants are big" or "Elephants weigh thousands of pounds" instead of "The median adult male African bush elephant weighs 6,000 kg". WhatamIdoing (talk) 19:36, 4 November 2023 (UTC)
OK, but I agree that we can read about the median weights of elephants, and summarize that to "Elephants weigh thousands of pounds." That's acceptable summarization and not original research. That's based on acceptable background information. But remember, the original question was how can we write an article without original research using only database entries. "Elephants are big," is common knowledge, almost to the level of SKYBLUE - but I still think if you simply had a database that said "average weight for elephants, 6000 kg" it wouldn't be appropriate for an editor to write "Elephants are big," because again that implies some knowledge about the relative size of an elephant. On a hypothetical Earth with only whales, wooly mammoths, and pygmy elephants, elephants are not that big. We only know they're big because of common background knowledge in our heads that most animals on earth are smaller - that's partly why it's too original. If we open the door to that, what's to stop us writing an article based on all the common knowledge and old urban legends that people think they know? Andre🚐 21:14, 4 November 2023 (UTC)
Youse guys have been arguing about this hypothetical damn' fish for almost a month straight. Please, please, either just drop it entirely or take it to user talk instead of hitting our watchlists over and over again with this silly pissing match.  — SMcCandlish ¢ 😼  22:58, 4 November 2023 (UTC)
We can certainly take it offline as I do not think anything in particular is at stake, here. The OP was trying to specify the use of primary sources. I think there have been a good number of discussions about primary sources for pretty much the entire last couple years that I've been back editing. This is hardly the longest discussion, although perhaps I have turned this one into a sidebar with WAID, whom I suspect I mostly agree with, so we're quibbling on a small fish point. But if you want to take this back to the general or for this thread to be closed, please feel free to do so. You can close it and write "turned into a quibble fest" and if anyone wants to start a new one, go ahead. As far as whether we'd find any area of meaning here, I think the question was the delta between what original research is, verbatim, versus what people mean by the principle of OR. Andre🚐 23:07, 4 November 2023 (UTC)
There might still be something to resolve in the main thread; maybe just collapse-boxing this will do it?  — SMcCandlish ¢ 😼  23:54, 4 November 2023 (UTC)

Publisher website links and WP:PRIMARY

– 15:34, 10 May 2021‎ (UTC)

Explicit and implicit synthesis examples.

I find the examples of synthesis near the top of the article very clear and helpful. However, I noticed there was a potential example missing that may complete the set and make an even clearer guideline. Here's my revision below.

Below are three example sentences demonstrating improper editorial synthesis. On their own, the two factual claims in each example may be reliably sourced. However, in the first example they have been joined into a new, synthetic statement with a conjunction such that the second clause explicitly questions the first clause, ultimately claiming that the UN has failed to maintain international peace. If no reliable source has combined the material in this way, it is original research.

 N The United Nations' stated objective is to maintain international peace and security, but since its creation there have been 160 wars throughout the world.


In the second example, the same sourced material is used to synthesize the opposite meaning by using a different conjunction and an additional qualifying term, illustrating how easily an article's framing can be skewed by sourced material used to make claims not actually made by any one source:

 N The United Nations' stated objective is to maintain international peace and security, and since its creation there have been only 160 wars throughout the world.


The previous two examples both explicitly compose material in an improper way, adding words with specific meanings to make an additional, synthetic claim. However, meaning is also implicitly created by juxtaposition alone: the very fact that two statements are placed side by side implies that one is somehow related to the other. The third example simply states the two factual claims sequentially with no additional language. Whether or not this is an implicit synthesis is dependent on further context, as the surrounding material can likewise clarify implications made by passages in isolation. Ultimately, it is synthesis of a more subjective kind, and arguably where original research overlaps with the distinct policy regarding undue weight. Regardless, care is warranted, as it is possible to imagine many contexts where an unspecified but likely negative implication would be obvious to many readers:

 * The United Nations' stated objective is to maintain international peace and security. Since its creation, there have been 160 wars throughout the world.



Included as well is a bit of copy-editing. Thoughts? Remsense 20:40, 2 December 2023 (UTC)

The central idea here might be reasonable, but the introductory text of the new, 3rd example is about 3× too long, especially given the nature of the this section of the page and the examples in it.  — SMcCandlish ¢ 😼  11:18, 3 December 2023 (UTC)
I agree. It's an interesting example, and I even think it represents Wikipedia policy. But it might be better as part of an essay. Shooterwalker (talk) 21:15, 3 December 2023 (UTC)
That was my qualm, but I struggled to be brief while making the distinction clear. I gave it another shot, how's this?

The previous examples add specific words to explicitly create an improper synthesis. However, meaning is also created implicitly by juxtaposition alone. The third example simply states the two claims sequentially. Whether this is an improper synthesis depends on further context. It is possible to imagine many contexts where a likely negative implication would be obvious:

Remsense 21:45, 3 December 2023 (UTC)
I don't think it will reduce or resolve disputes. I would unfortunately expect it to increase disputes. Any two adjacent sentences could be claimed to be "implicit synthesis". Consider:
  • Treatment for lung cancer may include surgery, radiation therapy, and chemotherapy. The prognosis is poor, and most people with lung cancer die within a few years.
The wikilawyer will say that putting those two sentences together is 'implicit synthesis', and that it improperly implies that cancer treatment kills people (in his opinion, although probably no one else in the world would agree).
Unless you can reliably and usefully tell editors how to identify a problematic case, it's generally not helpful to mention it in a policy. It ends up backfiring, as editors make up their own, mutually incompatible definitions and proclaim that their interpretation is the true one. WhatamIdoing (talk) 22:07, 3 December 2023 (UTC)
I think you're right, I think I may stash it for an essay instead. Remsense 22:10, 3 December 2023 (UTC)
Well, if there were enough support to add something like this to the NOR page, that intro text could be squeezed further, since we don't need the text to tell us what was previous and what is following, to re-summarize examples we just read, or to count for us: "Meaning may also be created implicitly by juxtaposition alone. Whether simply stating two claims sequentially is an improper synthesis depends on further context. It is possible to imagine many contexts where a likely negative implication would be obvious:" Even with an intro that tight, I think WhatamIdoing raises a serious concern; something like it was tickling at my own brain about this material last night, which is much of why I didn't do a concision pass of my own on it. This strikes me as eminently sensible and has been my experience as well, as a long-term shepherd of various guidelines: Unless you can reliably and usefully tell editors how to identify a problematic case, it's generally not helpful to mention it in a policy. It ends up backfiring, as editors make up their own, mutually incompatible definitions and proclaim that their interpretation is the true one.  — SMcCandlish ¢ 😼  09:33, 4 December 2023 (UTC)
SMcCandlish, certainly. The purpose of a policy like this is to be iron-clad. Remsense 09:35, 4 December 2023 (UTC)
@Remsense, I like the idea of writing an essay about this. That could be very helpful, without the burden of trying to make it "iron-clad" on the first go, and with the benefit of having room for lots of examples and solutions. WhatamIdoing (talk) 16:39, 4 December 2023 (UTC)
I agree! Thank you to everybody for discussing the merits and potential placement of this subtopic! Remsense 18:40, 4 December 2023 (UTC)

The reality is that when it's an overreach or POV/contested we call it synthesis; when it's neither we call it summarization of what the source said. Your example points out one of the e many ways of achieving overreach or POV/contested. Also of sneaking in a POV. But I think that being just 1 of many means that it's probably good to not add it to core policy wording. Sincerely, North8000 (talk) 18:53, 4 December 2023 (UTC)

Input requested

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.



There is a discussion at User talk:David Eppstein#Undue warning of a block regarding a warning about original research. Your input is appreciated. Note: I didn't come seeking input for the conduct part, but about the interpretation that I am pushing original research. Regards, Thinker78 (talk) 07:29, 10 January 2024 (UTC) edited 17:46, 10 January 2024 (UTC)

This talk page is for discussing improvements to the policy. It looks like you're looking for more input on a conduct dispute. This is not the venue. If you want input on the content dispute, you could try WP:NORN, but I'd recommend linking to the originating dispute, not the user talk page discussion. Firefangledfeathers (talk / contribs) 17:19, 10 January 2024 (UTC)
First, I have to point out I have had a dispute with you before so I don't consider you an uninvolved or unbiased editor in a case involving me.
Second, I am not seeking to discuss the issue here but I am requesting input from interested editors to continue the discussion in the talk page I linked. It is normal practice for editors to do advertise in such a manner in related venues to seek uninvolved editors input. Per WP:MULTI and WP:CONTENTDISPUTE (Related talk pages or WikiProjects, this is a related talk page). Thanks. Thinker78 (talk) 17:31, 10 January 2024 (UTC)
But you're not linking to a content dispute but a conduct dispute. None of the steps of WP:CONDUCTDISPUTE suggest referral to a policy talk page, and for good reason. If you would like input on the content dispute, don't link to a user talk page. Firefangledfeathers (talk / contribs) 17:38, 10 January 2024 (UTC)
I am seeking input regarding the interpretation of the editor I have a dispute with that I am pushing original research. I didn't come here looking for input regarding conduct. Thanks. Thinker78 (talk) 17:43, 10 January 2024 (UTC)
I would have taken Joe Roe's advice. -- LCU ActivelyDisinterested «@» °∆t° 17:48, 10 January 2024 (UTC)
Ditto. Firefangledfeathers (talk / contribs) 17:49, 10 January 2024 (UTC)
Why? The other editor did not clarify why they came to the conclusion I am pushing original research and when I attempted to explain they just took issue with the "walls of text". That's why I am trying to get input from uninvolved editors. Maybe there is some perspective about original research that I am missing? So I think clarification is in order. Sincerely, Thinker78 (talk) 17:56, 10 January 2024 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.