Talk:Star Wars: The Last Jedi/Archive 9

Archive 5 Archive 7 Archive 8 Archive 9 Archive 10 Archive 11 Archive 12

Presentation of user-generated scores

MOS:FILM#Audience response explicitly states, "Do not include user ratings submitted to websites such as the Internet Movie Database or Rotten Tomatoes, as they are vulnerable to vote stacking and demographic skew." In the "Audience response" section here, they are put at the beginning of the second paragraph without any criticism as part of it. The user ratings are reported, full stop, then criticism of the ratings is introduced as an afterthought. It needs to be more clear upfront that user ratings are inherently problematic, rather than introducing them as-is. Erik (talk | contrib) (ping me) 14:16, 5 December 2018 (UTC)

I've added "which lack controlled sampling" to give the correct picture of the user-generated scores, especially relevant following the first paragraph. We should not assume that readers know that user-generated scores are not proper metrics. That is why the MOS:FILM guideline has existed all this time. Erik (talk | contrib) (ping me) 14:19, 5 December 2018 (UTC)

Erik, regarding this and this, I see what you mean about using the sites as sources, but those sources can be easily replaced with WP:Reliable media sources about the sites' scores. When it comes to the other content about the user scores, we are reporting on what WP:Reliable sources have stated. Furthermore, the sources are specifically about Rotten Tomatoes and Metacritic. "Online film ratings" is vague. We went through an extensive RfC about that material. More than one RfC, in fact. You know this. Do you need to review those past extensive discussions? The content was thoroughly vetted. That you disagree with the WP:Consensus does not mean that you get to disregard it. And as for your addition of "which lack controlled sampling," that should be sourced. Flyer22 Reborn (talk) 14:30, 5 December 2018 (UTC)
I would be fine with including the user-generated scores as long as they are qualified. It took me aback to see them introduced at face value because there are still many people who think that they are credible measurements. To draw a comparison, if we editors cover works about pseudoscientific topics, we would indicate in "the same breath" that it is pseudoscientific. We would not say, "This documentary film is about extraterrestrials who visited Earth in the past. This premise has been rejected by professional..." We would say, "This documentary film is about the fringe belief that extraterrestrials visited Earth..." (or something like that). In this case, user-generated scores aren't suspect only here. They are suspect in the most general sense, which is why we almost never include them. When we do include them, we need language qualifying their use, in the same breath. I hope you're not asking for a source that has to apply to just this film, because that should not be necessary. In general, perhaps? (I've wanted to create a Wikipedia article about online film ratings for reasons like this.) Erik (talk | contrib) (ping me) 14:48, 5 December 2018 (UTC)
Someone other than me will ask for a source for "which lack controlled sampling." It's a statement that's unsourced. That the user scores are unreliable is made clear in one or more of the reliable media sources on the matter regarding the user score debate concerning this film. So we can use one of those sources for "which lack controlled sampling" or similar. Since reliable sources cite older user score data for the Rotten Tomatoes and Metacritic user scores, and new sources are not going to keep reporting on updates regarding those scores, we should probably remove "achieving ratings of 45% and 4.5/10, respectively" or replace it with something like "achieving ratings below [so and so]" when using a reliable media source in place of the sites as sources. And as for "upfront," it's not "upfront." That big RfC we had on which version to go with, now seen at Talk:Star Wars: The Last Jedi/Archive 4#RfC: Which version of the Audience response section should we go with?, specifically addressed whether or not the user scores should be upfront -- meaning in the first paragraph. Consensus was for moving the user score material to the second paragraph. We did. Flyer22 Reborn (talk) 14:54, 5 December 2018 (UTC)
I don't mind the other changes to the language. I'm using "upfront" to say that we should not write a sentence about user scores without automatically challenging the nature of the scores. Like we would say "the pseudoscientific belief of <whatever issue>". I know that comparison is not direct. I've added a general source per below. Erik (talk | contrib) (ping me) 20:53, 7 December 2018 (UTC)
I know what you meant by "upfront" above. I was referring to this edit summary, where you used "upfront" to seemingly mean "up front." You used it again here. Flyer22 Reborn (talk) 23:23, 8 December 2018 (UTC) Flyer22 Reborn (talk) 23:47, 8 December 2018 (UTC)

Comment – I would be fine with the addition, "which lack controlled sampling", if properly sourced. It's a claim about RT and MC that needs verification. It doesn't have to mention this film specifically, just RT and MC in general. --GoneIn60 (talk) 21:22, 5 December 2018 (UTC)

I've added a source. Erik (talk | contrib) (ping me) 20:53, 7 December 2018 (UTC)

As I noted in the original debate, I have strong reservations about including user scores here and would remove them if I could do so. Alas, consensus exists that they must be presented here. Toa Nidhiki05 23:36, 8 December 2018 (UTC)

Part II

I reverted this edit by Scoundr3l on the basis that further discussion is probably needed at this point. After reading the source cited in support of the RT/MC claim that they "lack controlled sampling", perhaps Scoundr3l has a point that we need to take another look to make sure this isn't a case of WP:SYN. The idea that RT and MC lack controlled sampling is implied but not explicitly stated. If a better source is needed, or if we need to rephrase it some other way, let's hash that out here instead of through back-and-forth editing in the article namespace. --GoneIn60 (talk) 05:32, 8 January 2019 (UTC)

Sorry guys, didn't see all this discussion or realize it may have been a compromise. But yeah, I think if we're going to discuss audience score (which, incidentally, is another thing worth discussing) I got no problem including the Vox source explaining these scores further, I just think it was presented dishonestly. There are three sources in the sentence: two discussing scores, and one discussing how those scores are arrived at. None of the sources were casting doubt on the TLJ audience scores but, by synthesizing their points, the sentence is creating an unsourced opinion that there's reason to doubt those scores. That's not to say there isn't cause or even sources to doubt the audience scores for this movie, it just isn't what was being sourced, so it's OR. Scoundr3l (talk) 16:24, 8 January 2019 (UTC)
The Vox source does refer to CinemaScore as a "fairly reliable source" despite its flaws, and in the same breath calls it the only such reliable measure especially when compared to RT, MC, and IMDB. It spends a good portion of the article explaining how the audience scores on these websites skew white male, have been exploited at times at films like Ghostbusters, and overall can often be an inaccurate representation of the true audience score. In a nutshell, it's telling its readers you simply can't trust the score you're seeing. It does talk about sampling, but it doesn't explicitly mention "controlled sampling". So we should definitely use a better phrase here for find another source altogether, but I wouldn't go as far to say that casting doubt wasn't the article's intention, because that's not how I read it. --GoneIn60 (talk) 17:09, 8 January 2019 (UTC)
A couple notes: the Vox article does not say that audience scores skew white male. It actually contrasts audience scores to those of critics, who are generally white males (the critics). It even links to an article discussing diversity issues in professional film criticism. Vox also makes criticisms of Cinemascore and RT review aggregation, but is not being used to modify those scores. Scoundr3l (talk) 18:08, 8 January 2019 (UTC)
Regarding your first note, yes, thanks for the correction. I skimmed past that part and didn't catch the appropriate context. As for your second note, I'm not entirely sure what you are suggesting. Yes, the article criticizes every polling method to some extent and even professional critics, but its conclusion is crystal clear in this excerpt:
CinemaScore ... is, at least, a fairly reliable source of what it actually measures. And at present, it’s the only such measure – especially when compared to the other measure of audience opinion: the audience grade collected by sites like Rotten Tomatoes, Metacritic, and IMDB.
There is no question that RT, MC, and IMDB are being called out for being unreliable. If the contrast between a "fairly reliable" measure and Rotten Tomatoes is stark, then one can only conclude from that context that RT and the like are not reliable. Casting doubt on the online audience grades from the big 3 is a clear objective that the article wishes to impose on its reader. It even mentions on two occasions that CinemaScore has "some methodology", but the same can't be said about online scores that are susceptible to manipulation. To that point, it goes on and says, "But even if there’s no foul play, the randomness of the sampling is less rigorous than the information collected by CinemaScore". This is the minimal reference we get about sampling, and as you can see, there's nothing here about what "controlled sampling" is or how it's used/neglected by online aggregators. That's the only issue I have at the moment. I'm not worried about this source being used to disparage online aggregators, because to me, that's part of its intent. --GoneIn60 (talk) 22:17, 8 January 2019 (UTC)
You seem to be arguing for the quality of the original research, not that it isn't original research. I understand that Vox is being critical of audience scores, but that doesn't make it applicable to unrelated citations as we see fit. Vox said nothing about TLJ. If I find a reliable source which criticizes the methodology of, say, Gallup Polls, I don't get to add a modifier to every Gallop Poll citation "According to a Gallup Poll, which have been shown to be biased [1], ...". That isn't how it works. Per WP:SYNTH "If one reliable source says A, and another reliable source says B, do not join A and B together to imply a conclusion C that is not mentioned by either of the sources. This would be improper editorial synthesis [...] which is original research". If we're using the scores (which, for the record, I'm opposed to per previously cited guidelines) they should be presented as cited by RS, or not cited at all. Bringing in an unrelated source about the reliability of the scores is implying a conclusion not found in any of the cited sources. Although, again, I have no issue including the criticism as a separate point, as seen. It's valid criticism. Scoundr3l (talk) 02:32, 9 January 2019 (UTC)
SYNTH is a complex rule tucked within a major policy. As mentioned multiple times on WP:What SYNTH is not, "...everything under the sun can be shoehorned into a broad-enough reading of SYNTH". There are acceptable levels of SYNTH allowed on Wikipedia that do NOT violate WP:NOR. We have to be careful with making these kinds of claims, and that explanatory supplement lays out a ton of examples that are often mislabeled as SYNTH violations. In particular, this is similar to scenario described at SYNTH is not obvious II:
An example of a perfectly valid citation is given in the guideline on citations, at WP:Bundling: "The sun is pretty big, but the moon is not so big.[1]" The bundled citation uses one source for the size of the sun, and another for the size of the moon. Neither says that the sun is bigger than the moon, but the article is making that comparison. Given the two sources, the conclusion is obvious. So a typical reader can use the sources to check the accuracy of the comparison.
Now let's look at the material you're challenging as SYNTH:
User-generated scores on Rotten Tomatoes and Metacritic, which lack controlled sampling,[155] were more negative, achieving ratings of 45% and 4.5/10, respectively.[156][157]
Here, you're arguing that since neither source directly ties the unreliability of RT and MC together with TLJ, we can't imply to the reader that the TLJ scores are potentially unreliable, but that would contradict the advice above. The first half of the sentence is properly sourced (assuming we rephrase/replace "controlled sampling" with something more in line with the Vox source), and so is the second half. By checking the sources, a reader would understand that the conclusion implied is obvious, even if that conclusion wasn't explicitly stated by any of the sources. The SYNTH policy was not meant to weed out this kind of valid synthesis.
Also keep in mind that "to claim SYNTH, you should be able to explain what new claim was made, and what sort of additional research a source would have to do in order to support the claim". I'm not convinced that this claim passes the litmus test. Perhaps we won't see eye to eye, but the compromise might be to focus on changing "lack controlled sampling" to a phrase we can agree on instead of going deeper down the rabbit hole on policy interpretation. --GoneIn60 (talk) 04:38, 9 January 2019 (UTC)
I don't think there's been any confusion over what the unsourced claim is: that the TLJ audience score is inaccurate. I said as much and I believe you've also said the claim is intentional and not explicitly stated in the sources. Isn't that what we're discussing? But we can bypass quibbling over valid and invalid synthesis with another simple litmus: why is this interjection here necessary? What, if not doubt, is being added? And as none of the sources are expressing doubt of the given scores, the source of the doubt is attributable only to the editor(s) who added the interjection. That's an NPOV issue. It'd be just as much a 'valid synthesis' to change it to "audience score on Rotten Tomatoes, which RT ensures is authentic [1], were more negative..." but our purpose here is to describe disputes, not to engage in them, so both are pretty encyclopedically inappropriate. I think the most appropriate thing to do is separate the claims from the criticism, as we usually do. There will still be plenty of room to pick apart audience scores in an impartial way. Scoundr3l (talk) 22:58, 9 January 2019 (UTC)
That's an NPOV issue. It'd be just as much a 'valid synthesis' to change it to "audience score on Rotten Tomatoes, which RT ensures is authentic [1], were more negative..."
It wouldn't be as valid, because the overwhelming consensus is that ALL audience scores on RT and MC are aggregated in an unreliable manner, and sources don't have to list every film on RT and MC for that to apply to every film. Also, that wouldn't be an accurate representation of what RT stated (their claim is that they didn't detect anything unusual, not that it isn't possible), but I get that you're just making a point. I just think it's a bit unreasonable to expect to see TLJ mentioned specifically in the cited source, but it looks like we're moving beyond that. Let's continue the discussion at the end of this thread, so we don't keep pushing the sidebar discussion below further down. --GoneIn60 (talk) 00:51, 10 January 2019 (UTC)
Here’s a reliable source going in-depth on the unscientific nature of audience polling. Toa Nidhiki05 18:25, 8 January 2019 (UTC)
I don't have any issue with the reliability of Vox. This source doesn't seem to add much that wasn't said in Vox, but it's at least on topic. Why not both? Scoundr3l (talk) 02:48, 9 January 2019 (UTC)
I introduced that source in an earlier discussion not long ago, and it was added as a citation in the first sentence of the "Audience reception" section. Past discussions have resulted in a preference to differentiate between reliable and unreliable polling methods described in this section, since we don't generally allow the unreliable audience scores to be mentioned per MOS:FILM#Audience response. --GoneIn60 (talk) 05:01, 9 January 2019 (UTC)
Yes, admittedly I haven't read all the previous discussion, but it looks to me almost like we've followed up one questionable decision with another. We really shouldn't be allowing audience scores for the same reasons listed in the MOS page and by sources such as Vox. They are prone to abuse. In the case of this movie, at least, I can see some value in discussing them as there was verifiable dispute over critic vs audience scores, so maybe just discussing them in that context is most appropriate. My late opinion. Scoundr3l (talk) 23:03, 9 January 2019 (UTC)
The decision was vetted pretty well in an RfC: Talk:Star Wars: The Last Jedi/Archive 2#Should we include an Audience response section?. It saw participation from over 50 editors and prompted a lengthy discussion. Ultimately, arguments citing WP:DUE as a result of the significant coverage the scores and controversy received prevailed. --GoneIn60 (talk) 00:51, 10 January 2019 (UTC)

Cont'd from above

As said, it's my late opinion. But you haven't answered the question as to why the interjection is necessary except to express doubt. And since, at most, we have vague accusations that the scores are unreliable and might be skewed, and a specific statement that these particular scores are accurate, it's really not an issue of due weight, it's an issue of no weight. At best, it's a controversy. The dispute should be discussed, not engaged in. The criticism of the score should be moved below where it can be presented impartially and not used to poison the well with editorial opinion, well intentioned though it may be. Scoundr3l (talk) 22:17, 10 January 2019 (UTC)

Apologize, as I haven't had much time that last couple days. So let's get straight to the meat and potatoes (I disagree with some of what you just said, but that's not important). You don't necessarily object to a statement that discusses the unreliability of RT and MC audience scores; you just don't want it in the same sentence. Fine by me. However, your original proposal states "can not necessarily ensure contributing voters". I think we need to drop "necessarily" and possibly "contributing" before reinstating. Does that work for you as a compromise? --GoneIn60 (talk) 17:17, 14 January 2019 (UTC)
No sweat, the weekend is no time to be quibbling over Wikipedia articles. In my original proposal, I only added "necessarily" because Vox said it and I figured it would be less likely to upset the applecart. I have no problem changing the phrasing. I also think "scientific polling methods" poses a similar problem of being vague and value-laden. Would you object to changing that to something more specific like "exit polling"? Thanks for your collaboration. Scoundr3l (talk) 18:43, 14 January 2019 (UTC)
I want to throw my two cents in here: I object to any change that removes the immediate mention that the internet audience scores are unscientific. This wording was explicitly hashed out months ago and throwing it out despite multiple reliable sources backing the claim up doesn't make any sense. In fact I doubt you will find any reliable source arguing for the validity of online polling that lacks controlled sampling. It's not poisoning the well to mention that some scores use reliable polling methods and other polls use unreliable methods. It doesn't mean the scores are bad, it doesn't mean they are wrong - it just contrasts the two. An alternative might be to not immediately introduce the claims, something like this:

Audiences randomly polled by CinemaScore on opening day gave the film an average grade of "A" on an A+ to F scale.[117] Surveys from SurveyMonkey and comScore's PostTrak found that 89% of audience members graded the film positively, including a rare five-star rating.[115][153][154] User-generated scores on Rotten Tomatoes and Metacritic have ratings of 45% and 4.5/10, respectively.[156][157] Audience reception measured by scientific polling methods (like CinemaScore and comScore) was highly positive[152] while user-generated polls, which lack controlled sampling, tended to be more negative.[155]

But the fact that user-generated polls and scientific polls found different results is highly important to this section, and it has to be mentioned that not all polls use the same standards. Toa Nidhiki05 21:54, 14 January 2019 (UTC)
Although we've had "scientific polling methods" in the section for months, the "which lack controlled sampling" wording was only added a month ago. I saw the aforementioned edit by Scoundr3l, but I didn't mind it because I felt that it was worded okay and placed in a good spot. That's the best spot for it if we are not to use the "which lack controlled sampling" wording right in the first sentence about audience scores. I waited to see if anyone would object to Scoundr3l's edit. As for including audience scores, we could have "were more negative" without the "achieving ratings of 45% and 4.5/10, respectively" wording there, but then "more negative" would be vague. The fact that MOS:FILM prefers we don't include the scores and that the scores can change a bit over the years is why I suggested above to Erik that we remove "achieving ratings of 45% and 4.5/10, respectively" or replace it with something like "achieving ratings below [so and so]" and use a reliable media source in place of the sites as sources. Flyer22 Reborn (talk) 06:18, 15 January 2019 (UTC)
As for Toa Nidhiki05's suggestion, it's not bad either. But as seen at Talk:Star Wars: The Last Jedi/Archive 4#RfC: Which version of the Audience response section should we go with?, keeping the audience score material out of the first paragraph was judged as edging out the other options. So if Toa Nidhiki05's suggestion is to have some of the audience score material in that first paragraph, I point editors back to the "Which version?" RfC. Flyer22 Reborn (talk) 06:30, 15 January 2019 (UTC)
I get where you're coming from Toa and I agree with most of the criticism, so this definitely isn't about removing it, just rewriting it. The chosen modifiers are vague, mildly contentious, and seem to be chosen only to influence the reader's opinion. We might also have said "Rotten Tomatoes, which had a larger sample size", "CinemaScores, which lacks double-blind sampling", etc. but we don't write like that because that's not NPOV. No evidence of foul play was presented (barring someone on FB claiming they hacked the site), so it's just criticism of the methodology, which is fair but subjective. Forbes went so far as to call the accusations a conspiracy theory. While I agree that CinemaScores is probably more scientific, it isn't clear what's meant by the comparison, nor is it for a writer at Deadline Hollywood to decide what is and isn't science. CinemaScores certainly doesn't use 'controls' in the scientific sense of the word, so 'controlled sampling' is confusing if not misleading. And given RT's detailed response to the accusation, I don't think they'd agree with being called unscientific by comparison (amusingly, the first line of the Vox article says that measuring audience reaction is far from an exact science). As long as we've agreed to use the scores, I think the best course of action is to present the scores as they are, without modification, followed by relevant criticism. Short of that, I think the next best thing would be rephrasing it with additive, impartial details on how the scores were arrived at. I.e. "exit polling", "user-generated scores", etc. If we can achieve some impartiality there, I'm less concerned about where it's placed and defer to previous consensus. Thanks. Scoundr3l (talk) 22:02, 15 January 2019 (UTC)
I'm trying really, really, really hard not to engage in the petty back-and-forth on what makes a poll scientific and what it means to use controlled sampling. Some of the comments above are likely being made by individuals who never took a statistics course in college. For anyone who genuinely is confused and would like more information, you may find the sources below useful:
We should move on, however; there's no sense dwelling on the irrelevant details in this discussion. We've already found ourselves in agreement (for the most part anyway, minus possibly Toa) about the next course of action. Drop the "lack controlled sampling" phrase, which isn't properly sourced. The cited Vox article doesn't use or define the term "controlled sampling". As a compromise, relevant criticism of polling methodologies can be inserted into the same paragraph in the following sentence. We just need to agree on what that says. --GoneIn60 (talk) 06:15, 16 January 2019 (UTC)
GoneIn60, if your "Some of the comments above are likely being made by individuals who never took a statistics course in college." comment includes me, I'm not sure why. Flyer22 Reborn (talk) 20:24, 16 January 2019 (UTC)
I'd likewise take exception at the comment if it wasn't vague enough to ignore. The question is why CinemaScores is being called "scientific". I'd be happy to discuss why it doesn't quality, based on my college education, but what'd really make me happy is better sources. Scoundr3l (talk) 23:30, 16 January 2019 (UTC)
Flyer, no, not at all. The comment was meant to be vague and ignored; it wasn't aimed at anyone directly. It was more out of frustration with the direction things were heading, more so that any particular comment or editor. Also, sources I posted following that comment were for anyone that genuinely needed more information on what makes a poll scientific. Hopefully they were helpful, but if not, exploring it further would be best saved for another time and another thread. This one doesn't need hijacking, which if you ask me, ended with a rather decent compromise. --GoneIn60 (talk) 03:39, 18 January 2019 (UTC)
GoneIn60, this is not the same section as the #Audience reception section copyedit section, obviously. That section somewhat focused on how statistics are to be seen. So I was wondering about your "never took a statistics course in college" statement in relation to this section. Anyway, moving on. Flyer22 Reborn (talk) 00:04, 19 January 2019 (UTC)
I reread your comments, and it's clear that my comment didn't apply to you. Regardless, it has been struck out. It was a poor choice on my part. --GoneIn60 (talk) 02:31, 19 January 2019 (UTC)
I understand that you clarified that, but I did wonder if you were including me somehow. Flyer22 Reborn (talk) 05:09, 19 January 2019 (UTC)
Scoundr3l, I think we need to save the "scientific polling methods" issue you brought up for another discussion. I advise that you read the entire Audience reception section copyedit above, and then if you have something new to bring to the table, begin a new thread. That's a separate beast in my opinion.
"In my original proposal, I only added "necessarily" because Vox said it..."
Well, not exactly. Vox did not say "Audience scores ... can not necessarily ensure contributing voters have seen the film". It says that every participant in a CinemaScore poll has seen the film, and then goes on to say that's not necessarily the case for audience score participants on RT, MC, and IMDB. A careful read here has the term "necessarily" implying that some participants may or may not have actually seen the film, but that doesn't change the fact that RT, MC, and IMDB have no way of ensuring that. You can't combine these and say "not necessarily ensure". The sites cannot ensure at all, period. The simple solution here is to drop the term "necessarily". As for "contributing voters", I think we should keep it simple and just say "participants". Anyone who participates in a poll is contributing to that poll's results, so calling them contributing is unnecessary fluff. --GoneIn60 (talk) 06:15, 16 January 2019 (UTC)
I believe I agreed with dropping "necessarily" several posts ago. As for changing "contributing voters" to "participants", I don't really see how that's necessary. Saying they contributed is synonymous with saying they participated. And removing "voter" just makes it less specific as to who we're talking about, but I'm not married to the phrasing. If that's what's holding up progress, go ahead and change it. As for the above thread, I've read it and I honestly don't see where anyone has answered Popcornduff on what is meant by "scientific polling" and "controlled sampling", just seems that everyone has taken those phrases for granted. A WP search for either term shows no results found and "controlled sampling" suggest "scientific control" which is not what's meant here. If we don't know what you mean, and WP doesn't know what you mean, surely readers won't know what you mean. So can we clarify the article and not the talk page? Thank you (Sincerely, I know this has been drawn out longer than it should) Scoundr3l (talk) 16:41, 16 January 2019 (UTC)
I think "contributing" is an unnecessary descriptor, but I'll compromise further and leave it for now only dropping "necessarily". I don't really have a lot of interest rehashing my stance regarding "scientific polling methods", but if this is still an issue you feel strongly about and want to discuss further, begin a new section referencing the previous section and see what kind of further discussion it generates. I may or may not jump in depending on where that goes. --GoneIn60 (talk) 20:29, 16 January 2019 (UTC)
Original proposal has been partially restored (diff). Had to change a bit of the grammar following the changes we discussed. I also removed IMDB, and there's really no reason to introduce it here. Hopefully that works for everyone. --GoneIn60 (talk) 20:38, 16 January 2019 (UTC)
Thanks, it looks good to me. As said, I don't see where anyone qualified 'scientific polling' but I'm happy to kick that issue further down the field, as well. It was a secondary issue that I'm sure will come up again. Scoundr3l (talk) 23:30, 16 January 2019 (UTC)

This says, "Certainly, there was legitimate debate among fans involving The Last Jedi’s characters and story lines. But the audience score was often cited in news articles as evidence of the film’s massive unpopularity—despite the fact that the metric is easily manipulated and not necessarily reflective of how most viewers felt about a movie." Thanks, Erik (talk | contrib) (ping me) 14:51, 5 March 2019 (UTC)

Yeah I think everyone here can agree that the scores are easily manipulated, but if it adds anything we didn't already know it might be worth adding. Scoundr3l (talk) 23:06, 6 March 2019 (UTC)

Discussion over 'Scientific Polling'

Update: A145GI15I95 changed "scientific polling methods" to "exit polls." Flyer22 Reborn (talk) 04:20, 29 March 2019 (UTC)

Hello, I was pinged. But why in the middle, not at the end, of this thread? The thread is hard enough to read as it is. Anyway, I changed "scientific polling methods" (which sounds like defensive weasel words, and which imparts no meaning to readers) to "exit polls" (which is a common phrase, and is the explanation given in the source). I saw this section was brutally discussed here, but discussion appeared to die several weeks ago, so I took this as a bold move to be possibly followed by a revert and discuss (as noted in my diff log). Best, A145GI15I95 (talk) 04:30, 29 March 2019 (UTC)
A145GI15I95, this section of the thread is specifically about the "scientific polling methods" wording (in part anyway). The last subsection of this thread is not about that. That is why I pinged you to this section. Flyer22 Reborn (talk) 05:21, 29 March 2019 (UTC)
Roger, roger. Force be with us. A145GI15I95 (talk) 15:58, 29 March 2019 (UTC)
The change of "scientific polling methods" to "exit polls" has now been reverted. May I ask specifically, please, is there really objection to this change, and why?
  • "Scientific" means "based on science" (NOAD), and "science" means "the intellectual and practical activity encompassing the systematic study of the structure and behavior of the physical and natural world through observation and experiment" (NOAD). The use of the word "scientific" here imparts no meaning to readers. It requires they check the source themselves to see what is meant. Its vague-but-important-sounding usage here (and in the source) is jargonistic, defensive, un-encyclopedic, and a weasel word ("words and phrases such as 'researchers believe' and 'most people think' which make arguments appear specific or meaningful, even though these terms are at best ambiguous and vague").
  • The phrase "scientific" is sourced from a single publication. And the publication is Deadline Hollywood, a former blog turned news site, which isn't the most esteemed.
  • The source explains its claim to "scientific" means simply exit polling. Exit polling is a common phrase with a specific definition in professional dictionaries like Oxford, Collins, American Heritage, and Merriam-Webster.
Considering "scientific" is vague and weaselly, and "exit polling" is accurate and neutral, is there really consensus against such a change? Thank you. A145GI15I95 (talk) 17:43, 30 March 2019 (UTC)
The term exit polling typically refers to elections. Scientific polling is far more accurate descriptor. Toa Nidhiki05 18:12, 30 March 2019 (UTC)
We could say "audience exit polls" or "theater exit polls" or "polls of audiences exiting theaters", etc, if there's a desire to qualify that this isn't regarding an official political election. I've presented above how "scientific" is definitively vague and therefore weaselly. It refers to such broad ideas as "the state of knowing" (Merriam-Webster) and "the study of the nature and behaviour of natural things and the knowledge that we obtain about them" (Collins). With no further explanation given, its usage here appears rather simply to mean "better". How is "scientific" far more accurate, please? A145GI15I95 (talk) 18:24, 30 March 2019 (UTC)
Your analysis also ignores the previous discussions on this page. If you're going to jump in, read everything. Additional sources have been provided above (I've included quite a few in my 06:15, 16 January 2019 (UTC) and 15:30, 4 December 2018 (UTC) posts). If you're challenging the existence of the term, those sources should help. Remember, this isn't an article about scientific polling. Using it in the appropriate context as the Deadline source does is fine, and the additional sources I've led you to show that this isn't some vague, weasel word. --GoneIn60 (talk) 19:26, 30 March 2019 (UTC)
And if you need more than just the Deadline source to justify "scientific" as a relevant description of polls like CinemaScore (or as an antonym of user-generated scores from sites like Rotten Tomatoes), then there are plenty more where that came from. Here's just a handful I found in less than 2 minutes:
How to help fix our terrible discussions of ‘Star Wars: The Last Jedi’The Washington Post
Captain Marvel Rotten Tomatoes May Be Right – Cosmic Book News
Bumblebee, Mary Poppins Returns And Aquaman All Have The Same CinemaScoreCinemaBlend
Hope that helps. --GoneIn60 (talk) 19:37, 30 March 2019 (UTC)
I read above and didn't see a complete explanation of why "scientific" isn't weaselly (vague and meant to sound superior), nor how it doesn't leave readers wondering what its actual meaning is. Nor did I see why succinct language to explain that these are polls based on audiences immediately exiting theatres, rather than reviews posted online by the general public at any time, wouldn't be more neutral and more accurate. Thanks for the three additional news links, but the first is behind a paywall, the second merely calls an unrelated Facebook poll unscientific, and the third describes theatre polls as more scientific (different saying they are scientific). These don't help the current wording of this article. Thank you, A145GI15I95 (talk) 20:04, 30 March 2019 (UTC)
The term "scientific poll" is defined very clearly at the sources I've directed you to. Some go pretty in depth and directly explain the contrast between scientific and uncontrolled, online polls. I've explained my stance thoroughly in the discussions above as well, and if there's something specific you disagree with, feel free to bring it up and I'll respond. However, I'm not going to repeat myself because of a lack of using the scroll bar.
As for your take on the most recent sources provided, the second actually states, "Obviously, this isn't a scientific poll or anything, and neither is Rotten Tomatoes..." In the third, it states, Unlike scores on Rotten Tomatoes or ratings on IMDb, CinemaScore is a more scientific measure of audience opinion on a movie". Calling user-generated polls unscientific is a verifiable claim. Calling CinemaScore and PostTrak scientific is also verifiable (and whether or not it's highly scientific or partially scientific doesn't matter). Sources like these wouldn't define polls as scientific or unscientific if this was vague terminology, by the way, which is at the core of your argument. It's a distinction they've chosen, not us. --GoneIn60 (talk) 21:26, 30 March 2019 (UTC)
Whether certain outside news sources define the general word "scientific" to something more specific than the dictionary definition of the word is immaterial, as readers aren't reading those sources, they're reading this article. Your comment a lack of using the scroll bar does not assume good faith. My interest is not to annoy you; my interest is to improve the readability and neutrality of this article, please. I read above. I saw arguments for and against "scientific", and arguments for "exit polling" but not against it.
May I ask about these ideas, to make the meaning clearer to first-time readers, and to appear less biased? Would these not improve the article, or at least not be of detriment?
  • Audience reception measured by scientific polling methods was highly positive. (current, provided for comparison)
  • A) Audience reception measured by exit polling was highly positive. (recently discounted by one editor above as appropriate only for political polls)
  • B) Audience reception measured by random samples exiting theaters was highly positive.
  • C) Random audiences polled upon exiting theaters received the film quite well.
Thanks again, 21:58, 30 March 2019 (UTC)
The "lack of using the scroll bar" comment can apply to anyone asking an editor here to repeat themselves. You may need to brush up on your understanding of bad faith. Also, there's no requirement to define the use of a term that is commonly used. You may argue that it wasn't common to you, but it is common in secondary sources (academic sources included, not just news). As for the proposals, I wouldn't be comfortable with either one unless a source that mentions their use of random sampling is cited. We already have one that calls them scientific, and it does so specifically in the context of Last Jedi. If you can find an alternate source that satisfies this concern, we can weigh the option of changing to "random samples" or "random audiences". I don't think either is really an improvement over "scientific", but at least a source would allow us to take it seriously. --GoneIn60 (talk) 22:43, 30 March 2019 (UTC)
I don't see evidence to assure that the definition of "scientific" used in this article's Deadline Hollywood source adheres fully to the definitions provided by the outside academic sources you listed far above.
The other concern (apart from weasel wording) is that the quote-unquote scientific polls only get immediate audience reaction, which can change overnight. To note this by the word "exit" or "exiting" would quickly and fairly make this clear.
Regarding the new (B) and (C) suggestions above using the word "random", please see the Deadline Hollywood source's explanation of how CinemaScore can be considered more scientific than MetaCritic, IMDB, and Rotten Tomatoes: Pollsters randomly choose six theaters in six cities (one theater in each city) to get to an ultimate goal of 400 to 600 ballots. It also describes their selection as "statistical", which could be incorporated perhaps as:
D) Statistically selected, random audiences polled upon exiting theaters received the film quite well.
E) Audience reception statistically polled by random samples exiting theaters was highly positive.
F) Statistically selected audiences polled upon exiting theaters received the film quite well.
G) Random audiences statistically polled upon exiting theaters received the film quite well.
H) Audience reception measured by statistical samples exiting theaters was highly positive.
A145GI15I95 (talk) 23:34, 30 March 2019 (UTC)

"I don't see evidence to assure that the definition of "scientific" used..."
So in all of your new proposals, which are becoming long-winded by the way, we see the audience being described as statistically selected, polled, and random. In the academic sources, that's precisely how a scientific poll is defined. Why that isn't evident to you is beyond me. Leaving "scientific" in place seems less convoluted.
"The other concern...is that the quote-unquote scientific polls only get immediate audience reaction, which can change overnight.
Do any sources share that concern? The point of any scientific poll is to identify patterns that are representative of a larger population within an acceptable margin of error. Here, the results are accurate enough to give analysts valuable insight on how a particular film will perform in the following weeks. If they didn't, then studios wouldn't pay top dollar for their services. They aren't always right, of course, but their continued existence and prominence in the industry have endured for a reason.
"Regarding...the word "random", please see the Deadline Hollywood source's explanation..."
Yes, it does have loose support for "random", although technically it's describing the theater selection process and not audience member selection (one does not necessarily translate to the other). I still think a better source is needed, because remember, CinemaScore is not the only poll that needs the support. PostTrak needs it as well. --GoneIn60 (talk) 03:30, 31 March 2019 (UTC)

I'm addressing one sentence: Audience reception measured by scientific polling methods was highly positive. It's cited by one source (Deadline Hollywood). The onus of finding more and better sources for this misleading and poorly sourced sentence lies with whoever defends it.
Its problems are: (1) The word "scientific" sounds weaselly. This could be improved with well-defined terms. (2) The polling only considers immediate audience reaction upon exiting theatres. This has nothing to do with "science", yet our wording implies it does. This could be improved with mention that these are exit opinions.
Regarding concern for brevity, suggestion (H) above is only two words longer than the current sentence (A). Here's yet another suggestion with the exact same number of words as the current:
  • I) Statistical polling of audiences exiting theaters showed positive reception.
A145GI15I95 (talk) 04:17, 31 March 2019 (UTC)
(edit conflict) You appear to have a fundamental misunderstanding of how consensus works on Wikipedia. One such way is through editing. The sentence you are challenging has been in place for over a year (diff). Not only does it have longstanding consensus through editing, but it was actually formed as a result of a drafting process, in which multiple editors participated (See Talk:Star Wars: The Last Jedi/Archive audience response). And if all that wasn't enough, consider that the phrasing withstood an RfC (see Talk:Star Wars: The Last Jedi/Archive 4#RfC: Which version of the Audience response section should we go with?); not one editor objected to the use of "scientific". This isn't some new addition that was immediately challenged. Considering that consensus has already been established, the onus for obtaining a new consensus is on those challenging existing consensus. I like how you dodged my direct answers above, though, and resorted to an incorrect interpretation of policy. Nevertheless, rest assured that your opinion has been duly noted. --GoneIn60 (talk) 04:49, 31 March 2019 (UTC)
Looks like you updated your post as I was responding. Your (I) suggestion is an improvement over the others in terms of grammar, but it drops a phrase that was discussed thoroughly in the past: "highly positive". That would need to remain in some form. Also, I still find "statistical polling" more vague than "scientific polling". I don't think you're going to sway me here, but if others think a change is needed, I'd probably concede to some version of (I). --GoneIn60 (talk) 04:55, 31 March 2019 (UTC)
Actually, we need to take SurveyMonkey into account as well. Their research team conducted the survey for Last Jedi and released its results to Mashable. I couldn't find verification that it was conducted on site as an exit poll. It may have been administered to respondents in a different fashion. So specifying "exiting theaters" wouldn't necessarily cover all 3 polling methods without additional sourcing. --GoneIn60 (talk) 05:03, 31 March 2019 (UTC)
When I earlier cited AGF, I should've cited Wikipedia:Civility; please make fewer insinuation (scrolling, long-winded, dodging, beyond me, rest assured). I'm following Wikipedia:BRD if these concerns and suggestions are new. The dictionary definition of "statistical" is more specific than "scientific". I'm not opposed to the word "highly". I'm unsure of Survey Monkey, as they're not mentioned in the problem sentence's source. I'd love to hear from more parties. Continuing to brainstorm with you:
  • J) Statistical polling of audiences exiting theaters showed highly positive reception.
  • K) Statistical polling of confirmed theater-goers showed highly positive reception.
Thanks, A145GI15I95 (talk) 05:34, 31 March 2019 (UTC)
Do yourself a favor and quit alluding that I'm somehow in violation of WP:CIVIL or WP:AGF. Be aware of WP:AOBF and WP:AAGF, and understand that false or questionable accusations can backfire. There is a proper venue to discuss those concerns, and this talk page is not one of them.
"Statistical" is defined to be "of, relating to, based on, or employing the principles of statistics" (courtesy of Merriam-Webster). Statistics, which can refer to a collection of data, doesn't always represent random, controlled sampling. Therefore, "statistical" is vague in this context and doesn't seem to sufficiently replace "scientific". One definition of "scientific" is "conducted in the manner of science or according to results of investigation by science: practicing or using thorough or systematic methods". The phrase "systematic methods" stands out here. Labeling a method "scientific" implies that it went through some systematic process to reach its conclusion. I'm glad you brought up dictionary definitions, because it helps paint a better perspective. (J) suffers from the same sourcing problem as I noted about (I), and (K) seems to be taking steps backward. I have no further comment at this time. --GoneIn60 (talk) 06:20, 31 March 2019 (UTC)
"Statistical" is a more accurate and meaningful term than "scientific", because statistics are definitively a specific branch of science and mathmatics. See Oxford's' "the practice or science of collecting and analysing numerical data in large quantities", AHD's "the mathematics of the collection, organization, and interpretation of numerical data", and Merriam-Webster's "a branch of mathematics dealing with the collection, analysis, interpretation, and presentation of masses of numerical data". Whereas "science" refers broadly to various states and methods of "knowledge".[1][2][3]
I don't see how suggestion (K) "steps backward". It says "statistical" instead of "scientific" and "confirmed theater-goers" instead of "audiences", both of which are more specific and accurate to our sources, and more meaningful to the common reader. And it's the same length as the current wording.
Do yourself a favor This is another example. I mean no allusion; I intend to make no report. I'm merely reminding you, please, to be polite, as printed at the top of this page. I'm not the first to question your word choice on this page. Please know we're all in this together. I haven't meant to beleaguer, but rather to answer your questions. This conversation would benefit from voices beyond yours and mine. May the Force be with us, A145GI15I95 (talk) 19:25, 31 March 2019 (UTC)
Our opinions differ about "statistical", and that's fine. Guess we'll have to agree to disagree. The phrase "confirmed theater-goers" grates very hard on the ears like fingernails on a chalkboard. In terms of grammar, it's a step backward.
Keep in mind that the act of reminding someone more than once about behavior is an insult in itself and does nothing but add fuel to the fire. In my 9-year tenure, I have yet to be reprimanded for anything behavioral, and by the looks of it, you've been here barely 8 months. I think that speaks for itself, nuff said. --GoneIn60 (talk) 20:40, 31 March 2019 (UTC)
It's by dictionary definitions (cited above), not opinion, that "statistical" has more specific meaning than "scientific". "Confirmed theater-goers" is a grammatically sound adjective and noun. The second word can be spelled sans hyphen, if preferred (AHD, Collins, M-W). Again, it's not my aim to insult or "fuel fire". Requests for politeness are permitted and preferable to reports; age of account isn't a factor in manners. I'd rather focus on content, and I'd rather hear a diversity of voices beyond yours and mine.
May I ask again, please, could others reply to this suggestion? It's for the Audience reception section, first sentence, with one source. Intention is to provide more clear, accurate, and neutral meaning to readers.
  • Audience reception measured by scientific polling methods was highly positive.
  • Statistical polling of confirmed theatergoers showed a highly positive reception.
Thank you, A145GI15I95 (talk) 16:20, 1 April 2019 (UTC)

Just reiterating my opinion from the previous discussion. If Deadline Hollywood is the only source that called CinemaScore a "scientific poll", we probably shouldn't be using that term. It's an exaggeration at best. While it's certainly more scientific than RT, CinemaScore is not a genuine scientific poll and I don't think they attempt to be. Scoundr3l (talk) 17:58, 8 April 2019 (UTC)

As I mentioned once upon a time in this wall of text, "scientific" probably isn't the best descriptor, and I don't think anyone is saying that it can't be replaced with something better. But for now, it appears to be the best option among those suggested, and it has a source to back it up. We can always take future suggestions (with sources) under consideration. The bottom line is that, "past discussions have resulted in a preference to differentiate between reliable and unreliable polling methods described in this section", so we need to maintain that distinction. Any change that removes or weakens that distinction should be avoided. --GoneIn60 (talk) 00:12, 9 April 2019 (UTC)
I think we're all trying to maintain the distinction, but a line can be drawn where it gets inaccurate and akin to puffery. I think that's where we are. CinemaScores is marketing research. By their own admission, their scores are high. They target a very narrow demographic in order to measure financial and advertising success, not overall quality. Surely we can find an accurate description which still differentiates it. Maybe "According to moviegoers polled at release" followed by a detailed description of the polling if necessary and can even including "more scientific" than online polling if it helps. Another idea. Scoundr3l (talk) 19:00, 9 April 2019 (UTC)
As described in this source (that I've now listed for the 3rd time) as well as in other sources I've posted, there are two key factors that determine if a poll is scientific: the participants cannot be self-selected, and they should also be chosen at random. All three polls – CinemaScore, PostTrak, and SurveyMonkey – meet that criteria. While it can be debated as to whether or not sampling methods used can be more or less scientific, that's an opinion. I don't want to engage in opinions that are unsupported by reliable sources.
Also, there's another point that hasn't been considered. The gathering of data is only one aspect of a scientific poll. How that data is analyzed is a factor as well and key to producing accurate results. CinemaScore, for example, uses a "extremely complex statistical analysis" according to an LA Times article, that also gives some backstory on the company's founder and describes the sampling methods in detail. Another source confirms the same but also verifies that the poll has a margin of error of 6%. That's a key revelation, because scientific polls are often defined by an expected margin of error. That source also mentions PostTrak, verifying that it too goes through a similar, rigorous process that some in the industry prefer over CinemaScore. SurveyMonkey is fairly new to the game, but I found their self-described summary aptly fitting here: "SurveyMonkey employs a team of survey methodologists—scientists who study surveys, polling, public opinion, and data collection."
It is a fact that these polls are considered scientific, so I disagree that we are crossing a line into inaccurate, puffery territory. --GoneIn60 (talk) 13:12, 10 April 2019 (UTC)
If the main concern is that "scientific polling methods" needs more clarification, then we could try something like:
Audience reception measured by scientific polls, which draw conclusions based on statistical analysis, was highly positive.
We could cite both the NCPP and LA Times sources from above for that addition, and we could also change "analysis" to "methods" if needed. --GoneIn60 (talk) 13:44, 10 April 2019 (UTC)
...and actually, this NY Times source cited earlier in the discussion could be used for that as well. From the source:
"Professional pollsters use scientific statistical methods to make sure that their small random samples are demographically appropriate to indicate how larger groups of people think." --GoneIn60 (talk) 14:19, 10 April 2019 (UTC)
It appears there are more editors against the current "scientific" wording than for it. A145GI15I95 (talk) 18:34, 10 April 2019 (UTC)
2-1 in one of many discussions is not consensus if that's what you're implying (refer to my response above on 04:55, 31 March 2019 if needed). This is a discussion about finding something better, not simply abolishing "scientific". Has that been accomplished? I don't think so. --GoneIn60 (talk) 12:05, 11 April 2019 (UTC)
Gone, I know you know what Original Research is, so please stop waving a Word document around and telling us all why you think it qualifies as scientific. You're not a reliable source. You know that. Going to a movie release and surveying only the people leaving the theatre on that night is not a random, scientific poll. That's targeted marketing research, it skews high, and that's exactly what the source you provided says it does. But it doesn't matter what you and I think about it, because the entire basis for including this phrasing is on the strength of one source, Deadline Hollywood, which is an infotainment site. Other editors think it's a bold claim and it needs more support. If it's true, it should be easy to support. If we can't, we should reword it. Scoundr3l (talk) 21:00, 12 April 2019 (UTC)
First off, the document was published by the NCPP, and has been cited in other articles on Wikipedia (though it's not being used here). The only reason for its use in this discussion was to provide a definition for "scientific poll". It wasn't the only source provided either. The definition came into play, because several editors including yourself questioned whether or not the Deadline source was accurate in its claim of "scientific". Others even questioned if such a term existed. These discussion sources are not currently being cited in the article; they're only being used to discuss the claim. WP:OR doesn't apply here. --GoneIn60 (talk) 11:36, 13 April 2019 (UTC)
Here is a list of sources that support "scientific polling method":
  • "The scores dramatically improve when we get to the more scientific polling, such as those from comScore and CinemaScore..." – Boxoffice
  • "...with a Cinemascore rating of A...It's a very scientific approach..." – Southern California Public Radio
  • "Unlike scores on Rotten Tomatoes or ratings on IMDb, CinemaScore is a more scientific measure of audience opinion on a movie." – CinemaBlend
  • "It's a scientific method that CinemaScore has been perfecting for decades." – Syfy Wire
  • "Even if you ignore the critical response to The Last Jedi and look instead the aforementioned Cinemascore...or other scientific audience exit polls" – ComicBook.com
  • "But their low ratings don’t jive with other, more scientific data." (in reference to CinemaScore and PostTrak) – Associated Press
  • "The more scientific CinemaScore..." – Rutland Herald
  • "PostTrak and CinemaScore are the scientific polled exits to go by..." – Another one from Deadline
  • "Since CinemaScore tracks reviews from audiences as they literally exit the theaters, it's a much more scientific form of polling." – Insider
  • "...the randomness of the sampling is less rigorous than the information collected by CinemaScore." (in contrast to RT, MC, and IMDB audience scores) – Vox
And for good measure, an additional definition of a scientific poll:
  • "A scientific, nonbiased public opinion poll is a type of survey or inquiry designed to measure the public's views..." – Gallup
If the problem is that you're not comfortable with only having that single Deadline source cited, well there's plenty to choose from if you'd like one or two more. Cheers! --GoneIn60 (talk) 19:07, 13 April 2019 (UTC)
Seems like an obvious solution would be to change it to "more scientific" as the sources prefer. Or even "using scientific methodology", which I admit seems trivial, but that's perhaps what makes it a fair compromise. It still creates a strong distinction but leaves a healthy gap between an unbiased "survey designed to measure the public's views" as Gallup defines a scientific poll, and marketing research of "the people most eager to like a new film" used to measure movie performance, as Cinemascores defines itself. Either would be better, in my opinion, though not perfect. Scoundr3l (talk) 19:43, 15 April 2019 (UTC)
I don't see how we can use "more scientific" in the opening statement, and I'm not sure there's really a difference between "scientific polling method" and "scientific methodology". I'm open to change, but not just for the sake of change. There needs to be a noticeable improvement. --GoneIn60 (talk) 23:28, 15 April 2019 (UTC)
At this point I'd like please to restate my last proposal, as a possibly imperfect improvement: Statistical polling of confirmed theatergoers showed a highly positive reception. Thank you, A145GI15I95 (talk) 23:37, 15 April 2019 (UTC)
Statistical polling is not an improvement and violates what sources actually say - none of them use that term. Gone’s proposals are far superior, but the current wording is better than all of them. Toa Nidhiki05 23:47, 15 April 2019 (UTC)
Statistical polling…violates what sources actually say - none of them use that term—This is incorrect. The single source for our "scientific" claim says:

These are scientific, statistically accumulated audience exit polls that … They literally have a Coca-Cola-like statistical formula that…

A145GI15I95 (talk) 03:22, 16 April 2019 (UTC)
Notice that when describing the exit poll, it calls it "scientific, statistically accumulated" not "scientific, statistical". It labels the formula statistical, but the poll itself is first and foremost labeled scientific. We also now have 9 more sources above to look at. It is predominantly described as "scientific" with no mention of the term "statistical" in any of those sources. Statistical isn't a bad descriptor necessarily, but it's too vague in this context to exist on its own. --GoneIn60 (talk) 08:11, 16 April 2019 (UTC)

The semantic hair-splitting seems to come and go as it's convenient. Are you seriously arguing that these polls aren't statistical? If you can see the difference between "statistical poll" and "polls using statistical formulas", then I'm sure you can see the difference between "scientific polls" and "polls using scientific methodology". Why is one an important distinction and the other a trivial change? 'Statistical' may be too vague but 'scientific' is too specific, so let's split the difference somewhere. Scoundr3l (talk) 20:52, 19 April 2019 (UTC)

Or we can use the term reliable sources use, which is scientific. Toa Nidhiki05 21:07, 19 April 2019 (UTC)
Reliable sources also call them 'exit polls', 'statistical', and 'more scientific' which are all alternatives on the table that have been rejected. Hence, discussion. Scoundr3l (talk) 21:18, 19 April 2019 (UTC)
No one rejected "more scientific". I said I don't see how it can be used. If you have a suggestion that doesn't make the opening sentence seem overly complex or convoluted, feel free to share. --GoneIn60 (talk) 14:23, 20 April 2019 (UTC)
Scoundr3l, the "semantic hair-splitting" goes both ways. Originally, you didn't like only having 1 source for "scientific". Now we have 10. Then you tried to split hairs by saying "more scientific" was a preferred choice over "scientific", maybe because you don't want to feel like you invested all this time for nothing. And now you're saying it's the opposition splitting hairs? I'm not buying it, and this moving of the goal posts is unreasonable.
By the way, your statistical vs scientific comparison above doesn't really make sense. The article states "scientific polling method", and in your comparison, this would be most equivalent to "polls using scientific methodology" or "polls using statistical formulas". So the contradiction you were trying to depict doesn't exist; I would tend to favor the same option in both scenarios (though I don't actually like the term "statistical" on its own). We have sources that directly call the polls scientific, and others that call the methodology scientific. We don't have any sources that directly call the polls statistical. However we do have 1 out of 10 that calls the methodology statistical. So at best, we could say "statistical polling method", but that would only have the support of a single source and, in my opinion, would make the statement more vague. --GoneIn60 (talk) 14:23, 20 April 2019 (UTC)
Regarding how to say "more scientific", we could change Audience reception measured by scientific polling methods was highly positive… to Audience reception measured by more scientific polling methods was highly positive… Regarding why, when we say one thing is scientific, we don't say the same for the other, we imply the one is flawlessly scientific and the other is completely non-scientific. It's like saying someone is older rather than old. This would improve accuracy and reduce bias. We could also change the following Audiences randomly polled by… to Confirmed audiences randomly polled by… to improve further the meaning imparted our readers. A145GI15I95 (talk) 15:50, 20 April 2019 (UTC)
I strongly disagree with this, as it de-emphasizes the fact that reliable sources state: RT and Metacritic are not scientific polls, and CinemaScore, SurveyMonkey, and PostTrak are. As far as I'm concerned, we have 10 reliable sources confirming "scientific" as wording used to describe them. It would be original research to downplay reliable sources because a few editors on the talk page don't like the word "scientific". The current wording is accurate and supported by reliable sources; none of the other proposals meet that. Toa Nidhiki05 15:54, 20 April 2019 (UTC)
The reader sees a claim that one group is scientific, and one group isn't, with no text to explain what merits the claim, and only a single, questionable source (a poppy movie blog) as the source. A145GI15I95 (talk) 16:01, 20 April 2019 (UTC)
If your concern is sourcing, pick three of the ten sources or so and add them. Not sure how you are still arguing about sourcing here. Toa Nidhiki05 16:03, 20 April 2019 (UTC)
Only one source in the article claims the second group is "unscientific", and it's a poor source. The suggestion to add "more" and "confirmed" would easily improve meaning and reduce bias, without drawback. A145GI15I95 (talk) 17:08, 20 April 2019 (UTC)
are you seriously trying to argue RT and Metacritic are scientific polls because you don’t think sources say they aren’t? You really need to rethink that argument. Toa Nidhiki05 17:33, 20 April 2019 (UTC)
That's not my argument. A145GI15I95 (talk) 17:50, 20 April 2019 (UTC)
The use of "more" implies a modicum of science. Probably best not to use that word. DonQuixote (talk) 18:04, 20 April 2019 (UTC)
"Only one source in the article..."
You keep arguing along these lines, but it's unclear why. We have 9 other sources now in this discussion. OK, so they're not in the article yet, but so what? That doesn't mean you get to ignore them. Ignoring them is akin to putting on blinders and shows an intent to be disruptive.
"Regarding how to say "more scientific", we could change...scientific...to...more scientific..."
Really? You took that much time to point out the obvious? No, that doesn't work, and when I said I couldn't see how it could be used, yes I already did the obvious substitution in my mind. Please don't waste our time here. I'm asking politely. --GoneIn60 (talk) 20:17, 20 April 2019 (UTC)
Sources on a talk page are of no benefit to readers of the article. A145GI15I95 (talk) 21:38, 20 April 2019 (UTC)
If you aren't going to be serious, there is no reason to continue this conversation Toa Nidhiki05 23:46, 20 April 2019 (UTC)
Well, duh! Sources can be added to the article at any time. We are discussing proposed changes, a future state version of the article that may or may not include the additional sources listed above. If you don't understand this, you may want to read over WP:TALK#USE, and if you're still not getting it, you may want to seek HELP. --GoneIn60 (talk) 14:29, 21 April 2019 (UTC)
Readers come to the article, find a vague claim with a single, poor source. We've stated this concern, offered suggestions to reword, and answered questions, genuinely and without sarcasm. A145GI15I95 (talk) 19:06, 21 April 2019 (UTC)
We've offered nearly a dozen more citations since then. If your issue is truly with sourcing, add three of them to the article and the problem is solved. Toa Nidhiki05 19:30, 21 April 2019 (UTC)
Indeed. WP:SOFIXIT applies. Either add it yourself, or quit complaining. oknazevad (talk) 20:16, 21 April 2019 (UTC)
"The WP:BURDEN to demonstrate verifiability lies with the editor who adds or restores material", not those who dispute the current wording, "and it is satisfied by providing an inline citation to a reliable source that directly supports the contribution". Also, Template:Collapse "should never be used to end a discussion". A145GI15I95 (talk) 17:49, 22 April 2019 (UTC)
WP:BURDEN has already been satisfied. As the other editors have pointed out, there's nearly a dozen sources presented in this discussion. If you're unhappy with the source that's currently cited, then it's up to you to change it to one that satisfies you out of the ones already presented. DonQuixote (talk) 18:20, 22 April 2019 (UTC)
A145GI15I95, I would advise being careful not to cross into WP:IDHT territory. A refusal or failure to "get the point" and heed advice from the community may be a sign that you need to disengage. I went ahead and added two more sources to the article for now, since it appears you didn't have any interest in doing so yourself. Time to move on. --GoneIn60 (talk) 03:51, 23 April 2019 (UTC)
No need to ping, I'm subscribed, and these concerns have come from the community. If we're recommending reading, I might also offer WP:BULLY. A145GI15I95 (talk) 16:42, 23 April 2019 (UTC)

Cinemascore calls themselves scientific, and some unscrupulous articles have picked up this phrase, but this is their marketing and not reality. They have not published their methodology in peer reviewed journals, and have not been subjected to any serious scientific scrutiny. I think A145GI15I95's proposals seem very reasonable. AfD hero (talk) 04:19, 24 April 2019 (UTC)

Back when we formed this section, you were OK with this version, which contained "scientific polling methods". So I just want to verify... Are you changing your stance, and if so, which of the 11+ proposals from A145GI15I95 do you favor? Also, I find it extremely odd that after being away from Wikipedia for almost an entire year, this is the very first contribution you make on your return. Interesting. --GoneIn60 (talk) 01:21, 25 April 2019 (UTC)

New information

This information added by Toa Nidhiki05 isn't at all needed. There are always going to be people who state that the film was a target of tampering with regard to its user scores. The bit that Toa Nidhiki05 added isn't new information. It's not like the source is reanalyzing what took place the previous year. The source is simply stating what others stated the previous year. So stating that "However, a year later, it was revealed that The Last Jedi had also been targeted.", as though it is some reanalysis of the matter that has taken the matter beyond speculation, is misleading. Instead of "Several reviewers speculated that coordinated vote brigading from internet groups and bots contributed to the low scores.", we could state "Several reviewers stated that coordinated vote brigading from Internet groups and bots contributed to the low scores." And then remove the redundant piece by Toa Nidhiki05. Toa Nidhiki05 doesn't like the material ending on the Rotten Tomatoes response, but there is really nothing else to state after that. UpdateNerd, your thoughts since you changed Toa Nidhiki05's wording? GoneIn60, your thoughts? Flyer22 Reborn (talk) 15:29, 8 March 2019 (UTC) Flyer22 Reborn (talk) 15:31, 8 March 2019 (UTC)

Did you even read the article? It’s not “people”. It’s a Rotten Tomatoes spokesperson saying their previous statement on whether the score was accurate or not is wrong and that multiple films, most prominently The Last Jedi and Black Panther, were targeted by serious vote brigading attempts. This is undeniably worth including, both on its own merit and because of the fact that it would be misleading for Wikipedia to have outdated information. Toa Nidhiki05 15:35, 8 March 2019 (UTC)
I was just about to state that since it's a Rotten Tomatoes spokesperson stating it, this does contrast the current Rotten Tomatoes statement and we should perhaps keep it. But we should go with the "a Rotten Tomatoes spokesperson" wording that you used instead of the WP:Weasel "it was revealed" wording. Flyer22 Reborn (talk) 15:37, 8 March 2019 (UTC)
I didn’t use weasel wording, that was from some other editor. My addition was:

A year later, following a review bombing campaign against Captain Marvel, a Rotten Tomatoes spokesperson said that The Last Jedi had been “seriously targeted” with a review bombing campaign.[1]

That was then modified by User:UpdateNerd into what is there now. I don’t like that user’s edits, frankly, but didn’t want to start an edit war. Toa Nidhiki05 15:42, 8 March 2019 (UTC)
Yeah, I saw what wording you used. That's why I stated that I prefer your "a Rotten Tomatoes spokesperson" wording. Sorry about my initial statement above. I've crossed through it. And I took the scare quotes out of the heading. Flyer22 Reborn (talk) 15:49, 8 March 2019 (UTC)
That’s alright, I can see what that might be confusing. I’d obviously agree to reinstate my own wording if that’s what you’d like, lol. Toa Nidhiki05 16:08, 8 March 2019 (UTC)
As long as it's properly sourced, the most balanced mention with no extraneous/editorial explanation is usually the best. I don't see how mentioning Captain Marvel or Thor or any other movie besides The Last Jedi is relevant, unless it's worded differently to explain how they are relevant. UpdateNerd (talk) 17:13, 8 March 2019 (UTC)
I liked the original wording, for the most part. I don't think mentioning Captain Marvel was a problem, but I agree that it doesn't need it. I would suggest changing it back to "Rotten Tomatoes representative" but dropping "However" (WP:HOWEVER), as that implies some relationship with their previous statement that isn't made clear in the source. Scoundr3l (talk) 18:25, 8 March 2019 (UTC)
Based on this source alone, I would agree with Scoundr3l's assessment that we should avoid wording that implies a relationship between the old and new statements. I'm also in favor of the original wording provided by Toa Nidhiki05, and it appears that "However" wasn't in there to begin with; "However" should be removed. In addition, I don't see an issue with including Captain Marvel (and Thor from a previous statement). Both provide context that is supported by their respective sources. Sure, this is an article about Last Jedi, but that doesn't mean we can't provide helpful context. --GoneIn60 (talk) 18:48, 8 March 2019 (UTC)
GoneIn60 got my idea right here. The reason to mention Captain Marvel is because that’s what sparked RT to say there was actually vote brigading at play here; I’d even go further and mention Black Panther, to give context that multiple films have been targeted, not just The Last Jedi. But I could see why we wouldn’t mention that narrowly here. Toa Nidhiki05 19:14, 8 March 2019 (UTC)
Basic copyediting words like "however" don't need to be pulled from the source; it's just there for flow purposes. This is indeed a reversal from when they said there was no tampering, and it's better than POV phrases like "in fact" or "actually".
Also, particularly in light of this new development, I don't see how the confusing statement about Thor: Ragnarok has any relevance now. All the specific titles could do with a mention on the vote-brigading article, but this is about the Star Wars movie. UpdateNerd (talk) 19:22, 8 March 2019 (UTC)
Well, "in fact", "actually", and "however" are all basically discouraged. In this case, the problem with 'however' is that it uses two different statements to imply a reversal that isn't sourced. While it does seems to contradict their previous statement, it's not our place to try to characterize their position or imply parity between the two statements. Verge doesn't tell us who said it, what specifically was said, if they are aware of the other claim, or how their statement is meant to be taken in conjunction/rebuttal to their previous statement. It may be a reversal, a rumor, or they may even be talking about two different things. Only they can really clarify that, not us (see conflict between sources). Besides, I think it flows just fine without it and allows the reader to make up their own mind. Scoundr3l (talk) 21:59, 8 March 2019 (UTC)
Again, well said. I couldn't agree more. I think it's pretty clear there's adequate support for some form of the original wording, so here's what I'd propose:
A year later, in the midst of negative review activity on Captain Marvel, a Rotten Tomatoes spokesperson said that several films including The Last Jedi had been "seriously targeted" with review bombing campaigns.
or without mentioning Captain Marvel...
In early 2019, a Rotten Tomatoes spokesperson said that several films including The Last Jedi had been "seriously targeted" with review bombing campaigns.
In the first suggestion, I thought it was best to rephrase so that "review bombing" didn't appear twice. In the second, I left it pretty much intact without the Captain Marvel context, but I felt "In early 2019" was better than using an indefinite article twice in a row (i.e. "A year later, a Rotten Tomatoes spokespearson). In both suggestions, I added the clarification of "several films", because this informs the reader that this wasn't some revelation focused keenly on Last Jedi. Instead, it was a culmination of activity on several films that prompted the statement.
Open to further suggestions of course! I prefer to retain the Captain Marvel context, but I'm willing to accept a version of either. Thoughts? --GoneIn60 (talk) 04:58, 9 March 2019 (UTC)
WP guidelines are just that, and not "policies" that must be followed. There are exceptions, and I think two contradictory statements are the perfect use-case. Otherwise it may be confusing and only make sense after reading twice (that's bad form if it can be avoided). Otherwise, placing the new information into a footnote would be more logical and self-consistent by actually acknowledging that two ideas, not one, are being presented. UpdateNerd (talk) 05:06, 9 March 2019 (UTC)
I think everyone here is well aware of the differences between guidelines and policies, so let's not drag this discussion into the weeds. Do you have any feedback on the suggestions above, or do you want to present one of your own? Trying to push this forward to a solution or compromise that most are comfortable with. --GoneIn60 (talk) 05:14, 9 March 2019 (UTC)
I was simply defending the current version. I don't assume everyone knows the nuances of how to balance guidelines, policies, etc. since it's all subjective, but in my view our goal should be to phrase things so a first-time reader understands more and isn't confused. UpdateNerd (talk) 16:40, 9 March 2019 (UTC)
(edit conflict) With this edit (followup fix here), I re-added Toa Nidhiki05's original wording, but without comparison to other films. Like I stated, using "was revealed" is wording that might get the sentence tagged with a Template:By whom tag. And we should be clear that it's a Rotten Tomatoes spokesperson who stated it. I took "review bombing" out of scare quotes per WP:Scare quotes. The sentence now reads as: "A year later, a Rotten Tomatoes spokesperson said that The Last Jedi had been 'seriously targeted' with a review bombing campaign." Feel free to tweak it, of course. I don't see that we need to state "In 2019" since "A year later" clearly means "In 2019." I also think that "A year later" flows better. Flyer22 Reborn (talk) 16:51, 9 March 2019 (UTC)
In light of the restoration of original wording, I would only suggest we replace the word "said" with "revealed", since it is now clear that its attributed to a RT spokesperson, and that word would be more suited to the context of the information presented. UpdateNerd (talk) 16:56, 9 March 2019 (UTC)
I don't see that "In early 2019" or any indication that it was stated early in the year is needed either. Flyer22 Reborn (talk) 16:54, 9 March 2019 (UTC)
On second thought regarding "In 2019," I understand the need to use it since the previous sources are from December 2017. Of course, math-wise, people shouldn't think we mean a few months later in 2018, but people often use "a year later" broadly or rather in an "it's this year now" context. Flyer22 Reborn (talk) 17:02, 9 March 2019 (UTC)

Two thoughts. Firstly, using the 2019 date is choppy, but is needed for accuracy. Secondly, and far more importantly, the "however" should definitely be restored, because the entire point of the sentence is that RT now admits that the viewer score for TLJ was manipulated, whereas they previously denied that. Without the word to represent the contrast, the point is damaged if not lost. This is one place where an exception to the guideline needs to be invoked or else the encyclopedia is worse. oknazevad (talk) 17:49, 9 March 2019 (UTC)

This is where things gets a little dicey. A closer examination of RT's statement on 12/20/17 literally has them saying that they "haven’t seen anything unusual" and "don’t see any unusual activity". In other words, they didn't detect anything. This is a carefully-worded, subtle denial only days after the film's release. For all we know, they may have detected it later on, or they uncovered a new method that allowed them to detect it. Either way, without further elaboration from RT, it's hard to say for sure if this is a direct or stark contradiction to what they're saying now. That's why "however" remains questionable in my mind. --GoneIn60 (talk) 18:04, 9 March 2019 (UTC)
Definition of 'however', from google: "used to introduce a statement that contrasts with or seems to contradict something that has been said previously."
I think this falls in the exact category. It's not a matter of using weasel words to tell the reader what to think; it's including a word that would be expected, and looks wrong without it. The guidelines allow exceptions as long as we have gone through all the alternatives and decide using the word improves readability. UpdateNerd (talk) 18:12, 9 March 2019 (UTC)
"it's including a word that would be expected..."
That's debatable. The more obvious the contrast, the more you would expect to see a word like "however". The point I've illustrated is that the contrast between statements isn't definitively obvious in the sources, so we shouldn't be implying it here. It's plausible that they didn't have evidence from the first few days, but they uncovered some in the weeks that followed. We simply don't have enough information to draw or imply any conclusions here. If the source would have focused on the contradiction, then I'd be more apt to support its inclusion, but that's not the case. --GoneIn60 (talk) 18:23, 9 March 2019 (UTC)
Yet the two sources clearly disagree, with one saying RT said there was no unusual activity, and the other saying they were targeted with "review-bombing". Polygon written before either RT statement was made makes it more clear there was an issue all along, so maybe RT just didn't want to get involved. I think we should replace the Screen Rant (comparitively poor source) with the Polygon one and include that perspective before mentioning RT's denial. UpdateNerd (talk) 18:36, 9 March 2019 (UTC)
Well, after reading the two sentences back-to-back several times, I'm beginning to see how it does seem to be missing a transition of sorts. I think your earlier suggestion to change "said" to "revealed" would soften the blow a bit. That might be a good start. I'm not sure about the impact of replacing the source, so let's see what others think. --GoneIn60 (talk) 18:44, 9 March 2019 (UTC)
Except that after finding a source that was claiming review-bombing all along, nothing was actually revealed per se. The word "admitted" would probably be more telling, but it's a weasel word not reflected in the source. UpdateNerd (talk) 18:53, 9 March 2019 (UTC)
Regarding the Thor: Ragnarok/Bleeding Cool info you just reverted my change to, my main problem is that it states that reviews "tapered off up to that point" without explaining when/what is meant by that. Can you explain or do you have a fix? My eyes get stuck on that every time I read it because of confusion, but it also really doesn't seem relevant. UpdateNerd (talk) 18:56, 9 March 2019 (UTC)
I rephrased both bits just now so let me know how that reads to you or if you see any outstanding issues. Thanks, UpdateNerd (talk) 19:02, 9 March 2019 (UTC)
(edit conflict) There were a lot of sources opining that review bombing or vote brigading was taking place (hence the presence of the Quartz and BleedingCool statements). These were opinions, however, based on circumstantial observation. Internal RT analysis has an advantage over these opinions. They have access to raw data (user accounts, IPs, history, etc.) that could expose fraudelent behavior that a simple observer wouldn't have access to. However, they are also a biased source, so anything they provide publicly should always be taken lightly with a grain of salt.
Also it's important to note that past discussions decided to include RT's response at the time to provide balance (as this response was picked up and analyzed by a slew of sources as well). Before removing or seriously modifying this statement, it should be discussed beforehand. The 2019 statement, by the way, is still a revelation that RT eventually detected review bombing; there is just too much uncertainty about what that really means. Were they hiding it? Was it simply a new discovery after 12/20/17? Is this analysis based on the same set of data that the original analysis was based on? We just don't know. --GoneIn60 (talk) 19:12, 9 March 2019 (UTC)
I think it's good enough as is for now, with those minor changes I just made. UpdateNerd (talk) 19:18, 9 March 2019 (UTC)
I think your recent changes, which are essentially just light copyediting moves, are fine. However, I would just advise that we take our time and let other editors weigh in for more drastic changes on older content. I'm not saying you're incorrect, but we don't need to be in a rush either. There are quite a few editors that held a stake in this section's formation. Thanks. --GoneIn60 (talk) 19:21, 9 March 2019 (UTC)

____

References

  1. ^ Robertson, Adi (March 7, 2019). "How movie sites are dealing with review-bombing trolls". The Verge. Retrieved March 8, 2019.