Wikipedia talk:Articles with the most references

Latest comment: 11 months ago by Usernamekiran in topic MOSTREFS bot
WikiProject iconWikipedia Project‑class
WikiProject iconThis page is within the scope of WikiProject Wikipedia, a collaborative effort to improve Wikipedia's encyclopedic coverage of itself. If you would like to participate, please visit the project page. Please remember to avoid self-references and maintain a neutral point of view, even on topics relating to Wikipedia.
ProjectThis page does not require a rating on Wikipedia's content assessment scale.

Intervals edit

Just wanted to state my piece: I find the current interval labeling to be inelegant and unhelpful. The reference counts do not follow an even distribution; it's more like the tail of a poisson distribution. I attempted to remedy this but was reverted for a reason I do not find compelling. However, I am uninterested in a battle over this issue, so I'll simply leave at this point. A waste of time. Praemonitus (talk) 23:25, 22 April 2013 (UTC)Reply

Stats on growth of references? edit

Hi there, very interesting list and Wow! A lot of references per article...  :-)

  • I am looking for a stat/ report/ study about the growth of references in enWP in total, for the last year or something similar. I think I read somewhere about an evaluation of this kind, maybe for references or for external links or citations. Unfortunately I can't find it anymore and don't remember the place or exact subject. I would be very thankful if somebody could point me to something!
  • On german Wikipedia, users goiken and Svebert looked at the growth of "<ref" in article name space between 10. April 2011, 29. Oktober 2011 and 13. April 2012 (based on dumps, reported here: [1], [2], [3]). They found a growth of references of 36,4 % (from 2,6 to 3,6 million), while the number of german WP articles grew 13,5 %. The average number of references per article was 2,1 in 2011 and 2,5 in 2012.
  • OK, wait, I found some very old stats from enWP of 2008/2009: User:Dr_pda/Article_referencing_statistics and User:WolterBot/Cleanup_statistics. "the average number of citations per paragraph was 2.07 for FA, 2.06 for GA, 0.87 for A class, 0.51 for B, 0.26 for Start, 0.14 for Stub and 0.15 for Unassessed articles. This was out of a total of 2,251,862 articles (disambig pages and obvious lists excluded); 1,625,072 (72%) had no refs" [4].

It would be great to have some recent stats! This might be more complicated now with reftoolbar-cite, lua-template? Any ideas or links to other evaluations? --Atlasowa (talk) 13:56, 23 April 2013 (UTC)Reply

Australian Diarists Article edit

The article got separated into three different pages and the references are now spread across those three, so the listing should be changed to either refer to the split in its listing, or to remove it from the list -glove- (talk) 00:38, 27 October 2016 (UTC)Reply

Propose consolidating the list and other section edit

It makes no sense to distinguish the articles into these two separate sections. We can retain the "type" showing list and the others, but don't split them into two different sections.Terrorist96 (talk) 19:29, 10 July 2021 (UTC)Reply

  • I think the current arrangement is fine and logical, but "Lists" should follow "Other articles". ili (talk) 15:57, 30 August 2021 (UTC)Reply

MOSTREFS bot edit

Hello. I am encountering a few different scenarios while creating the required parameters. For example, in one scenario if there are 10 unique refs/cites, but used for 20 times, then the result is coming out as 20 refs used in the article, with a margin of error being 2-5% (due to various ref/cite formatting, and inconsistency therein). In second scenario, there are 10 refs/cite, then the result is 10, with a margin of error being 10%. In both scenarios, the sources without "cite", "ref" or other formatting are not being counted eg Khun Sa#Sources. What would you folks prefer? courtesy ping to Sanglahi86 —usernamekiran (talk) 04:54, 21 December 2022 (UTC)Reply

@Usernamekiran: Perhaps retrieving both the number of "References" and "Unique references" and placing the data separate columns in this page will be better (and simpler) for standardization and statistical purposes. XTools lists both "References" and "Unique references". —Sanglahi86 (talk) 14:29, 21 December 2022 (UTC)Reply
I will see if that can be done. And how many articles should be included in the list? I was thinking a list with 500 articles, to be updated twice a month. But configuring that is very easy, so please feel free to tell me about the number, and frequency. —usernamekiran (talk) 15:11, 21 December 2022 (UTC)Reply
@Usernamekiran: This page currently separates articles from lists; personally, I would prefer dividing the number (500 is a good number) between non-articles and lists. As for frequency, I think weekly updates might be better. —Sanglahi86 (talk) 17:16, 21 December 2022 (UTC)Reply
  • @Sanglahi86 and Theknine2: Hello. I have initiated the final scan a few minutes ago. It might take a few days to scan through all the ~5 billion articles. I can also create a separate table for 500 list articles. I will let you guys know once the scanning is finished. I apologise for the delay. —usernamekiran (talk) 18:12, 18 February 2023 (UTC)Reply

As we're now one month away from the tenth anniversary of when I created this page, it seems very apropos that a bot is going to start maintaining it. I'm excited to see how it goes!  — Scott talk 23:51, 19 February 2023 (UTC)Reply

  • @Scott, Sanglahi86, and Theknine2: all done. You can see the results at User:KiranBOT/MOSTREFS. It is basically an empty page, with User:KiranBOT/MOSTREFS/articles, and User:KiranBOT/MOSTREFS/lists are transcluded there. Please feel free to make any changes or suggestions. Also, there are some kinks that need to be worked out. I will be working on it. The only thing I can't work out is to find the pages that have some transclusion in them (working on it). The bot will update these two pages on every fifth day of the month. I am also thinking about creating a new hidden category. It'll improve the results in next runs. I'll post about it tomorrow in detail from computer. Please feel free to make any suggestions. —usernamekiran (talk) 19:22, 21 February 2023 (UTC)transclusion isn't working. —usernamekiran (talk) 19:32, 21 February 2023 (UTC)Reply
    The lists look perfect! I hope these can be transferred over to the main article soon. Theknine2 (talk) 07:04, 28 February 2023 (UTC)Reply
  • The table of lists has been saved as empty, but it has been scanned. I will fix it tomorrow. —usernamekiran (talk) 19:36, 21 February 2023 (UTC)Reply

April update edit

  • @Scott, Sanglahi86, and Theknine2: the bot is working as expected. It updated the report a few days ago. Since day 1, I was aware that scanning the dump would give out inaccurate results. So I had been working on some different approaches as well. Scanning live wikipedia is time consuming, as well resource consuming. If done with one method, it would take around 90 days to scan all the 7.5 ish million articles. I came up with a different method/approach which would take around 30 to 40 days to scan all the articles while being easy on the resources, and without overwhelming wikipedia servers. But given it would take around (or more than) a month to finish all the mainspace, the task should be run only once in three months. If we update the live-report quarterly, and dump-report monthly, then it will give you guys a very good idea about articles with most refs. So basically there would four pages of reports, where should I save them? (MOSTREFS/articles, and MOSTREFS/lists for dump-version, and live-version). I have initiated the program for scanning the live wikipedia, 30 to 40 days is only an estimation, I am not sure how much days it would actually take. Kindly let me know the page locations, and any other suggestions. The program would take around a month to finish, so no hurries. The suggestions would be implemented after this run finishes. —usernamekiran (talk) 13:45, 11 April 2023 (UTC)Reply
@Scott, Sanglahi86, and Theknine2: Hello. The run on live Wikipedia finished. The results are accurate. The lists can be seen at User:KiranBOT/sandbox1, and User:KiranBOT/sandbox2. They have same accuracy as of xtools. In this run, I excluded articles with less than 500 unique references. In the next run I will make it 400. I can also (automatically) create some other statistics based on the raw report generated by bot. —usernamekiran (talk) 14:20, 7 May 2023 (UTC)Reply

Propose consolidating lists edit

The lists of articles are very very long, I propose consolidating the "main articles" list to at least articles with 450+ references, and the "lists" list to at least 550+, plus it helps make both lists easier to upkeep. Theknine2 (talk) 19:24, 15 February 2023 (UTC)Reply