User:Lingzhi2/reviewsourcecheck

This editor uses reviewsourcecheck to find errors in the references and notes.

This script simplifies source reviews by flagging 16 types of errors in the references and/or notes. It is a fork of Ucucha's HarvErrors2 script; it produces no output if harv templates are not being used.

To check as many errors as possible, I recommend using this in conjunction with two others

  1. First, copy/paste importScript('User:Ucucha/HarvErrors.js'); to Special:MyPage/common.js . This script by Ucucha is indispensable in its own right. In addition, my script relies on its output, and thus cannot function without it.
  2. On the same page and below that script, add one of these two variants, depending on where on the page you wish to add a link that toggles between "Hide ref check" and "Show ref check". Both versions should store the page state (whether reference errors/warnings are "hidden" or "shown"). The state persists between page loads and between the browser closing and reopening (unless cleared by the user, for example by deleting data in your browser's cache etc.). Huge thanks to User:Evad37 for much coding help:
    1. If you want to add a tab to the drop-down tab at the top, located between the 'watchlist star' and the search box (using the vector.js skin), add importScript('User:Lingzhi2/reviewsourcecheck.js');
    2. If you want a link to be added to the Toolbar section on the left side of the page, add importScript('User:Lingzhi2/reviewsourcecheck-sb.js');.
  3. Finally go to to Special:MyPage/common.css and add .citation-comment {display: inline !important;} /* show all Citation Style 1 error messages */.

When you've added those, go to an article to check for various messages in its notes and references. (You may need to clear your browser's cache first).

After the explanatory #Common sense and #Cascading errors sections, the #Descriptive table section below gives a short summary of the messages generated by this script (other messages that you may see are output from the other scripts, of course). Below that is a dummy article with some examples of errors in notes and references for you to examine.

OCLC, Archive links and odd, incongruous errors edit

  1. If you need to find an OCLC number, go to WorldCat.org and use the "Find items in libraries near you" search. From the results, click the link for the item you're interested in. The output of that search also includes the OCLC Number, a little bit down the page.
  2. Missing archive link: Archiving is not required, but is considered good practice. This is flagged even if an access date is present. To archive: Go to an articles History and click the "Fix dead links". On the "IABot Management Interface", find the small drop-down menu that says "Run Bot" and select "Fix a single page"). Type a Wikipedia article's title on the line that says "Page title to analyze" Be sure to check the checkbox labeled "Add archives to all non-dead references (Optional)". Run time for the bot can be from a few seconds to several minutes.
  3. If the script displays some sort of an odd, incongruous error, such as "Missing pagenums for book chapter?" for a news article, check to see if you're using {{citation}} or {{cite book}} or whatever instead of {{cite news}}. Since this sort of template mismatch often may not change or disturb the output that the readers see, it may not be an urgent problem.

ISBN and/or OCLC for book sources edit

This is a straight copy/paste from WT:FAC discussing OCLC and ISBN guidelines. The relevant diff is here

Could somebody remind me - I'm sure I've read previously that giving both the ISBN and the OCLC numbers is deprecated. Is that right? Also, assuming it is, is there any preference for one over the other? Many thanks. KJP1 (talk) 08:27, 15 September 2019 (UTC)

Hi KJP, I'm not sure there is a hard and fast rule (if there is, I have missed it), but from the reviews I've done, I've seen people using the 13-didgit ISBN wherever possible, with the OCLC for those publications without. Cheers - SchroCat (talk) 08:47, 15 September 2019 (UTC)
SchroCat - Many thanks. Shall follow this convention. KJP1 (talk) 08:49, 15 September 2019 (UTC)
Speaking primarily here as editor, what SchroCat says. Cheers, Ian Rose (talk)
Speaking as a nitpicky sources reviewer, ISBN and OCLC is overkill. Use the ISBN if one exists, otherwise the OCLC. Brianboulton (talk) 23:06, 17 September 2019 (UTC)
I've felt the same, that both are overkill, however a presentation at WikiConference USA in 2015 changed my opinion. The OCLC link goes directly to the Worldcat record for that number, while an ISBN link takes a reader to a page with various options, several of which may or may not be useful to actually finding a copy of the source. The OCLC therefore streamlines the search to finding a library with the source, while the ISBN link requires another click. As such, I always list an OCLC, adding an ISBN if one exists. Imzadi 1979  23:37, 17 September 2019 (UTC)
I think that in the absence of a set ‘rule’ on how to do it, as long as each article is consistent in the selected approach, then there should be no problems. - SchroCat (talk) 18:45, 18 September 2019 (UTC)

Common sense edit

Use these warnings and error messages with a spoonful of caution and common sense. For one thing, editors have some discretion in whether they wish to use some template parameters. A good example is |location= for books: many editors considered these mandatory for books, because different publisher locations may produce different formats with different pagination. However, the only unbreakable rule is that usage must be consistent. This script generates a running count of books with and without publisher location. If one or more Inconsistent Location warnings are displayed, go to the last instance and look at its totals. If most (but not all) of the books cited include a location, suggest that the editor(s) fill in the missing location(s) as often as possible. If there are many without a location and a few with, you could suggest removing those few. [Or just go ahead and suggest they fill them all in. Consider it a public service].

Another example is the messages regarding adding access date and archive urls/dates. Editors have even more discretion with these than they do with |location=. Per WP:CITE, |access-date= is only required for web sources without a known publication date. In practice, moreover, some relatively stable urls (e.g., those from Google Books) may not need access dates either. In general, however, try to encourage consistency, because access dates are often very useful. Access dates aid automated tools (and humans) in locating the correct version of an archived web page, or in adding archive links in the future. Even with relatively stable resources like Google Books, there's no future guarantee that the source will remain accessible, so the access date is an indication that at least at one time that it was.

Note that P/PP error? is the check that seems to display the greatest number of false positives. There's simply no way to cover all the possible variations that could be displayed. Treat this as marking "something to check; it could well be wrong" rather than a definitive error.

Also note that sometimes you also need to apply a little knowledge of the outside world. A relatively amicable historian would be amused/bemused to see Missing first name for: Suetonius:

  • Suetonius (2008), Edwards, Catherine (ed.), Lives of the Caesars, Oxford University Press, ISBN 9780191609107
Bede (860). "St. Gallen Stiftsbibliothek Cod. Sang. 254. Jerome, Commentary on the Old Testament book of Isaiah. Includes the most authentic version of the Old English "Death Song" by the Venerable Bede". Europeana Regia. Retrieved 5 June 2013.

Cascading errors edit

Correct use of the |location= requires consistency, while references such as {{cite book}} and {{cite web}} need to be sorted systematically if they are grouped together in their own dedicated section. Correct use of |location= parameters depends on whether or not other sources on the page include that parameter, and correct placement of a reference within sort order depends on the sort order of the references above it. For this reason, the programming logic for both "Inconsistent use of Location" and "Sort error, expected: <identifier>" displays error messages that are cascading .

For the sort errors, the first reference listed out of sort order causes that reference and perhaps many or all subsequent ones to display the error message. The first reference listed out of sort order will often throw others below it out of sort order too. If there are several, start from the top and work down when correcting. Fixing one or two might clear many errors. This is a cascading error; when fixing, start at the first instance & go down.

It is very strongly suggested that the |location= field be populated, but one could argue that it is not, strictly speaking, absolutely required. However, there is a requirement that it be employed (or not employed) consistently. As a case in point, this version of our article on "Slavery" has (looking at last instance in the running total) "Inconsistent Location (16 with; 141 without)". If that article were in FAC it might be possible (but notvery advisable) to delete the 16 locations.

Descriptive table edit

Error and description
Error message Comments
P/PP error? very short snip; Page ranges for footnotes. In the footnote, if the page range given includes a dash or hyphen or comma, it should say "pp." and not "p.", and vice versa. As discussed earlier, this check is a time saver that is not foolproof. Produces some false positives.
Caution: No ref= anchor;  If a source that is sitting within a large References section has no |ref=, it may be difficult to spot the fact that it isn't pointing to anything at all in the article's body text. This locates such instances. A slight drawback is that sometimes you don't want source to link to anything (e.g., "Further reading"). If the resulting warning messages annoy you, use |ref=0 in the cite template.
Hyphen in pg. range; Use User talk:GregU/dashes.js to fix this error.
Missing identifier (ISSN, JSTOR, etc.); Only for periodicals
Warning: duplicate author/date; For example, if there were three instances of "Bhattacharya, Sanjoy (2002)", the second and third would be flagged (but not the first). Editors should modify all three to appear as "Bhattacharya, Sanjoy (2002a)", "Bhattacharya, Sanjoy (2002b)" and "Bhattacharya, Sanjoy (2002c)".
Missing first name for: Asimov; No check performed if the citation includes et al.
Inconsistent first name fmt.([some number] full, [some number] abbrev); Full or abbreviated first names, as for example "Wang, N." versus "Wang, Nancy". The respective numbers in "[some number] full" and "[some number] abbrev" indicate the running total of each; the underlined one indicates whether the current reference is full or without a abbreviated.
Inconsistent Location ([some number] with, [some number] without); Only for books. The respective numbers in "[some number] with" and "[some number] without" indicate the running total of each; the underlined one indicates whether the current reference is with or without a location. There's no requirement that the |location= field be populated, but there is a requirement that it be employed (or not employed) consistently... see this parameter as discussed in both #Common sense and #Cascading errors.
Caution: Missing pagenums for book chapter?  
Missing Publisher; Only for books. I was unable to see anything in the page source which would facilitate flagging websites with no publisher.
Missing ISBN; Books, if publication year >= 1970
Pub. too early for ISBN; You do the math. This could be caused by a typo, but more likely indicates the source is a reprint (which therefore should include an |orig-year= parameter).
Missing OCLC; Books, if publication year < 1970
Missing Year/Date;  
Missing archive link;  
Missing access date; NOT flagged if a publication date or an archive date is present.
Sort error, expected: Wiznewsky; See #Cascading errors. Only checks within sections that have one of the following headings: "Biographies", "Bibliography", "References", "Citations_and_notes", "Literature_cited", "Works_cited", "Book_sources", "Primary_sources", "Secondary_sources", "Sources", or "Specialized_studies". Should be able to handle pages with more than one section of references (e.g., Primary and secondary) and sort each separately. Should not sort sections where references are listed in order of occurrence (and are not intended to be sorted). The sort uses javascript's sort(Intl.Collator()) function so it is relatively language-sensitive for Romanized scripts, but it is not foolproof. In fact it is probably adrift with non-Romanized text such as "สองศิษย์เอกธัมมชโยโต้ขอกล่าวหาวัดพระธรรมกาย". You'll just have to discuss those with other editors.

Dummy article (example) edit

Lorem ipsum dolor sit amet,[1] consectetur adipiscing elit,[2] sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.[3] Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.[4]

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.[5] Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident,[6] sunt in culpa qui officia deserunt mollit anim id est laborum.

Notes edit

  1. ^ Trigger 1980, p. 32; Green 1969, pp. 123; Champion 2009, pp. 121.
  2. ^ Weigold 1999, pp. 43–53.
  3. ^ Kishimoto (2007). "Chapter 225". Naruto. Vol. 25. Viz Media. ISBN 978-1-4215-1861-9.
  4. ^ Nelson 2009, pp. 204 to 5.
  5. ^ Wang 2014, p. 323-4.
  6. ^ Wang 2014, p. 226–7.

References edit