Template talk:Taxobox/Archive 17

Latest comment: 13 years ago by MichaK in topic NCBI Taxonomy IDs
Archive 10 Archive 15 Archive 16 Archive 17 Archive 18 Archive 19 Archive 20

Template adds drafts or talk page uses to encyclopedia categories

Hi, this template seems to add articles to the encyclopedia categories even if used in user or talk space. For example see User:Klichka/testbard and Category:Domesticated animals. Jason Quinn (talk) 01:52, 12 April 2010 (UTC)

Fixed. Ucucha 02:11, 12 April 2010 (UTC)

unranked_superclassis

Hello. Is it possible to include unranked_superclassis (above superclassis) and move unranked_classis above classis? Thanks. --kupirijo (talk) 23:47, 20 April 2010 (UTC)

Could do, I suppose. What is the need? Hesperian 11:08, 21 April 2010 (UTC)
Makes more sense because our "unranked_X" ranks are usually directly above "X". We have some taxoboxes that use "unranked_classis" for a rank above superclassis, though. Ucucha 11:10, 21 April 2010 (UTC)
We should track them down and replace them with unranked_superclassis in order to be consistent. --kupirijo (talk) 12:03, 21 April 2010 (UTC)
The reason is the following: In the article Amniote, I would like the word Amniota to show below Tetrapoda in the taxobox. --kupirijo (talk) 12:02, 21 April 2010 (UTC)
There are about eighty of those [1]. The best thing to do I think is to first add the unranked_superclassis parameter, then replace all places where unranked_classis is used for a rank above superclassis with unranked_superclassis, then move unranked_classis to below superclassis. Ucucha 12:09, 21 April 2010 (UTC)
Probably even fewer. I also spotted some that should not be changed. e.g. Neotherapsida. (I just corrected a spelling mistake that made it Sauropsida not show). There are probably more cases like this involving Tetrapoda. --kupirijo (talk) 12:27, 21 April 2010 (UTC)
What a disaster of an article that is. Flags in the distribution... But I agree that 80 is probably a high estimate. Ucucha 12:31, 21 April 2010 (UTC)

Not a MoS

How is this a MoS? It is a template's documentation and as such should be removed Gnevin (talk) 11:57, 23 April 2010 (UTC)

It gives style advice about a particular issue of style—how to use the taxobox. Ucucha 12:03, 23 April 2010 (UTC)
No it gives advice about how to use the taxobox which has a particular style . There are no stylistic choices to be made here the infobox like all infobox sets the stylistic choices hence why we use Infoboxes Gnevin (talk) 12:31, 23 April 2010 (UTC)
In fact, there are stylistic choices to be made when using the taxobox—whether or not to include particular ranks in the taxobox, for example. Ucucha 13:01, 23 April 2010 (UTC)
The majority of boxes have fields that can be hidden . Stylistic choices are when to use Capitals, when to use Italics , what words to use and not use . This is documentation Gnevin (talk) 13:30, 23 April 2010 (UTC)
As in Template:Taxobox#Bold/italic markup, you mean? Ucucha 13:33, 23 April 2010 (UTC)
Then move that content to Wikipedia:Manual of Style (Taxonomy) like Wikipedia:Conservation status Gnevin (talk) 13:37, 23 April 2010 (UTC)
The page is a mix of template documentation and style guidelines. Style guidelines should be in the Wikipedia:Manual of Style or one of its subpages. Otherwise it's hard for new editors to find. Plus, sooner or later a conflicting recommendation will accidentally get slipped into the MoS. (This has happened many times for many topics.) I think it would be best to separate the documentation from the style guide. Ozob (talk) 01:59, 29 April 2010 (UTC)

Viruses - smallpox, polio

When the subject is a horrible virus that has no ecological niche other than to cause pain, suffering, and death, is the regular taxo box setup really desirable? It's not like the life form (if a virus can be considered "life") contributed to biodiversity or the functioning of the ecosystem. It is thus unlike other species that humans have eradicated. The genocide committed against the smallpox virus was undeniably beneficial to humanity, with no effect on the biosphere - unlike the Dodo, the Scimitar Oryx, the Guam Rail, and the Mason River Myrtle. Is there a property of the taxobox (or could there become one) to show an intentional, humanitarian eradication of plagues? Smallpox is extinct in the wild, but nobody's sorry about that (except perhaps a few would-be terrorists). This would also fit in the box for polio, where it would read something like "Critically endangered - but not yet accomplished." Z.S. ......(talk) 02:25, 27 April 2010 (UTC)

Template:Detestableorganismbox perhaps? In any case, the conservation status should be given by an external source, not our own interpretation. I doubt any list the smallpox virus. Ucucha 02:38, 27 April 2010 (UTC)

Change the Conservation Status link?

  Resolved
 – No change needed and Stemonitis (talk) answered my questions. --Modocc (talk) 16:39, 13 May 2010 (UTC)

Perhaps the link for Conservation Status should actually go to Wikipedia:Conservation status. I left a message dif with Gnevin (talk) explaining the need for this change, since I think he prematurely removed the page from the guidelines probably due, in part, that this box does not have a link to Wikipedia:Conservation status, which would give the false impression that this page is not important. Furthermore, at the top of Conservation Status article it states: See Wikipedia:Conservation status for the classifications of conservation status used in Wikipedia. It therefore appears to be a fairly important page, or it wouldn't exist, but that is in need of some attention and with a direct link it would get some attention. --Modocc (talk) 06:30, 13 May 2010 (UTC)

I don't think this change would be a good idea. We should not link readers to pages in the Wikipedia: namespace if possible. The example which seems to have brought you here, king of herrings, does not link to a particular set of conservation statuses, because it has no conservation status. It is listed as "NE", presumably meaning "not evaluated" for the IUCN Red List. Were it to have a meaningful conservation status, including the status_system parameter, it would indeed be linked to an article like IUCN Red List, which would explain the meanings of the different statuses (see, for an example, Afzelia rhomboidea). --Stemonitis (talk) 07:57, 13 May 2010 (UTC)
You replied while I was revising my initial post, no edit conflict for some reason. The problem with the linking to the Conservation status article instead is the user does not always know beforehand which specific system or systems is being referenced, and even if they do, the articles about these different systems are incomplete anyway and they are not intended to be complete. Thus users may often need to go to Wikipedia:Conservation status for information that isn't in those other articles. --Modocc (talk) 08:49, 13 May 2010 (UTC)
That sounds like an argument for improving those other articles, rather than introducing cross-namespace links. (A better solution might be to link "conservation status" to the system in use, since any taxobox with a conservation status should have a status system. I can foresee problems with that approach, too, in that linking to articles like NZTCS could be somewhat unexpected for a link titled "conservation status".) I would go further, and say that the hatnote at conservation status is probably a bad thing, since it's chiefly intended for editors, whereas the encyclopaedic content should be aimed squarely at readers. The user should always know what system is in use, because it should be listed in the taxobox; with some exceptions, such as the "NE" you spotted, they appear in the cleanup category Category:Taxoboxes needing a status system parameter if no status system is given, and Category:Taxoboxes with an unrecognised status system if it isn't recognised by the template. --Stemonitis (talk) 09:16, 13 May 2010 (UTC)
Thanks for clarifying that for me. Given the hatnote, which I removed per BLD due to my experience and your input here, and this obtruse guideline missing its guideline template I was expecting encyclopedia content. The article I read should have left the status parameter blank when it was without a verified status system parameter, so I wouldn't have been sent on a pointless search in the middle of the night for what amounted to nonexistent content. Thanks again. --Modocc (talk) 15:30, 13 May 2010 (UTC)

NCBI Taxonomy IDs

I know this topic has been discussed before (e.g., Archive 5, Archive 10, Archive 11) but I'd like to reopen the case for adding NCBI Taxonomy IDs to the Taxobox.

Wikipedia's taxon pages have a huge web presence (see my blog post Google and Wikipedia revisited and Page, R. D. M. (2010). "Wikipedia as an encyclopaedia of life". Nature Precedings. doi:10.1038/npre.2010.4242.1.). If a taxon is in Wikipedia it is almost always the first search result in Google. Researchers in other areas of biology are making use of a Wikipedia as a tool to annotate genes Gene Wiki and RNA families Wikipedia:WikiProject_RNA, respectively. Pages for genes, such as Cytochrome_b, have numerous external identifiers in their equivalent of the Taxobox (the Pfam_box. I think we are missing a huge opportunity by not including NCBI taxonomy ids. The advantages would be:

  • It would provide a valuable service to Wikipedia readers by enabling them to go to NCBI to discover more about a taxon
  • It would help Wikipedia contributors by providing a standardised way to refer to NCBI (and enable bots to add missing NCBI taxonomy ids). Putting them in an External links section makes it harder to be consistent (there are various ways to write a URL linking to the NCBI taxonomy)
  • It would facilitate linking from NCBI to Wikipedia. A mapping of Wikipedia pages to NCBI taxonomy ids could be added to NCBI Linkout, generating more traffic to the Wikipedia pages
  • Projects that are trying to integrate information from different sources would be able to combine information of genomics from NCBI with other information much more readily

Note that I am not arguing that Wikipedia should "follow" NCBI taxonomy, merely that where the potential to link exists, the links would create value, both within and outside the Wikipedia community. rdmpage (talk) 09:48, 20 May 2010 (UTC)

I agree that links to the NCBI taxonomy would be very useful. Even though the NCBI taxonomy is not authorative on the higher-level groupings of taxa (which are in flux, anyway), links on the level of individual species would be straightforward and very useful. MichaK (talk) 10:06, 20 May 2010 (UTC)
Agree . And this information can be later submitted to NCBI/LinkOut --Plindenbaum (talk) 10:16, 20 May 2010 (UTC)
The idea seems good to me. How would this be implemented? Ucucha 12:51, 20 May 2010 (UTC)
There has been a sample implementation in a previous discussion. I'm not sure if it's easy to transplant into the current template, but should be doable. As Rod noted, the Protein Box also has links to external data sources. MichaK (talk) 14:00, 20 May 2010 (UTC)
Several things need to be done (in rough order of difficulty)
  • Agree on a modification to Taxobox. I propose adding "ncbi=xxx" where xxx is the NCBI taxonomy id. We also need to pick a standard NCBI URL pattern to convert the taxonomy id to a link to NCBI.
  • Create a bot to populate existing Taxoboxes. I can write this (I've some experience programming against the Mediawiki API), but I have no experience of the bot approval process. I'm hoping that somebody here can advise about this.
  • Decide how much of NCBI to map to. The higher up the taxonomic tree the more I suspect that there will be conflict between NCBI and Wikipedia, either in which name to use or what the name encompasses. A case could be made for restricting the mapping below a certain taxonomic level, although this would have to be done with care as many major groups in the tree of life are based entirely on molecular data and it would seem unwise to exclude them from the mapping.
  • Create the mapping between Wikipedia and NCBI. I could make a first go at this using local dumps of these two databases. I'd be interested in having a simple web tool that enabled some manually cleaning up of the mapping before a bot used it to populate Wikipedia. A local Semantic Mediawiki might be the simplest way to get this set up.
  • Maintain the mapping and add it to NCBI so that the links work in both directions.
  • Have the bot periodically look at new taxon pages and add the NCBI id where appropriate.
I'm happy to do most of this. What I'd really like at this stage is some indication from people involved in Taxobox and the Tree of Life project whether they think this is a good idea, and if so, advice on the mechanics of getting a bot approved. —Preceding unsigned comment added by Rdmpage (talkcontribs) 15:59, 20 May 2010 (UTC)
Agree. Andy Mabbett (User:Pigsonthewing); Andy's talk; Andy's edits 13:16, 20 May 2010 (UTC)
Agree. Happy to help out if the Gene Wiki crew can be of help, otherwise we'll just lend moral support... Cheers, AndrewGNF (talk) 16:20, 20 May 2010 (UTC)
Agree - the plan seems reasonable, and of benefit to all communities involved. - UtherSRG (talk) 16:22, 20 May 2010 (UTC)
Oppose -- genetic information is scant for a wide majority of species, and that information would be utterly incomprehensible to 99% of readers. Furthermore, this information is, I would argue, not critical to an understanding of the subject from the perspective of a general audience. I'm not opposed to adding NCBI links, but it should not be in the taxobox, which should be restricted to basic/critical species data.-- Yzx (talk) 18:42, 20 May 2010 (UTC)
And also, I don't like the idea of putting external links anywhere in an article but in an External Links section. -- Yzx (talk) 18:51, 20 May 2010 (UTC)
There are several issues here. I'll try to unpack them:
  • By my estimate there were about 110,000 taxon pages in Wikipedia in at 18 June 2009. The NCBI taxonomy has at least three times as many taxa as Wikipedia.
  • I dispute that genetic information is "utterly incomprehensible to 99% of readers". We live in an age of personal genomics and synthetic organisms. You might not think genetic information is critical to understanding a taxon, but I beg to differ. I'd argue that genomics is "basic/critical species data".
  • If we applied the comprehensiblity criterion Wikipedia should stop citing primary literature, as much of that will certainly be incomprehensible to many readers. The point is Wikipedia provides both information about a topic, and a starting point for people to explore further, if they wish.
  • I think treating identifiers such as NCBI taxonomy ids merely as links to a web page understates their significance. These taxonomy ids are widely used in many biological databases to link together information about a taxon. In this sense, they are not "merely" URLs to a web page. Wikipedia pages are full of identifiers in Infoboxes, such as:
  • ISO 4217 Codes for currency (e.g. NZD for New Zealand dollar)
  • ISSNs for journals (e.g., ISSN 1175-5326 for the journal Zootaxa)
  • ICD-10 codes for diseases (e.g., A60., B00., G05.1, P35.2 for Herpes)
The list goes on. I'm arguing that NCBI ids are not the same thing as links to external web pages. They are identifiers.
--Roderic D. M. Page 21:56, 20 May 2010 (UTC)
  • There's a difference between knowing what a gene is and being able to interpret the information at NCBI, like the NADH dehydrogenase subunit 4 of Draco volans. What does this page actually say about the organism? How many Wikipedia users do you honestly think could answer that question? And mind you, NCBI doesn't include anything that would help a reader in this regard. I maintain that most readers would find the link useless, and worse such a link would put the onus of interpreting raw scientific data on people who aren't qualified to do it.
  • We cite primary literature but summarize the pertinent points in a way that's directed to general readers. If there was something meaningful to say about the species from the NCBI data, it would be in a publication and we would summarize it in the prose. Simply putting a link to the raw data at NCBI would not be the correct approach.
  • To me, basic/critical information is information that would be regularly covered by books and other published materials about species of organisms. All the current elements of the taxobox (common name, classification, distribution, etc) regularly appear in field guides and taxonomic references and such. I've yet to see a single general book on individual species of organisms that includes genomic and proteomic data. To me that says that this information is not crucial to an understanding of the subject.
  • ISO and ISSN codes are regularly included in published materials about those subjects, which reflects their ubiquity. What published materials on organism species include the taxonomic codes in NCBI? Certainly not field guides, or other encyclopedias, or taxonomic references that I've seen. How widely are these codes used, and who determines what they are? ICZN? If these codes are not ubiquitously used, then we should not be promoting them.
  • I maintain my objection to having external links anywhere other than the External links section.
-- Yzx (talk) 23:01, 20 May 2010 (UTC)
Just want to point out that external links appear in other general purpose infoboxes (e.g., Barack Obama and John McCain), and other scientific/medical infoboxes already use the identifier/external link model (e.g., Herpes simplex and TP53). Perhaps the external links in the "External links" section is not a hard-and-fast rule? And perhaps WP standards vary slightly depending on context? Cheers, AndrewGNF (talk) 00:14, 21 May 2010 (UTC)
Even so, I would argue that external links should only appear in an infobox if they're indispensable to the subject. I remain absolutely unconvinced that NCBI is such a link. NCBI is a tool for researchers. It only presents data, and nothing about what that data actually means. There would be no benefit in directing readers to it, and the vast majority of organism species literature does just fine without it. -- Yzx (talk) 00:48, 21 May 2010 (UTC)
One other point of clarification. I noticed your link to NADH dehydrogenase subunit 4 above. I don't think this is what rdmpage is proposing. Using your Draco volans example, I think he would propose adding ncbi=89032 to the template, which would then add a link in the template to the NCBI page on Draco volans. Cheers, AndrewGNF (talk) 01:07, 21 May 2010 (UTC)
The only useful parts on that page are the "Entrez records" links on the upper right, which lead to pages like NADH dehydrogenase subunit 4. It's already been acknowledged that NCBI is not authoritative for taxonomy -- it says so right on the page. -- Yzx (talk) 01:21, 21 May 2010 (UTC)
I realise that our perspectives on this may be rather different, but I would argue that genomics is a central part of biology, it plays a huge role in our understanding of the evolutionary history and taxonomy of organisms, and will play an increasing role in identification through DNA barcoding. NCBI is the point of entry into that information. Furthermore, NCBI often does a good job of providing links to further information. For example, take the excellent Porbeagle article that you've extensively edited. The NCBI page for this shark Taxonomy ID 7849 includes links to further resources not mentioned on the Wikipedia page (such as pages from the Barcode of LifeBarcode of Life,EOL, and the World Register of Marine Species), as well as freely available literature on PubMed Central. The genomic data is, indeed, more specialised, but I for one am not going to assume that every Wikipedia reader will find this incomprehensible. Looking at the first nucleotide sequence linked to the Porbeagle (Lamna nasus) FJ519727 we find it has been published in a paper on barcoding sharks (Wong, Eugene H.‐K.; Shivji, Mahmood S.; Hanner, Robert H. (2009). "Identifying sharks with DNA barcodes: assessing the utility of a nucleotide diagnostic approach". Molecular Ecology Resources. 9 (s1): 243–256. doi:10.1111/j.1755-0998.2009.02653.x. PMID 21564984.). I think at least some Wikipedia readers will find this useful. I'm struggling a little to see why you are so adamant that they won't. Roderic D. M. Page 04:14, 21 May 2010 (UTC)
It's not about whether genomics is important, or whether someone finds NCBI useful. It's not about the external links on the NCBI page either, because we can link to the important ones directly from the article's External links section. It's about the taxobox, which is arguably the most visible component of the entire article. Everything in it needs to provide basic/crucial data, and I've yet to see any evidence that NCBI would help the average reader better understand the subject. Take my Draco volans example from above. Tell me what the average Wikipedia user is supposed to take away about the species from its NCBI page that would justify the inclusion of the link in the infobox. -- Yzx (talk) 04:43, 21 May 2010 (UTC)
Leaving aside the mythical "average Wikipedia user", the Draco volans does the following:
  • Gives me a classification (more detailed than Wikipedia's)
  • Gives me links to EOL and other, more specialised resources.
  • Tells me how much we know about the genomics of this animal (i.e., not a lot)
  • Gives me access to information on those genes
Now, you might not find this useful, but I do. If a user goes to the NCBI page and can't make sense of it, they can simply go back to Wikipedia and try someplace else. But you seem determined to not give users that choice. And I'm struggling to think of many things more fundamental about an organism than its genome.
Perhaps another analogy might help. I regard NCBI identifiers as one of the key things that link biological information together. In the same way, things like ISBNs, DOIs, and PubMed ids are part of the way we link bibliographic information. The "References" section of Wikipedia pages are littered with these identifiers that I'm sure mean little to most people. But they glue the citations to the web. NCBI taxonomy ids do they same thing in biology. Increasingly if you want to know what we know about an organism you go to GenBank. --Roderic D. M. Page 05:46, 21 May 2010 (UTC)
None of those reasons justifies why this link needs to be in the taxobox, instead of in External links with all the other important sites. Again, I reiterate that the taxobox is for critical information. If genomics is so fundamental to the understanding of an organism, why do field guides, nature encyclopedias, and taxonomic references exclude it? And if NCBI identifiers are like ISBNs, why aren't they found across all biological sources like ISBNs are found across all books? Give me examples of NCBI codes being used like ISBNs in general biological literature. For an external link to be featured so prominently on every species article, it has to demonstrate significance above and beyond that of a regular external link. And I fail to see it. Tell me what the genetic and proteomic information on the Draco volans page, scant as it might be, actually says about the species. -- Yzx (talk) 06:15, 21 May 2010 (UTC)
Field guides and nature encyclopedias are typically printed books, which typically don't have URLs. Why judge what should be in an online encyclopaedia by reference to print publication? Many taxonomic references implicitly include NCBI taxonomy ids by virtue of listing GenBank sequence accession numbers, from which you can get back to the taxonomy id. What does a NCBI page tell me about an organism? Well, I can browse the sequences, which often have information on where they came from, I can find links to related sequences (which gives me clues about what taxa the species I'm interested in is related to), and there are links to publications, often expressly about that taxon.
I guess we're not going to agree on genomics being fundamental to the understanding of an organism. But I'd argue that the Internet and genomics are the defining technologies of the early 21st century, and it makes sense to explicitly link the two flagship resources, NCBI and Wikipedia, in a way that is clear, consistent, and easy to implement. If you look at some Wikipedia pages, such as for the gene BRCA1, the Infobox contains a wealth of information that you can choose to make use of, or not. I think it worthwhile to add similar value to Taxoboxes. You may see less value in genomics than I do, I accept that. But just because you don't see the value doesn't mean that there is no value, nor that there won't be Wikipedia users who will find the addition of NCBI ids to the Taxobox valuable. --Roderic D. M. Page 07:38, 21 May 2010 (UTC) —Preceding unsigned comment added by Rdmpage (talkcontribs)
Again, what does the NCBI page for Draco volans say about the species? There are 9 nucleotides and 2 proteins listed. What do those 11 pages teach the reader about Draco volans? -- Yzx (talk) 07:55, 21 May 2010 (UTC)
This depends on your definition of "reader." If the reader is not interested in genetics or the taxonomy of the species, then nothing. The taxonomy might not be authorative, but it's still a useful starting point. E.g. you can also look at the genetic information of clades like Draconinae. As far as I'm aware, the NCBI taxonomy is the only one to assign numerical ids to "all" species, and they are therefore used by many projects (e.g. UniProt plus >1700 hits in Google Scholar). MichaK (talk) 08:24, 21 May 2010 (UTC)
That doesn't answer the question, which is, what do those 11 pages reveal about the biology of Draco volans? Don't worry about what the reader would be interested in. Just give an example. -- Yzx (talk) 08:29, 21 May 2010 (UTC)
That's like asking "what does a Latin binomial with authority teach the reader"? Most readers won't know how to interpret what means, but some will. In the same way, the Apple_Inc page has an Infobox with the stock market symbol APPL linked to NASDAQ. Most of that page is gibberish to me, but I accept that there's lots of valuable information there if you know how to interpret it, and I certainly wouldn't argue that it should be removed because I find stock market summaries to be impenetrable. I would argue that NCBI taxonomy ids are much the same thing. They are the currency of molecular biology and genomics, and including them in the Taxobox would be a great way to link those fields to Wikipedia at the level of organisms. Roderic D. M. Page 08:42, 21 May 2010 (UTC) —Preceding unsigned comment added by Rdmpage (talkcontribs)
Okay, so explain to me what those 11 pages say about Draco volans. Just one thing. -- Yzx (talk) 08:53, 21 May 2010 (UTC)
Browsing the taxonomy page, the nine linked sequences, and the page for the genus Draco, I discover:
  • There are only 9 sequences, so we don't know much about this lizard's genetics (we might wonder why that is, maybe it's rare)
  • The sequences come from papers with these titles:
  • Chameleon radiation by oceanic dispersal (interesting, chameleons can disperse over oceans)
  • Phylogenetic systematics of Southeast Asian flying lizards (Iguania: Agamidae: Draco) as inferred from mitochondrial DNA sequences (this sounds like an interesting paper, maybe we should add this to the Wikipedia page)
  • Re-evaluation of the status of Draco cornutus Gunther, 1864 (Reptilia: Agamidae) (unpublished, maybe we can find it, sounds like when somebody creates the Draco_cornutus page they should read this)
  • Phylogenetic relationships of the flying lizards, genus Draco (another unpublished paper, maybe we can find this)
  • Then, we look at the taxonomy page and go up a level to the genus Draco, and we see a list of species that is different from that on Wikipedia's Draco page (so we might start to investigate this further)
I think anybody genuinely curious about this lizard will discover a wealth of leads to further information, and anybody wanting to edit the sparse Wikipedia pages for this lizard will find some useful starting points here. --Roderic D. M. Page 08:58, 21 May 2010 (UTC)
I'm not talking about resources that are linked to from the NCBI page. We can link to those directly from our articles if we find them useful, and we should link Google or Web of Science if we want to give readers a list of papers. I'm talking about the substance on the NCBI page itself, which are those 9 nucleotides and 2 proteins. Tell me one thing they say about Draco volans. -- Yzx (talk) 09:03, 21 May 2010 (UTC)
The genetic sequences on their own don't tell you much, as they have to be placed in context (e.g. by using the "Run BLAST" or "Identify Conserved Domains" links). I think we have a disagreement on what the substance of the NCBI page is: For me, it is the taxonomy id, because that is a unique identifer for a particular speces, which I can find in other databases. MichaK (talk) 09:07, 21 May 2010 (UTC)
Exactly. The sequences don't mean anything without interpretation. And the only ones qualified to interpret are trained scientists, not the readers of a general encyclopedia. Scientific conclusions are what an encyclopedia should reference, not the raw data. As for the IDs, if they were really standard species IDs akin to ISBNs, then they should be used by general taxonomic references, online and printed, as well as generally in biological literature. After all, the IUCN Red List assigns a unique number to every species in its database too. Give me evidence that NCBI codes are used this way. -- Yzx (talk) 09:19, 21 May 2010 (UTC)
But WP is not just a general encyclopedia. As Rod pointed out, it's the first Google hit for the most species and it should thus also be a jumping board to further kinds of information. Second, look at other boxes, like the drug box (Fluoxetine, e.g. linking to MeSH, ATC PubChem, IUPHAR, ...) or several version of boxes on gene pages (NADH dehydrogenase or NDUFA1). Many of the links there are to scientific databases, that, by your definition shouldn't be there, yet they will be useful to some readers. Lastly, you picked ISBNs out of Rod's analogy on bibliographic info, where he also listed DOIs and PubMed ids. You won't see PubMed ids in many places, yet they are extremely useful to identify publications and might thus be "hidden" in links. MichaK (talk) 09:35, 21 May 2010 (UTC)
Ok, then demonstrate that NCBI IDs are used ubiquitously to identify species, like DOIs or PubMeds are used to identify online articles. Almost every online scientific article database gives them. Does every online taxonomic database give or even use NCBI IDs? IUCN Red List doesn't. Encyclopedia of Life doesn't. GBIF doesn't. None of the FAO publications do. ITIS gives a Taxonomic Serial Number, which is different from the NCBI. -- Yzx (talk) 09:50, 21 May 2010 (UTC)
Actually, EOL does link to the NCBI Taxonomy page for each taxon that exists in GenBank. Asking for a ubiquitous identifier is a red herring -- the majority of published papers don't have DOIs or PubMed ids, but many do and we use them when available. Indeed, not all publishers use DOIs in their lists of literature cited, they not always aware of the benefits of using these links. Enlightened publishers (and resources like Wikipedia) do. Why not do the same thing for taxa? Roderic D. M. Page (talk) 10:21, 21 May 2010 (UTC)
Linking to GenBank is not the same as using their IDs. That's like saying putting a direct link to a paper is the same thing as using a DOI. It's not. And what about all those other databases I mentioned? What about the fact that some of those databases give different IDs to those species? -- Yzx (talk) 17:08, 21 May 2010 (UTC)
I'm not at all clear what the sentence "Linking to GenBank is not the same as using their IDs. That's like saying putting a direct link to a paper is the same thing as using a DOI" means. Yes, different databases have different ids. This is pretty much a fact of life, just as we can have multiple identifiers for papers (DOIs and PubMed), books (ISBNs and OCLC numbers), etc. We make choices as to which ones are "best" using some criteria (such as longevity of the resource, coverage, use in other resources, etc.). It's clear from this thread that you're firmly in the "oppose" camp, so I suggest we simply agree to disagree. --Roderic D. M. Page (talk) 18:01, 21 May 2010 (UTC)
You're advocating giving a privileged link to a site containing genomic data that can't teach people about the species, because there's no associated interpretation, and an ID applicable only to a select subset of databases. I absolutely cannot comprehend why something of such limited utility needs to be in the taxobox. -- Yzx (talk) 20:39, 21 May 2010 (UTC)
While you judge it to be of limited utility, others may well see it differently. I can't comprehend why you seem unwilling to allow that others find this information useful, and would derive benefit from having this information easily accessible. --Roderic D. M. Page (talk) 21:34, 21 May 2010 (UTC)
I have asked you repeatedly to demonstrate the usefulness of the site by giving me a single example of what the NCBI genomic data on Draco volans reveals about the species. I'm still waiting. -- Yzx (talk) 21:54, 21 May 2010 (UTC)
Dear Yzx, I am not sure how much you worked in this area, and my experience is limited but existing; one thing that is pretty clear is the the NCBI species IDs are not only used by the NCBI database, but by many databases. I am working on metabolite databases (MetWare), and even there I am using NCBI IDs for reference to species. The NCBI database itself may not have the the information you are looking for, but using the NCBI IDs you can certainly find that information if you really tried. As such, it is very much like the CASnumber in the Chembox, it is a number specific for one database, to which most Wikipedia users have no access, hence no use, but is merely in the infobox because it is used by many, many databases. So, I see this proposal very much in line with existing Wikipedia practice, and strongly suggest the adoption of the NCBI identifier in the taxonomy infobox. EgonWillighagen (talk) 06:10, 22 May 2010 (UTC)
I don't think it's unreasonable to ask that a site linked to from Wikipedia teach the reader something about the subject, rather than simply linking to a database for the sake of linking to a database. Neither I nor any other reader should have to dig through the link to scrape some use out of it, but since you say it's there, I'll ask again. What does the data on the NCBI database say about Draco volans? -- Yzx (talk) 22:02, 22 May 2010 (UTC)
Personally, I find this previous answer to your question to be quite compelling. (But, of course I understand that reasonable people can disagree on this point.) Cheers, AndrewGNF (talk) 23:08, 22 May 2010 (UTC)
I assume you've already read my response to that above -- you don't link to a genomics database if you want a list of papers. I'm asking about what's actually on the NCBI database, i.e. the genomics. But really, I actually also want to know why this information is so vital that, not only should its link be mandatory on hundreds of thousands of articles, but that it should also be given special preference over all other external links in the article. For now though I'll just settle for a single piece of information about Draco volans, learned from its NCBI data. -- Yzx (talk) 00:45, 23 May 2010 (UTC)

Agree to place in "External links" section only :Per guideline Wikipedia:External_links#External_links_section can external links be placed only in "External links" section. Placing link to NCBI to the External link section is a good idea. For example for gastropods it would be very useful. Wikipedia_talk:WikiProject_Gastropods/Archive_3#Statistics_from_NCBI There are 8227 records of gastropods at NCBI but there is 19500 articles of gastropods on Wikipedia. It should look for example like this: http://en.wikipedia.org/w/index.php?title=Quantula_striata&oldid=351448694

--Snek01 (talk) 20:00, 20 May 2010 (UTC)

As I discussed in my response to User:UtherSRG above, I don't think it is helpful to readers NCBI taxonomy ids as external links, they are identifiers. Many Wikipedia pages incorporate standard identifiers in their Infoboxes, I'm suggesting Taxoboxes would be improved by doing the same thing. I don't think adding them as external links is the answer. There's little standardisation in how the links would be described (for example, what URL to use and how to describe it in the text. --Roderic D. M. Page 22:07, 20 May 2010 (UTC) —Preceding unsigned comment added by Rdmpage (talkcontribs)
Comment I'm uncertain as to whether we shouldn't be linking to wikispecies instead for our authoritative taxonomic information. StuartYeates (talk) 00:09, 21 May 2010 (UTC)
NCBI is primarily a source of information on the genomics of organisms. My argument for linking to it is to connect information in Wikipedia to information on genomics, rather than taxonomy per se. --Roderic D. M. Page 03:43, 21 May 2010 (UTC) —Preceding unsigned comment added by Rdmpage (talkcontribs)
The WikiSpecies pages I saw don't link to the NCBI taxonomy either, so this wouldn't solve the problem. Of course, the taxobox could also link to WikiSpecies. MichaK (talk) 08:26, 21 May 2010 (UTC)
We could easily write a template to standardize that, which would turn {{NCBI Taxonomy|Mus musculus musculus|39442}} consistently into, for example, Mus musculus musculus (On NCBI). However, it would be very valuable to have machine-readable species identifiers embedded on species pages on Wikipedia, using the species microformat or another similar standard. The best place to do that would be in the Taxobox, since there's exactly one to a species page anyway, or in a species-identifier info box, which could list out any number of biodiversity database identifiers. There's a good example of the former in the archive, and I've mocked up an example of the latter. I'd support any scheme which would make it easier to link Wikipedia up with other databases -- Gaurav (talk) 18:45, 21 May 2010 (UTC)
Here's a screen shot of the User:Bilardi/taxoboxtestmockup mentioned
 
, which illustrates a perhaps more elaborate version of a Taxobox. I favour the Taxobox approach as it is unique (one per page) and readily machine readable --Roderic D. M. Page (talk) 21:34, 21 May 2010 (UTC)

Summary so far

This thread is getting long, so I thought I'd summarise the state of play:

Agree

Oppose

  • Yzx
  • Snek01 ("Agree to place in "External links" section only" is effectively oppose as the proposal is to put NCBI taxonomy ids in the Taxobox) --Roderic D. M. Page 09:32, 21 May 2010 (UTC) —Preceding unsigned comment added by Rdmpage (talkcontribs)
Presumably the alternate solution suggested is something like template:NRDB species as used in Daboia with the internal numbers in the template supporting any mashup requirements ? Shyamal (talk) 03:01, 24 May 2010 (UTC)

Ways to proceed

  • While we aren't voting here, overwhelming consensus does appear to be towards incorporating these ids at this time. If there are any administrators here, I'd suggest they Be Bold. I don't mean to close the discussion or ignore Yzx and Snek01's arguments: I'm just pointing out that it is extremely easy to remove the NCBI field later if needed. Being Bold here might also benefit the opposition, as making the change might demonstrate that the proposed taxobox would actually be confusing or ugly. If this page was not protected, I'd've begun BOLD, revert, discussing it at this point; I think that's the fastest way to sharpen the arguments and reach a consensus. -- Gaurav (talk) 05:53, 23 May 2010 (UTC)

Against guidelines

Unfortunatelly the newbie User:Rdmpage suggested something, that is against guidelines. I would like to point to some examples: Template:Infobox film has been with Imdb parameter, but after a long years long discussion those links were moved to External links section. Enforce something against guidelines has no chance. Guidelines are long term used standards, that we all should respect. --Snek01 (talk) 10:43, 21 May 2010 (UTC)

Sorry, but calling someone a newbie who has been registered for two years doesn't help. Can you please link to the exact guideline that links to external pages should only be under "external links"? Also, please comment on Template:Drugbox, Template:Chembox, Template:GNF_Protein_box and Template:Infobox_rfam, all of which contain links to external pages. MichaK (talk) 10:55, 21 May 2010 (UTC)
Looking at the discussion about Imdb it's clear that part of the issue concerned Imdb's reliability. NCBI is a curated scientific database. And as MichaK points out, many Wikipedia pages have Infoboxes replete with useful links to external pages. I realise that this suggestions has opened a can of worms and relates to larger debates (see Please_stop_the_infobox_disease for a view that I think would be an unmitigated disaster for Wikipedia) --Roderic D. M. Page (talk) 11:09, 21 May 2010 (UTC)
Snek01, you say the adoption of the NCBI IDs "is against guidelines". I am not sure that is true; as I discussed earlier and seeing MichaK also brings it up, the Chembox contradicts your observed guideline. Perhaps there actually is such a guideline, as you indicate, but then please point us to it. EgonWillighagen (talk) 06:22, 22 May 2010 (UTC)
Agree. It needs to be setup to show up in the infobox only if the NCBI Taxo ID parameter is entered. Ganeshk (talk) 11:28, 21 May 2010 (UTC)
Can we start a external resources section on the Infobox? May be we can add other links into this section in the future. Ganeshk (talk) 11:47, 21 May 2010 (UTC)
I'd be in favour of this. ITIS and EOL identifiers would seem obvious candidates (EOL especially given that EOL page snow incorporate Wikipedia pages). In revisiting the issue of NCBI ids I'd not tackled the question of adding other ids, for fear of aggravating those who want no truck with identifiers of any kind. --Roderic D. M. Page (talk) 13:37, 21 May 2010 (UTC)
I'd be wary that that'd lead to a lot of clutter, as well as difficulty in selecting which of the biodiversity databases is worth linking to directly from the infobox. I wonder if the best solution isn't the one used by the GeoHack tool on the Wikipedia toolserver: provide a single "Databases" link on the Taxobox which takes you to a page listing different websites where more information on this taxon can be provided. I know NCBI uses LinkOut to provide something like this, which is probably Good Enough for now. -- Gaurav (talk) 05:26, 23 May 2010 (UTC)
Yes, we could regard NCBI Linkout as a way to avoid clutter. The NCBI taxon page would be the point of entry into the world of biodiversity databases (in effect, being the taxonomic equivalent of the GeoHack tool). --Roderic D. M. Page (talk) 22:07, 23 May 2010 (UTC)
I tried searching for a guideline against external links in infoboxes, but couldn't find one. WP:EL cautions against putting external links into the body of the article, which makes sense to me, as users would expect links in a hyperlinked encylopedia to point to other articles, not to external websites. I also agree with you that external links in infoboxes should be used sparingly: I'm worried about too many identifiers cluttering up the infobox, or about difficulties in ensuring that only relevant, informative resources are linked. However, I think NCBI identifiers count as identifiers, not external resources. We are not telling users that NCBI is an informative website where they can find technical information on the taxon in question (although this is true); we are telling users that NCBI identifies this taxon using this identifier. I think this is directly analogous with inserting geographical coordinates into articles: we're not telling users that they can find out more about Singapore by looking at 1N 103E on a map, but we're providing the map coordinates so that users might use them in any way they like. This is powerful feature, allowing map providers to geolocate Wikipedia articles, providing a whole new way of browsing the encyclopedia. Thus, I see NCBI taxon ids to provide a clear benefit to the Taxobox. -- Gaurav (talk) 05:26, 23 May 2010 (UTC)
This is a nice analogy. Yes, NCBI taxonomy ids are best thought of as identifiers. Latitude and longitude provide a way to anchor information in geographic space; two seemingly unrelated data sources can be linked by virtue of being about the same place. In much the same way, NCBI taxonomy ids are across many genomic and molecular databases to refer to the same taxon. They provide a way to join this information together. --Roderic D. M. Page (talk) 22:07, 23 May 2010 (UTC)
They are closer to PubChem numbers no? And we do include PubChem numbers in inforboxes.©Geni 17:08, 24 May 2010 (UTC)

Another way of looking at it

The NCBI id is a resource, in that it can be used to look things up, but it is also a name.

Bird species taxoboxes routinely include at least two names, the binomen and the "officialized" English name (see Northern Mockingbird for an example). Plant species infoboxes much less often include common names. There are a number of reasons for this, but a salient one is that plant common names don't often have an accepted 1:1 correspondence with scientific names. Bird "common" names do.

I suggest that a taxobox for a species should contain every widely accepted name for that species, be it official English name, lsid, guid, NCBI id, or whatever. This seems only logical. And all these other names should be redirects in Wikipedia to the species article. This, too, seems only logical.

So to me, there are only two questions left to settle: (1) should names be links, (2) how should we handle names for higher taxa, since they are much more dependent on circumscription? --Curtis Clark (talk) 23:06, 21 May 2010 (UTC)

I think the main problem with your post is that the NCBI ID is not a widely used name—indeed, I don't think I've ever seen it mentioned in the scientific literature. Ucucha 09:17, 23 May 2010 (UTC)
I see that there are some contrary examples above; perhaps the ID is more widely used in databases and technical genetic fields, which I don't have experience with. Ucucha 09:19, 23 May 2010 (UTC)
I didn't necessarily mean "widely used", but rather "widely accepted". I don't think anyone is proposing that a different identifier be used to search for information in NCBI.--Curtis Clark (talk) 04:46, 24 May 2010 (UTC)
I think it's clear that the relative lack of visibility of NCBI taxonomy ids gives the (misleading) impression that they aren't widely used. These identifiers are the glue that binds many molecular databases together, and binds them to data about organisms via the NCBI taxonomy pages. In the same way, DOIs bind published literature together, but are often barely visible. For example, the paper Johnson, Jeff A.; Lerner, Heather RL; Rasmussen, Pamela C.; Mindell, David P. (2006). "Systematics within Gyps vultures: a clade at risk". BMC Evolutionary Biology. 6: 65. doi:10.1186/1471-2148-6-65. has the DOI 10.1186/1471-2148-6-65. This is the only DOI you see when you visit the web page for the paper. In the list of references at the end of the paper you see lots of links labelled "Publisher Full Text". These use DOIs to make the links, even if the DOIs aren't directly displayed (if you mouse over the link you should see the DOI). Just because DOIs (or, indeed, other bibliographic identifiers such as ISBNs) might not be directly displayed doesn't mean they are not at the core of identifying articles or books. This paper doesn't mention NCBI taxonomy ids directly, but does list the sequences in GenBank, and each of these is connected to a NCBI taxonomy id. Anybody wanting to find all the sequences from one of these vultures will use that taxonomy id to find them. They are at the core of binding together what we know about an organism. --Roderic D. M. Page (talk) 09:58, 24 May 2010 (UTC)

All reasons provided by rdmpage are wrong:

  • "It would provide a valuable service to Wikipedia readers by enabling them to go to NCBI to discover more about a taxon". Everybody agree, that NCBI can be valuable source, but it does not mean, that it have to be placed in taxobox.
  • "It would help Wikipedia contributors by providing a standardised way to refer to NCBI (and enable bots to add missing NCBI taxonomy ids). Putting them in an External links section makes it harder to be consistent (there are various ways to write a URL linking to the NCBI taxonomy)". Standardized way of DISPLAYING external links in wikipedia is in "External links" section at the bottom of the page.
  • "It would facilitate linking from NCBI to Wikipedia. A mapping of Wikipedia pages to NCBI taxonomy ids could be added to NCBI Linkout, generating more traffic to the Wikipedia pages". Traffic is irrelevant to this decisssion.
  • "Projects that are trying to integrate information from different sources would be able to combine information of genomics from NCBI with other information much more readily". The purpose of wikipedia is not to help to "projects that are trying" to do something, but to provide all relevant information in one place. Generally there is no reason why should be some link to some source placed within taxobox, especially when there is another place, where they are placed. --Snek01 (talk) 05:11, 22 May 2010 (UTC)
Sorry, but this seems like a non sequitur to my comments. Did you mean to put it in the section above? Are you opposed to including it in the taxobox as a name, rather than as a link?--Curtis Clark (talk) 13:50, 22 May 2010 (UTC)

A few questions

Okey this is a bit outside my field and I haven't read all of the above so please forgive me if these have already been answered.

  • 1)are NCBI codes a free and open standard?
  • 2)How solid are they? What I mean is once an NCBI code is set does it continure to mean the same thing forever or do they shift as new information comes to light?
  • 3)How specific are they? We've hit issues with some chemical database systems including duplicates and the like.
  • 4)Does the field of biology have any competeing systems and if so how many?

©Geni 11:21, 23 May 2010 (UTC)

  • 2) The ids are stable, in the sense that most of them do not change (especially at the species level). However, if new information leads NCBI to restructure the classification, it is possible that some higher level taxa will be removed. NCBI keeps a record of which nodes it deletes.
  • 3) Each id is intended to be unique. I have come across rare occasions when two NCBI taxon ids exist for the same species because the scientists depositing DNA sequences didn't realise their study organism was already in the database under a different name.
  • 4) It depends on the scope of your question (biology is a pretty big field). Within molecular biology pretty much all databases that include taxa use NCBI taxonomy ids. These include very large databases such as GenBank itself (over 100 million DNA sequences, see GenBank) and Uniprot with over 500,000 protein sequences, as well as numerous databases derived from GenBank. The journal Nucleic_Acids_Research publishes an annual update of the databases in this field, many of which will make use of NCBI taxonomy ids. Within taxonomy there are numerous databases, each with their own identifiers. Indeed, this is one of the problems the field faces, a plethora of competing databases (see Thomas, Claire (2009). "Biodiversity Databases Spread, Prompting Unification Call". Science. 324 (5935): 1632–1633. Bibcode:2009Sci...324.1632T. doi:10.1126/science.324_1632. PMID 19556479.). None of these identifers has anything like the widespread adoption of NCBI taxonomy ids, with the possible exception of ITIS. --Roderic D. M. Page (talk) 22:32, 23 May 2010 (UTC)
The paper you provided clearly states that there is currently no unified system of taxonomic identification tying together all, or even the major biological databases. Even if a single ID is used across the fields of molecular biology, it is still a far cry from being an "alternate name" for a species, so long as there is still no consistent system across the rest of biology. It would thus be disingenuous for Wikipedia to put the NCBI ID in the taxobox without also including all its alternatives, as that would imply a degree of universal adoption that does not exist. Perhaps when the "integrated database" spoken of in the paper is fully operational, then adding an ID number onto every species page may be considered. -- Yzx (talk) 00:24, 24 May 2010 (UTC)
Everyone seems to ignore my point about alternate names. English bird names have never been used for plants, and likely never will be, so they will never be consistent across the rest of biology. Nevertheless, they are included in taxoboxes. I don't see any reason not to include NCBI and other identifiers in taxoboxes. Your vehemence makes me wonder what your actual agenda is.--Curtis Clark (talk) 04:39, 24 May 2010 (UTC)
What? Of course a bird name wouldn't be used for a plant. I'm talking about if a bird species had five different IDs in five different databases (though the article mentions that there are 100+ biological databases and I'm not going to guess at how many systems they use), then Wikipedia should list all or none of them to avoid making judgment calls on which ones are more valid. This is specifically what I say above. -- Yzx (talk) 04:52, 24 May 2010 (UTC)
There isn't really any unified system of identifiers for organisms in biology. Taxonomic names (already used in Wikipedia) are not universal. There are different taxonomic codes, such as the International_Code_of_Zoological_Nomenclature for animals, the International_Code_of_Botanical_Nomenclature for plants, and the International_Code_of_Nomenclature_of_Bacteria (see Nomenclature_Codes). There are some taxa that have too many names, there are some that have none. Ambiregnal taxa are governed by both the Zoological and Botanical codes because in the past taxonomists have differed over whether the organisms were animals or plants (based on an outdated view of the evolution of Eukaryotes). These taxa may have both an animal name and a plant name. In some fungi different life history stages ( teleomorphs, anamorphs, and holomorphs) may have different names, each name attached to a different part of their life cycle. Many bacteria don't have formal taxonomic names because the code requires the bacteria to be cultured before it can be named. As a result, the number of formally described bacterial species bears little relation to the actual diversity of bacteria, and informal names are in widespread use. Then there are competing systems such as the PhyloCode.
The use of any naming system requires choices. Any system is flawed, the flaws may be less obvious in some cases because we are so familiar with the system (or at least parts of it). A key thing about NCBI taxonomy ids is that that they span all of life, and any organism that has sequences in GenBank has one. They are probably the only identifier that is directly linked to data about the organism (i.e., the sequences). I think it is entirely possible to make an informed judgement about the relative value of adding different identifiers to the Taxobox. --Roderic D. M. Page (talk) 09:40, 24 May 2010 (UTC)
Multiple standards are not a massive problem. Common and toxic chemicals such as Sodium hydroxide can have as many as six seperate ID numbers and we get by.©Geni 17:17, 24 May 2010 (UTC)
(edit conflict) We do have a unified system, it's called a binomial name. It is by far the most widely used system of identifiers across biology. The point is not that an ID number might be more suitable in some situations, it's that there is nothing approaching consensus across the many branches of biology about the use of any particular ID number system for species, as is stated by the source you provided. I realize it may be clear to you which system is the most valuable, but it is also clear from discussion that your views are not shared even by the very small sampling of biologists (and biology-minded individuals) here. It is not up to Wikipedia to make the judgment or choice to promulgate any system, if the current situation is that there is no coordination of ID numbers between biological databases as a whole. -- Yzx (talk) 17:23, 24 May 2010 (UTC)
As Rod pointed out, this is not even true. There are three widely used codes of nomenclature, with subtly different rules and overlapping scope. And even within a code, there is synonymy. I agree that these codes constitute the most widely used system(s), but they aren't the only systems (witness the codified English names for birds). You misunderstand the scope of the NCBI codes: anyone who uses Genbank or any of the related resources uses them, directly or indirectly. They are not intended to replace binomials; they are intended to provide a stable ID, directly tied to species, that is useful in an information system. If species names never changed, NCBI wouldn't need separate IDs. Arguably there are issues of circumscription and typification that NCBI skirts (and that the codes of nomenclature have grappled with for centuries), and arguably a single set of unique taxon IDs could have been shared among all data services (and someday might be), but the NCBI IDs are in use now.
I was ambivalent about including numeric IDs (just as I am ambivalent about IUCN status, fossil range, and synonyms), but the opposition I have seen here has pushed me into the "support" camp.--Curtis Clark (talk) 02:26, 25 May 2010 (UTC)
I think, that User:Rdmpage writes serious inaccuracies here. For example User:Geni asked if "are NCBI codes a free and open standard?" I think, that correct answer is "no". An identifier (the number) called as "Taxonomy ID" used by NCBI is unique in their database, but it is just a number referring to the scientific name of the taxon. So there is no need to display the "Taxonomy ID" in the wikipedia. --Snek01 (talk) 00:17, 25 May 2010 (UTC)
If it was just a number referring to the scientific name, it would change when the name changes, right?--Curtis Clark (talk) 02:26, 25 May 2010 (UTC)
That is good question. I am sorry for my mistake (I am not native speaker). I will try to correct myself: It is just a ordinary number referring to the certain taxon (that have given usually some name). But in real I do not know, what will happen on NCBI, when anything will change and you can probably read about how they are making changes if necessary on their website. However the important thing is this: the number is important to be able to make a link, but the number is not important for reader and the number is not useful to be displayed on wikipedia. --Snek01 (talk) 15:58, 26 May 2010 (UTC)

More votes

I've been lurking, sorta following the discussion. I'm opposed. Infoboxes are prime real estate. I don't believe these codes are important and useful enough to muscle their way in. Put them in the external links section. Hesperian 11:16, 24 May 2010 (UTC)

It always surprises me when I disagree with Hesperian. I agree that links should be in the external links section, but these are names. If the taxoboxes are overcrowded, it's because we've added things like IUCN status, fossil range, and synonyms, all of which could as easily go in the text. If I understand these identifiers correctly, they are just like nomenclatural synonyms, except that they are in current use.--Curtis Clark (talk) 02:05, 25 May 2010 (UTC)
That is the problem. You, Curtis Clark, do not understand these indentifiers correctly. They are neither nomenclatural synonyms nor like nomenclatural synonyms. They are obvious numbers that are a part of an URL of a certain website. --Snek01 (talk) 13:00, 25 May 2010 (UTC)
I have nothing more to say to you about any issue, and I will no longer address your posts.--Curtis Clark (talk) 02:29, 26 May 2010 (UTC)
Before this escalates, can I suggest that we avoid making assumptions about what others do and do not understand? I think it often does not contribute to the discussion, and more importantly I think those assumptions are often wrong. Stepping back a bit here, I think we're all on the same team here -- everyone's trying to make WP better. Along those lines, I think MichaK has put together a concrete suggestion to move forward. Perhaps we should go comment on that proposal? Cheers, AndrewGNF (talk) 16:15, 25 May 2010 (UTC)

Strongly agree: I only just stumbled across this discussion. Often the only consistent hook for joining across multiple DBs is the ubiquitous NCBI Tax. ID. For example, in a recent paper I co-authored (snoPatrol), I joined the Pfam, Rfam and GOLD databases. The only reasonable way to do these sorts of things is using NCBI Tax. Ids. I could envisage the NCBI Ids in Taxobox playing an essential role in a lot of future research. --Paul (talk) 12:42, 24 May 2010 (UTC)

Note on above: the "strongly agree" refers to the proposal to include (not to the previous comment); → unindented by me. --G.Hagedorn (talk) 18:21, 24 May 2010 (UTC)
Thanks for clarifying :) --Paul (talk) 19:42, 24 May 2010 (UTC)

Agree: include NCBI taxon IDs in taxobox. Major points: Linking information from and to Wikipedia is an essential service to the public. I believe the issue of "external links" is misleading here. Separating external links is intended for links referring to related information pages, especially where the URL may be instable. However, it is established practice in Wikipedia that identifiers, where they happen to be URLs, may be present in infoboxes. This is true for most organizations (homepage URL in infobox), but also for chemicals (see Sodium hydroxide mentioned above, 4 out of 6 IDs are URLs), or for web standards like XHTML, OWL. NCBI codes are identifiers that happen to be URLs. In the template, the identifier should be entered, not the URL (but the template may render it as a URL). The NCBI is a stable, public service and should not be compared with a commercial website like the Internet Movie Database. Is NCBI the only identifier system? No. But my understanding of the consensus of the experts in the discussion is that it is presently the best. And should a better emerge, the NCBI identifiers will be a solid foundation for migration. --G.Hagedorn (talk) 18:21, 24 May 2010 (UTC)

Oppose like Yzx. Looking at Bilardi's example, this seems like something of little use to anybody that doesn't belong in the infoboxes of organisms. —innotata 00:48, 25 May 2010 (UTC)

Did you also look at Gaurav's example? The information in this example is more along the lines of what has been proposed here, namely identifiers from NCBI taxonomy and other relevant databases. MichaK (talk) 12:05, 25 May 2010 (UTC)
All of these links in the "Gaurav's example" are ordinary external links. Their identifiers are not useful. And so they are not even mentioned in Featured articles. If they are not mentioned in featured articles, then they can be considered as not notable. Ordinary external links does not need any special placement. --Snek01 (talk) 12:52, 25 May 2010 (UTC)
Gaurav's example doesn't look better, and there is no indication one is favoured by those supporting this proposal. Where are we going to draw the line with these identifiers which would be mostly useless? And there are lots of other problems, such as taxonomies that don't match. —innotata 23:54, 26 May 2010 (UTC)
I'm not sure, but it seems that example is intended for the external links. If the question is whether we should place them there, I don't care—count my opinion as neutral. —innotata 14:18, 27 May 2010 (UTC)

Opposed. If we include NCBI codes, we should ad other database IDs too, and the taxoboxes are crowded as they are. As one who primarily work with extinct species, I see little use for the parameter. By all means ad it, but to the external links section. --Petter Bøckman (talk) 15:05, 25 May 2010 (UTC)

Extinct species are also in the NCBI Taxonomy, and we increasingly see genetic information from extinct species (e.g. Neanderthal and mammoth). Furthermore, there might (will?) be databases that contain your species of interest, and it would be of benefit to add them. MichaK (talk) 15:41, 25 May 2010 (UTC)
Genes for Synapsids and Labyrinthodonts? I don't think so, mate. When (if) the NCBI names become standard references, they should be included in the taxobox. I am working in a Natural History Museum where much of the activity is "barcoding of life" stuff, and I have yet to see NCBI names as standard. Keep them in the "external links" section, we can always make a bot to put them in the taxobox later should it become more appropriate. --Petter Bøckman (talk) 18:59, 25 May 2010 (UTC)
Petter, can you comment on how you feel about the examples in {{Taxobox/testcases}}? There are a few examples in there (e.g., Apatosaurus) for which no NCBI taxonomy ID exists. In those cases the Taxobox is unchanged. When the NCBI ID does exist, then it appears in a collapsed section. Thoughts? Cheers, AndrewGNF (talk) 19:36, 25 May 2010 (UTC)
Sure, the NCBI taxonomy does not go that far back. But, there are dbs that do, like the Paleobiology Datbase, which has an entry for Dimetrodon that is not linked from the wiki page. Don't you think it would be useful to add this information (in a common place for all relevant species)? MichaK (talk) 20:27, 25 May 2010 (UTC)
Usefull, yes, but not in the taxobox. And why just NCBI and paleodb? Why not ITIS? Why not Tree of Life and Animal Diversity Web? By including a single (though relevant) database, we are swapping from reporting facts to taking sides in who has the right of way. I still say hold it until NCBI is a defacto alternate name used throughout, or put it under external links. Petter Bøckman (talk) 09:14, 26 May 2010 (UTC)
Of course it should be inclusive. A db will only show up for relevant species, so someone working on living species has no reason to dispute the addition of paleodb. Realistically, biology is so large and dispersed that there will never be a unique identifier for all species that ever lived on this planet. MichaK (talk) 09:46, 26 May 2010 (UTC)
Well, my argument is still that NCBI number are not commonly used alternate names. When/if it becomes a ubiquitous signifier, I'm all for putting it in the taxo-box. Until then, it is about as relevant as other databases like ITIS, ADW or TOLweb. Petter Bøckman (talk) 10:39, 26 May 2010 (UTC)
  • Oppose I'm late to the party; been traveling and didn't have the time to read the discussions. I agree in principle that linking to these databases is generally a good idea. I disagree that the taxobox is the place to do it. The only arguments for the specific placement in the taxobox that I see are cases of WP:OTHERSTUFFEXISTS. I don't see any reason why database links can't go in the already established and widely recognized external links section; readers expect our links to be there, not in the infobox (note: not a linkbox). The taxobox is cluttered as it is. Rkitko (talk) 02:26, 29 May 2010 (UTC)

Suggestion to move forward with prototyping

It sounds like there would be sufficient support to move forward in one form or another in principle, so perhaps the next step is to actually put together some prototypes. Maybe we can continue to build consensus on something tangible. There are already a {{Taxobox/sandbox}} and {{Taxobox/testcases}} set up. Who wants to take the first stab at it? Cheers, AndrewGNF (talk) 22:08, 24 May 2010 (UTC)

Some summary:
  • "Taxonomy ID" is not notable.
  • If a link would get to the infobox, then it would be exact duplicite of the link in "External links" section.
  • Everything what is technically possible to do in taxobox is also technically possible to do in External links section also.
--Snek01 (talk) 00:28, 25 May 2010 (UTC)
Thank you for the summary of your perspective. My hope in starting a new section was to suggest that the proponents of this proposal move forward to prototype something tangible. Perhaps when some users see the prototype they will think that the additional parameter makes a lot of sense. Perhaps others will agree with you that it should be in External links. Regardless, my suggestion was that we move toward something concrete so we all don't debate something amorphous indefinitely. Of course, since this template is protected, we'll have to get an administrator to gauge whether we reach consensus during the prototyping process. So I think everyone is motivated to continue to seek common ground. Cheers, AndrewGNF (talk) 00:38, 25 May 2010 (UTC)
I'm not convinced that everyone here is seeking common ground.--Curtis Clark (talk) 02:01, 25 May 2010 (UTC)
Well personally, I think the stated arguments are all, for the most part, pretty reasonable. My bet is still that we'll find some consensus that we all are comfortable moving forward with. I'm hoping having some concrete prototypes will help get us there quicker... Cheers, AndrewGNF (talk) 03:26, 25 May 2010 (UTC)

Auto-collapse identifiers

Many of the opposing votes are concerned about sacrificing screen real estate to subjectively irrelevant identifiers. I think many of the supporters are motivated by the fact that the infobox is a central, unique and structured place in the article. In contrast, the external links section is a free-format, unorganized mixture of links to various places, and thus not a good place to put identifiers. (As said elsewhere, it happens that the identifiers can be translated into links, but that does not make them the same as links.) I propose to keep the external identifiers section collapsed by default, so that screen real estate is not wasted for those who don't want to see the external ids, but there is still a central/unique/etc. place for them for everyone who wants to look for information beyond WP. Here's a first mockup using an edited template (the colors don't work yet, and I only added NCBI taxonomy). MichaK (talk) 16:04, 25 May 2010 (UTC)

I think this is an elegant compromise. --Roderic D. M. Page (talk) 16:23, 25 May 2010 (UTC)
I like this too. I've copied MichaK's sandbox to {{Taxobox/sandbox}}, so the side by side comparison can be seen at {{Taxobox/testcases}}. Cheers, AndrewGNF (talk) 17:37, 25 May 2010 (UTC)
Thanks Andrew. The main development should be in the sandbox, of course. MichaK (talk) 18:48, 25 May 2010 (UTC)
Nothing was really going on in {{Taxobox/sandbox}} recently, so I just nominated your work to be the "main development" branch... ;) Hope you don't mind... Cheers, AndrewGNF (talk) 18:57, 25 May 2010 (UTC)
Nope, I tried to agree with your actions. :) MichaK (talk) 20:08, 25 May 2010 (UTC)

Either they are relevant, in which case they should no more be collapsed than IUCN status or fossil range, or they are irrelevant, in which case they don't belong in the taxobox under any circumstance. I happen to think they are relevant, but I could probably be talked out of that by reasoned discourse. I'm concerned, though, about collapsible sections in the taxobox: they sidestep the question of whether the information really belongs there, and might lead to two different kinds of abuse: adding more and more information in collapsible sections, and taking existing elements and collapsing them because some editors don't find them useful.

How about an identifier template that would go under the taxobox? It could be collapsible or not, it could be created unilaterally, it could conceivably even take its width from the taxobox, it could contain links with far less controversy, and its inclusion in an article would become a question of its relevance to the article, not its relevance to all of biology. And in a year or so, editors might start demanding to know why it shouldn't be combined with the taxobox.--Curtis Clark (talk) 02:53, 26 May 2010 (UTC)

I definitely see your point that collapsed sections are a slippery slope. I got the idea from the GNF Protein Box, where the Gene Ontology section is collapsed by default (e.g. on NDUFA1. It's relevant to have it there, but as it can be quite long, it's also good to only show it if necessary. However, I think that an extra box just below the taxobox would just disperse this dispute over many individual species pages, as many of the arguments raised against the inclusion into the taxobox could also be raised against a separate box. :-/ MichaK (talk) 06:56, 26 May 2010 (UTC)
I'm all for an extra box, but I think it should go way down on the page, together with links to other external resources like commons and wikispecies. A formalized bottom box for organisms where one can put interesting parameters all organisms should have would be nice. The taxobox should in my view be reserved for the essential navigational information. Petter Bøckman (talk) 09:19, 26 May 2010 (UTC)
Well, perhaps this is the way to move forward. IMHO, external links are important enough to be near the top of the page, which is a main disagreement here. However, there seems to be more support for having a structured, consistent, bot-maintainable, machine-readable box (or part of a box). It is important to me that this information is easily accessible (i.e. "with one click"). Thus, it would have to have its own TOC entry, or be displayed prominently under "external links". As Curtis Clark suggested, this might be an intermediate solution and at some point there might be support to make this part of the taxobox. MichaK (talk) 09:44, 26 May 2010 (UTC)

Implementation

For ease of maintenance, I think the external ids should live on separate pages, much like it is the case with Template:PBB. E.g. for NDUFA1, the protein box is included via {{PBB|geneid=4694}}, which fetches Template:PBB/4694 (which in turn populates Template:GNF_Protein_box. So I propose a similar indirect way, e.g. {{SpeciesID|species=Mus_musculus}} which contains the relevant information. @Andrew, I guess this would also make it easier to re-use the protein box bot, no? MichaK (talk) 09:54, 26 May 2010 (UTC)

I'm not sure I understand. So there would either be a Template:TaxonIds/Mus which contains all the taxon information for the Mus genus, or a Template:TaxonIds/Mus_musculus page, which would be incorporated by the TaxonIds template? Why would this be easier than just incorporating all the ids into a single template in the external links section, such as {{TaxonIds|ncbi=1234|eol=1234|wikispecies=Alpha beta}} which produces the box you've designed? -- Gaurav (talk) 17:02, 26 May 2010 (UTC)
MichaK, I'm not entirely sure that the extra layer of abstraction is necessary here. The primary motivation for using that system for the gene pages is so that the ugly template code would be appear at the top of the page. That would be very intimidating for newbie editors (with which the biology community we were targeting is filled). Since the template you've designed is simpler and would appear at/near the bottom, I think this is less of an issue... Cheers, AndrewGNF (talk) 17:22, 26 May 2010 (UTC)
Ok, thanks for clarifying this. The more direct solution is totally fine. MichaK (talk) 18:43, 26 May 2010 (UTC)
But I do very much like the TaxonIds template that Gaurav proposes above to handle the rendering and presentation. It will keep the species pages much cleaner (making it much friendlier for both human editors and bots), and also make changes easier down the road... Cheers, AndrewGNF (talk) 17:27, 26 May 2010 (UTC)
Taking a page out of Snek01's book, I've put MichaK's template (with some changes) up at Template:TaxonIds. Let's wikiroll! -- Gaurav (talk) 17:54, 26 May 2010 (UTC)
Great! Thanks a lot. MichaK (talk) 18:43, 26 May 2010 (UTC)

Implementation is (very easy and) done {{NCBI}} (created today).

Sample:

{{NCBI | 6526 | ''Biomphalaria glabrata''}}

Gives results

as any other normal external links templates:

These templates also uses microformats allowing automatization, so I expect, that for example also User:Gaurav ("I'd support any scheme which would make it easier to link Wikipedia up with other databases") and hopefully User:Ganeshk will support this. --Snek01 (talk) 11:45, 26 May 2010 (UTC)

This is a good suggestion, but I much prefer the External identfiier box proposed by MichaK below. Another consideration is that {{NCBI | 6526 | ''Biomphalaria glabrata''}} introduces redundancy. If the accepted binomial name for the taxon changes, it also has to be changed here, as well as in the Taxobox. This problem would be multiplied for each template added. --Roderic D. M. Page (talk) 14:52, 26 May 2010 (UTC)
This works also without the name, that automatically add a name of the article. --Snek01 (talk) 15:32, 26 May 2010 (UTC)
While this is a good start, this template only inserts a normal link, indistinguishable from all other links to external resources. The identifiers should be separated from other links, and a box in tabular form seems better suited to do this. Design-wise (not implementation-wise), it could look like this... though of course I'd rather have this up in the taxobox. MichaK (talk) 13:06, 26 May 2010 (UTC)
Yes, that is the purpose and intention that they are "indistinguishable from all other links to external resources" because all links in the external links section are (or should be) valuable links and we can not say that some certain links are the best or better than other ones. This is not truth that "identifiers should be separated from other links", because identifiers are not notable at all and nobody need them = nobody need identifiers. --Snek01 (talk) 15:32, 26 May 2010 (UTC)
Why can you not concede that identifiers may be useful to some people? Why do you think identifiers are shown for chemicals and proteins? MichaK (talk) 15:58, 26 May 2010 (UTC)
I am not familiar with identifiers in chemistry. The only thing is, that CAS registry number is really used, but I have no idea if other identifiers in chemistry are in practical use or not. Maybe some other are added to wikipedia redundantly, maybe not, I do not know. But here we are talking about one certain identifier in taxonomy. If I can very simplify this, then creators of NCBI taxonomy browser are using that identifier, because they are lazy to make more easily URL like this http://www.ncbi.nlm.nih.gov/Taxonomy/Sarasinula_plebeia instead of this http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=157727 and nobody other uses their randomly generated number. --Snek01 (talk) 18:50, 26 May 2010 (UTC)
That is just factually wrong. NCBI taxon identifiers are used by third parties. From the top of my head, I know UniProt, uBio and TreeBASE use them, but I am certain we'd find many more if we took a look.--Rvosa (talk) 08:56, 27 May 2010 (UTC)
Here's a couple more users of NCBI taxon IDs: Pfam, Rfam, TreeFam, ENSEMBL COMPARA, iTOL, EMBL, GOLD, MEROPS, InterPro, PDB, SILVA. Oh, and did someone mention Notability? Whilst it appears NCBI taxonomy doesn't have an individual citation, the NCBI database resources articles that bundle this resource are very well cited.--Paul (talk) 09:57, 27 May 2010 (UTC)
Please see my comment further below; I believe that the question is whether the ID numbers have significance beyond linking to databases, i.e. whether they are actually used in the scientific literature at large as identifiers. If they were, that would demonstrate their notability, but the evidence shows that, while NCBI resources are well-cited, the citations don't make use of the ID numbers. -- Yzx (talk) 18:17, 27 May 2010 (UTC)
Here are a few examples of the NCBI Taxonomy id being used the scientific literature: a dissertation, see page 42 ("Tabelle 2.2"), a paper on recognizing biomedical names, a paper on reconstruction of phylogenetic relationships, a paper on proteome-wide analysis. Clearly, NCBI Taxonomy ids are used in the literature. Of course they are not as ubiquituous as binomial names, but they are in use whenever there's a need for a numerical identifier of a taxon. I'm not an expert on the other databases, but we started the discussion with proposing NCBI Taxonomy ids. MichaK (talk) 19:54, 27 May 2010 (UTC)
I like the tabular External Identifier box a lot. I think this would be a great way forward, as it side-steps the conflict over what goes in the Taxobox, maintains the distinction between links and identifiers that some of us have argued for, and would be easy for a bot to populate. --Roderic D. M. Page (talk) 14:52, 26 May 2010 (UTC)
Agreed, especially considering Petter Bøckman's point above that - at a later date, when taxonomy ids are incorporated into the Taxobox proper - we could have a bot running around deleting the External Identifier boxes and incorporating the content into the taxon's own Taxobox. -- Gaurav (talk) 17:02, 26 May 2010 (UTC)
The text:
provide no useful information, because the displaying of the number is not useful at all. It is even better like this:
or
And that applies to all identifiers from those databases used in the taxonomy. --Snek01 (talk) 15:32, 26 May 2010 (UTC)
Sorry, the second example is not better than the first one. To begin with, NCBI is an institute that does many things, so it is mandatory that we refer to the "NCBI Taxonomy". Second, 10090 may mean nothing to you, but it does mean something to people working with these identifiers. Really, how is seeing a "4" worse than seeing a "10090"?!? The latter means something beyond the wikipage, while the former is just an arbitrary numbering within that page. MichaK (talk) 15:49, 26 May 2010 (UTC)
Could you provide an evidence or an example for your statement? De facto for all articles that exist on wikipedia is the number unimportant. It is probable, that maybe you will find some of such extraordinary articles on wikipedia, but how many of them will be in comparison of other obvious articles using this link. I will give you a hint: no featured article displays such unimportant numbers. Do you think, that featured articles are written by such stupid people, that they have forgotten something like this? --Snek01 (talk) 16:32, 26 May 2010 (UTC)
Here's your example: Nicotinamide adenine dinucleotide. It is a featured article, and it contains no less than 7 identifiers right in the chembox. One of them is a CAS number which does have the most prominence. The other ones are from less accepted databases. I'm sure there are more. (Plus, what Andrew said below.) MichaK (talk) 18:39, 26 May 2010 (UTC)
This is not article about any taxon. NOBODY PROVIDED EVIDENCE ABOUT PRACTICALLY USING OF NCBI IDENTIFIER. --Snek01 (talk) 19:07, 26 May 2010 (UTC)
This discussion is about introducing identifier information for taxa. The fact that (unlike chemicals), taxa have no identifiers in featured articles (yet), is not a good argument. I think it is relevant that many other subject areas in Wikipedia do find the identifiers useful, just like the majority of people commenting here. --G.Hagedorn (talk) 21:38, 26 May 2010 (UTC)
Adding identifiers to something where is is not used, is against the content wikipedia policies Wikipedia:No original research and Wikipedia:Neutral point of view, because adding identifiers into article adds a bias to the article. --Snek01 (talk) 03:05, 27 May 2010 (UTC)
Virtually all Chemistry and mineralogy featured articles have identifiers and links. Cheers, AndrewGNF (talk) 17:09, 26 May 2010 (UTC)
I believe the issue is whether NCBI ID numbers are identifiers, which are used in the scientific literature to clarify what is being talked about (like a scientific name), or whether it is a tool to facilitate linking between databases. For example, the CAS # for xenon is 7440-63-3, and Google Scholar and Book searches for xenon 7440-63-3 produce literature that use it as an identifier: [5] [6]. However, the NCBI code for the great white shark is 13397, and searches for Carcharodon carcharias 13397 produce [7] [8] (almost no results), while searching for Carcharodon carcharias GenBank produces [9] and [10]. Thus, scientists who cite GenBank data in their research don't use the IDs. The NCBI ID would be useful to the operators of different databases seeking to coordinate the entries in their databases, but that's not what we're doing. -- Yzx (talk) 17:41, 26 May 2010 (UTC)
Do I hear my name? :) I prefer MichaK's box: it's aesthetically more pleasing, provides a single source where all the taxonomic ids may be found, for anybody looking for them. It will also distinguish between links *to* the NCBI taxonomy database in any article (for which people can use Snek01's template) and boxes inserted on individual taxa linking to their taxon id, which are being "tagged" with the metadata we're adding. -- Gaurav (talk) 17:02, 26 May 2010 (UTC)
I think the box looks good. Would it be possible to make it the same with as the wikispecies/commons boxes? That would make the section look more tidy. Will it be possible to run a bot-script that identify all organisms articles and add a box with an appropriate content? Petter Bøckman (talk) 17:33, 26 May 2010 (UTC)
I put the Wikispecies link in, but forgot about the Commons. I've created a first stab at the template (based on MichaK's code) at Template:TaxonIds, so I'll add it now. -- Gaurav (talk) 17:54, 26 May 2010 (UTC)
I think there was a misunderstanding, Petter was probably talking about the width of the Wikispecies box. I adapted the style of the the new template to the ones of the Wikispecies box. I'm not sure if it's a good idea to link to Wikispecies twice, though. MichaK (talk) 19:09, 26 May 2010 (UTC)
Ah, oops, silly me :). I thought Petter meant to integrate the WikiCommons and WikiSpecies templates into {{TaxonIds}} to avoid repetition. It looks good to leave them separate on the example pages created so far, so I guess we'll be removing the WikiSpecies link from {{TaxonIds}} soon. -- Gaurav (talk) 17:13, 28 May 2010 (UTC)

For example Encyclopedia of Life provide links to NCBI like this:

--Snek01 (talk) 19:02, 26 May 2010 (UTC)

    • Why not? I am not sure that the ID number needs to be exposed (only reason would be to make printout understandable). So I would endorse the text above on the link. However, I strongly favor a clean identifier-link place, which links to wikispecies and other relevant taxon IDs. This is relevant to readers. It communicates that these links are identifiable rather than related information (external links often contain the latter, see, e. g., Archaeopteryx, where the link "Journal of Dinosaur Paleontology" is NOT an eligable identifier for the genus Archaeopteryx...). --G.Hagedorn (talk) 21:38, 26 May 2010 (UTC)

I'm thinking that {{TaxonIds}} (and the example at User:MichaK/Sandbox/Mouse#External_links_and_sources) is looking very good right now. Kudos to Gaurav, Ganeshk, and MichaK for putting it together. When you all are happy with it, can you start another subsection here so that we can get people's updated opinions of this prototype? (Although we're no longer talking about taxobox, I think it makes sense to continue it on this discussion page...) I think we are moving toward consensus... Cheers, AndrewGNF (talk) 20:26, 26 May 2010 (UTC)

I don't think a small box on the side is the correct approach. Remember that if you exclude databases, most species out there probably wouldn't have any relevant external links, or at most 2-3. So I think the box should be wide and centered, like a navigation template, so that for most taxa you wouldn't get a lot of blank space. -- Yzx (talk) 23:11, 26 May 2010 (UTC)
Please check again. I have removed the older table format from the example. The blank space is gone. Ganeshk (talk) 23:17, 26 May 2010 (UTC)
I'm talking about if the mouse only had 2-3 other external links, or none at all. There would then be a huge white space to the left of the box. -- Yzx (talk) 23:41, 26 May 2010 (UTC)
Hmmm, good question... Just to see it in practice, I added another section to the example page with two external links. Personally, I don't think it looks too bad, and it will be hidden away at the bottom. Other thoughts? Or specific other pages we should have a look at? Cheers, AndrewGNF (talk) 23:52, 26 May 2010 (UTC)
How bad it looks would depend on how many links ultimately make it into the box, wouldn't it? -- Yzx (talk) 23:56, 26 May 2010 (UTC)
Yes, but because {{TaxonIds}} uses a small font, I think the imbalance would have to be pretty severe (huge number of taxonIDs, very few external links) for it to look really bad. And, IMHO, even if we have to tolerate a rare extreme bad case like that, the slight impact on presentation would be more than offset by the utility of the infobox. Anyway, is this a tradeoff that would make the difference between supporting and opposing for you, or something we should just look hard at optimizing? Cheers, AndrewGNF (talk) 00:02, 27 May 2010 (UTC)
I suppose it is premature to talk about cosmetic issues when there are still fundamental questions about the nature of this proposed box to be answered, i.e. what databases will it link to and the format of those links. -- Yzx (talk) 02:10, 27 May 2010 (UTC)
In terms of databases to link to, I'd suggest a core of 2-3 that will be near universal, but leaving open the option for more specialised or regional databases. The core, for me, would be:
  • NCBI taxonomy, with which we started this thread. Many, if not the majority of extant taxa will be in this database (or will be in the future), as well as some extinct ones (including Tyrannosaurus rex (NCBI Taxonomy ID 436495).
  • EOL which has the largest coverage of any taxonomic database, and makes use of Wikipedia content, and includes Wikimedia Commons as a partner EOL partners
  • Wikispecies
Then there are databases such as ITIS, which has something of a North American regional bias, and World_Register_of_Marine_Species which lists marine species. There will be others that are relevant to a subset of taxa, perhaps decisions as to which to be included should be the responsibility of those creating and editing pages for those taxa. --Roderic D. M. Page (talk) 09:20, 27 May 2010 (UTC)
Could you provide a reliable source that lists the most utilized and/or the most significant biological databases? For Wikipedia editors to decide what they are would be Original Research. -- Yzx (talk) 18:25, 27 May 2010 (UTC)
Would it be possible to make the box so that it can contain data from a number of bases, but exhibits only for the paleobase. The box should then scale to make room for just the data it has, and not have a large, empty field. Petter Bøckman (talk) 09:58, 28 May 2010 (UTC)
Actually, you can see that in action at the Sepsidae example mockup. At the moment, it slips off the end of the page, but doesn't look particularly bad (IMO). We could shrink that further by integrating the WikiSpecies and Wikimedia Commons links into {{TaxonIds}} itself, although it's aesthetically more pleasing as separate boxes I guess. In case of a page which has absolutely no external links, I guess it would be up to the article writer's discretion (as it should be). For instance, on the Sepsidae page, the {{TaxonIds}} could be moved up to the "Further reading" section instead. -- Gaurav (talk) 17:13, 28 May 2010 (UTC)

Trial

Snek01, Yzx, Petter: what do you think of the {{TaxonIds}} template? User:AndrewGNF has set up examples to see how it'd work in real life, but I'm eager to start putting the template up on actual pages. It'd give us an idea of how it works on different taxon types, . Most of us pro-info-ers seem to like it (see straw poll below), but I'd like to make sure we've got some consensus across the board before being bold about it. -- Gaurav (talk) 04:18, 2 June 2010 (UTC)

I still have reservations because this proposal is not very specific. Is the idea to include any and all database links in the box (general ones like GBIF/Barcode of Life/IUCN, more specific ones like OBIS/WoRMS/ARKive, or even more specific ones like FishBase)? If not, what is the cap on the number of links it includes, and what are the criteria for which databases to include and which to exclude? Not all databases use ID numbers. How will the link formatting be made consistent between those that do and those that don't? And as I've mentioned before, aligning the box to one side creates unsightly white space for taxa with few or no other external links, a problem that will only get worse as the box gets bigger. -- Yzx (talk) 18:10, 2 June 2010 (UTC)
Agreed that it is time to really focus on a specific proposal. I suggest that the links and identifiers would be limited to the ones that currently appear in {{TaxonIds}}. Given that constraint, what do you think? Cheers, AndrewGNF (talk) 22:58, 2 June 2010 (UTC)
For me, a criterion would be if the database explicitly states the identifier on the page. This is the case for NCBI taxonomy, ITIS, uBio NameBank and WoRMS, but not for EOL (where "Mus musculus Linnaeus, 1758" seems to be more appropriate?). I would argue that a database needs to encompass a certain number of species to be included there. Nonetheless, as only relevant databases are shown, it doesn't hurt if there are more databases. Regarding the design, this can be easily changes as this is a template. So perhaps the next step is to populate the species pages and then it can be seen how it looks. MichaK (talk) 12:32, 3 June 2010 (UTC)
I like those criteria. I've added a section to the TaxonIds documentation to collect criteria on which databases to include. -- Gaurav (talk) 08:26, 4 June 2010 (UTC)
I've inserted the {{TaxonIds}} template into three live article - House mouse, Sepsidae and Short-beaked_common_dolphin. Somebody else also added it to Amarinus lacustris. As you can see, the box looks fine in House mouse (in External Links) and Amarinus lacustris (where it's been added just below the TaxoBox, getting rid of the whole empty-space issue). Although there is some blank space at the bottom, I still think it looks better than a raw list of identifiers as proposed below, but maybe that's just me. Comments? -- Gaurav (talk) 08:13, 6 June 2010 (UTC)
Looks good to me. For consistency, I think it should always be in the same section, which currently mean: "External links". So I'm wondering what to do in cases like Amarinus lacustris where there is no such section yet. But I guess eventually there will be such a section in each article of relevance. MichaK (talk) 07:23, 7 June 2010 (UTC)
I created an empty external links sections and moved the boxes down. Was reverted. :) Ganeshk (talk) 12:18, 7 June 2010 (UTC)
As this discussion has become far too long already, I've raised the problem of placement at Template_talk:TaxonIds MichaK (talk) 16:15, 7 June 2010 (UTC)


Looks great to me! Cheers, AndrewGNF (talk) 15:40, 7 June 2010 (UTC)

Original research relevance

Yzx and Snek01 have raised the issue of whether selective inclusion/exclusion of identifiers constitutes original research. Personally, I don't see how that rule applies at all. The examples at Wikipedia:No original research/Noticeboard seem to be much more relevant to my understanding of the rule. Can anyone point to some precedent where the NOR rule applies to a case like this? Cheers, AndrewGNF (talk) 01:08, 28 May 2010 (UTC)

If one were to sling a policy, it would be NPOV, since one would be including some but not all referenced identifiers. But that would only be applicable to editors who tried to prevent other identifiers from being added. Every article starts out with limited information.--Curtis Clark (talk) 03:50, 28 May 2010 (UTC)

Latest prototype straw poll

Just so we can get a rough feeling for what people think of the latest {{TaxonIds}} template, maybe we can take a quick straw poll of the users who are still here? Examples usages on copies of real pages are linked at {{TaxonIds/examples}}. Cheers, AndrewGNF (talk) 23:23, 28 May 2010 (UTC)

Support

Oppose

Prior 'ART'

I did something along these lines a while ago, but never had time to follow up, see my template {{taxon}}, I support all efforts to link wikipedia with other databases as long as they are 'decent', my template is fish/shark focused so some parts shown is not applicable for all species. But I did try to builod it in a way so that it can scale for plants and other types of organisms, but template coding is a mess so it is not very pretty code :-) A example usage of my template would be like below and it is used in the General references section

{{Taxon|type=fish|displayName=Oceanic whitetip (''Carcharhinus longimanus'')|name=Oceanic whitetip shark|catlifeid=1280961|biolibid=138579|fishbolid=366|genus=Carcharhinus|species=longimanus|gbifid=13543768|iucnid=39374|eolid=17058563 }}

--Stefan talk 03:56, 29 May 2010 (UTC)

The template {{MPLinksUK}} which performs similar function for UK MPs, may also be of interest. It is intended to be used in External links sections. Andy Mabbett (User:Pigsonthewing); Andy's talk; Andy's edits 10:12, 4 June 2010 (UTC)