Wikipedia:Reference desk/Archives/Computing/2021 May 31

Computing desk
< May 30 << Apr | May | Jun >> June 1 >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


May 31 edit

DOCTYPE Puzzle edit

Consider the following HTML DOCTYPEs:

 <!DOCTYPE html PUBLIC "-//IETF//DTD HTML 2.0//EN">
 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">

Looking at Document type declaration, Document type definition and Formal Public Identifier, I can pretty much figure out what each part is for and why it is there, with the exception of the "//EN". Why is it there? Is it a language? Could there be a DTD in French with "//FR" at the end?

I would like to update the above pages to cover this detail. --Guy Macon (talk) 17:19, 31 May 2021 (UTC)[reply]

I did some research, and it looks like "//EN" is a two-letter code that refers to the language the DTD uses (see this source). Now that source doesn't mention if there's any standard the language codes follow, it just vaguely refers to "the two letter identifier for the language". This source mentions under the "ISO 8879:1986" section that "public_text_language = code from ISO 639", and based on the rest of that page, which talks about formal public identifiers, it seems like public_text_language would refer to the "//EN" at the end of a formal public identifier. Here's a link to ISO 8879:1986, which the source I linked mentions, and ISO 639. Also another source which confirms that "//EN" is a language code, and yet another which confirms ISO 639 is used. So to answer your question, yes, a DTD in French could have "//FR" at the end, as long as FR is the correct language code in ISO 639. Hope this helps. HoneycrispApples (talk) 18:13, 31 May 2021 (UTC)[reply]
Thanks! Very helpful. --Guy Macon (talk) 21:11, 31 May 2021 (UTC)[reply]
which it is, see here. We also have a full list of ISO 639-1 codes.  --Lambiam 10:36, 1 June 2021 (UTC)[reply]
I did a search and could not find any DOCTYPE that specifies any language other than EN. It would be an interesting addition to the articles if we could find one, no matter how obscure, that uses another language code.
I am also interested in any DOCTYPE that uses a registered name ( +// instead of -// ), or one that uses the syntax mentioned in Formal Public Identifier:
"Registered domain names may be used as owner identifiers. For example, the owner of example.net could issue FPIs using the owner identifier "+//IDN example.net"."
...which references [ https://www.ietf.org/rfc/rfc3151.txt ].
That RfC in turn uses this example:
  +//IDN python.org//DTD XML Bookmark Exchange Language 1.0//EN//XML
I also found a page at [ https://github.com/ImageMagick/glib/blob/main/glib/tests/bookmarks/fail-09.xbel ] which uses this syntax:
 <?xml version="1.0" encoding="utf-8"?>
 <!DOCTYPE xbel
   PUBLIC "+//IDN python.org//DTD XML Bookmark Exchange Language 1.0//EN//XML"
          "http://www.python.org/topics/xml/dtds/xbel-1.0.dtd">
That "+//IDN python.org" intrigues me. Did python.org get an ISO registration, or can anyone who own a domain name use the "+//"? Is there a list of those ISO registrations anywhere? --Guy Macon (talk) 22:42, 1 June 2021 (UTC)[reply]