Template talk:Lang/Archive 3

Latest comment: 6 years ago by Emir of Wikipedia in topic A bug with the new non-italics entry point?

List of most-transcluded nonexistent ISO 639 templates

Following up on the section above, here's the list of the most-transcluded ISO 639 templates. To see the full list with the links to articles, visit Special:WantedTemplates and search for "ISO 639". All of these templates were redlinked when I posted them here; the blue-linked ones have since been created.

Some are typos. Some may need to be created. Others need to be fixed. An error category would make it easier for us to find and fix articles that are using these nonexistent templates. – Jonesey95 (talk) 04:00, 29 September 2016 (UTC)

Thank you for compiling this list, Jonesey95. I'll slowly be making my way through parts of it. Help from other editors will be more than welcome. Uanfala (talk) 11:36, 29 September 2016 (UTC)
Hey! I recently implemented a few hundred {{lang}}s, and I'm pleased to say that I don't think I'm responsible for much of the above. However, I probably did most or all of the ja-latns. My intention was to tag them as being in Japanese, but transliterated into Latin characters, not in the native script. I thought ja-latn was the way to show this. How should I have tagged these? Thanks. Phil wink (talk) 23:56, 30 September 2016 (UTC)
Script codes normally have initial capitals: so it's ja-Latn (and the corresponding template is {{ISO 639 name ja-Latn}}). Uanfala (talk) 00:02, 1 October 2016 (UTC)
Thanks. I've fixed the 14/16 ja-latn which were mine, plus one more I found (Haedong, a disambiguation page which also contained a few other -latns), but I was unable to find the 16th case. Cheers. Phil wink (talk) 17:35, 2 October 2016 (UTC)
There is a link above to Special:WantedTemplates that will take you to a page with links to all transclusions of each nonexistent template. I followed it and found that only this page and a user sandbox page use the ja-latn template now. Good work! – Jonesey95 (talk) 18:50, 2 October 2016 (UTC)

Module solution

It seems odd to have so many templates just to return language names. It would be neat to have a module. Then all the codes and language names can easily be viewed and modified. I'm new to module coding, but it's such a simple thing that it shouldn't be too hard to make one. — Eru·tuon 00:20, 1 October 2016 (UTC)

Agreed, this will be much better. However, writing a module to look up the codes might be trivial, but rewriting all the templates that depend on that definitely isn't. Uanfala (talk) 00:29, 1 October 2016 (UTC)
I'm starting a module at Module:Language. It just has a small table of language data and a Wiktionary linking function so far. — Eru·tuon 05:04, 1 October 2016 (UTC)
You can perhaps save yourself some work. Many language codes are already known to MediaWiki so a call to the Scribunto language library like this:
mw.language.fetchLanguageName('ar', mw.getContentLanguage():getCode())
can get you a code's matching language name. It isn't perfect because it doesn't include all of the ISO 639 codes. For example, the above returns 'Arabic' but the valid ISO 639-2 code 'ara' returns an empty string.
Trappist the monk (talk) 10:30, 1 October 2016 (UTC)
And you might save even more work by getting the it straight from the source: Ethnologue has all the codes together with their meanings in a simple data dump on their website: https://www.ethnologue.com/codes. Uanfala (talk) 10:53, 1 October 2016 (UTC)
That source appears to be only ISO 639-3 codes that are not part of 639-1 or 639-2. This data file, for example, does not list code 'ara' which is both 639-2 and 639-3.
Trappist the monk (talk) 11:21, 1 October 2016 (UTC)
@Trappist the monk: Thanks for the suggestion! I wasn't aware that was possible. I just implemented it, though I replaced mw.getContentLanguage():getCode() with 'en'. Doing that might save a little processing time. — Eru·tuon 19:26, 1 October 2016 (UTC)

Now the module's Wiktionary function is pretty reliable, so {{wikt-lang}} now uses it. I'll see if I can make it perform the functions of {{lang}} next; as of yet it just adds language attributes. — Eru·tuon 22:34, 2 October 2016 (UTC)

More Lang template errors

I just learned that there is a list of articles with Lang template errors on the Tools server, with another one at this link. – Jonesey95 (talk) 13:33, 15 October 2016 (UTC)

Documentation inconsistency: COinS unsafe or not?

The {{COinS safe|n}} notice at the top is contradictory to the documentation text "To suppress this – e.g. when using {{lang}} within a wikilink or the title parameter of a citation – add the parameter |nocat=true.". Either "or the title parameter of a citation" needs to be removed, or the {{COinS safe|n}} notice needs to be relaxed accordingly.   ~ Tom.Reding (talkdgaf)  15:43, 2 November 2016 (UTC)

Rendering Hebrew

Forgive me, if this should be addressed somewhere else, but this was the most fitting place I could find. It puzzles me that Hebrew text renders differently (at least in my browser) depending on whether it's formatted with {{hebrew}}} or {{lang|he}}}. Should the latter not imply the former? Take for instance the first words of the shema':

  • {{Script/Hebrew|שְׁמַע יִשְׂרָאֵל}} produces: שְׁמַע יִשְׂרָאֵל
  • {{lang|he|שְׁמַע יִשְׂרָאֵל}} produces: שְׁמַע יִשְׂרָאֵל

In the first case, it's rendered nicely in a suitable Hebrew-friendly font (SBL Hebrew, I think), whereas the latter example renders in some ordinary standard-font not well suited to Hebrew. Often in a given Wikipedia article occurrences of Hebrew words will be templated differently (and thus rendered in different fonts), which gives a very haphazard and inconsistent look to the reader. This can hardly be intended? Am I missing something? ——Pinnerup (talk) 17:00, 16 February 2017 (UTC)

Short answer: {{hebrew|...}} specifies a list of fonts for your browser to pick from; {{lang|he|...}} doesn't, it merely declares the enclosed text as being in the Hebrew language, and leaves it up to your browser to pick a suitable font from those installed.
Long answer: the emitted HTML is, respectively,
<span class="script-hebrew" style="font-size: 115%; font-family: Alef, 'SBL BibLit', 'SBL Hebrew', 'David CLM', 'Frenk Ruehl CLM', 'Hadasim CLM', Cardo, Shofar, David, 'Ezra SIL', 'Ezra SIL SR', 'Noto Sans Hebrew', FreeSerif, 'Times New Roman', FreeSans, Arial;" dir="rtl">שְׁמַע יִשְׂרָאֵל</span>
<span xml:lang="he" lang="he">שְׁמַע יִשְׂרָאֵל</span>
. In the first of these, the class="script-hebrew" attribute doesn't do anything special; the dir="rtl" specifies right-to-left text; and in the style="..." attribute, the font-size: 115%; declaration is self-explanatory, whilst the font-family: Alef, 'SBL BibLit', 'SBL Hebrew', 'David CLM', 'Frenk Ruehl CLM', 'Hadasim CLM', Cardo, Shofar, David, 'Ezra SIL', 'Ezra SIL SR', 'Noto Sans Hebrew', FreeSerif, 'Times New Roman', FreeSans, Arial; declaration specifies a list of fonts which might be suitable for the enclosed text - they are listed in decending order of priority, so if your browser is like mine and doesn't have any of the first thirteen installed, but does have Times New Roman, it will use that font.
In the second of these, the lang="he" attribute specifies the Hebrew language; the xml:lang="he" does the same and is in fact redundant. Notice that no font is specified, nor even the script direction, which is implied by context. The purpose of the {{lang}} template is to declare a language for the enclosed text, nothing more.
Please observe the warnings at Template:Hebrew#Usage about not using {{hebrew}} directly. --Redrose64 🌹 (talk) 08:56, 17 February 2017 (UTC)
Thank you for your very informative and comprehensive answer! That explains a lot. After reading it I found that with some fiddling I was able to configure my browser to display text marked with lang="he" in a suitable Hebrew font. I take it best practice is to only use {{lang|he}}? I notice, however, that {{lang-he-n}} calls {{hebrew}}. Should I avoid using that? —Pinnerup (talk) 13:54, 19 February 2017 (UTC)
Template:Hebrew#Usage says "do not include it directly" - if you are using {{lang-he-n}}, then you are not including {{hebrew}} directly. --Redrose64 🌹 (talk) 08:38, 20 February 2017 (UTC)

Protected edit request on 3 March 2017

Please change the way that articles are categorized by the "unknown language" check by changing [[Category:Articles containing unknown ISO 639 language template]] to [[Category:Articles containing unknown ISO 639 language template|{{{1}}}]].

I believe that this will sort pages in the tracking category by the value of the language code that is used. This will help me (and other gnomes) identify groups of articles that have a common error or a common language for which a template needs to be created. Thanks. – Jonesey95 (talk) 07:16, 3 March 2017 (UTC)

  Done — Martin (MSGJ · talk) 09:07, 3 March 2017 (UTC)
MSGJ, thank you. That worked, and it will help a lot. – Jonesey95 (talk) 18:06, 3 March 2017 (UTC)

strange linebreak and space created for some languages

Is it possible that something in template:lang is broken?

See example : (တိုင်း in Burmese)

template:my which uses lang looks ok to me... --katpatuka (talk) 17:24, 22 April 2017 (UTC)

The above text looks fine to me. What are you trying to demonstrate? – Jonesey95 (talk) 22:45, 22 April 2017 (UTC)
Is has already been fixed: there had bee I linebreak in template:my I didn't see. --katpatuka (talk) 04:59, 23 April 2017 (UTC)

Flemish (Belgian Dutch)

{{Lang-nl-BE}} doesn't work, but should, and should render as "Flemish (Belgian Dutch)" probably. Some linguists classify Flemish as a language (or even multiple languages), not a dialect or dialect continuum of Dutch (much the way Scots is considered a separate language from English, not a variant of it), but I think we're stuck with nl-BE for now. ISO defines separate codes for two forms of Flemish, West Flemish and Limburgish, but not the other two, nor Flemish a whole. Many sources for this or that which we may need to mark up with a template are not specific and just say "Flemish", so it's going to be original research for a Wikipedian to try to use one of the more specific ISO labels. However, just using {{lang-nl}} is inaccurate and a disservice to readers. Ergo, we need {{lang-nl-BE}}.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  22:10, 21 August 2017 (UTC)

Does "nl-BE" conform to the existing template instructions at Template:Lang#Indicating regional variant? In other words, is "BE" the correct two-letter abbreviation? A sourced answer would be helpful. – Jonesey95 (talk) 23:16, 21 August 2017 (UTC)
nl is ISO 639-1 language code for Dutch; BE is ISO 3166-1 country code for Belgium. The 639 table also lists Flemish as a code nl language. See also code nld @ sil.org.
Flemish contributes extensively to the size of Category:CS1 maint: Unrecognized language because it is-a-language-that's-not-a-language. For cs1|2, we could modify Module:Citation/CS1 to accept |language=Flemish but we also require that there be a code from which we can render a language: |language=de → (in German). Code nl will always render as Dutch so we could, in lieu of decision from recognized authorities make up our own. nl-BE would work if there is nothing better.
Trappist the monk (talk) 23:37, 21 August 2017 (UTC)
I have created two templates and a category to begin support for this language/dialect. Let me know if more are needed. – Jonesey95 (talk) 01:06, 22 August 2017 (UTC)

FYI MediaWiki does not recognize dialects and character sets (e.g. Simplified Chinese (a character set), British English (a dialect)). Flow 234 (Nina) talk 15:37, 20 September 2017 (UTC)

Fraternities and this template.

Is there any guidance as to whether this template should be used for Fraternities? For example, should it simply be ΦΒΚ or should it be ΦΒΚ (from {{lang|el|ΦΒΚ}})Naraht (talk) 08:56, 23 September 2017 (UTC)

Completely incorrect advice

Presently there some text that says:

Do not use quotation marks in your user style sheet; they may be misinterpreted as wikitext. While they are recommended in CSS, they are only required for font families containing generic-family keywords ('inherit', 'serif', 'sans-serif', 'monospace', 'fantasy', and 'cursive'). See the W3C for more details.

This is wrong on either every detail or almost every detail.

Quotation marks are not needed around generic family keywords. They're only needed around actual font names that contain spaces or other non-alphanum characters, which is a lot of them, or start with digits. These quotation marks are not optional in such a case, though many browsers gracefully decline to choke to death if you leave them out. They're usually double-quotes not single-quotes, unless one is using inline CSS inside HTML, e.g. in <span style="font-family: 'Lucida Sans Unicode', sans-serif;">...</span> or whatever.

If it actually is true that "quotation marks in your user style sheet ... may be misinterpreted as wikitext", this is a very severe MediaWiki bug which needs to be addressed immediately. If this were the case, I think we would have heard about it by now.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  03:48, 29 September 2017 (UTC)

It goes back twelve years, to this edit at 09:04, 27 January 2005 (UTC) by Mzajac (talk · contribs). In those days, template documentation was on the talk page - we didn't use /doc subpages. Subsequent relevant edits include:
Maybe back in 2005 the MediaWiki software behaved differently to now. Or perhaps it was for browsers with an incomplete or improper implementation of CSS 2.1, such as Internet Exploder 6. CSS 2.1 is still largely current: the relevant document in CSS 3 is CSS Fonts Module Level 3 (3 October 2013), which being a W3C Candidate Recommendation is not yet a full W3C Recommendation. The section concerned is 3.1 Font family: the font-family property. --Redrose64 🌹 (talk) 11:15, 29 September 2017 (UTC)
Redrose64 has indicated the latest advice from W3C. A summary of that advice is:
  1. There are two types of font names: family and generic.
  2. Generic names are serif, sans-serif, cursive, fantasy, and monospace – these fonts are supplied by the user agent (browser) itself.
  3. It is recommended that the list of fonts supplied to font-family has a generic name as the last entry to allow a guaranteed fallback should none of the named family fonts be available to the user agent. The generic font name at the end of the list must not be quoted.
  4. It is recommended that family font names that contain spaces, digits or punctuation (other than hyphens) are quoted.
  5. Font family names that contain the following words must be quoted: inherit, serif, sans-serif, monospace, fantasy, cursive, initial and default.
That is the same as the current W3C recommendation from 7 June 2011. Hope that helps. --RexxS (talk) 13:32, 29 September 2017 (UTC)

How to display a Japanese ellipsis without resorting to ・・・?

Hello, I would like to display … as wikipedia does here. Wikipedia uses a template with "span lang", so I've tried to do it the same way, but the result is this. Is there any way to display … as in the aforementioned article? Seelentau (talk) 20:30, 1 July 2017 (UTC)

@Seelentau: What is wrong with using {{lang|ja|…}} which produces , or indeed &#8230; which produces …? --Redrose64 🌹 (talk) 20:41, 1 July 2017 (UTC)
@Redrose64: I would like to use it in a wikia-wiki, where the lang-template doesn't exist. And &#8230; is displayed at the bottom of a letter height, whereas on said wikipedia page, the ellipsis is displayed in a similar fashion as ・・・ (which is just three ・). I would like to avoid using ・・・, but simply using &#8230; comes up as … for me. Seelentau (talk) 20:55, 1 July 2017 (UTC)
Have you tried using <span lang="ja">…</span> which is the HTML expansion of that template, and which produces ? Please note: span is an ordinary HTML element, lang is one of its permitted attributes. --Redrose64 🌹 (talk) 21:33, 1 July 2017 (UTC)
Yes, I've tried using span lang, and it does work here, but not over at wikia. Seelentau (talk) 21:40, 1 July 2017 (UTC)
Then all I can think of is that the fonts are different. --Redrose64 🌹 (talk) 21:55, 1 July 2017 (UTC)
Yup, was just about to write that. :) By changing it to sans-serif, it works: <span lang="ja"><font face="sans-serif">…</font></span> produces Seelentau (talk) 21:56, 1 July 2017 (UTC)
You can contract that - <span lang="ja" style="font-family:sans-serif">…</span> produces - another reason for doing so is that the font element is obsolete. --Redrose64 🌹 (talk) 22:03, 1 July 2017 (UTC)
Ah, okay, I will do that, thank you! :) But one more problem is that I can't use it in article titles. For example, the song name "UnknownDespaira Lost" has to be titled "Unknown…Despair…a Lost". Is there no character that is actually and not a modified …? Seelentau (talk) 22:15, 1 July 2017 (UTC)
Set the article title without attempting to style it. Then, at the top of the article, use {{DISPLAYTITLE:Unknown<span lang="ja" style="font-family:sans-serif">…</span>Despair<span lang="ja" style="font-family:sans-serif">…</span>a Lost}} - see mw:Help:Magic words#Technical metadata. --Redrose64 🌹 (talk) 22:28, 1 July 2017 (UTC)
Oh yes, displaytitle exist, completely forgot^^ I will do that, but do you know why the ellipsis is actually displayed this way? The background of all of this is Japanese, by the way, and their ellipsis is always displayed as , but when I copy it (for example, from here), it's simply …. Then again, for some, … is displayed as from the start... does Firefox not support that? Seelentau (talk) 22:36, 1 July 2017 (UTC)
@Seelentau: It looks like this has to do with the font that the browser chooses. In my browser (Chrome), the ellipsis character is displayed in the font Meiryo when marked as Japanese (), but in the font Arial otherwise (…). To avoid this inconsistency, you might be able to use the "midline horizontal ellipsis" character (U+22EF, ⋯) instead of the horizontal ellipsis (U+2026, …), which displays the correct way even in Arial. But note that that is probably technically incorrect because U+22EF is in the mathematical operators block and categorized as a symbol rather than a punctuation character (FileFormat.info page). — Eru·tuon 01:17, 11 October 2017 (UTC)

Recent change

WOSlinker has recently changed some (or all?) lang templates to use html for italics. Has that change been discussed anywhere? Does it improve anything? Because it has caused some problems: forms such as {{lang-it|'Livorno'}} now display as 'Livorno' instead of Livorno. Could someone please fix this (or, if the change is an important improvement, give some hint as to how the various affected pages could be tracked down and fixed)?

What we really need is a font style parameter for this template (yes, I know that bold is technically a font weight); while italics are commonly used for words in other languages, they are not used for proper names – and the templates are often used for proper names, which sometimes need to be bold-faced. Is there any reason why this couldn't or shouldn't be implemented? Justlettersandnumbers (talk) 10:21, 16 October 2017 (UTC)

I think there might be a broader problem with WOSlinker's changes, see User talk:WOSlinker#Why HTML for italics?. – Uanfala 11:00, 16 October 2017 (UTC)
Hmm, it looks as if this should have discussed before it was implemented. May I suggest that, pending the outcome of such a discussion, someone with smart rollback and template editor permissions roll back these edits (which I think are these (about 369), plus one a little earlier and three the previous evening, two of them not to {{lang}}-foo templates)? That'd fix the errors for now, without prejudice to doing this properly if there's consensus that it is what's wanted. Justlettersandnumbers (talk) 13:27, 16 October 2017 (UTC)
I would do it, but I have been brought to ANI for unbreaking templates in the past, and there I was accosted by administrators who would not read, could not read, or both. I learned my lesson from that experience, which was "It's better to be happy than right". I support reverting these changes, however. I would like to hear back from WOSlinker, though, whose editing knowledge and skills I respect greatly. – Jonesey95 (talk) 14:07, 16 October 2017 (UTC)
I've changed all my edits on the lang templates back to the wiki style italics. There are two lang templates still using ther html style but I've never edited those. -- WOSlinker (talk) 13:01, 17 October 2017 (UTC)
@WOSlinker: Would you please change Module:Zh back as well? Scriptions (talk) 01:01, 19 October 2017 (UTC)
Done. -- WOSlinker (talk) 08:12, 19 October 2017 (UTC)

zh-yue

zh-yue does not seem to be working any more, in e.g. [係] Error: {{Lang}}: unrecognized language tag: zh-yue (help) , copied from Written Cantonese.--JohnBlackburnewordsdeeds 13:37, 19 November 2017 (UTC)

Isn't the language code supposed to be simply yue rather than zh-yue? – Uanfala 13:42, 19 November 2017 (UTC)
From the registry 'yue' is the subtag for "Yue Chinese" or "Cantonese", the tag for "Chinese" is 'zh'.--JohnBlackburnewordsdeeds 13:54, 19 November 2017 (UTC)
The subtag registry identifies yue as a language code. See Yue language where yue is identified as the ISO 639-3 language code.
{{lang|yue|係}}
It may once have been true the the correct code was zh-yue (language subtag with an extlang subtag). According to the subtag registry, that is no longer true and the preferred subtag is yue. {{lang}} and Module:lang do not currently (may never) support extlang subtags.
Trappist the monk (talk) 14:13, 19 November 2017 (UTC)
Thanks, I see that it’s deprecated now. I can’t remember ever using it, but recalled it from a previous browse of the registry and so thought it OK. I guess with the template previously passing through anything, but now actually checking what’s passed to it, there will be quite a few like this to fix.--JohnBlackburnewordsdeeds 16:12, 19 November 2017 (UTC)
I've been keeping an eye on Category:Lang and lang-xx template errors looking for indications that the new module is doing the wrong things. I haven't seen any zh-yue errors. But, I have seen plenty of zh-han and zh-t errors. grc-gre is fairly common as is jp.
Trappist the monk (talk) 16:35, 19 November 2017 (UTC)

Error

{{langnf}} doesn't work in Israel infobox. I can't figure out how to fix it. --Triggerhippie4 (talk) 04:32, 19 November 2017 (UTC)

@Triggerhippie4: The first parameter was empty, it needs to be a language code; specifically, the code for the language that the third parameter is written in. This edit should fix it. --Redrose64 🌹 (talk) 10:25, 19 November 2017 (UTC)

(edit conflict)

{{langnf}} is calling {{lang}} without providing a valid ISO 639 language code. {{lang}} requires (has always required) a language code so that it knows how to correctly supply html markup for the text. In this template, the first positional parameter, {{{1}}}, the language code, is empty:
{{langnf||Hebrew|"The Hope"}}
Many might leap the the conclusion then that they should add the language code that matches the language name. They would be wrong. In this case, the correct code is en because "The Hope" is English.
It appears that the documentation for {{langnf}} is inadequate. I can also imagine, though have not given it sufficient thought to recommend, that in {{langnf}} this line:
}} for {{Lang|{{{1|}}}|{{{3}}}|rtl={{{rtl|}}}}}<noinclude>
might be changed to:
}} for {{Lang|{{{1|en}}}|{{{3}}}|rtl={{{rtl|}}}}}<noinclude>
if {{{3}}} is usually an English translation. If {{{3}}} is always an English translation then there is no need for {{lang}} in {{langnf}}
Perhaps Editor Hyacinth, the original author of both {{langnf}} and its documentation, can be persuaded to revisit that template.
Trappist the monk (talk) 10:31, 19 November 2017 (UTC)
It seems that it used to work in the past without a language code in the first parameter, though. If you go to Template:Language with name/for today, you can see (or at least I can see) that the examples in the documentation do not produce errors. If you null-edit the template, they will produce errors. I have not yet looked at the old {{lang}} code to puzzle out this apparent effect. – Jonesey95 (talk) 23:51, 19 November 2017 (UTC)
I put the old lang template code in Template:lang/sandbox2 and used that template in the sandbox for {{langnf}}. It appears that leaving the language blank did not produce an error in the past:
{{lang/sandbox2||Foo}} → {{lang/sandbox2||Foo}}
{{Language with name/for/sandbox||2=German|3=[[Thuringia]]}}Error: {{language with name/for}}: missing language tag or language name (help)
{{Language with name/for/sandbox|en|2=German|3=[[Thuringia]]}}German (English for 'Thuringia')
{{Language with name/for||2=German|3=[[Thuringia]]}}Error: {{language with name/for}}: missing language tag or language name (help)
{{Language with name/for|en|2=German|3=[[Thuringia]]}}German (English for 'Thuringia')
It appears to me that even though the language name in {{lang}} was listed as a Required parameter, there may not have been code that enforced that requirement. Still researching. – Jonesey95 (talk) 00:06, 20 November 2017 (UTC)

(edit conflict)

seems that it used to work. The output of this particular example when rendered by the previous version of {{lang}} looks like this:
[[Hebrew language|Hebrew]] for <span lang="" >"The Hope"</span>
The purpose of {{lang}} is to indicate that the text belongs to a language. If the language is going to be English there isn't much sense in calling {{lang}}, no need to wrap the text in <span>...</span> tags. Because {{lang}} expects to have a language code so that it can do its job, the module whines and complains when that important piece is missing. {{lang}} cannot know that the template that's calling it doesn't really need its services. So it complains.
Trappist the monk (talk) 00:24, 20 November 2017 (UTC)
I've tweaked {{langnf}} so that is only calls {{lang}} when a language code is provided in {{{1}}}.
Trappist the monk (talk) 11:28, 20 November 2017 (UTC)

Odd error at Dalian Mosque

The following works on its own مسجد داليان but is generating an error when used in a {{Chinese}} template at Dalian Mosque.--JohnBlackburnewordsdeeds 00:40, 20 November 2017 (UTC)

Garbage in, garbage out.
{{Chinese}} is a redirect to {{Infobox Chinese}}. That template calls {{Infobox Chinese/Blank}} with the values provided to the infobox template:
|lang2=Arabic
|lang2_content={{lang|ar|مسجد داليان}}<br/>(''Masjid Dālyān'')</span>
{{Infobox Chinese/Blank}} calls {{lang}} with the content of |lang2= and |lang2_content=:
{{lang|Arabic|{{lang|ar|مسجد داليان}}<br/>(''Masjid Dālyān'')}}
The 'inside' {{lang}} produces this:
<span title="Arabic-language text"><span lang="ar" dir="rtl">مسجد داليان</span></span>
which is partially correct – partially because Arabic script is right-to-left and should be marked as such but the infobox has no support for that.
So now, the outside {{lang}} looks like this:
{{lang|Arabic|<span lang="ar">مسجد داليان</span><br/>(''Masjid Dālyān'')}}
But, that won't work because the value assigned to {{{1}}} is not an ISO 639 language code. Module:lang rejects 'Arabic' because it expects a code, not a language name so instead of rendering bogus html it emits an error message.
The quick fix? There are at least two and probably more.
  1. |lang2=ar
  2. |lang2_content=مسجد داليان<br/>(''Masjid Dālyān'') – were it me, I would remove the <br /> and the transliteration because that is left-to-right and Latn script.
For the time being, because there are a lot of {{lang-??}} templates that call {{lang}} and a lot of them impose italics on the 'text', the italic detection and associated error messages are disabled so in future the error message will be back.
A long-term fix to properly support the transliteration of the Arabic is needed and will require modifications to {{Infobox Chinese}} and {{Infobox Chinese/Blank}}.
Trappist the monk (talk) 01:23, 20 November 2017 (UTC)
{{Infobox Chinese/Blank}}. Ugh. I keep intending to have a go at rewriting Infobox Chinese to use Lua, but every time I’ve looked at it I’ve been put off by things like that. It’s not even clear that belongs in the template, which seems to have grown to do too much over the years. People don’t notice or object as most fields in the template default to hidden, but if they are hidden so no-one sees them they probably aren’t all needed.--JohnBlackburnewordsdeeds 13:30, 20 November 2017 (UTC)

Broken Doxology

The code grc-gre is not a valid IETF language tag (see my comments at "Module:Language/data/wp languages" ?) so Module:lang emits an error message because it cannot make sense of the 'code'. There is a related template, {{lang-grc-gre}}, which has documentation that, to this reader, is far from clarifying. That template does not emit an error because it drops the -gre thing and calls {{lang}} with only the IANA/ISO 639-3 language code.

It does not appear that grc-gre is a Linguist List code so I would guess that someone here at en.wiki created it.

Trappist the monk (talk) 14:17, 20 November 2017 (UTC)

It might make sense to treat any foo-bar as just foo any time the foo-bar combo doesn't resolve. I'm skeptical we can prevent people adding -bar instances that don't resolve to something in our table, since they're introduced (albeit slowly) all the time, e.g. in linguistics papers. PS: grc-gre was previously discussed in an older thread, above: #zh-yue.  — SMcCandlish ¢ >ʌⱷ҅ʌ<  22:27, 20 November 2017 (UTC)
Clearly we differ on what constitutes a 'discussion'. At #zh-yue, I merely mentioned grc-gre as a common cause of error messages displayed by Module:lang.
Trappist the monk (talk) 11:00, 21 November 2017 (UTC)
Sure; I was just making sure the threads were connected, in case the earlier post mattered in the context.  — SMcCandlish ¢ >ʌⱷ҅ʌ<  18:11, 21 November 2017 (UTC)

"Module:Language/data/wp languages" ?

The documentation says that Module:Language/data/wp languages supports non-standard language codes (e.g. Linguist List codes), but I added one, and it did not help. Should the documentation be modified to remove that link? Thanks. – Jonesey95 (talk) 13:14, 20 November 2017 (UTC)

Related: Is there a recommended way to fix templates in Category:Lang-x templates with other than ISO 639? – Jonesey95 (talk) 13:17, 20 November 2017 (UTC)
For templates that truly don't use IETF tags, I think there is nothing to 'fix'.
There is one, {{lang-ca-valencia}} that has prompted me to tweak Module:lang/sandbox so that when the IETF language tag includes a variant, the module fetches the language name from the variants table:
{{#invoke:lang|lang_xx_italic|code=ca-valencia|text=Lucrezia Borgia|italic=no}}Valencian: Lucrezia Borgia
[[Valencian language|Valencian]]: <span lang="ca-valencia" style="font-style: normal;">Lucrezia Borgia</span>
{{#invoke:lang/sandbox|lang_xx_italic|code=ca-valencia|text=Lucrezia Borgia|italic=no}}Valencian: Lucrezia Borgia
[[Valencian language|Valencian]]: <span lang="ca-valencia" style="font-style: normal;">Lucrezia Borgia</span>
Alas, that is the only one of the templates listed at Category:Lang-x templates with other than ISO 639 with a proper variant subtag.
Trappist the monk (talk) 16:08, 20 November 2017 (UTC)
I'm having second thoughts about this sandbox tweak. Consider:
{{#invoke:lang/sandbox|lang_xx_italic|code=pt-ao1990|text=some pt text}}Portuguese: some pt text
[[Portuguese language|Portuguese]]: <i lang="pt-ao1990">some pt text</i>
Not what we really want. Perhaps an alternate language parameter that when concatenated with ' language' can be used to replace the default language name (from the data tables) so that we link to the variant language name article.
Trappist the monk (talk) 17:03, 20 November 2017 (UTC)
From my previous work with the ISO 639 name templates, my experience is that every language and dialect has an article or redirect at "XXXX language". The templates and categories depended on that construction. I messed with hundreds of those templates, and I do not recall encountering any missing articles or redirects. See, for example, Middle Scots language, which would be the destination for "sco-smi" below if we could put in some sort of override.
I'll let you continue to think about it. You always come up with something that works. – Jonesey95 (talk) 17:41, 20 November 2017 (UTC)

I think you meant Module:lang/data and in particular the override table.

You added sco-smi which looks like a valid IETF language tag but is not. Were it valid, smi would be listed as an extlang in the IANA language-subtag-registry file. At present, there are no plans to support extlangs because there are preferred language codes for all of the existing extlangs.

Because Module:lang expects a valid IETF language tag, it emits an error when it disassembles sco-smi into its separate parts, the code sco and this other thing smi which doesn't match the required patterns for script (4 letters), region (2 letters or 3 digits), or variant subtags (4 digits or 5–8 alphanumeric characters).

It may be that we will want to create a table that specifically holds Linguist List codes so that we can handle them. The question that I have about any of these codes that are not in language-subtag-registry file is: What to put in the lang="" attribute of the enclosing <span>...</span>? Browsers and screen readers probably don't know about (aren't required to know about) 'private' language codes that aren't in the registry.

Trappist the monk (talk) 13:56, 20 November 2017 (UTC)

Sorry, I should have linked to the documentation in question. I meant Module:Lang/doc, which refers to files that apparently do not work. Should those references be removed in order to avoid confusion? – Jonesey95 (talk) 16:38, 20 November 2017 (UTC)
What do you mean by files that apparently do not work?
The only place where code 1ca is defined for Module:lang is in Module:lang/data in the override table. That code works as it should (there is no extlang tacked onto it):
{{#invoke:lang|lang_xx_inherit|code=1ca|text=كَیکاوس|rtl=yes|italic=no}} → [كَیکاوس] Error: {{Lang-xx}}: unrecognized language tag: 1ca (help)
Adding sco-smi to the override table doesn't work because the module has extracted the language code (sco) from it and cannot find that in the override table and there is, at present, no mechanism to make the module search for the (invalid) extlang either alone (smi) or in combination with the language code (sco-smi).
Trappist the monk (talk) 17:03, 20 November 2017 (UTC)
Re "files that do not work": Module:Language/data/wp languages is linked from Module:Lang/doc. It doesn't do anything, right? – Jonesey95 (talk) 03:07, 21 November 2017 (UTC)
Actually, it does. Module:Language/name/data creates a single table from the /wp languages, /ISO 639-3, and /iana languages modules. The first module read is /wp languages. First language 'code' into the composite table wins so when code exists in all three of the data modules, only the code and data in /wp languages is used. For example, code gem is present in both /wp languages and in /iana languages. Module:Language/name/data reads /wp languages first so its value for code gem (Proto-Germanic) is the value used by Module:lang; not the 'official' Germanic languages:
{{#invoke:lang|lang_xx_italic|code=gem|text=Example text}}Germanic languages: Example text
The 'codes' that do not work in /wp languages are the hyphenated codes.
Trappist the monk (talk) 03:49, 21 November 2017 (UTC)

I am writing this mostly as a note-to-self before I jump on the plane to Elsewhere.

It occurs to me that we can make use of the IETF language tag's support for private-use subtags. So, subtags that we have invented, like code grc-gre, might be renamed grc-x-gre. When the module sees the -x-subtag, it knows that subtag is non-standard and will look for that code in a special wp_private_subtags table for the language name. Someone else apparently had a similar idea because be-x-old exists in Module:Language/data/wp languages.

Another thing we might do, if and when we add support for label control (see #Wish list for future enhancement), is to overload that parameter so that |label=none hides all labels (language name, transliteration and translation static-text) and |label=name display's a name different from the name usually associated with the code. I'm not sure if there are any real benefits to this particular idea.

Trappist the monk (talk) 11:27, 21 November 2017 (UTC)

Possibly the usefulness for the |label= idea: the label provided by the current {{lang-de-CH}} is 'Swiss German' but the language name retrieved from the module's language tables is 'German' so in that template we might write: {{#invoke:lang|lang_xx_italic|code=de|label=Swiss German}}.
Trappist the monk (talk) 11:50, 21 November 2017 (UTC)
This would also be useful for suppressing the appearance of the same word in successive instances of the template (Foo, Bar, and Baz Quuxian, instead of Foo Quuxian, Bar Quuxian, and Baz Quuxian). Also useful for cases where one ethnic or national group uses one name for the language and a neighboring one does different.  — SMcCandlish ¢ >ʌⱷ҅ʌ<  18:15, 21 November 2017 (UTC)

Category:Articles containing Pushto-language text

Please be advised, Category:Articles containing Pushto-language text has been nominated to be merged into Category:Articles containing Pashto-language text. Cheers, -- Black Falcon (talk) 19:25, 23 November 2017 (UTC)

Kikuyu language category?

I think something changed with one of the templates but I can't figure out what, specifically. But now using the template Template:Lang-ki in an article makes the Category:Articles containing Gikuyu-language text appear at the bottom of the page. This is unlike other "hidden" categories like Category:Articles containing French-language text. Hopefully this is the right place to let people know. :) Thanks! Umimmak (talk) 00:52, 25 November 2017 (UTC)

It looks like this is also happening with {{lang-ps}} and Category:Articles containing Pushto-language text and Category:Articles containing Pashto-language text (the former is used by the template after the change to use the module). – Jonesey95 (talk) 04:13, 25 November 2017 (UTC)
Whether a category is hidden or not has nothing to do with the way that a category is added to an article; it's entirely down to the way the category page itself is set up (see WP:HIDDENCAT). Category:Articles containing French-language text has the template {{Category articles containing non-English-language text|French|fr|fre|fra}} which contains code to set __HIDDENCAT__, whereas Category:Articles containing Gikuyu-language text does not have a similar template. I note that the latter is set up as a redirected category (that's the {{Category redirect|Category:Articles containing Kikuyu-language text}} on that page), so the module needs configuring to use Category:Articles containing Kikuyu-language text instead of Category:Articles containing Gikuyu-language text. --Redrose64 🌹 (talk) 09:38, 25 November 2017 (UTC)
Well in the meantime I made the redirect category hidden so it won't show up on articles. Umimmak (talk) 10:21, 25 November 2017 (UTC)
The primary name for the ISO codes ki and kik' seems to be Gikuyu and that's why articles got categorised into Category:Articles containing Gikuyu-language text. I've added entries for these two codes in the "override" table in Module:Lang/data, this should make the templates use the preferred name now. – Uanfala 11:00, 25 November 2017 (UTC)
Thanks; there were only three pages tagged as having {K/G}ikuyu text so I purged them all and that should be fixed. I'm not sure what the right way to fix the P{u/a}shto categories Jonesey95 mentions -- should the __HIDDENCAT__ be placed on them in the meantime I guess? Or the actual Template:Category articles containing non-English-language text? Umimmak (talk) 11:38, 25 November 2017 (UTC)
Apparently, the Pashto categories have been nominated for merging by SMcCandlish, there's some discussion at Wikipedia:Categories for discussion/Speedy. I'm not sure what the due process is, but after it is over I guess the same remedy will work as with Kikuyu. – Uanfala 12:29, 25 November 2017 (UTC)

Italicization

Good day! Why do we use italics in this template, but not in others? Take a look here. However, when I tied using the template here, it didn't italicize: Uzbek: Oʻzbek gimnaziyasi/Ўзбек гимназияси; Russian: Узбекская гимназия; Kyrgyz: Өзбек гимназиясы. But the the uz template does italicize its content in articles. Can you change it so that it doesn't? Shouldn't these templates be uniform? Nataev talk 15:25, 28 November 2017 (UTC)

I changed the template a couple of hours ago. If you click Edit on an article and then save it without making any changes (this is called a "null edit"), the italics should go away. – Jonesey95 (talk) 15:51, 28 November 2017 (UTC)
There is more to it than that. The example holds text in two scripts when it should not (Uzbek can be / has been written in three):
{{lang-uz|Oʻzbek gimnaziyasi/Ўзбек гимназияси}}
The first part of that text (left of the solidus) uses Latin script, the second part uses Cyrillic script. The Latin should be italicized but the Cyrillic should not be. {{Lang}} and the {{lang-??}} templates do not support more than one script simultaneously (there is an expectation that in future, multiple scripts will be supported; see #Wish list for future enhancement). The third Uzbeck script is Arab which, unlike Latn and Cryl, is written right-to-left so requires special handling.
We are in a transition so there is a mix of old and new. For the time being, I would write the example:
{{lang-uz|Ўзбек гимназияси}} / {{lang|uz-Latn|Oʻzbek gimnaziyasi}}Uzbek: Ўзбек гимназияси / Oʻzbek gimnaziyasi
Order reversed because of Editor Jonesey95's template edit. I would note here that that edit doesn't really let lang module manage italics because {{lang-uz}}'s call of {{lang}} (from inside {{language with name}}) doesn't give the module any direction on how italics are to be managed.
Trappist the monk (talk) 16:32, 28 November 2017 (UTC)
OK, I see. The fact that Uzbek is currently written using two (actually, three) scripts is a huge pain in the neck. Nataev talk 08:27, 29 November 2017 (UTC)

Simpler subtag parsing

I created a simpler subtag parsing function in Module:Lang/sandbox. It does not require enumerating every combination of language, script, region, and variant code. It doesn't work perfectly yet. See Module talk:Lang/codes/testcases for the result. (I didn't want to add the testcases to Module:Lang/testcases because that page is kind of long already.) More examples would be appreciated. — Eru·tuon 02:03, 30 November 2017 (UTC)

This is something that I had intended to do but was leaving to later. I have replaced the non-working variant validation/consolidation code at the bottom of get_ietf_parts() with the working code from the live module. This new snippet also includes more helpful error messaging. Apparently there is something wrong with how Lang/sandbox or Lang/codes/testcases evaluates/renders actual column results. In the three failures, where is the fourth table element?
{{lang-??}} templates using the module are allowed to use |script=, |region=, |variant= to supplement the IETF language tag provided by the template (overriding is not currently permitted but is contemplated). The values supplied by these parameters are validated in get_ietf_parts() and why each of script, region, and variant are all individually made lowercase; not done all at once as you have done in /sandbox.
Trappist the monk (talk) 12:16, 30 November 2017 (UTC)
I don't know what's happening to make the tables have only three elements, but I'll look into it.
I don't understand how what you are saying relates to letter case. Could you clarify? At which point is letter case significant in the function get_ietf_parts? I do notice now that language code is never lowercased, so perhaps letter case for it should be preserved, and an error be triggered for something like {{lang|GrC|...}}.Eru·tuon 20:36, 30 November 2017 (UTC)
Actually, code is lowercased in Module:Lang before it is validated, so {{lang|GrC|...}} would not return an error. — Eru·tuon 20:40, 30 November 2017 (UTC)
In Module:Lang, get_ietf_parts() is called at line 665:
get_ietf_parts (args.code, args.script, args.region, args.variant);
In that call, args.script, args.region, and args.variant come from the template parameters |script=, |region=, and |variant= respectively and can be any case (IETF language tags have no standardized case; there is a 'common' way of writing the various subtags that, by some sort of convention, uses particular case – we mimic that in format_ietf_tag())
In get_ietf_parts() (Module:Lang) the parsing (that part you are rewriting) is case insensitive. Once parsed, we look to see if any of the template parameters (|script=, |region=, and |variant=) is set. If any of these is set, and there is no matching subtag in source, then we assign the template parameter's value to the appropriate local variable (lines: 194, 209, and 224). Then we validate. Before we can do that, we down-case the content of the variable in question because the data tables are all indexed with lowercase keys (because of __preprocess() in Module:Language/name/data).
In Module:Lang/sandbox you down-case only the value in source. That works fine for {{lang}} which doesn't support the subtag parameters but won't work for the {{lang-??}} which do/will support them.
Trappist the monk (talk) 21:18, 30 November 2017 (UTC)
Okay, I see. The supplied subtags need to be lowercased. — Eru·tuon 21:54, 30 November 2017 (UTC)

Category:Articles containing Semitic languages-language text

It appears that Template:Transl/Template:Lang is the common denominator between the articles in Category:Articles containing Semitic languages-language text rather than Category:Articles containing other Semitic-language text. In Aleph "sem" is only a variable in the Template:Transl but in Adnan, "sem-Latn" is called in Template:Lang. Hyacinth (talk) 22:12, 1 December 2017 (UTC) Hyacinth (talk) 22:19, 1 December 2017 (UTC)

Template:ISO 639 name sem may contain a problem. Hyacinth (talk) 22:34, 1 December 2017 (UTC)

The code "sem" should link to Semitic language. I don't know why the word "other" was in the template, but it was breaking things. The ISO 639 templates are a mess. – Jonesey95 (talk) 23:19, 1 December 2017 (UTC)
I don't think my changes have fixed anything, though. I used to understand how these templates worked, but with the module-ization still in progress, I'm a bit at sea. – Jonesey95 (talk) 23:26, 1 December 2017 (UTC)

(edit conflict)

sem is an ISO 639-2 collective. See sem @ sil.org and their definition of Collections of languages – which definition, to me any way, is rather obtuse. The correct name for sem is 'Semitic languages' and this is the name that {{lang}} was getting from the IANA data table and the name that Module:Lang used when creating the category link for that code. For the time being, I have created an entry in the override table in Module:Lang/data. This will work until someone creates a {{lang-sem}} template by which time we may have figured out how to handle collectives. I'll add notes and TODOs in the appropriate places.
Trappist the monk (talk) 23:48, 1 December 2017 (UTC)
Not sure that this is the venue to discuss the ISO 639 name templates. {{Lang}} and the {{lang-??}} templates are abandoning all of the ISO 639 name templates along with the templates that might have called them.
Trappist the monk (talk) 23:48, 1 December 2017 (UTC)
One issue is that some of these "collections" are language families with a proto-language (for instance, the Semitic languages with Proto-Semitic), and in that case the code for the language family is sometimes used for the proto-language. For example, in {{proto}}, sem is used as the code for Proto-Semitic. Wiktionary distinguishes the proto-language by appending -pro (semsem-pro). This is because language and codes must be distinct, as they are both used in etymology templates and there are distinct categories pertaining to each (for example, "Terms derived from Semitic languages" and "Terms derived from Proto-Semitic"). But I don't know if on Wikipedia this polysemy (a code being used for both language family and proto-language) will cause similar problems or what solution would be appropriate. — Eru·tuon 00:35, 2 December 2017 (UTC)
This makes my brain hurt. I'm not sure that we care what is done at {{proto}}. There, the code just feeds a {{#switch:}} that chooses a wikilinked article name to precede the 'text'. The template takes no care to properly identify the language in metadata as {{lang}} does.
IANA doesn't apparently recognize proto (that word isn't in the registry). The only 'fit' would be as a variant (5-character length) but because proto isn't registered with IANA, shouldn't the proper form be a private use subtag: sem-x-proto? We should not / must not redefine tags that are already defined by international standards organizations.
In Module:Language/data/wp languages there are three 'proto' language codes defined:
cel – IANA name: Celtic languages; WP name: Proto-Celtic, a redirect to Proto-Celtic language (ISO 639-2 collective)
gem – IANA name: Germanic languages; WP name Proto-Germanic, a redirect to Proto-Germanic language (ISO 639-2 collective)
pgl – IANA name: Primitive Irish; WP name: Proto-Irish, a redirect to Primitive Irish (ISO 639-3 individual)
Of those, pgl should probably be deleted from the WP languages table because it is an ISO 639-3 individual language so we should be displaying 'Primitive Irish' with {{lang-pgl}}. The other two inappropriately redefine the international standards organizations' code/name assignments so if we are to keep them as 'Proto-something' then we should create correct private use subtags cel-x-proto and gem-x-proto.
Trappist the monk (talk) 01:52, 2 December 2017 (UTC)
I have deleted pgl from Module:Language/data/wp languages.
Trappist the monk (talk) 12:10, 2 December 2017 (UTC)
Oops, I wrote -proto but meant -pro. Corrected.
Just to clarify, I'm not proposing that Wikipedia use the same convention (-pro) as Wiktionary. Wikipedia wants to follow external standards, while Wiktionary is perfectly comfortable with creating its own idiosyncratic hyphen-containing language codes that have nothing to do with IETF subtags. So Wikipedia and Wiktionary are incompatible here. Your idea of a private use subtag sounds more consistent with Wikipedia's preferences. — Eru·tuon 06:05, 2 December 2017 (UTC)

I believe that "other" is a reference to transliterated Semitic. Hyacinth (talk) 01:40, 2 December 2017 (UTC) As is "sem-Latn". Hyacinth (talk) 01:41, 2 December 2017 (UTC) See: Template:Category articles containing non-English-language text. Hyacinth (talk) 01:44, 2 December 2017 (UTC)

Category:Articles containing Eskimo-Aleut languages-language text is on String figure. Hyacinth (talk) 03:22, 5 December 2017 (UTC) Template:Lang-esx. Hyacinth (talk) 03:26, 5 December 2017 (UTC)

{{lang-esx|put two things together}} is a misuse of the template: 'put two things together' is English. Don't do that.
How the templates deal with language collections is a known unresolved issue. The fix that worked for sem won't work here because {{lang-esx}} exists.
Trappist the monk (talk) 04:10, 5 December 2017 (UTC)

Italic related, don't know how to fix

Hi. Can you have a look at Poor Dionis? Two blocks of text, both of which have italics for just one sentence, have vanished, and, looking over the potential fixes, I could find nothing to address the specific problem. Dahn (talk) 12:17, 9 December 2017 (UTC)

I think I've fixed it now (converting on the way from {{lang-xx}} to simply {{lang}} as I don't think it's necessary to add language labels, but feel free to reinstate them using template-external text). Now, the problem (visible in this old revision) was that lang-xx templates were used for an entire paragraph of text, and this text contained within it italicised phrases. The template assume that the markup is meant for the whole text and see it as an error, but that's a legitimate use. Is there any way to fix that? – Uanfala (talk) 12:44, 9 December 2017 (UTC)
Thank you, that's a very good solution. As for the rest: I'm sure the problem can easily pop up in other templates where the same was used, so maybe it's a good idea to add that to the list of potential script errors? Dahn (talk) 12:52, 9 December 2017 (UTC)

Question re: bolding and ' ' marks

G'day, I use template:lang-sh-Latn a fair bit, and I've noticed it is now rendering with the sh word like this 'bold'. See the Background section of Yugoslav coup d'état for examples. Is there a recent change that has caused this? It doesn't comply with MOS:BOLD etc. Thanks, Peacemaker67 (click to talk to me) 08:25, 27 November 2017 (UTC)

Yes. We're in a transition period where things don't always work as expected. The problem with {{lang-hbs-Latn}} is that its language code, hbs-Latn, includes a script subtag, Latn, that specifies italics and, internally, the template also includes italic markup: ''{{{1}}}'' which together created: ''''{{{1}}}''''. I've converted {{lang-hbs-Latn}} to use the module:
{{lang-hbs-Latn|banovine}}Serbo-Croatian Latin: banovine
Trappist the monk (talk) 11:01, 27 November 2017 (UTC)
Thanks. Peacemaker67 (click to talk to me) 01:19, 28 November 2017 (UTC)
@Trappist the monk: Here's a weirdness, with {{lang|arc-Latn}} and presumably some others:
  1. On this page, this italicizes, but in mainspace the link markup is broken and it's non-italic: [[Frahang-i Pahlavig|{{lang|arc-Latn|hozwārishn}}]]hozwārishn
  2. On this page, this italicizes, but in mainspace the link markup is broken and it's italic: ''[[Frahang-i Pahlavig|{{lang|arc-Latn|hozwārishn}}]]''hozwārishn
  3. On this page, this italicizes, and it renders (italic) in mainspace: {{lang|arc-Latn|[[Frahang-i Pahlavig|hozwārishn]]}}hozwārishn
  4. On this page, this does not italicize, and it renders (non-italic) in mainspace: ''{{lang|arc-Latn|[[Frahang-i Pahlavig|hozwārishn]]}}''hozwārishn
  5. On this page, this italicizes, and it renders (italic) in mainspace: {{lang|arc-Latn|[[Frahang-i Pahlavig|''hozwārishn'']]}} → [[[Frahang-i Pahlavig|hozwārishn]]] Error: {{Lang}}: text has italic markup (help)
  6. On this page, this does bold and 'single quotes' (non-italic), and it renders (the same way) in mainspace: {{lang|arc-Latn|''[[Frahang-i Pahlavig|hozwārishn]]''}} → [hozwārishn] Error: {{Lang}}: text has italic markup (help)
Another glitch (not namespace dependent):
  • Italics: {{lang|ar-Latn|shamia}}shamia
  • Bold and 'single': {{lang|ar-Latn|''shamia''}} → [shamia] Error: {{Lang}}: text has italic markup (help)
  • Non-italic: ''{{lang|ar-Latn|shamia}}''shamia
None of the {{lang|foo-Latn}} instances should auto-italicize, since {{lang|es}}, etc., do not; only the {{lang-foo}} templates emit italics around Latin script by default.

I think the latter problem is the same as the one reported by Peacemaker67 above, but the namespace problem may be something new. PS: I assume the "bold and 'single'" problem is fixed in the Lua by doing italics directly instead of by wiki ''...'' markup.
 — SMcCandlish ¢ >ʌⱷ҅ʌ<  05:29, 28 November 2017 (UTC)

I took a look at the source code. ''[[Frahang-i Pahlavig|{{lang|arc-Latn|hozwārishn}}]]'' results in ''[[Frahang-i Pahlavig|<span lang="arc-Latn">''hozwārishn''</span>[[Category:Articles containing Aramaic-language text]]]]'' in mainspace. Outside of mainspace, the category isn't included, so the link works, and oddly the italics don't cancel out. (Maybe because they have no displayed text between them.) Three ideas: a |nocat= parameter to remove the category, or a |link= parameter to provide the article that the text should link to, or require that the link be placed inside {{lang}}: ''{{lang|arc-Latn|[[Frahang-i Pahlavig|hozwārishn]]}}''. — Eru·tuon 08:19, 28 November 2017 (UTC)
I have taken the liberty of changing the list markup above from unordered to ordered.
Items 1 & 2 render as they do for the reason given by Editor Erutuon: a category link inside a wikilink does not work.
Item 3 renders in italic font because the language code specifies a Latin script
Item 4 renders in upright font because Latn script italics reversed by external wiki markup
Item 5 renders in italic font because, I suspect, while there is something between the italic markup in the wikilink and the italic markup provided by the template, that 'something' is not displayable text so the markup is not displayed
Item 6 renders as it does because the template applies italic markup (from the Latn script) to displayable text already wrapped in italic markup
From the very beginning of this experiment, Module:Lang has supported |italic= so you can write:
{{lang|arc-Latn|[[Frahang-i Pahlavig|hozwārishn]]|italic=no}}hozwārishn
|italic= always overrides any automatic italic setting.
{{lang}} only auto italicizes when the IETF language tag tells it to italicize with a Latn (case insensitive) subtag.
Those {{lang-??}} templates that have been converted to use Module:Lang, emit error messages when the 'text' to be rendered includes italic wiki markup (presumably to override the wiki markup included in the wikitext templates). That same error message code is available to {{lang}} but is disabled for now because all of the unconverted {{lang-??}} templates call {{lang}} and many of them have hard-coded italic wiki markup which would cause an error message for each of the rendered {{lang-??}} that has hard-coded italic wiki markup.
Trappist the monk (talk) 11:43, 28 November 2017 (UTC)

Expected behavior of {{lang}} versus {{lang-xx}} is that the former will always produce non-italic output; it's too often used for proper names which are not italicized in most contexts. Having it switch to producing italicized when particular language codes are used will confuse and result in wrong output; pretty much no one will remember that one particular kind of case is going to produce different output. We do expect {{lang-xx}} to italicize unless it's non-Latin material, so this auto-handling of -Latn would make sense.

Another thing: the prescribed method of representing italics within material already italicized is to turn the italics off for nested-italics. The obvious way to do this is ''Blah Blah ''Foo'' Yadda Yadda''; MoS probably actually advises this somewhere. But that's going to work with the Lua version if it spits an error instead. What to do about this?

On the other matter: The purpose of these templates is proper markup and formatting of text. If insertion of an optional and possibly redundant category by some of them is causing these central purposes of the markup to fail, then there needs to be a parameter for adding a category not for removing one - it should not be done by default. Who actually uses "Category:Articles containing Foo-language text" and for what? That's a maintenance/tracking category type, not a category for readers. It could be made invisible or even (old-school style) be moved to the article's talk page. In the end it would make more sense for a bot to detect when a particular {{lang|foo}} etc. has been used in article, using a list of templates and parameters that do this sort of stuff, then add the categories. We don't really need to the templates to do it at all.
 — SMcCandlish ¢ >ʌⱷ҅ʌ<  00:43, 30 November 2017 (UTC)

{{lang}}, in most contexts, will not produce italic rendering. Only in the special case, where the editor has taken the trouble to explicitly specify a Latin script by including the Latn IETF subtag, will the module apply italic markup.
Italics markup is being discussed at #html italic markup vs wiki italic markup.
Yes, the purpose of these templates is proper markup and formatting of text, which statement I would clarify: text supplied to the template; the template cannot know what exists outside the opening {{ and closing }}. The insertion of categories does break the wikilinks for your examples 1 & 2 above. This breakage was also true for the old, non-module version (I recreated Test page with your examples, selected this version of {{lang}}, clicked edit, typed Test page into the Preview page with this template text box, and clicked the adjacent Show preview button). From this experiment, I believe that the Module version of {{lang}} breakage of examples 1 & 2 is not new and existed in the old template. I do not know why the templates add hidden categories, nor do I know if no one uses them or if a lot of editors use them. Clearly, someone thought it important. If you wish to do away with these categories, this is probably not the correct venue.
Trappist the monk (talk) 14:24, 30 November 2017 (UTC)
Oh, I don't care if the categories exist, they just shouldn't be added by these templates if breaks the main purpose of the templates [sometimes]; a bot can do the categorizing instead.  — SMcCandlish ¢ >ʌⱷ҅ʌ<  07:56, 10 December 2017 (UTC)

Formatting of first line of multiline text

I ran across this problem at Jacques Dutronc#Discography. When there are multiple lines of foreign language text, the wiki syntax of the first line is not properly displayed.

The template seems to have been used without change on the above page for many years, so I assume that something's changed with the template, rather than the article having been wrong for all that time.

Here's a simple example, where the first bullet-point is shown as a standard asterisk, not as a list item:

  • een
  • twee
  • drie

Is there a problem with the template, or is this now the expected behaviour? (I'm not sure it's really being used properly on that page anyway, but maybe that's a different matter.) --David Edgar (talk) 00:28, 22 November 2017 (UTC)

This is an issue with all templates. The fix is to do this {{lang|nl|<nowiki />. However, the template shouldn't be used this way, but should be used for the exact content in the other language: *{{lang|nl|een}}, etc.  — SMcCandlish ¢ >ʌⱷ҅ʌ<  00:49, 22 November 2017 (UTC)
I use {{Lang|xx|...}} often for multi-line text, and this report gave me a fright. Turns out, the behaviour as described seems to be triggered by * (and similar), and surprisingly goes away when <poem>...</poem> is used:


*een
*twee
*drie

although that creates undesirable white space. I agree that for list items it's best to use the template for each item. -- Michael Bednarek (talk) 05:42, 22 November 2017 (UTC)
The problem is twofold: one part is that the MediaWiki parser treats the "*" character as invoking a list only if it occurs in certain defined positions, such as the start of a line; the other part is that the module underlying this template strips leading (and trailing) whitespace, which includes newlines. So you need an initial newline, but also need to prevent that from being stripped as whitespace - all that you need is to hide that newline using a character which doesn't test as whitespace, so won't be stripped off; yet one which won't be visible when rendered by the browser. The entity for a space is ideal:

*een *twee *drie

You can use either &#32; or &#x20; they behave exactly the same. --Redrose64 🌹 (talk) 08:06, 22 November 2017 (UTC)
Just use the <nowiki /> as the fist content in the parameter. This prevents any spacing problems that result from other tricks. Been doing it this way for over a decade, and you'll find it documented at block templates, e.g. Template:Block_indent/doc#Technical issues with block templates.  — SMcCandlish ¢ >ʌⱷ҅ʌ<  20:13, 22 November 2017 (UTC)
PS: That documentation snippet actually lives in a page of its own and can be transcluded as needed with {{Block bug documentation}}.  — SMcCandlish ¢ >ʌⱷ҅ʌ<  20:15, 22 November 2017 (UTC)
It might be possible to add a hack to fix this in Lua. Say, if the text starts with an asterisk and contains a newline (or newline plus asterisk), assume it's a bulleted list and add a newline at the beginning. That might result in unintuitive behavior in certain cases, though. — Eru·tuon 22:02, 27 November 2017 (UTC)
Doesn't have anything to do with * in particular, though. This affects all wikimarkup the effect of which is triggered only by newline followed directly by a special character (#, ;, :, probably others), or by newline then HTML comment then special character.  — SMcCandlish ¢ >ʌⱷ҅ʌ<  00:15, 28 November 2017 (UTC)
That's right. The logic could easily be modified to accommodate #, ;, : along with *. (I think that's all of the list-ish special characters.) It would also be possible, but more costly, to accommodate HTML comments after the newline. I'll test the idea in Module:Lang/sandbox. — Eru·tuon 01:03, 28 November 2017 (UTC)
Another case is {|, I just remembered.  — SMcCandlish ¢ >ʌⱷ҅ʌ<  07:57, 10 December 2017 (UTC)
As an FYI about this thread, there is a secondary issue with the above invocation of this template: <span><ul><li></li><li></li></ul></span> is bad HTML and will not render to HTML correctly on MediaWiki at some point in the future, defeating the purpose of the template (it adds items in the misnested HTML5 or the general misnesting lint errors). Since this template presently generates a span, "detecting the issue and fixing it for these limited cases" doesn't fix a secondary issue of the invocation above (and additionally is inconsistent with every other template that does something like the proposed change). I would suggest that Template:Lang should be able to output a <div></div> rather than a span (while the majority of cases are inline, I've seen quite a few where a block lang template would be helpful--poetry is one). --Izno (talk) 20:58, 28 November 2017 (UTC)
If a block lang version of this template (or some sort of switch that flips this template from default inline to block) is required, then you should add that to the feature request list. Now, while basic functionality is still at issue, new features should not be getting in the way.
Trappist the monk (talk) 23:20, 28 November 2017 (UTC)

Thousands of articles generating visible errors due to italic markup; solutions?

After fixing a couple of dozen ones at random in Category:Lang and lang-xx template errors, broadly speaking I see four cases:

  1. {{lang-xx|''All the text inside is italicised''}}
    • Can be replaced with {{lang-xx|italic=yes|All the text inside is italicised}}
    • Or sometimes {{lang-xx|italic=no|All the text inside is unitalicised}} when "xx" is written in Latin script in the first place
  2. {{lang-xx|Тэкст виФ транскрипцион ''Text with transcription''}}
    • Can be replaced with {{lang-xx|Тэкст виФ транскрипцион}} {{lang|xx-Latn|italic=yes|Text with transcription}}
    • Could also use "translit" parameter, though that introduces extra WP:LEADCLUTTER which could be undesirable in some cases
  3. {{lang-xx|Name1 ''or'' Name2 ''or'' Name3}}, where the "italics" markup on "or" is actually intended to de-italicise
    • Could be replaced with {{lang-xx|Name1}} or {{lang|xx|italic=yes|Name2}} or {{lang|xx|italic=yes|Name3}}
    • But I suspect bot regex replacement wouldn't be safe, probably there's similar-looking cases where something else is intended
  4. Other stuff which will require manual intervention

How shall we go about clearing this backlog? Should I go to Wikipedia:Bot requests, or is someone handling this already? (I think at least the first two cases can be handled by bots, allowing human effort to be focused on more difficult cases). Cheers, 59.149.124.29 (talk) 05:12, 11 December 2017 (UTC)

Thanks for fixing what you've fixed. Unfortunately it isn't quite as you describe.
  1. {{lang-??}} templates do not all default to italic rendering so the italic markup might have been used to negate the default italic or been used to force italics.
  2. it is not always clear that the 'text with transcription' you refer to is a transliteration or is a restatement of the text written in the language's 'other' (often Latin) script (Serbian uses both Cyrillic and Latin, for example; there are quite a few others). When the italicized text is a transliteration, and the static text provided by the {{lang-??}} template is not desired, perhaps a better choice is to use the more semantically correct {{transl}} template.
  3. yeah, I think this is the correct solution assuming that the script used for the non-English text is supposed to be italicized. The rather larger issue with your example is that the original template mixes the English 'or' with the non-English text which is counter to the underlying html markup that identifies all text in {{{1}}} as the non-English language.
I suspect then that fixes, rather than being general, are perhaps easier if done on a per-template basis. Recently I fixed several hundred instances of {{lang-so}} which by default rendered text in an upright font even though Somalian is written primarily using a Latin script (this is a case where the template should have been fixed long ago, rather than editors 'fixing' each instance to italicize). I fixed the template to italicize and then used a simple search and replace regex in AWB:
find: (\{\{\s*lang\-so\s*\|)''([^'\|\}]+)''(\s*\}\})
replace: $1$2$3
I suspect that something similar may work for a lot of other templates. Of course, hundreds of editors each fixing the templates on pages that they care about could go a long way to clearing the error category. I know, that's being overly optimistic.
Trappist the monk (talk) 11:55, 11 December 2017 (UTC)

pop directional format

A recent change to Module:Lang/sandbox changed this:

table.insert (span, '&lrm;');

to this:

table.insert (span, '&#x202C;');

&#x202C; is Unicode character U+202C Pop Directional Format.

Editor Great Brightstar: please explain why you think that &#x202C; is a better choice than the existing &lrm;.

Trappist the monk (talk) 11:21, 20 December 2017 (UTC)

I just leave it as experimental feature for test porpose that anyone can try it. Since you asked me to explain now, I decided to made a test case for this.

With PDF:

David; Hebrew: דָּוִד‬, Modern David

With LRM:

David; Hebrew: דָּוִד‎, Modern David

However both PDF and RLM characters performanced the same (tested on both Chrome and Firefox), so I revert my change. --Great Brightstar (talk) 02:59, 21 December 2017 (UTC)

OK, I saw this is already reverted. --Great Brightstar (talk) 03:07, 21 December 2017 (UTC)

A bug with the new non-italics entry point?

I'm just adding language tagging to the lead of Grand Embassy of Peter the Great using {{lang-ru}} and noticed that the Russian is being italicised unless I add |italic=no. Template:Lang-ru does correctly call Trappist the monk's new lang_xx_normal, so I'm not sure what's going on:

{{lang-ru|Великое посольство|translit=Velikoye posolstvo}}
Russian: Великое посольство, romanizedVelikoye posolstvo
{{lang-ru|Великое посольство|translit=Velikoye posolstvo|italic=no}}
Russian: Великое посольство, romanizedVelikoye posolstvo

OwenBlacker (talk) 01:09, 24 December 2017 (UTC)

fixed. Thanks for spotting that. I only left out the most important bits.
Trappist the monk (talk) 01:34, 24 December 2017 (UTC)


I found a similar error while viewing the page for Vínarterta:

{{lang|is|striped lady cake}}
striped lady cake

Lua error in Module:Lang at line 367: invalid value (boolean) at index 2 in table for 'concat'.

This error only appears for users not logged in.REH11 (talk) 02:45, 24 December 2017 (UTC)

For a while that error may have been there, for all users, not just logged out users – that particular kind of error does not discriminate against logged in vs logged out. But, {{lang|is|striped lady cake}} and the other {{lang-is|Vienna cake}} a improper uses of the templates. Both 'striped lady cake' and 'Vienna cake' are clearly English, not Icelandic, so should not be marked up as such.
Trappist the monk (talk) 03:44, 24 December 2017 (UTC)
I saw this issue at Torah just now, but a purge fixed it. Emir of Wikipedia (talk) 23:17, 24 December 2017 (UTC)