Template talk:Lang/Archive 9

This page is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Making Module:Lang accessible from other modules

Latest comment: 5 years ago14 comments2 people in discussion

Is there a way to make Module:Lang accessible from other modules? Since this module is used on over 800k pages, I'd like not to mess around with it and instead ask those more knowledgeable with this module. My specific need is the end-result of "<Langauge name>: <text>" with an option to disable italics and linking of language. Thank you. Gonnym (talk) 22:05, 6 January 2019 (UTC)

You are wanting to call one of the functions that renders a {{lang-??}} template? Two ways I can think of that give the same results as {{lang-de|German text|italic=no|link=no}}:

frame:callParserFunction ({name = '#invoke:Lang',
	args =
		{
		'lang_xx_italic',
		code = 'de',
		text = 'German text',
		italic = 'no',
		link = 'no',
		}
	});

or

frame:expandTemplate ({title='Lang-de', args = {'German text', italic = 'no', link = no}})

As always, better examples of what you want will yield better answers.

—Trappist the monk (talk) 22:35, 6 January 2019 (UTC)

Is a frame object necessarily? I don't really see the point of module->template->module when the modules can talk with each other directly. Looking over the code it doesn't seem as if the frame is actually used after retrieving the arguments (if that is indeed the case, then just another access point that follows the style of function p.main(frame) return p._main(args) should work here). My example would be the episode titles in List of Pokémon: Indigo League episodes#Episodes. Currently they get passed as normal text and I can hard-code these Kanjai and Romanji instances in the module, but that won't help me when another language asks for the same support, so instead I'd like to pass the title with the XX language code and pass it to this module. --Gonnym (talk) 23:05, 6 January 2019 (UTC)

You did say Since this module is used on over 800k pages, I'd like not to mess around with it... so I gave you options that don't touch Module:lang. You are right, Module:lang forsn't use any of the frame functions. For work I'm doing for {{Infobox Chinese}}, I have modified Module:lang/sandbox so that I could call is_ietf_tag() and name_from_tag() function from another module. I intend to do that with lang() but for the nonce it's easier to just expand {{lang}} and worry about the fine points later.

I'm confused. When I look at List of Pokémon: Indigo League episodes#Episodes I see {{Japanese episode list/sublist}} which gets a bunch of parameters including: |KanjiTitle= and |RomajiTitle=. Why is it that in {{Japanese episode list/sublist}} you don't, in the call to {{Episode list/sublist}}, do this:

| AltTitle{{#if:{{{RomajiTitle|}}}||NULL}} = {{transl|ja|{{{RomajiTitle|}}}}}
| RAltTitle{{#if:{{{KanjiTitle|}}}||NULL}} = &nbsp;({{lang|ja|{{{KanjiTitle|}}}}})

Nor do I see where the language name version (à la {{lang-ja}}) would be used.

—Trappist the monk (talk) 00:21, 7 January 2019 (UTC)

Sorry for not being clear. I was working on a different version as the TfD result was a merge and not a wrapper. The function that handles the title text is "createTitleText()". It currently has hardcoded the KanjiTitle and RomajiTitle, but I'd like to change that to "NativeTitle" and "TranslatedTitle" (and then probably a NativeTitleLng or something) to be more general. --Gonnym (talk) 07:23, 7 January 2019 (UTC)

There are three titles, right? The native-language title (needs {{lang}} for proper rendering), the transliterated or romanized title (needs {{transl}}), and the translated or local-wiki-language title (markup according to the local wiki's rules for titles). Both {{lang}} and {{transl}} require a language code so the parameter that provides that should explicitely use the term 'code'.

It seems to me that |KanjiTitle=, as a parameter name, is poorly chosen and your module should not prefix the Japanese language title with it 'Kanji:'. Kanji is not a language but a form of writing. At random I picked a Japanese title from your Pokémon example page and decoded some of the characters to their unicode values. The first one I picked was Hiragana (according to unicode). Better to use the native language name as the title prefix?

Until Module:Lang can be adjusted to support access from another module, these should work:

titleString = titleString .. 'Romanization: "' .. frame:expandTemplate ({title='Transil', args = {'ja', args.RomajiTitle}}) .. '"'

titleString = titleString .. " (" .. frame:expandTemplate ({title='Lang-ja', args = {args.KanjiTitle, italic = 'no', link = no}}) .. ")"

—Trappist the monk (talk) 12:40, 7 January 2019 (UTC)

Thanks Trappist. The number of titles was coded a bit awkward. There were 4 title parameters - |Title=, |RTitle=, |AltTitle= and |RAltTitle= - Title is in English and |AltTitle= can be either an alternative English title or a native language title, while the |RTitle= and |RAltTitle= can be either a reference or another alt title without formatting. {{Japanese episode list}} made the adjustment so "Kanji" will be placed in |RAltTitle= and Romanization in the |AltTitle=. This is basically how I inherited the code. I agree that using Kanji was not a good choice, which is why I'm looking at how to generalize it more. From your comments, I think that the changes will be from |KanjiTitlte= to |NativeTitle= and |NativeTitleLangCode= and instead of |RomajiTitle= it would be |TranslitTitle= and |TranslitTitleLangCode=. Any comments on this? Also, could you ping me whenever the module will be made to support direct access? --Gonnym (talk) 12:58, 7 January 2019 (UTC)

Isn't a single language code parameter sufficient? If the native language title is rendered by {{lang-??}} using the appropriate native language code, then the transliteration of that native language title rendered by {{transl}} must use that same native language code. It makes no sense to write {{lang-ja|Japanese title}} and {{transl|de|transliterated Japanese title}}.

—Trappist the monk (talk) 13:46, 7 January 2019 (UTC)

Oh, I didn't know they both used the same language code. That's even better. Also, are all transliterations called "Romanization"? Making sure I don't have an incorrect label. --Gonnym (talk) 14:00, 7 January 2019 (UTC)

Romanization generally applies to transliterations from the native script to the Latin script. This is, by far, the most common use at en.wiki and perhaps at other Latin-script wikis. I do not know what happens at wikis written with other scripts (Cyrillic, Greek, Devanagari, ...). This suggests that 'Transcription: ' is probably the better prefix for transliterated titles.

—Trappist the monk (talk) 14:22, 7 January 2019 (UTC)

Done in the sandbox and appears to work. I haven't got a current need for the {{lang-??}} so those functions aren't tested as well.

local lang_mod = require ("Module:Lang/sandbox");
titleString = titleString .. " (" .. lang_mod._lang_xx_inherit ({code = args.NativeTitleLangCode, args.NativeTitle, link = 'no'}) .. ")"
-- or for the italic rendering:
titleString = titleString .. " (" .. lang_mod._lang_xx_italic ({code = args.NativeTitleLangCode, args.NativeTitle, link = 'no'}) .. ")"

try these. report problems here. I'll move these changes into the live module after we have some experience with the sandbox.

—Trappist the monk (talk) 11:24, 8 January 2019 (UTC)

Seems that both are working good.

titleString = titleString .. 'Transcription: "' .. langModule._transl({args.NativeTitleLangCode, args.TranslitTitle, italic = 'no'})  .. '"'

is how I accessed transl. I tried using "code=" here also but that didn't work. --Gonnym (talk) 16:05, 8 January 2019 (UTC)

Is there an ETA on when this will move to live version? Does it need more testing? Anything specific I can do to help? --Gonnym (talk) 10:21, 16 January 2019 (UTC)

done

—Trappist the monk (talk) 12:16, 16 January 2019 (UTC)

Use for Fraternities and sororities.

Latest comment: 5 years ago3 comments3 people in discussion

Should {{lang|grc|ΦΒΚ}} or {{lang|el|ΦΒΚ}} (or other greek letter combinations representing fraternities and sororities) be used in those articles? I thought that I should ask here for any ideas before bringing it up on WP:FRAT. Right now it is inconsistent with a few fraternities using it in the infobox like Phi Beta Delta. Naraht (talk) 22:47, 5 February 2019 (UTC)

And I guess that lang|grc should be used since these would be references to ancient Greek rather than current (which would use el)

I think the Greek letters in the names of fraternities are as "Greek" as are the Greek letters used in math expresssions. – Uanfala (talk) 23:07, 5 February 2019 (UTC)

How about neither since they are just letters from the Greek alphabet that are not Greek words. When written out as words using Latin characters, as in your Phi Beta Kappa example, the {{lang}} template is not required because these words are loaner words into English (I can find them in my 1987 Webster's Ninth New Collegiate Dictionary) so, as words, they do not require special treatment. If you write then out as words in Greek then use {{lang}} with the ISO 639 code appropriate to the Greek spelling.

—Trappist the monk (talk) 23:11, 5 February 2019 (UTC)

Hebrew language characters

Latest comment: 5 years ago9 comments4 people in discussion

Articles with Hebrew language characters cause English text to read backwards in opera 12. How do I get it right, as in modern Opera? Thanks. --Smarkflea (talk) 18:47, 24 February 2019 (UTC)

Without an example of where you are seeing this it is hard to offer much useful help. Article where you are seeing this problem?

—Trappist the monk (talk) 18:50, 24 February 2019 (UTC)

Shemini Atzeret is one. --Smarkflea (talk) 18:53, 24 February 2019 (UTC)

There is one instance of {{lang}} in that article:

When Shemini Atzeret falls on the Shabbat, the Scroll of [[Ecclesiastes]], or Kohelet ({{lang|he|קהלת}}, otherwise read in Ashkenazi synagogues on the [[Shabbat]] of Sukkot), is read on that day outside the Land of Israel.

When Shemini Atzeret falls on the Shabbat, the Scroll of Ecclesiastes, or Kohelet (קהלת, otherwise read in Ashkenazi synagogues on the Shabbat of Sukkot), is read on that day outside the Land of Israel.

is this the place you are having trouble? Looks correct to me (win7 chrome).

—Trappist the monk (talk) 19:32, 24 February 2019 (UTC)

That example also looks correct in Firefox 65.0 on Ubuntu. Certes (talk) 19:37, 24 February 2019 (UTC)

That paragraph that begins 'Psalm 27' is perfectly fine in old Opera; only the lead paragraph is backwards (plus what's in (Official name) in the infobox). Smarkflea (talk) 20:03, 24 February 2019 (UTC)

The Hebrew text in the lede and in the infobox is handled by {{Hebrew}} which doesn't not use {{lang}}

'''Shemini Atzeret''' (<big>{{Script/Hebrew|שְׁמִינִי עֲצֶרֶת}}</big> – "Eighth [day of] Assembly"; [[Sefardic]]/Israeli pron. ''shemini atzèret''; [[Ashkenazic]] pron. ''shmini-atsères'')...

Shemini Atzeret (שְׁמִינִי עֲצֶרֶת‎ – "Eighth [day of] Assembly"; Sefardic/Israeli pron. shemini atzèret; Ashkenazic pron. shmini-atsères)...

Problems with {{Hebrew}} should be addressed at that template's talk page. There may be no fix possible form Wikipedia's end of things be cause Opera 12 is rather out of date ...

—Trappist the monk (talk) 20:24, 24 February 2019 (UTC)

OK, thanks...Smarkflea (talk) 20:41, 24 February 2019 (UTC)

The problem has been resolved by removing a bidirectional control character (U+202C POP DIRECTIONAL FORMATTING) from {{script/Hebrew}}. — Eru·tuon 03:15, 26 February 2019 (UTC)

Ligurian dab

Latest comment: 5 years ago13 comments4 people in discussion

Please can someone make the necessary changes so that {{#invoke:Lang|name_from_tag|lij|Some text|link=yes}} (Ligurian) links to Ligurian (Romance language) rather than the disambiguation page Ligurian language? Test case: Camogli (top of infobox). Thanks, Certes (talk) 13:15, 9 December 2018 (UTC)

Special cases are a plague on our house. Perhaps the better solution is to handle the Ligurian language articles in a different, more standard way. MOS:DAB at §Disambiguation pages with only two entries would suggest that a dab page for those two articles is unnecessary or inappropriate. Surely one of them is primary (I would guess that to be the Romance language article – more than 500 article-space incoming links v. less than 100 for ancient) so shouldn't that article takeover the 'Ligurian language' name and then let the existing hatnotes do the DAB linking between the two?

—Trappist the monk (talk) 13:55, 9 December 2018 (UTC)

Ah, I see the problem. There's an assumption near the end of Module:Lang's function name_from_tag:

return make_wikilink (language_name .. ' language', language_name);

I was assuming that there would be a lookup table from code to article title, where we could add someTable["lij"] = "Ligurian (Romance language)";. We may be able to force a primary topic in this case and I've pinged the disambiguation people to see how well that solution would work in general. Alternatively, we could add a table of exceptions in the module (if you don't mind handling edit requests), or just link via the existing redirects such as ISO 639:lij where they exist (keeping the "Foo language" syntax as a fallback). Certes (talk) 14:35, 9 December 2018 (UTC)

Template:lang-lij is currently responsible for 207 bad links to DAB pages, which is over 9% of the total such bad links in English Wikipedia. This problem needs to be addressed. As I understand it, ISO 639 code lij only applies to Ligurian (Romance language).

I am extremely reluctant to define a WP:PTOPIC in even the clearest of cases. That is a guaranteed way of slowly but surely accumulating bad links which are unlikely to be found and fixed and which will degrade the encyclopaedia. I keep an occasional eye on one PTOPIC where the primary meaning is overwhelming (something like 100:1 or better), but which nevertheless collects bad links. Page views aren't everything, but they do measure what readers are looking for. The Romance language gets 83.5% of the views and the ancient language 16.5%, a ratio of 5:1. I do not consider that that ratio passes the "much more likely than any other single topic" test required by PTOPIC. Narky Blert (talk) 10:07, 15 December 2018 (UTC)

@Trappist the monk: Please consider implementing the change discussed at Wikipedia talk:Disambiguation pages with links#Ligurian language, which will mend the 207 faulty wikilinks and provide a framework for fixing similar cases easily in future. For comparison, our next most mislinked dab page has five incoming links. Certes (talk) 12:21, 15 December 2018 (UTC)

done.

—Trappist the monk (talk) 14:02, 15 December 2018 (UTC)

@Trappist the monk: Thank you! After null-editing we're now down to about a dozen bad links. These are from redirects such as Munegu via {{R from alternative language}}. That template invokes Module:Lang's name_from_tag and appends " language" to make an article title. Could we simplify it to invoke name_from_tag with link=yes, rather than constructing the link manually? I've mocked this up in {{R from alternative language/sandbox}}; the effect can be seen in {{R from alternative language/testcases}}. Certes (talk) 01:42, 17 December 2018 (UTC)

Umm, you don't need my permission. When I made the change to {{R from alternative language}}, I was interested in making only minimally necessary changes to switch away from the plethora of ISO 639 name templates to Module:lang and its smaller suite of data modules. The change you propose seems correct for lij and should be correct for the non-special cases codes as well. If it's not, I'm sure that someone will complain. Until then, I guess I would say that you should proceed.

—Trappist the monk (talk) 02:08, 17 December 2018 (UTC)

Thanks for the confirmation. I've put in an edit request. Certes (talk) 12:15, 17 December 2018 (UTC)

Mono languages

I've run into a similar issue where mnr links to Mono, but should refer to Mono language (California). I'd like to add four special cases to Module:Lang/data, but since I'm not familiar with the workings of the module, I would like to check with other editors. My proposed change would be to add the following to the local article_name list:

['mnr'] = {"Mono language (California)"},
['mnh'] = {"Mono language (Congo)"},
['mru'] = {"Mono language (Cameroon)"},
['mte'] = {"Mono-Alu language"}

Any comments or suggestions before I make this edit? —hike395 (talk) 09:41, 4 March 2019 (UTC)

mnr is used in Owens Valley. I don't see any other current problems, though dozens of languages could potentially produce bad links. Certes (talk) 11:54, 4 March 2019 (UTC)

done.

—Trappist the monk (talk) 12:08, 4 March 2019 (UTC)

Thanks, Certes and Trappist! —hike395 (talk) 02:18, 5 March 2019 (UTC)

Template:Nihongo

Latest comment: 5 years ago5 comments4 people in discussion

Just found {{Nihongo}} which isn't using this module. Not sure if what it does is a duplicate of some lang-supported template, so notifying any watcher here that might know more. --Gonnym (talk) 18:31, 9 January 2019 (UTC)

Nihongo has hung out there because it has additional options that may/may not be useful to the general Lang module (detailed handling of the multiple scripts specific to Japanese). It's probable that it could be implemented by Module:Lang plus some calling extension module (e.g. Module:Nihongo). --Izno (talk) 20:21, 9 January 2019 (UTC)

possible dumb question - is there a way to include {{lang}} in something that is using {{nihongo}} or one of its variants? (May not be possible/desirable.) Wikipedia:Typo Team/moss doesn't flag articles with the {{lang}} template for spellcheck but is flagging LGBT rights in Japan. I'd like to update the article so it's not considered an error but not really sure how. Or is this maybe something that should be added to Moss tool so these articles aren't mistakenly flagged? Cinnamingirl (talk) 17:34, 7 April 2019 (UTC)

The only dumb question is the one not asked. But, that question does need to make sense to the reader. This reader doesn't understand your question. I don't know anything about moss, but I do know that {{lang}} has nothing to do with spelling. Whatever editors provide, {{lang}} will accept (same, in this case, with {{lang-ja}}). I guess that {{lang}} is a way to keep moss from emitting an error for non-English words but that is a use that the {{lang}} and {{trans}} templates play no active part in.

LGBT rights in Japan is listed here Wikipedia:Typo Team/moss/L#LG-LW 2 where is says that nenja is wrapped in {{nihongo}}, and sure enough, it is:

{{nihongo|''nenja''|念者||"lover" or "admirer"}} → nenja (念者, "lover" or "admirer")

If the moss tool doesn't accept {{nihongo}} but does accept {{lang}} and {{transil}} then perhaps {{nihongo}} can be replaced:

{{transl|ja|nenja}} ({{lang|ja|念者}}, "lover" or "admirer")

nenja (念者, "lover" or "admirer") – replacement

nenja (念者, "lover" or "admirer") – {{nihongo}} original

No doubt, someone at moss thought about this an took the decision to disregard {{nihongo}}. I don't know why that decision was taken so perhaps you should ask at moss. If you get an answer, post a link to the discussion here.

—Trappist the monk (talk) 18:11, 7 April 2019 (UTC)

Looking at the documentation for {{nihongo}}, it would appear that the template is misused in LGBT rights in Japan. The documentation shows this:

{{Nihongo|English|kanji/kana|rōmaji|extra|extra2}}

Filling in the blanks:

{{Nihongo|"lover" or "admirer"|念者|nenja}} → "lover" or "admirer" (念者, nenja)

There is {{nihongo3}}:

{{Nihongo3|"English"|Kanji|Rōmaji|extra|extra2}}

which appears to match the misused {{nihongo}}:

{{Nihongo3|"lover" or "admirer"|念者|nenja}}

nenja (念者, "lover" or "admirer") – {{nihongo3}}

nenja (念者, "lover" or "admirer") – {{nihongo}} original

nenja (念者, "lover" or "admirer") – {{transl}} / {{lang}} replacement

Under the bonnet, while all of these appear to be the same, clearly they are different:

<span title="Hepburn transliteration"><i lang="ja-Latn">nenja</i></span><span style="font-weight: normal"> (<span title="Japanese-language text"><span lang="ja">念者</span></span>, "lover" or "admirer")</span>

– {{nihongo3}}

''nenja''<span style="font-weight: normal"> (<span title="Japanese-language text"><span lang="ja">念者</span></span>, "lover" or "admirer")</span>

– {{nihongo}} original

<span title="Japanese-language romanization"><i lang="ja-Latn">nenja</i></span> (<span title="Japanese-language text"><span lang="ja">念者</span></span>, "lover" or "admirer")

– {{transl}} / {{lang}} replacement

In {{nihongo}} and {{nihongo3}}, the transliterated text is treated as if it were English (no lang attribute); that should probably be fixed in those templates.

For completeness, {{nihongo2}} is a wrapper for {{lang|ja|...}} where ... is kanji/kana so not quite appropriate for this discussion.

—Trappist the monk (talk) 18:55, 7 April 2019 (UTC)

Expand documentation

Latest comment: 5 years ago2 comments2 people in discussion

Would someone please list the usage of this template in its documentation? I know it must take at least 2 parameters, [1] for the language code, [2] for the text in that language. This information is missing from the documentation. If there are any optional parameters, these should all be listed. If there are no optional parameters, this should be made explicit. Danielklein (talk) 01:06, 15 April 2019 (UTC)

There is a ton of documentation at Template:Lang. Are you talking about a different template? – Jonesey95 (talk) 04:25, 15 April 2019 (UTC)

“Translit.” should be Transcribed

Latest comment: 5 years ago18 comments4 people in discussion

The alleged Arabic “transliterations” are actually almost always transcriptions. How can I change the template so it says “transcribed” rather than “transliterated”?

Kilo77 (talk) 01:16, 26 April 2019 (UTC)

Please provide the definitions for these two words that you think mean something that my brief perusal of dictionaries indicates otherwise. --Izno (talk) 13:27, 26 April 2019 (UTC)

Also, please provide examples of that you believe show that the template is doing the wrong thing.

—Trappist the monk (talk) 13:41, 26 April 2019 (UTC)

Izno, the lede of our article Transliteration should provide enough of a general background, and as for the case of Arabic, Romanization of Arabic (and particularly the section Romanization of Arabic#Transliteration vs. transcription) spell out why the romanisations commonly used are not transliterations. Trappist, an example would be {{lang-ar|القرآن|al-Qurʾān}} which outputs Arabic: القرآن, romanized: al-Qurʾān – the problem here is the use of the label "translit". – Uanfala (talk) 13:54, 26 April 2019 (UTC)

Here is the first paragraph from Transliteration:

Transliteration is a type of conversion of a text from one script to another that involves swapping letters (thus trans- + liter-) in predictable ways (such as α → a, д → d, χ → ch, ն → n or æ → ae).

and the first paragraph from Transcription (linguistics):

Transcription in the linguistic sense is the systematic representation of language in written form. The source can either be utterances (speech or sign language) or preexisting text in another writing system.

This non-linguist has trouble distinguishing the one from the other and how whatever differences exist between the two apply to {{lang}} and related templates. Is this 'problem' unique to the Arabic languages or does it apply across the board to all languages that might be romanized? Why, after all of these years, is this 'problem' just now coming into the light?

—Trappist the monk (talk) 14:31, 26 April 2019 (UTC)

In Semitic languages such as Arabic and Hebrew, the difference between transliteration and transcription is crucial. For example, the Hebrew word מלך (meaning “king”) is transliterated as MLK but transcribed as MÉLEKH. The transliteration only represents the spelling, i.e. the letters (no vowels, no stress). The transcription, on the other hand, represents the pronunciation. What I noticed here is that in most cases whereas the template says “translit.”, what is provided is in fact the transcription. Kilo77 (talk) 14:58, 26 April 2019 (UTC)

Ok, thanks for that, it's helpful ... though, isn't pronunciation the domain of IPA (which these templates don't support)? And what about answers to my other questions?

Recalling that these templates are used by experts and those who are not, how should Module:lang and the templates be fixed to accommodate generic romanization, transliteration, and transcription? Some sort of very clear and concise documentation that explains these differences for the non-expert editor population is an absolute must.

—Trappist the monk (talk) 15:16, 26 April 2019 (UTC)

Transcription is a broad concept that covers uses (like IPA) that we aren't interested in here. If we focus on the contexts where the lang templates are used, we're looking at romanisation: giving some sort of Latin-script representation of a string in another, non-Latin script. There are two ways this can be done. One is transliteration: using a straightforward mapping of the foreing-script characters into Latin characters. This is what's normally done when the source script is an alphabet (like Cyrillic) or an abugida (like the writing systems of South and Southeast Asia). For other writing systems, this won't produce helpful results. Transliterating from an abjad (like Hebrew or Arabic) will not render the short vowels (as explained above), while for logographic writing systems (like Chinese) there's usually no meaningful way to map the logograms into Lain characters. What can be done in both those cases, is to incorporate information about how the source string is pronounced. For Arabic that involves supplying the short vowels, and for Chinese it involves ignoring the logograms altogether and simply representing some standardised form of the pronunciation (that's what pinyin does for example). This strategy of using extra information that's not present in the source text is termed "transcription". It's a form of romanisation: its aim is to represent the written form and so it's different, for example, from IPA transcription, which is normally used to represent more or less faithfully the spoken form. Foreign-script terms can be given both with a romanisation and with a pronunciation, and these will almost always look quite different (see for example the lede of Siwa Oasis).
Using "transliteration" to refer to transcription is inaccurate. How can this be avoided? One way is to make the template recognise the source writing system and choose one of the two labels ("transliteration" or "transcription") accordingly. Another solution is to simply avoid having to make the distinction: using a more general term like "romanisation" will do the trick. Even better, a label can be omitted altogether: readers don't really need to be told that, do they? And most instances of romanisation I've seen don't use this template parameter anyway, and that's probably a major reason why this issue hasn't been brought up, at least in recent times. – Uanfala (talk) 18:48, 26 April 2019 (UTC)

Since early in this discussion I have been thinking that the simple solution is, as you suggest, to generalize on romanization because the links that Module:lang creates are to 'Romanization of <language name>' (when an article of that name exists).

Because the {{lang-??}} templates support |tranlit= we should deprecate that parameter and replace it with |roman= or some-such. Further, instead of going to the trouble of having the module figure out if the romanization is transliteration or transcription, we can offer an |rtype= parameter that takes as values the keywords xlit and xscript. For the general case where |roman= (or {{{3}}}) has a value and |rtype= is empty or omitted, the template renders |roman=<romanized text> as:

[[Romanization of <language>|Romanized]]: <romanized text>

when |rtype=xlit then

[[Romanization of <language>|Transliterated]]: <romanized text>

when |rtype=xscript then

[[Romanization of <language>|Transcribed]]: <romanized text>

The same code also handles tool tips for {{transl}}. For that template, I suspect that the correct thing to do is to use the 'romanization' term in the tool tip unless {{{2}}} holds a transliteration standard in which case the tool tip will use the 'transliteration' term.

—Trappist the monk (talk) 13:27, 27 April 2019 (UTC)

Thank you, Trappist the monk and Uanfala (talk). You are right. I should perhaps add that TRANSCRIPTION in the case of Arabic, Hebrew and also Standard Chinese is actually a good term here, because it is distinct from PHONETIC TRANSCRIPTION (where IPA allows both broad (phonemic) and narrow (phonetic) “phonetic transcriptions”). So from my own perspective, as a student of SEMITIC linguistics, replacing TRANSLIT. by TRANSCRIBED (rather than ROMANIZED) would be good enough. Regardless of transcribed or romanized: the current template should be corrected. Thank you, again. Kilo77 (talk) 13:44, 27 April 2019 (UTC)

I have tweaked Module:lang/sandbox (visual and tool-tip rendering), cf:

{{lang-ar|القرآن|al-Qurʾān}} → Arabic: القرآن, romanized: al-Qurʾān
{{lang-ar/sandbox|القرآن|al-Qurʾān}} → Arabic: القرآن, romanized: al-Qurʾān

these same with |translit-std=ISO:

{{lang-ar|القرآن|al-Qurʾān|translit-std=ISO}} → Arabic: القرآن, romanized: al-Qurʾān
{{lang-ar/sandbox|القرآن|al-Qurʾān|translit-std=ISO}} → Arabic: القرآن, romanized: al-Qurʾān

and how the change applies to {{transl}}:

{{transl|ar|al-Qurʾān}} → al-Qurʾān
{{transl/sandbox|ar|al-Qurʾān}} → al-Qurʾān

these same when a transliteration standard is supplied:

{{transl|ar|ISO|al-Qurʾān}} → al-Qurʾān
{{transl/sandbox|ar|ISO|al-Qurʾān}} → al-Qurʾān

—Trappist the monk (talk) 13:33, 29 April 2019 (UTC)

Thank you, Trappist the monk. I cannot see any actual change within the articles. For instance, under Quran, it wrongly says "Arabic: القرآن‎, translit. al-Qurʾān. It should be transcribed (or at least "romanized") rather than "transliterated". Are we waiting for approval? Likewise: what about Hebrew and Standard Chinese? Thank you. Kilo77 (talk) 03:54, 2 May 2019 (UTC)

I made the tweak in Module:Lang/sandbox so you won't see the changes in articles that use live templates. I was waiting for you to comment because yours is the original complaint. Can I take it from your (or at least "romanized") rather than "transliterated" statement that, at least for Arabic, the sandbox tweak is acceptable? What do you mean: what about Hebrew and Standard Chinese? Are you suggesting that this 'romanized' solution will not work for those languages? If so, why? And also if so, what must be done for Hebrew and Standard Chinese? And, further, does whatever-must-be-done-for-Standard-Chinese also apply the other 'flavors' of Chinese?

—Trappist the monk (talk) 12:07, 2 May 2019 (UTC)

As far as I can see, the changes to the sandbox will solve the problems with Hebrew, Chinese and the rest. – Uanfala (talk) 16:51, 2 May 2019 (UTC)

Thank you, both. Kilo77 (talk) 20:38, 2 May 2019 (UTC)

Trappist the monk: Do you need anything else? When will the changes happen? Thank you. Kilo77 (talk) 12:25, 7 May 2019 (UTC)

done; I was distracted by events elsewhere.

—Trappist the monk (talk) 12:50, 7 May 2019 (UTC)

Thank you. That’s a big improvement for the encyclopedia. Kilo77 (talk) 13:31, 7 May 2019 (UTC)

Use in grammar tables?

Latest comment: 5 years ago14 comments6 people in discussion

What is the recommended practice for grammar tables, such as those found on Slovene numerals? These should not be italicised, but having to put italic=no on all of the forms is cumbersome. Is there no shortcut to this? Rua (mew) 08:54, 19 May 2019 (UTC)

These should not be italicised Why not? In that article, it seems the more onerous task is applying {{lang}} to the various non-English words. Adding |itlaic=no (assuming that that is appropriate) is a simple regex search and replace:

search: (\{\{lang\|sl\|[^\}]+)

replace: $1|italic=no

—Trappist the monk (talk) 09:11, 19 May 2019 (UTC)

(edit conflict) Why should they not be italicized? MOS:FOREIGNITALIC suggests doing so, and MOS:LANG says that the lang template is the best way to make foreign-language text accessible in various ways. As for a way to make it easy to add the lang template, I often copy the whole article into text editing software and then use copy-paste or find-replace to make quick, consistent work of applying or fixing templates. – Jonesey95 (talk) 09:18, 19 May 2019 (UTC)

I don't think there is any reason to italicise the terms in grammar tables. It is standard practice in the English Wiktionary to not italicise in grammar tables, but to only italicise mentions of words in running text. The MOS says nothing about tables like this, which are a special case and I consider it an error of omission. Rua (mew) 10:16, 19 May 2019 (UTC)

Mind you, this logic applies to more than grammatical tables: I think the general purpose of italics is to highlight phrases that are not in English. When all you're working with is a table of non-English terms, it might be reasonable not to place them all in italics. (I don't know that I agree with this logic, but it's there.) --Izno (talk) 13:24, 19 May 2019 (UTC)

The CMOS (17th ed, 2017, section 11.3): An entire sentence or a passage of two or more sentences in another language is usually set in roman. Sure, that's for passages, not tables, but it's easy to see why the same logic should apply. – Uanfala (talk) 14:17, 19 May 2019 (UTC)

Then should this rationale also apply to tables of newspapers, tables of ships? Editor Rua's rationale, if I understand it, is that non-italics in tables is standard practice in the English Wiktionary. But, that does not mean that the same does or should apply here at the English Wikipedia. This template talk page is not the place to make such a determination. That discussion correctly belongs at WT:MOSTEXT.

—Trappist the monk (talk) 14:30, 19 May 2019 (UTC)

I think it's standard practice on Wikipedia as well. A look at various grammar articles may be helpful. Rua (mew) 14:31, 19 May 2019 (UTC)

Adding that there doesn't seem to be anything in our manual of style mandating the use of italics in this context. And given that MOS:FOREIGNITALIC talks of using italics for phrases in other languages and for isolated foreign words, it would seem that larger chunks of foreign text (as in grammar tables) should be assumed exempt. – Uanfala (talk) 14:35, 19 May 2019 (UTC)

Another point. Adding italic=no might be cumbersome, but so is adding {{lang}} to begin with. If there's markup to apply to every single cell in a table, shouldn't there be a way to do that once for the whole thing? I'm tempted here to bypass the template altogether, and do what it does directly in the table: that's add title="Slovene language text" lang="sl" just once, at the top of the table. This sets the title and lang in the <table> element, so presumably this should apply to the whole thing. I know next to nothing about html, so I don't know if this isn't a bad idea. – Uanfala (talk) 14:56, 19 May 2019 (UTC)

If you were to do that, the 'markup' will also apply to the English-language column and row headers (float you mouse over any of the headers):

English	English	English
English row header	Slovene text	Slovene text

this 'works' (adding title="Slovene language text" lang="sl") to each cell (becomes <td title="Slovene language text" lang="sl">...</td>) but that is just as cumbersome:

English	English	English
English row header	Slovene text	Slovene text

If there is enough call for such tables, perhaps a better solution is some sort of table-row-template that accepts an ISO 639 code, and n 'data' (cell values); special needs like spanning multiple columns and / or rows could be tricky.

Alternately, write a {{lang}} wrapper template that turns-off italic so that all you have to provide is language code and text.

—Trappist the monk (talk) 15:34, 19 May 2019 (UTC)

(edit conflict) Adding lang="something non-English" to the whole table is tempting, but I have never found a table in which that would be appropriate because there are always headers or a caption in English, and those shouldn't be tagged as non-English. (Similarly, class="IPA" can't be added to an entire table of phonemes because the English headers and caption aren't IPA.) One tactic for simplifying the wikitext is to use a Lua-based template that receives the list of cells in a specific format and generates a table with the correct language tagging, as in Appendix:English doublets and Appendix:Romance doublets on Wiktionary, which use Module:doublet table. That requires sticking to a particular format in the table cells. — Eru·tuon 15:36, 19 May 2019 (UTC)

A construct such as

<body lang="en"><table lang="es"><tr lang="en"><th>English thing<tr><td>Spanish thing</table></body>

is valid. --Izno (talk) 16:06, 19 May 2019 (UTC)

(edit conflict) :::A quick and dirty solution could be to enclose the English headers in {{lang|en}}, this is probably a bad idea? Other than that, I think I quite like the tables in these wiktionary appendices: they've got the added benefit of greatly simplifying the table syntax – they do away with all the horrible pipes-and-hyphens and double pipes and exclamation marks and leave a table that should be understandable to anyone who's ever seen a template. – Uanfala (talk) 16:09, 19 May 2019 (UTC)

Please restore links to language names

Latest comment: 5 years ago10 comments5 people in discussion

Colonies Chris, it's bold, revert, discuss, not bold, revert, revert discuss. Please read that page and undo your edits, then discuss them here.

The links to language names should stay. This is not an article, it is a template documentation page, where duplicate links are often helpful for readers who are focused on one particular section. MOS:DUPLINK and MOS:OVERLINK apply to articles; this is not an article.

Also, replacing redirects with canonical names is not beneficial, as stated very clearly in WP:NOTBROKEN. Your previous claims about reducing server load were always questionable, and they simply do not apply here, since this documentation page is transcluded in only one page. Do not do a blanket replacement of redirects with their targets. – Jonesey95 (talk) 04:17, 23 May 2019 (UTC)

This is not about duplicating links in different sections; it's about not linking them at all. As I said, if you think any specific links are useful, let's have a discussion about them, don't just blanket revert all my changes. And I will point out that none of the reasons given at WP:NOTBROKEN for not bypassing a redirect apply in these cases. If you think diferently, please be specific about which ones you think apply here. Colonies Chris (talk) 08:33, 23 May 2019 (UTC)

Please revert your changes while we discuss, per BRD. Thanks. – Jonesey95 (talk) 08:38, 23 May 2019 (UTC)

Some of the changes in this edit were good ~~(like the correct hyphenation in the ISO 639-1 etc. redirects),~~ some are presumably neutral (unlinking language names), others are unhelpful. {{Cl}} should not be replaced with {{Category link}} within running text as it impacts on the readability of the code and departs from the standard shortcut used in these contexts. Changing a link from List of ISO 639 codes to Lists of ISO 639 codes is also not a good idea: it bakes into the text the current setup of the articles, and it's conceivable that at some point in the future these two might lead to different places, so it's best to continue linking via the redirect that's conceptually most appropriate in the context. – Uanfala (talk) 10:20, 23 May 2019 (UTC)

A non-breaking hyphen (U+2011) may (in isolation) be visually indistinguishable from an ordinary hyphen. When placed together, they can be distinguished; cf. ‑-. It is possible that replacement is a good thing. But, if the original editor's intent was to prevent a line-break between 639- and the part number, then perhaps the correct fix, would be to replace U+2011 with ‑:

[[ISO 639‑1]] → ISO 639‑1

But, gasp, that gets to the target page via redirect. If the goal is to avoid redirects at all cost then this:

[[ISO 639-1|ISO 639‑1]] → ISO 639‑1

or this:

[[ISO 639-1|ISO {{nowrap|639-1}}]] → ISO 639-1

—Trappist the monk (talk) 13:37, 23 May 2019 (UTC)

One could equally well argue that spelling out the full name of a template makes reading easier, as it says what it does rather, than an opaque abbreviation.
If the original editor was intending to avoid a possible line break, they were using a very unclear and unhelpful method. If it's considered important, then using {{nowrap}} makes clear the intention. But I don't think avoiding a possible line break here is worth that sort of trouble.
Since several lists are available, why hide that from the reader? Colonies Chris (talk) 16:24, 23 May 2019 (UTC)

Colonies Chris, you have been asked by several people in several places to stop replacing template redirects in /docs and other locations, including having your AWB access revoked because of it. If you keep it up I suspect the next place you'll end up is ANI getting sanctioned. Just leave well enough alone! Primefac (talk) 20:31, 23 May 2019 (UTC)

I've fought enough battles with bullies in this place that I won't just slink quietly away. If replacing template redirects is so evil, you might consider this change and this one, both made by admins at my request. Perhaps you should take the admins to ANI first? Colonies Chris (talk) 21:14, 23 May 2019 (UTC)

Never said it was evil; I was saying that when multiple people have asked you to do something, you stop doing it and at the very least discuss the issue. I should hope requests for dialogue don't count as bullying. Primefac (talk) 22:07, 23 May 2019 (UTC)

Colonies Chris, please listen to the helpful advice that editors have been giving you over the past few months. I offered you an olive branch above (twice) by suggesting that you read and follow WP:BRD by reverting your own edit. You chose not to do so. The current state of the page is consistent with the "Discuss" phase of that BRD process; the original wikitext has been restored pending the outcome of this discussion. Please do not restore any of your changes until that consensus has been reached. – Jonesey95 (talk) 21:07, 23 May 2019 (UTC)

What to do about languages without an ISO-636-3 code?

Latest comment: 5 years ago6 comments3 people in discussion

I was working on Chaná language and I wanted to wrap the Chaná name for Blas Wilfredo Omar Jaime (which is Agó Acoé Inó, - "dog without owner") in a lang tag. But there is no ISO-636-3 code for this language. I wonder if we could use Glottolog codes (glottocodes) in such situations? It might be better than nothing. babbage (talk) 12:56, 10 June 2019 (UTC)

There is and there isn't: mis. The language codes understood by {{lang}} are used in the underlying html markup so must be known to and understood by browsers, screen readers, and the like. There is a list of codes that these tools are expected to know about. See the IANA language subtag registry.

Using Glottolog or Linguist List codes in {{lang}} would not be understood by browsers and other html readers. I suspect that when there is no language code for a particular string of text, it is better to use mis than to have the browser assume that the text is English:

{{lang|mis|Agó Acoé Inó}} → Agó Acoé Inó

—Trappist the monk (talk) 13:15, 10 June 2019 (UTC)

Might be better for easier future possible expansion and to allow current searching to work, to support a parameter for this language that will behave behind the scenes as mis. --Gonnym (talk) 13:59, 10 June 2019 (UTC)

What? What searching? Wikipedia should not be in the business of making up language codes nor should wikipedia attempt to predict what IANA or the ISO 639 custodians will do with regard to as-yet-uncoded languages. We can, if there is sufficient support, create private IETF tags so we might create a tag mis-x-chana which would work much the same as this:

{{lang|yuf-x-yav|Wi:kaʼi:la}} → Wi:kaʼi:la

yuf → Havasupai–Hualapai

—Trappist the monk (talk) 14:16, 10 June 2019 (UTC)

Please re-read what I wrote, no need to get all worked up. Using {{lang|fr|test}} in an article, adds the article to a hidden category Category:Articles containing French-language text. It is very valid for someone wanting to search for Category:Articles containing Chaná-language text. As I said above, the backend code will still be "mis" and work like it does now, but the custom parameter will allow the article to be categorized. --Gonnym (talk) 14:22, 10 June 2019 (UTC)

You didn't write anything about categories; searching could be anything from a ctrl-F-search to any of the various cirrus-searches to google-searches. The possible solution that I offered would do categorization according to the language name just as my yuf-x-yav example does:

Wi:kaʼi:la[[Category:Articles containing Yavapai-language text]]

—Trappist the monk (talk) 14:37, 10 June 2019 (UTC)

Part italics within foreign text

Latest comment: 5 years ago5 comments2 people in discussion

At Un_bel_dì_vedremo#Lyrics the Italian text is not italicised, using italics=no. Fine, except part of the text, in the original, is italicised. If I mark it as such (with double apostrophes), it produces a red error message, even though logically there is no error. Thus:

[Un bel dì, vedremo
levarsi un fil di fumo sull'estremo
... chiamerà:
« Piccina mogliettina,
olezzo di verbena »
i nomi che mi dava al suo venire.
(a Suzuki)
Tutto questo avverrà,
te lo prometto.
Tienti la tua paura – io con sicura
fede lo aspetto.] Error: {{Lang}}: text has italic markup (help)

I have just spotted that 'em' is being used to italicise stage directions ("a Suzuki"), which I suppose could be used as a kludge. But this is not emphasis, it's italicised text. Is there a way to do it properly? Imaginatorium (talk) 14:18, 14 June 2019 (UTC)

|italic=unset doesn't do what you want?

—Trappist the monk (talk) 14:25, 14 June 2019 (UTC)

Thanks for the lightning response. Yes, I made the mistake of chasing the error message instead of just finding the template instructions. Imaginatorium (talk) 15:11, 14 June 2019 (UTC)

But despite thinking about it quite a bit more, I am unable to see any purpose in the current behaviour of the "italic=no" option. Have I understood correctly: "no" means "positively switch off italics", so in an already-italic context they go off, whereas "unset" means simply "do not apply italics", thus remaining the same as the existing context. These are different, but whereas the "unset" option simply respects any explicit italic setting, the "no" option objects to it. How does that help? Why can't both respect explicit italic settings? Imaginatorium (talk) 18:05, 15 June 2019 (UTC)

|unset= arose from this discussion.

—Trappist the monk (talk) 13:27, 17 June 2019 (UTC)

Marra language

Latest comment: 5 years ago2 comments2 people in discussion

Adding the lang template in the Marra language page, using "mec" as the language code, automatically created the category Category:Articles containing Mara-language text. The name of the language in the automatically generated category name is spelled incorrectly (only one r, not two). Any suggestions on how to fix this? I don't know anything about templates, so I can't fix it myself. I didn't notice it was spelled incorrectly until after I'd created the category by adding the Category articles containing non-English-language text template. DferDaisy (talk) 15:50, 6 July 2019 (UTC)

ISO 639-3 used the 'Mara' spelling until their 2019-01-25 data set. IANA caught up at their 2019-04-30 data set. I have updated both to 2019-04-08 and 2019-04-30 respectively. So now, {{lang}} renders Category:Articles containing Marra-language text instead of Category:Articles containing Mara-language text.

—Trappist the monk (talk) 18:34, 6 July 2019 (UTC)

Lang-bs-Cyrl

Latest comment: 4 years ago2 comments2 people in discussion

Please enable code Cyrl if its not for Bosnian as it has two official scripts. You can find new official app for phone, it's free of charge (link of Pravopis page, for Android and iOS), where its stated that Bosnian has two scripts Latin and Cyrillic. I made template Template:Lang-bs-Cyrl per Template:Lang-sr-Latn equivalent. --Obsuser (talk) 11:55, 24 July 2019 (UTC)

I don't understand what you mean by enable code Cyrl if its not for Bosnian. If you write this:

{{lang|bs-Cyrl|Cyrillic text placeholder}}

then {{lang}} returns this:

Cyrillic text placeholder → Cyrillic text placeholder

which is correct, right?

In your {{lang-bs-Cyrl}} you have:

{{#invoke:lang|lang_xx_italic

which return this:

{{lang-bs-Cyrl|Ćirilica}} → Serbo-Croatian Cyrillic: Ćirilica → [[Serbian Cyrillic alphabet|Serbo-Croatian Cyrillic]]: Ćirilica

and that is incorrect, right? (should not be italicized)

The {{#invoke}} should be:

{{#invoke:lang|lang_xx_inherit

—Trappist the monk (talk) 12:15, 24 July 2019 (UTC)

get_ietf_parts is providing a wrong response for sr-Latn

Latest comment: 4 years ago1 comment1 person in discussion

I just created the redirect at Немања (to Nemanja) and noticed that {{R from alternative language}} isn't recognising sr-Latn as the |from= parameter. Looking in that redirect template, it invokes is_ietf_tag with each IETF-tag parameter, which returns get_ietf_parts (code) and true.

Module:Language/data/wp_languages has ["sr-Cyrl"] = {"Serbian Cyrillic"}, which is overridden in Module:Lang/data by ["sr-cyrl"] = {"Serbian"}, but I can't see anything that explicitly handles sr-latn, so I would have expected that get_ietf_parts would run through successfully and return code = "sr", script = "latn", region = nil, variant = nil, private = nil, nil, but I'm evidently misreading the Lua (which I'm not very familiar with) because the output is an as yet [[undetermined language]] rather than [[Serbian language|Serbian]], which is what name_from_tag should have provided.

I'm not sure why things aren't Just Work™ing, but does Module:Lang/data perhaps need a line for ["sr-latn"] = {"Serbian"}? I can't see why, but that's the only thing I can think of. Could someone please take a look and see what it is that I'm missing? Thanks! — OwenBlacker (he/him; Talk; please {{ping}} me in replies)

{{R from alternative language}} was broken at this edit. Fixed, I think.

Is {{R from alternative language}} really the correct template? Shouldn't it be {{R from alternative script}}? The language didn't change.

—Trappist the monk (talk) 19:31, 3 August 2019 (UTC)

Wikidata language item mappings

Latest comment: 4 years ago4 comments2 people in discussion

At some point, it would be nice if we could process Wikidata statements (with monolingual text properties, qualifiers and/or reference sources) that refer to Wikidata items that represent a language. Perhaps we can have a mapping added that maps WD language items to IETF language tag (P305) (see d:Special:WhatLinksHere/Property:P305) or Wikimedia language code (P424) (see d:Special:WhatLinksHere/Property:P424). Then language source designations (like those provided by Template:In lang (Q66459516)) could be added to articles from Wikidata statements that had monolingual text properties, language qualifiers or language reference sources (e.g., language of work or name (P407) or other properties like d:Special:WhatLinksHere/Q18616084) on them, etc.

As an example, it would be nice to map the following:

German (Q188) -> "de" -> "(in German)"
Japanese (Q5287) -> "ja" -> "(in Japanese)"

We could chase the properties down but it would mean digging through more WD entities not linked to the current article/page (which are considered expensive operations). These items should remain relatively static so caching this mapping is useful.

Thank you, 50.53.21.2 (talk) 19:34, 15 August 2019 (UTC)

I'm at a loss to understand how using wikidata would make using {{lang}} or the {{lang-??}} templates easier for editors to use. Sure, we might write:

{{lang|{{#property:P218|from=Q188}}|Guten Morgen}} → Guten Morgen

but why?

—Trappist the monk (talk) 22:08, 15 August 2019 (UTC)

I was thinking more along the lines of Template:Official website (Q5614958) using a official website (P856) statement with a language of work or name (P407) qualifier type of thing. I am not sure where you got ideas about {{lang}} or {{lang-langcode}}. I believe this is the talk page for Module:Lang and friends. 50.53.21.2 (talk) 20:22, 17 August 2019 (UTC)

Module talk:Lang redirects here because the primary purpose of Module:Lang is to support {{lang}} and most of the {{lang-??}} templates. That it supports other templates is a bonus. I cannot read your mind (I did not inherit that gene). I am unable to extract from what you have written a clear understanding of what you think that Module:Lang can do for you. Pretend that I'm a simpleton and draw me a picture from wikisource to final rendering.

—Trappist the monk (talk) 21:49, 17 August 2019 (UTC)

Istro-Romanian

Latest comment: 4 years ago1 comment1 person in discussion

When you use this template with the code "ruo", the automatically generated name is "Istro Romanian" instead of "Istro-Romanian", as it should be. Anyone know how to change this? Super Ψ Dro 19:42, 23 August 2019 (UTC)

Use template in names of works?

Latest comment: 4 years ago2 comments2 people in discussion

I admit, after reading the MOS and a quick search of the archive here, I'm still uncertain about whether to use this template everywhere I see a foreign word, whether wikilinked or not. The examples, of spoken text, are pretty clear use cases. And I believe institutions and proper names are excluded. But what about: Émile Zola wrote criticism in the short-lived journal L'Événement before publishing Thérèse Raquin. Should both titles be wrapped in {{lang|fr...}}? I guess the "font" and "screen reader" rationales would say yes.

It can get awful ~~tedious~~punctilious when it comes to long bibliographies (Zola again). But I guess that is part of the charm of this template. David Brooks (talk) 14:15, 15 October 2019 (UTC)

Templates have charm? Who'd 'a' thunk?

I don't see why you shouldn't wrap non-English titles in {{lang}}. What you should not do is wrap non-English titles in {{lang}} when those titles are parameter values in cs1|2 citation templates ({{cite book}}, {{cite magazine}}, etc.) – spoils the citation's metadata.

—Trappist the monk (talk) 14:56, 15 October 2019 (UTC)

Terms in multiple languages

Latest comment: 4 years ago4 comments2 people in discussion

Can this template be modified so that one can write a term that is in multiple languages without choosing a specific language over the others? An example of what I'm referring to here is the word smuga (both in Old Norse and Modern Swedish) at Special:PermanentLink/925030207#History. Geolodus (talk) 14:19, 7 November 2019 (UTC)

No. I have never seen anything anywhere that suggests that the lang html attribute can take multiple values. The primary purpose of this template is to provide proper html markup for non-English text in an English-language article page. Browsers and screen readers need this information to correctly render or speak the non-English text. If you were to give the template more than one language code, browsers and screen readers would not know which of the multiple languages really applies to the non-English text.

—Trappist the monk (talk) 15:22, 7 November 2019 (UTC)

What should be done with the above-mentioned word then? And I am well aware of the purpose of this template; if I wasn't, I wouldn't even be here. Geolodus (talk) 17:06, 7 November 2019 (UTC)

Since the word appears only once in that article, it can be marked only once with only one language code. Perhaps there is a way to rewrite that section to use the term twice and so mark one instance of the term with non and the other instance with sv using two {{lang}} templates? One template cannot serve two languages.

—Trappist the monk (talk) 18:04, 7 November 2019 (UTC)

Potential future problem with automatic addition of asterisks to Proto-Norse

Latest comment: 4 years ago7 comments4 people in discussion

Currently {{lang}} adds an asterisk at the beginning of words in languages that begin with "Proto-", on the assumption assuming that this means that they are reconstructed, as has been discussed previously. Now Proto-Norse is oddly named. It's actually attested. Currently Wiktionary has 38 attested words and 3 reconstructed words. So if words were tagged as Proto-Norse in {{lang}}, an asterisk would be added to attested words. However, this problem hasn't come up yet because there's no ISO 639 code for Proto-Norse and no Wikipedia-specific code in Module:Lang/data. In the article on Elfdalian, some Proto-Norse words are tagged as Old Norse instead.

At least it seems like a good rule of thumb that reconstructed languages begin with "Proto-", based on Wiktionary's language data: all but one of the completely-reconstructed languages on Wiktionary begin with "Proto-" (Illyrian), and Proto-Norse seems to be the only genuine non-completely-reconstructed language whose name begins with "Proto-". (Proto-Brythonic and Tatic look like mistakes.) But some not-completely-reconstructed languages have Reconstruction entries, though; "Latin", for instance, is the header under which Wiktionary puts unattested Vulgar Latin or Proto-Romance words (list of reconstructed "Latin" words).

Like Uanfala in the previous discussion, I doubt the wisdom of automatically adding the asterisk. In some probably rare cases it's not supposed to be added, and it has to be manually added in some other cases, because some languages without the "Proto-" prefix have reconstructed words. However, the behavior will serve well in many cases and changing the behavior would involve a lot of cleanup work. — Eru·tuon 03:43, 8 December 2019 (UTC)

I suppose that we might invent a parameter:

|proto=yes – adds a proto-prefix to text for languages with names that do not begin with 'Proto-': {{lang|xil|<Illyrian text>|proto=yes}} → *<Illyrian text>

|proto=no – inhibits automatic proto-prefix for text from languages with names that begin with 'Proto-': {{lang|cel-x-proto|<Proto-Celtic text>|proto=no}} → <Proto-Celtic text>

|proto= – (empty or missing) ignored; no default value

|proto=<anything else> – emits an error message

If we do this, I suspect that the cleanup work will be minimal.

—Trappist the monk (talk) 15:02, 8 December 2019 (UTC)

If we already know that the Illyrian is proto, then can't we add a an "unique_languages" array to the code so it can check if it's on the list and automatically add |proto=yes without editors needing to handle this? --Gonnym (talk) 15:12, 8 December 2019 (UTC)

I absolutely hate special-case code which is what we would have for Illyrian. Were there more than just that lonely one, or a strong probability for more, then sure, we should do as you suggest. Regardless, there still appears to be a need for |proto=no so until there is a sufficiency of proto-language names that do not begin with 'Proto-', |proto=yes and |proto=no are adequate to the task without any special-case code, are they not?

I have hacked the sandbox to create |proto= in {{lang}}. If we decide to keep it, then I'll add the same for the {{lang-??}} templates:

{{lang/sandbox |es|es text}} → es text – nothing added; nothing taken away
{{lang/sandbox |xil|xil text|proto=yes}} → *xil text – name does not begin with 'Proto-' so add splat prefix
{{lang/sandbox |cel-x-proto|cel text|proto=no}} → cel text – remove automatic splat prefix
{{lang/sandbox |cel-x-proto|cel text|proto=bob}} → [cel text] Error: {{Lang}}: invalid |proto=: (help) – error, poor 'bob' is not a valid parameter value
{{lang/sandbox |cel-x-proto|*****multi-splat cel text|proto=no}} → multi-splat cel text – remove all splats prefixing text; do not add
{{lang/sandbox |cel-x-proto|*****multi-splat cel text|proto=yes}} → *multi-splat cel text – remove all but one prefixing splats in text
{{lang/sandbox |cel-x-proto|*****multi-splat cel text|proto=}} → *multi-splat cel text – remove all but one prefixing splats in text
{{lang/sandbox |xil |*****multi-splat xil text}} → *multi-splat xil text – name does not begin with 'Proto-'; not instructed to add so delete all but one prefixing splats in text
{{lang/sandbox |xil |*****multi-splat xil text|proto=yes}} → *multi-splat xil text – name does not begin with 'Proto-'; instructed to add; delete all but one prefixing splats in text
{{lang/sandbox |xil |*****multi-splat xil text|proto=no}} → multi-splat xil text – name does not begin with 'Proto-'; instructed to inhibit; deletes all prefixing splats in text

I changed the code to strip multiple splats back to one or none as appropriate; the multi-splat test works when there is only one splat in text. I added this because of a typo in my own testing that put two splats in text and rendered both of them.

—Trappist the monk (talk) 18:46, 8 December 2019 (UTC)

I think |proto=yes and |proto=no are enough to solve the problem that I raised. Thanks! (Though from the editor's perspective, adding or removing an asterisk is easier because it requires fewer characters. But again that would require someone to take on a cleanup job.) As for the multi-split shortening, I'm not sure whether it's a good idea or not. I've occasionally seen a double-asterisk prefix that meant "ungrammatical" (see Asterisk § Ungrammaticality and Asterisk § Ambiguity), but haven't used it myself and am not sure whether it would be put inside {{lang}}. — Eru·tuon 19:40, 10 December 2019 (UTC)

Added for the {{lang-??}} templates; there being no {{lang-xil}}, I'm using {{lang-es}} in these examples:

{{lang-es/sandbox|es text}} → Spanish: es text – nothing added; nothing taken away
{{lang-es/sandbox|es text|proto=yes}} → Spanish: *es text – not a proto language; force with |proto=yes
{{lang-cel-x-proto/sandbox|cel text|proto=no}} → Proto-Celtic: cel text – remove automatic splat prefix
{{lang-cel-x-proto/sandbox|cel text|proto=bob}} → [cel text] Error: {{Lang-xx}}: invalid |proto=: (help) – error, poor 'bob' is not a valid parameter value
{{lang-cel-x-proto/sandbox|*****multi-splat cel text|proto=no}} → Proto-Celtic: multi-splat cel text – remove all splats prefixing text; do not add
{{lang-cel-x-proto/sandbox|*****multi-splat cel text|proto=yes}} → Proto-Celtic: *multi-splat cel text – remove all but one prefixing splats in text
{{lang-cel-x-proto/sandbox|*****multi-splat cel text|proto=}} → Proto-Celtic: *multi-splat cel text – remove all but one prefixing splats in text

—Trappist the monk (talk) 00:11, 11 December 2019 (UTC)

I like this proposal, but suspect that |reconstructed=yes and |reconstructed=no would be less confusing as attested (non-proto) languages with unattested terms exist. wikt:Reconstruction:Middle English/halibutt is an example of this. Glades12 (talk) 12:04, 16 December 2019 (UTC)

Warning message appears when a 3-letter code is used in a section heading.

Latest comment: 4 years ago6 comments3 people in discussion

This version of an article, (which I have just edited in response to a question at WP:Help desk#contents box on Malagasy Hippopotamus article different then section name) contained a problem where a warning message from this template appeared in the entry in the contents list. The template call had the ISO639-1 3-letter code for Malagasy, mlg (contrary to the recommendation for the template). The template code evidently corrected this, but left the warning message "code: mlg promoted to code: mg" in the result in a way that it didn't appear in the Heading itself, but did appear in the Contents entry. --ColinFine (talk) 22:23, 19 January 2020 (UTC)

ISO 639-1 does not have three-character language codes. Those are reserved to ISO 639-2 and -3.

The message is there for a reason. The message is telling editors that they have used a language code that should not be used because browsers are not expected to understand the three-character equivalent of the two-character language code. Mayhaps that they do, but the standard list of language codes, region, script, and variant tags is the language-subtag-registry file. That file does not list mlg or any of the other ISO 639-2 / -3 codes that have ISO 639-1 equivalents. The template cannot know if the editor intended to name some other language but instead committed a typo, perhaps nlg or lgm, both legitimate codes that aren't promoted to ISO 639-1:

{{lang|nlg|nlg}} → nlg

{{lang|lgm|lgm}} → lgm

However, even though this warning message exists, it is not considered critical so doesn't warrant a red error message. If you wish to see these warnings, see Category:Lang and lang-xx code promoted to ISO 639-1 to un-hide them. Using the css described there, I see the warning message in the header and in eight or nine other instances of the template.

In the html of the 'broken' page, I find this in the TOC markup:

<li class="toclevel-1 tocsection-7"><a href="#Oral_legends_and_the_kilopilopitsofycode:_mlg_promoted_to_code:_mg"><span class="tocnumber">4</span> <span class="toctext">Oral legends and the <i>kilopilopitsofy</i><span>code: mlg promoted to code: mg </span></span></a></li>

and this in the header:

<h2><span class="mw-headline" id="Oral_legends_and_the_kilopilopitsofycode:_mlg_promoted_to_code:_mg">Oral legends and the <i lang="mg" title="Malagasy language text">kilopilopitsofy</i><span class="lang-comment" style="font-style:normal; display:none; color:#33aa33; margin-left:0.3em">code: mlg promoted to code: mg </span></span></h2>

Warning message clearly present in both although MediaWiki has stripped the attributes from the ... tag that wraps the warning message.

In the fixed version, this in the TOC:

<li class="toclevel-1 tocsection-7"><a href="#Oral_legends_and_the_kilopilopitsofy"><span class="tocnumber">4</span> <span class="toctext">Oral legends and the <i>kilopilopitsofy</i></span></a></li>

and the header:

<h2><span class="mw-headline" id="Oral_legends_and_the_kilopilopitsofy">Oral legends and the <i lang="mg" title="Malagasy language text">kilopilopitsofy</i></span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/w/index.php?title=Malagasy_hippopotamus&amp;action=edit&amp;section=7" title="Edit section: Oral legends and the kilopilopitsofy">edit</a><span class="mw-editsection-bracket">]</span></span></h2>

For me, the warning message appears in both the TOC and the heading in the broken version. In the 'fixed' version, there is no html for the warning message in the TOC or the header (because it's fixed there shouldn't be a message) so no warning message in either place.

Are you still seeing this 'error'? After a purge of your cache?

—Trappist the monk (talk) 23:38, 19 January 2020 (UTC)

OK, Trappist the monk: the message should be there. But for me (and evidently for the IPV6 user who reported it) it appears in the contents list but not in the header. This seems a very bad idea. I can't see what relevance my cache has: the first time I ever went to that article, I saw the problem; but now that I've fixed the language code, it's gone away.

As for the IANA file you pointed at: I was looking at that earlier. It's nonsense. The only en- subtags in it are en-CA, en-GB-od, en-GB-oxendict, en-boont, and en-scouse. Whatever it is supposed to be, it isn't. --ColinFine (talk) 00:30, 20 January 2020 (UTC)

Editors will do as editors will do. There is nothing that I can do to prevent editors from placing templates in headings. Nothing. When they do that, there is nothing that I can do to prevent MediaWiki from stripping the style attribute from the ... tag. Nothing. There is nothing that I can do to prevent editors from ignoring the template documentation; there are, however, better writers than I who can certainly improve upon the documentation that I have hacked; if you are one of those who can, please do.

Not nonsense. You just have to understand how to read it. You found:

en-CA – in the registry, en-CA is the only allowed prefix for the variant newfound so, when the the whole tag is assembled: en-CA-newfound → Newfoundland English
en-GB-oed which is the old but still supported version of the preferred en-GB-oxendict (though en-oxendict is allowed because the variant oxendict requires only en as a prefix) → English, Oxford English Dictionary spelling
en-boont an old, now redundant tag because there is a defined variant tag boont that works only with the prefix en → Boontling (Jargon embedded in American English)
en-scouse an old, now redundant tag because there is a defined variant tag scouse that works only with the prefix en → Scouse (English Liverpudlian dialect known as 'Scouse')

The registry file does not list all possible combinations of language code, region code, script code, variant code because such a list would be much larger than the already large list (nearly 8000 language codes alone). So, yeah, it's big, it may be initially confusing, but it is certainly not nonsense. And, perhaps I get to escape the firing squad here, I am not the editor who included mention of the registry file in the template documentation which happened with this edit.

—Trappist the monk (talk) 01:33, 20 January 2020 (UTC)

(edit conflict) @Trappist the monk: The error message in the header is displaying for you because you've got .lang-comment {display: inline !important;} in your common.css. For most people, it won't display because of the inline CSS display: none;. — Eru·tuon 00:31, 20 January 2020 (UTC)

Yep, I know that, that is why I wrote: If you wish to see these warnings, see Category:Lang and lang-xx code promoted to ISO 639-1 to un-hide them. (and, yeah, I can see that text)

—Trappist the monk (talk) 01:33, 20 January 2020 (UTC)

Error suppression

Latest comment: 4 years ago2 comments2 people in discussion

Is there any desire to allow for error-suppression? I'm just adding this template to Template:Infobox musical composition and I can foresee there being existing template calls with invalid language codes. Making it possible to chose for the error message to display only to logged-in users or on edit-preview could be useful, maybe? — OwenBlacker (he/him; Talk; please {{ping}} me in replies) 23:16, 28 February 2020 (UTC)

I generally oppose error suppression; if there is an error message then something is wrong and that something should be fixed not hidden away.

There was some discussion that died aborning that I was imagining could be standardized across all of the infoboxen that support |native_name_lang=.

—Trappist the monk (talk) 23:34, 28 February 2020 (UTC)

Languages without codes

Latest comment: 4 years ago4 comments3 people in discussion

Sometimes the languages used in Wikipedia articles don't have official language codes. For example, Talang Tuo inscription quotes Old Malay, which doesn't have its own code. Should we use the code "und" for these languages, or is there some better way to handle these situations? -- Beland (talk) 22:09, 4 November 2019 (UTC)

Perhaps mis, Uncoded languages, is a better choice than und, Undetermined, because the language isn't 'undetermined', it just isn't coded. If there is sufficient need (I don't know what those criteria might be) we can invent a private IETF language tag for use within {{lang}} and the other various templates that use Module:Lang; perhaps ms-x-old ...

—Trappist the monk (talk) 23:02, 4 November 2019 (UTC)

Excellent; thanks! -- Beland (talk) 17:19, 14 November 2019 (UTC)

I agree with using mis. Glades12 (talk) 09:37, 29 February 2020 (UTC)

Question regarding ISO templates

Latest comment: 4 years ago2 comments2 people in discussion

Does the template/module use the templates listed at Category:ISO 639 name from code templates as the category header says? --Gonnym (talk) 13:44, 1 March 2020 (UTC)

In general, no. There are {{lang-??}} templates that do not use Module:lang which may (or may not) use ISO 639 templates.

—Trappist the monk (talk) 13:53, 1 March 2020 (UTC)

Tuscan

Latest comment: 4 years ago3 comments2 people in discussion

@Trappist the monk: The code ita-tus for Tuscan language is not recognized. Throws an error. This appears to be the only code we have available for pre-modern Italian of the general regional variety that developed into Modern Italian, so it would be good for this one to be supported (though it is a Language List code, not an ISO one). It's not correct to code this material as either modern Italian or as Vulgar Latin, and we're left without any other code in that range to use, unless I'm missing something. (That said, depending on the exact material in question, some other language code for an Italic language or dialect will be more appropriate in some cases; Tuscan isn't a catch all for everything between VL and MI, by any means). So, this has about the same kind of use case as codes for Middle English and Middle French, just narrowed a bit due to the fragmentation of Italian languages in the era. — SMcCandlish ☏ ¢ 😼 14:03, 21 April 2020 (UTC)

PS: Having lat-vul for Vulgar Latin would also be helpful, because after about the beginning of the 8th c., it would be incorrect to code this as la. I think use of the Language List codes would be much more sensible that instituting our own novel codes, as suggested in a thread up at the top of this page. (Or another way of putting it: WP's novel codes should be identical to Language List, Glottolog, Linguasphere, and other semi-standardized codes, or not be created, since they would be meaningless to anyone but our own editors). — SMcCandlish ☏ ¢ 😼 14:07, 21 April 2020 (UTC)

I understand your suggestion that we use semi-standardized codes. But, the tags ita-tus and lat-vul are not valid as IETF language tags because tus and vul are not recognized as valid extended language (extlang) subtags. All currently registered extlangs listed in the IANA language subtag registry have primary (ISO 639) language codes so extlang use is not supported by Module:Lang. Even if Module:lang did support extlangs, ita-tus and lat-vul would still be invalid because tus and vul are not valid extlang tags.

So, the only way forward is private-use subtags which is a standardized IETF format and is supported by Module:lang. We might create something like: it-x-tuscan and la-x-vulgar for our own use here. That gives us valid html and links to the appropriate language article from the rendered non-English text.

—Trappist the monk (talk) 14:43, 21 April 2020 (UTC)

Nihongo Foot

Latest comment: 4 years ago7 comments2 people in discussion

In Nihongo Foot, can someone replace the commas before |extra= and |extra2= with semicolons? Esszet (talk) 15:44, 13 May 2020 (UTC)

Can you demonstrate that somewhere there is a consensus to make that change? The comma before |extra= has been there since the first version of the template:

{{
  #if: {{{extra|{{{4|}}} }}} | {{
    #if: {{
      #if: {{{3|}}}| {{{1|}}}
    }}{{{2|}}} |,
  }} {{{extra|{{{4}}} }}}
}}

And why is the discussion here instead of at Template talk:Nihongo foot?

—Trappist the monk (talk) 16:09, 13 May 2020 (UTC)

I went to Module:Lang/utilities and went to the talk page, and either way, it's just formatting, you use semicolons because the notes are separate ideas. I see the basic {{Nihongo}} is also used for people's names (Satoru Iwata, Shinzo Abe, etc.), but, by the same token, for basically all other languages, semicolons are used (see Xi Jinping, Moon Jae-in, Vladimir Putin, Muhammad bin Nayef, etc.). Even in English, you have things like "maiden name; date of birth (and death, if applicable)"; see Michelle Obama and Jacqueline Kennedy Onassis. Esszet (talk) 17:29, 15 May 2020 (UTC)

I think that you are answering a question that I did not ask. Perhaps I did not ask it correctly. History:

{{nihongo}} used to call {{nihongo/rom}} which was created with the comma 9 October 2005 (since deleted); first version not calling {{nihongo/rom}} created 21 March 2006‎ with the comma

{{nihongo3}} created with the comma 7 February 2008

{{nihongo foot}} created with the comma 31 May 2010

So, for 10–15 years, the {{nihongo}} templates have used a comma; if nothing else, there appears to be a long-term consensus that supports the use of the comma. Perhaps that consensus exists because no one but you has noticed that the template uses a comma; I don't know. This is why I asked you to show me where editors using these templates have established a new consensus that replaces the comma with the semicolon.

I do not see any such discussion at Template talk:Nihongo, Template talk:Nihongo3, Template talk:Nihongo foot, or at WT:JAPAN. Really the place to be having this discussion is at one of those places.

—Trappist the monk (talk) 19:34, 15 May 2020 (UTC)

The reason it's there is probably that someone made a simple mistake and it was just never fixed. In the case of names, the consensus is semicolons, it's supported by the MOS. It's the same thing with notes; it's a separate idea. That's really all there is to it. Esszet (talk) 01:38, 16 May 2020 (UTC)

Maybe the comma is a simple mistake, I don't know. Maybe the comma is intentional, I don't know. Maybe the community never noticed the comma, I don't know. Maybe the community did notice and is comfortable with the comma, I don't know. What I do know is that for 10–15 years these templates have, without objection, rendered with a comma; that is the consensus for these templates. Establish a new consensus to use a semicolon and I am delighted to make the necessary change to the templates.

—Trappist the monk (talk) 13:22, 16 May 2020 (UTC)

Nope, sorry, the MOS overrides that. Esszet (talk) 21:00, 16 May 2020 (UTC)

Edit request on 26 May 2020 T 11:44 (UTC-8)

Latest comment: 4 years ago5 comments2 people in discussion

This edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request.

I'm really bad with navigating through .lua code, but can someone change the single quotation marks around text in the |lit= parameter to double quotation marks per MOS:QUOTATIONMARKS? —Tenryuu 🐲 ( 💬 • 📝 ) 18:46, 26 May 2020 (UTC)

I presume that you mean the quotes around 'In the West Nothing New' in this rendering:

{{lang-de|Im Westen nichts Neues|lit=In the West Nothing New}} → German: Im Westen nichts Neues, lit. 'In the West Nothing New'

Which part of MOS:QUOTATIONMARKS do you think applies here?

—Trappist the monk (talk) 19:20, 26 May 2020 (UTC)

Yes, that's what I meant. More specifically, MOS:DOUBLE (or MOS:SINGLE), where

Most quotations take double quotation marks (Bob said: "Jim ate the apple."). Exceptions:
Plant cultivars take single quotation marks (Malus domestica 'Golden Delicious'; see Wikipedia:Naming conventions (flora)).

Simple glosses that translate or define unfamiliar terms usually take single quotes (Cossack comes from Turkic qazaq 'freebooter').

—Tenryuu 🐲 ( 💬 • 📝 ) 19:27, 26 May 2020 (UTC)

Doesn't the 'In the West Nothing New' qualify as a simple gloss so should take single quotes per the second bullet point?

—Trappist the monk (talk) 19:43, 26 May 2020 (UTC)

Fair point. I'm used to seeing double quotes for that, but if that's the convention we're following, I have no further objections. —Tenryuu 🐲 ( 💬 • 📝 ) 19:51, 26 May 2020 (UTC)

Meitei script

Latest comment: 4 years ago2 comments2 people in discussion

This edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request.

The Meitei script doesn't display properly with the {{lang|mni|ꯏꯟꯗꯤꯌꯥ}} template Meitei: ꯏꯟꯗꯤꯌꯥ, the script is available in the Microsoft's Nirmala UI font family and the font "Noto Sans Meetei Mayek". I did a work-around by adding style="font-family:'Noto Sans Meetei Mayek','Nirmala UI Semilight','Nirmala UI','Noto Sans';" Meitei: ꯏꯟꯗꯤꯌꯥ but can someone please update the template? Irtapil (talk) 09:26, 4 June 2020 (UTC)

Not done. It is not clear to me how this request is to be fulfilled for both {{lang}} and {{lang-mni}}. In the meantime, perhaps a better workaround is:

[[Meitei language|Meitei]]: <span lang="mni" style="font-family:'Noto Sans Meetei Mayek','Nirmala UI Semilight','Nirmala UI','Noto Sans';">{{lang|mni|ꯏꯟꯗꯤꯌꯥ}}</span>

Meitei: ꯏꯟꯗꯤꯌꯥ

This workaround renders the language label using the page's default font.

—Trappist the monk (talk) 11:59, 4 June 2020 (UTC)