Talk:Lyndon word

Latest comment: 6 years ago by InternetArchiveBot in topic External links modified

New article created edit

This was requested! however all it really entails is a definition...perhaps a redirect? Wikipedia appears to be lacking in general on the broader subject however. Robbjedi 04:45, 18 October 2005 (UTC)Reply

Necklace edit

I am not merging this into necklace (combinatorics), as Lyndon word is a standalone concept, deserving its own article. Hope nobody minds. Kompas 09:49, 3 August 2006 (UTC)Reply

"A theorem of Radford states...." edit

needs a reference 173.206.238.69 (talk) 00:26, 21 May 2012 (UTC)Reply

Done. -- Darij (talk) 09:36, 24 December 2015 (UTC)Reply

Why periodic strings are omitted? edit

What's wrong to include the strings "00", "11" etc. in the sequence? --RokerHRO (talk) 15:17, 7 July 2014 (UTC)Reply

They are equal to their rotations, and therefore are not the unique smallest string among their rotations. Also because then the standard factorization wouldn't work. —David Eppstein (talk) 16:02, 7 July 2014 (UTC)Reply

Algorithm dubious? edit

The algorithm given for computing the standard factorisation looks to be quadratic rather than linear as stated. If the input is 'bbbb...bbba' (n 'b's, 1 'a') then the algorithm takes O(n) time to find the first Lyndon word in the factorisation is 'b', and then it proceeds to factorise 'bbbb...bbba' (with n-1 'b's).

Also, I think the algorithm's finishing condition is faulty, unless it assumes the string ends with a terminating character that's smaller than every other character. As written, 'aa' will factorise as ['aa'] rather than ['a', 'a']. 125.30.89.127 (talk) 08:59, 3 November 2014 (UTC)Reply

Plus one to faulty finishing condition - from what I gathered one should iterate until the string is empty, and treat m reaching the end of the string as also going to step 3.3.

Also (but that could just be me) I could only understand the algorithm as described after I understood the algorithm already, it isn't clear that it starts with the assumption that a single letter initial Lyndon word is the best it can do and tries proving (3.3) or disproving said assumption by looking at further letters, each time learning a new length assumption for the initial Lyndon word when the previous assumption is proven false (3.2), where equality contributes "nothing".

I also agree with the quadratic runtime assessment, which is most likely due to case 3.3 discarding all accumulated knowledge, though I wouldn't yet know how to do better. 193.227.108.10 (talk) 13:11, 12 November 2014 (UTC)Reply

Maybe something closer to pseudocode would be clearer and less bug-prone than this textual description? I have an implementation in http://www.ics.uci.edu/~eppstein/PADS/Lyndon.py (see in particular the ChenFoxLyndonBreakpoints function) that is probably too specific to Python to use directly but might make an appropriate starting point. —David Eppstein (talk) 16:10, 12 November 2014 (UTC)Reply
The implementation looks fine (for obvious reasons quite similar to what I came up with, see below, it returns the words directly instead of breakpoints), though I believe it deserves some (textual) explanation as to why one is doing that: In essence it starts by assuming a single symbol Lyndon word is the longest possible in the beginning and tries to prove or disprove that by looking further, and each time it disproves the hypothesis it did in fact find a longer Lyndon word, which becomes the new assumption. If I've some more time I'll try to make this a tad more readable...
Making the runtime linear instead of quadratic for a full factorization then results from additional observations in the case the hypothesis is proven (namely that there may be repetitions of what we just reported), this might make it easier to grasp, also a more space-efficient version (avoiding appending the string to itself, and reporting the minimum rotation needed to make it minimal) for finding the minimum cyclic permutation does become apparent then.
 def lyndon_factorize(S):
     start, left, right = 0, 0, 1
     while start < len(S):
         if right == len(S) or S[left] > S[right]:
             length = right - left
             while start + length <= right:
                 yield S[start:start+length]
                 start += length
             left, right = start, start + 1
         elif S[left] == S[right]:
             left, right = left + 1, right + 1
         else:
             left, right = start, right + 1

193.227.108.10 (talk) 13:15, 13 November 2014 (UTC)Reply

The numbers of binary Lyndon words of each length edit

(For starters, English is not my primary language..) My first understanding of this was:

Every binary Lyndon word of each length has its own number.

Probably just my fault, but wouldnt the word amounts be a tiny bit more exact? Although it doesnt sound that good. Or could someone think of another way to say it?

Anyway, if anyone else had trouble understanding, the sequence 1, 2, 1, ... means:

  • 1 word of length 0 - ε
  • 2 words of length 1 - 0,1
  • 1 word of length 2 - 01
  • ... — Preceding unsigned comment added by 147.32.31.193 (talk) 18:18, 5 November 2014 (UTC)Reply

Actually, "amounts" would be a word you'd use for a continuous quantity (a real number like an amount of water) while number makes more sense for discrete integer values (a number of glasses of water). —David Eppstein (talk) 18:32, 5 November 2014 (UTC)Reply

Nonemptiness edit

I have tweaked the definition to regard the empty word as non-Lyndon. There are multiple reasons for this:

  • Several references used on this page (e.g., Lothaire, Kufleitner, Brlek-Lachaud-Provencal-Reutenauer) define a Lyndon word to be nonempty. Some others (e.g., Berstel-Pocchiola, Ruskey and Melancon) make claims incompatible with the empty word being Lyndon (e.g., Melancon about the basis of the free Lie algebra, and Berstel-Pocchiola when they list all Lyndon words of length up to 5).
  • The claim that "every string may be formed in a unique way by concatenating a sequence of Lyndon words, in such a way that the words in the sequence are nonincreasing lexicographically" would be false if the empty word were Lyndon, since the possibility of adding or removing the empty word at the end of the sequence would break the uniqueness.
  • Duval's algorithm does not work when started with the empty word.
  • In Radford's theorem, the empty word is out of place too. -- Darij (talk) 09:36, 24 December 2015 (UTC)Reply

External links modified edit

Hello fellow Wikipedians,

I have just modified one external link on Lyndon word. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 18 January 2022).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 18:36, 9 January 2018 (UTC)Reply