Talk:Scapegoat tree
This article is rated C-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||||
|
tree
editThe claim at the head of this article that "Each node in a scapegoat tree keeps track of the size and height of the subtree rooted at that node, as well as whether or not that node has been deleted," is simply wrong. As the first sentence of the second paragraph of the Abstract in the original Galperin and Rivest paper states: "Scapegoat trees, unlike most balanced-tree schemes, do not require keeping extra data (e.g. "colors" or "weights") in the tree nodes."
Agreed, I noticed that to. So I decided to be bold and lose my first article rewrite to scapegoat trees =) feedback appreciated... and any help to bring it up to par appreciated. Themania 18:59, 16 February 2007 (UTC)
The description of the deletion operation assumes an alpha of .5. Really, the rule should be written "NodeCount <= MaxNodeCount * alpa". Cebarber (talk) 17:03, 16 August 2009 (UTC)
balancing
edit"Once the scapegoat is found, a standard binary search tree rebalance operation is performed." Either the BST article should be edited to explain what the 'standard' operation is, or the article should expound more. It's ambiguous because it's unclear if you'd want to perform only 1 rotation (as in trees that stay strictly balanced) or multiple rotations because the tree only stays 'loosely' balanced. Actually, then red-black trees are compared, and it seems to say that red-black trees use rotations *as opposed* to whatever scapegoat trees do. Looking at the original paper it doesn't seem to explain either. —Preceding unsigned comment added by 192.160.124.68 (talk) 16:23, 12 May 2010 (UTC)
Hey, looks like someone updated it. It's a little better now but still doesn't actually explain how the rebalance operation works. It explains how you pick which node to rebalance, but not what you actually do to it. "A standard rebalancing operation is performed." Looking at the paper again it looks like he basically suggests rebuilding the whole subtree (iterate it in order, copying into an array, then divide and conquer the array to make a new tree). But the paper goes into more detail about an in-place (no extra memory use) way of doing it that seems specific to goat trees; the other tree type articles usually have pseudo code for all the steps and/or an explanation for the operations that are specific to the particular tree type in question, so I don't think that'd be out of place here. —Preceding unsigned comment added by 192.160.124.68 (talk) 14:59, 13 May 2010 (UTC)
bb-tree
editHi, it looks that idea of scapegoat tree is based on "BST of bounded balance", they even cite J.Nievergelt in their work. BB-trees are very easy to implement, and have many usefull properties, especially for functional languages. Is there any article on this in wikipedia? Nore info: http://groups.csail.mit.edu/mac/users/adams/BB/index.html , Article on polish wiki: http://pl.wikipedia.org/wiki/Drzewo_o_ograniczonym_zrównoważeniu --149.156.82.207 (talk) 19:42, 2 August 2010 (UTC)
Yes, it seems to have been linked to Weight-balanced tree LudwikSzymonJaniuk (talk) 09:12, 5 April 2019 (UTC)
Etymology
editWhy are they called scapegoat trees?? That's the first question that came to mind when I saw the link, but this article doesn't tell me. Someone who knows should add that fact somewhere in the intro, please. Nate Wessel (talk) 02:06, 6 March 2015 (UTC)
I added a section on the etymology, one of the sources describes it. LudwikSzymonJaniuk (talk) 09:38, 5 April 2019 (UTC)
amortized(?) worst case
editAs with Splay tree the infobox shows "amortized O(log n)" for worst case insertion or deletion. What I don't understand at all in this context is the term "amortized". Isn't "worst case" specific enough? Does "amortized O(log n)" for worst case in effect mean: possibly worse than O(log n), but on a certain average O(log n)?
If so then "amortized O(log n)" for average insertion or deletion would say what is meant. –Nomen4Omen (talk) 08:31, 4 October 2021 (UTC)
To your Edit summary as of 23:06 12 December 2021
edit@David Eppstein: Dear Sir,
you say «Makes no sense to omit the "amortized" and say that worst-case insert-time is logarithmic without qualification».
Of course, it makes sense!
Worst case analysis is well defined and existed long (≤ 1900) before the invention of amortized analysis (≈ 1970). Worst case analysis is simply the "analysis of the worst case". So, of course, worst-case insert-time can be logarithmic, can be linear, can be bounded (=constant) or whatsoever. Worst-case is sufficient as a qualification and does not need an additional one. (And because of that, we as WP-editors have to assume that the WP-reader understands worst-case in the simple and old sense − and we should not add some attribute like amortized.) In Galperin&Rivest I read on page 165 right half top:
- "RB trees implement the basic dictionary oparations with a worst-case cost of O(log n)"
without any attribute amortized! And in the middle of the right half of this page I read
- "... algorithm for maintaining weight-balanced trees with amortized cost O(log n)"
without any attribute worst-case! As I tried to point out in our many earlier dialogs, there is no «amortized worst-case» (nor an «amortized average»). As Rebecca Fiebrink points out:
- „Amortized analysis is a useful tool that complements other techniques such as worst-case and average-case analysis.“
She says: there is a trichotomy
- average vs. amortized vs. worst-case
analysis. As you can read in the article, amortized analysis forms the worstcase-of-the-sums, frequently also called upperbound of the sums, of a sequence of operations – and not the sum of a sequence of worstcase operations. (Btw., the techniques "accounting" and "potential" method obtain the same results in this respect.) In that sense your "Amortized time bounds are worst-case, not average-case" is wrong:
- Amortized costs are upper bounds of sums, sums which afterwards are averaged.
(Indeed, all 3 terms "amortized", "upper bounds" and "averaged" appear in this 1 sentence, but it is important to observe the roles the terms take therein. It is really not very frequent, even in the field of mathematics, that one has to read a text in such a precise manner.)
In Galperin&Rivest you also find: "scapegoat trees ... do have logarithmic worst-case cost of a SEARCH" – without any attribute amortized. Or:
"However, splay trees do not guarantee a logarithmic worst-case bound on the cost of a SEARCH."
- Also here, there is no attribute amortized, it means the simple worstcase.
- So already this conflicts with the adjective amortized in your entry «amortized O(log n)» in the Infobox of article Splay tree.
- Moreover, "does not guarantee logarithmic" means ≠O(log n)
(– which conflicts heavily with the same entry).
The sources you gave me in another dialog do not catch the point of our discussion:
- Lecture notes from U. Washington: "While amortized analysis is about averages, we are averaging cost-per-operation on worst-case input. Contrast: Average-case analysis is about averages across possible inputs." I can agree, but it is rather sloppy a formulation: "worst-case input" should be replaced by "sequences whose sums are worst-case". Again this does not support your use of the terms «amortized worst-case» as opposed to «amortized average» (– as you do in some Infoboxes).
- UCSD lecture notes: uses exact phrase "worst-case amortized time". Hein??
- Cornell lecture notes: "Amortized analysis is a worst-case analysis of a a sequence of operations". Again rather sloppy: A little bit better would be: "Amortized analysis is analysis of worst-case sequences of an operation".
–Nomen4Omen (talk) 16:28, 13 December 2021 (UTC)
- As I said elsewhere the next time you revert an edit on amortized analysis based on your woefully incomplete understanding, which you continue to show off to the world in your TLDR remarks here, I am taking you to ANI for WP:CHEESE-like behavior. —David Eppstein (talk) 17:14, 13 December 2021 (UTC)