Template:Did you know nominations/Quotient filter

The following discussion is an archived discussion of the DYK nomination of the article below. Please do not modify this page. Subsequent comments should be made on the appropriate discussion page (such as this nomination's talk page, the article's talk page or Wikipedia talk:Did you know), unless there is consensus to re-open the discussion at this page. No further edits should be made to this page.

The result was: promoted by PFHLai (talk) 22:47, 29 August 2012 (UTC)

Quotient filter

edit

Created by RMcPhillip (talk). Self nom at 20:05, 23 July 2012 (UTC)

  • Comment I support this nomination generally. I can't review/promote it but if there is a problem you can let me know and I'll try and fix it. Protonk (talk) 20:57, 23 July 2012 (UTC)
  • One question, is quotient supposed to be capitalized or not? I only see it capitalized in this hook and the first sentence of the article (except where it begins a sentence) Chris857 (talk) 21:53, 23 July 2012 (UTC)
  • I don't think so. I made the change. Protonk (talk) 23:38, 23 July 2012 (UTC)
  • May I ask what checks were involved in the "seems grand" approval? There are significant chunks of the Bender et al. paper reproduced verbatim—many over two dozen words long—and that's just one source. I appreciate that much original work has been done, including the various diagrams, and I imagine it's quite hard to redo a proof without some close paraphrasing, but direct quotes must be shown, and Wikipedia's fair use guidelines (see the text section on that page; direct link is WP:NFCCEG) followed. BlueMoonset (talk) 15:31, 27 July 2012 (UTC)
  • Argh, I didn't make a check on that front. Sorry; my head is still stuck in the "old DYK" mentality (does the source meet the hook? Is it blatantly incorrect? yes and no respectively, pass. Ironholds (talk) 15:41, 27 July 2012 (UTC)
  • Thank you for your comments BlueMoonset. I stand fairly admonished. I have corrected the article, adding quotes where I should have put them originally. Some of the quotations may still seem overly long ... I will also removing the banner warning. Please re-add it if the content is still improper. RMcPhillip (talk) 14:09, 28 July 2012 (UTC)
  • I have restored the banner warning; there are still significant issues with unspecified Bender quotations. Examples include:
  • the fourth and fifth sentences in the very first paragraph, from "not in the set" through "space consumption."
  • the first two sentences of the Algorithm Description section:
  • 1: Even though footnotes are to other sources, the bolded text is identical to Bender: The quotient filter is a compact hash table similar to that described by Cleary and which employs quotienting, a technique suggested by Knuth.
  • 2. "is partitioned into the q most significant bits (the quotient) and the r least significant bits (the remainder)" is identical
  • Cost/Performance section, Cluster length subsection, sentence two: the relevant section should reflect the full extent of what is being quoted, rather than show only part; an ellipsis should be used: "the hash function h generates uniformly distributed independent outputs, then ... with high probability, a quotient filter with m slots has all runs of length O(log m); most runs have length O(1)." This leaves the Cost/Performance section almost entirely quoted; if it can't be summarized, perhaps it should be omitted.
  • Application section, first paragraph:
  • The second half of the second sentence, "split database tables into smaller sub-tables", is identical to the end of the third paragraph in the paper's Introduction section.
  • The third sentence through to the end of the paragraph is almost identical to the fourth paragraph in the paper's intro, second sentence to the end.
  • Even if all the quotations are specified, will the article qualify under fair use guidelines? I'm increasingly dubious that it will. There may not be other ways to phrase a few of these examples, but a rough estimate puts the number of quoted words at well over 400 from a paper containing around 3300 in an article of maybe 2200 total. What I'm afraid of is that fair use requires significantly lower proportions: WP:NFCCEG states, "Extensive quotation of copyrighted text is prohibited." BlueMoonset (talk) 20:41, 31 July 2012 (UTC)
Hey all, BlueMoonset has asked me to step into the reviewer role here. This is much better than it was. There's still a bit of unquoted closeness - for example, look at the "Cluster length" paragraph - but much less over-quoting. Nikkimaria (talk) 16:02, 8 August 2012 (UTC)
Thank you for your help Nikkimaria. I will fix that closeness. Is there some tool available to detect copyright problems? I looked at the Coren bot results but did not see this article. If such problems remain, I'd like to clean them all up at once. (Moral: Do things right the first time.) RMcPhillip (talk) 16:34, 8 August 2012 (UTC)
I'm sorry I wasn't able to get back to you sooner. There's information about the tool that helps detect replicated text at WP:Duplication detector, and a link to the tool itself at the bottom of the page. BlueMoonset (talk) 04:14, 17 August 2012 (UTC)
Thanks for the link to the tool. I do plan to update the article in the very near future. RMcPhillip (talk) 20:54, 23 August 2012 (UTC)
I have removed all but the most coincidental instances of replicated text and I took the liberty of removing the banner warning.RMcPhillip (talk) 11:09, 28 August 2012 (UTC)
Thanks for all your work on this; I think it's finally good enough to go on. Nikkimaria (talk) 04:46, 29 August 2012 (UTC)
And thanks for your patience with a newbie. RMcPhillip (talk) 10:18, 29 August 2012 (UTC)