Wikipedia:Featured article candidates/Cross-site leaks/archive1

The following is an archived discussion of a featured article nomination. Please do not modify it. Subsequent comments should be made on the article's talk page or in Wikipedia talk:Featured article candidates. No further edits should be made to this page.

The article was archived by Gog the Mild via FACBot (talk) 26 March 2024 [1].

Cross-site leaks

Nominator(s): Sohom (talk) 00:24, 15 February 2024 (UTC)[reply]

Say you clicked on that sketchy link that you shouldn't have clicked on, what's the worst that could happen ? This article seeks to answer that exact question by providing a technical introduction to an age old attack that has recently drawn some interest in the academic web security community.

A product of 4 months of almost-continual effort, this article has recieved a extensive GA review from RoySmith and has subsequently been peer reviewed by TechnoSquirrel69. This is my first time nominating an article for the featured star, and I would love to hear any feedback comments that y'all might have -- Sohom (talk) 00:24, 15 February 2024 (UTC)[reply]

JimKillock: Support

Extended content

I think it is great to be making articles like this of a good standard. I am sure it is well researched and accurate given the review you have done. However, technical matters like this are very hard to make accessible to an average reader, and I have to say, I really struggle reading this, although I consider that I have a basic lay knowledge of how some of these things might fit together. that said, it also seems a particularly challenging topic to convey in simple terms.

The introductory (lead) section is what really matters here. If this can explain the basic concept well enough, then the other sections may be comprehensible. You might want to see if you can try explaining it in reply here, in an over simplified manner, to see if that gives a guide to the edits needed make this sufficiently readable by a general reader. Hope that helps. Jim Killock (talk) 20:00, 17 February 2024 (UTC)[reply]

@JimKillock Thank you for taking the time to review the article :) As you mentioned, the topic is pretty technical which limits how simple some parts of the article can be. I've done some reworking of the prose of the lede at User:Sohom_Datta/csl, let me know if this makes it any better (I can try and simplify it further if you want). Sohom (talk) 03:05, 20 February 2024 (UTC)[reply]

@Sohom Datta Better, especially the first paragraph, but further simplification is needed. Your lede audience is someone who knows nothing about the topic, for instance perhaps your grandparents, or aunts and uncles. (Apologies if they are in fact all software engineers!) (See WP:TECHNICAL and WP:EXPLAINLEAD which seem to guide towards "do simple first, then complicated after". These also give links to text simplicity checker tools, which rate the new text as within the 20% most complicated on Wikipedia, or as "PhD" level texts.) For instance:

To perform a cross-site leak attack, the attacker must find at least one URL in a victim website that provides at least two different responses based on the website's previous interactions with a user and identify at least one way in which they can distinguish between the two responses.

This is a lot to take in, if you are new to it. There are around five or six different ideas to comprehend within it. Perhaps it would be better to step through the process, with one concept per sentence?

An alternative way of presenting the information might reduce the technical explanation in the lead to a very basic statement, and give a noddy-level step through for people like me as the first section. (Post script: this is suggested in the MOS links above.)Jim Killock (talk) 08:08, 20 February 2024 (UTC)[reply]

@JimKillock Is the latest version of User:Sohom_Datta/csl better? I've broken down the second paragraph, added a inline example and broken a few more of the longer sentences up. :) Sohom (talk) 06:48, 21 February 2024 (UTC)[reply]

@Sohom Datta Definitely getting there. I'll ask questions until I can readily understand it perhaps.

What do you mean by "response" in this sentence? For instance, on a search page, an attacker might find one response when a search yields results and another when there are no results. (A lay reader would assume the "response" was the search result.)
Then in the following sentence, what causes an information leakage issue or "side channel"?
Why does information leakage aid an attack, or how does it?

Jim Killock (talk) 07:34, 21 February 2024 (UTC)[reply]

@JimKillock Sure, no problems, feel free to ask any questions :)

- In this case response should mean the HTTP responses sent to the browser, however mentioning that requires that we go of on a tangent explaining "HTTP". Since the "search result" abstraction is not incorrect, I'm not too keen on doing anything here.

But a "search result" is naturally going to give a different "response" if it is for a different search term, which is what you imply here. However the http response seems to be different for another reason? This is why it seems confusing. --Jim Killock (talk) 21:47, 23 February 2024 (UTC)[reply]

The HTTP responses are different because the search results are different. Imagine a attacker is trying to attack your Gmail account using this technique, the attacker would analyze gmail and find that the website has a search endpoint. They would then pick two queries (say "dog" and "ggdkjsvkjfdsgfdjkgjfdsdj"). They know that "ggdkjsvkjfdsgfdjkgjfdsdj" will always return a empty response. Given this, the attacker will then identify a information leak by which they can differentiate a empty response from a non-empty response. Once they do that, they are able to make the two queries, and if both responses are empty, they know that you don't own dogs, else they now know that you own/talk about dogs. Hope that cleared up the confusion. Sohom (talk) 18:14, 24 February 2024 (UTC)[reply]

- This is more difficult, and a good question. There is no particular cause for these information leakages, they are inherent to the way the web works and are a result of design choices that were made during the early years. This is touched upon in some detail in the "Background" and "Mechanism" section of the article.

IIRC, they are things like, http responses and JS etc allow the website to ask (or is told by dewfault?) things like what browser version, plug ins, language, fonts, etc are being used. Are there other relevant examples? An example or two would help. --Jim Killock (talk) 21:47, 23 February 2024 (UTC)[reply]

You are confusing Browser fingerprinting with Cross-site leaks. The most common example of a information leakage issues in a cross-site leak context would be the leakage of cache information across websites (a different website caches images and then you check if a specific image is cached). Another example I like to give is that of a website timing how long another website (say Wikipedia) takes to return a login page. If it takes particularly long, you can infer that the login page redirect somewhere else, and so make the inference that the user is logged into Wikipedia. Sohom (talk) 20:13, 24 February 2024 (UTC)[reply]

- Again, this is another really good question, I've struggled to explain this in a succient way in the lede, but these information leakages are necessary since they bypass the Same-origin policy (there is some discussion about this in the "Background" section). Under normal circumstances, browsers will block attempts by websites to query information about other websites rendering this attack ineffective.

Let me know if the changes work :) Sohom (talk) 15:37, 23 February 2024 (UTC)[reply]

Improvements for sure. Perhaps it would be helpful to draft a simple explanation as not being part of the lede as a first step, so we don't conflate the parameters of a lede with the need of the average reader to be given a simple overvew? And then it can aid the lede draft as you would have the basic level guide ready? --Jim Killock (talk) 21:47, 23 February 2024 (UTC)[reply]

@JimKillock Sure, I've answered your questions above, let me know if you need further clarification. Sohom (talk) 16:24, 25 February 2024 (UTC)[reply]

I will re-read your background section and see if I can understand better. Jim Killock (talk) 22:43, 25 February 2024 (UTC)[reply]

The current draft is better, but it is still hard to follow; your explanations here are very helpful, but I think it will take a long time to answer my questions in this way to the point I understand it sufficiently to be able to suggest edits or improvements to the text.

So I'm doing a bit of background reading elsewhere to see if I can get an overview into my mind.

Another step you could take is to read WP:TECHNICAL and WP:EXPLAINLEAD and produce a simple explanation based on those for the "general reader" audience, without regard to where it goes for now. Jim Killock (talk) 08:53, 26 February 2024 (UTC)[reply]

@JimKillock Let me know if the rough summary at User:Sohom Datta/csl's 2nd paragraph is any better? Sohom (talk) 19:27, 2 March 2024 (UTC)[reply]

Yes this is much better. The example is very well explained and I would hope accessible to a much broader range of readers.

There is something missing after this. You say: To execute the attack, the attacker tricks the victim into visiting a website they control, where the exploit is delivered. This often involves phishing or luring the victim to click on a seemingly harmless link. It is hard to relate this back to the prior explanation: the attacker has learnt about different responses, and you have explained how to observe these, but not why these differences matter to an attacker, nor why phishing is needed to deliver the attack. Jim Killock (talk) 11:36, 4 March 2024 (UTC)[reply]

@JimKillock I've cleaned up User:Sohom Datta/csl a bit more. I've removed the reference to phishing since it seems out of place and provided more context on how the attack can be used to find more information about the user. Sohom (talk) 23:53, 8 March 2024 (UTC)[reply]

@Sohom Datta Closer still :) But I am not sure you can cut all the information about the way the attack is delivered. You had: "To execute the attack, the attacker tricks the victim into visiting a website they control, where the exploit is delivered. This often involves phishing or luring the victim to click on a seemingly harmless link." This didn't seem complete, but the user needs to understand a little of the mechanism used.

Therefore on a read through of the current text, these two questions came to to me:

A line is needed to explain the method by which the attacker gets to watch gmail queries in the victim's browser: they could try to find a search URL which returns a HTTP response based on how many search results are found for a specific search term in a user's emails. [SOMETHING NEEDED HERE: about what the attacker has done to be able to cause the searches to be made from the browser, if that is possible, that is more that "information leakage issues"] The attacker can then use information leakage issues
At the end, perhaps give an example of the kind of revealing search the attacker might want to find out about, and what the value of that might be. eg, is this about prurient interest, emails from banks? Improving phishing attacks as the attacker knows who you bank with?

Jim Killock (talk) 07:49, 9 March 2024 (UTC)[reply]

@JimKillock Let me know if the latest version is better :) Sohom (talk) 01:44, 10 March 2024 (UTC)[reply]

@Sohom Very, very nearly there. Thank you very much for bearing with me on this.

The order of things needs to be simplified a little (sorry as I have probably caused this problem). In particular, you open up the idea of using Gmail time responses in para 2 sentence 2, but don't explain why this might work until para 3 sentence 3. These ideas belong together for the reader, even though they don't in terms of the sequence of the attack. Otherwise they won't understand para 2 and will stop reading.
After "had any emails containing the query string" I think a really basic example would be satisfying for the reader who has made the effort to read this, and is unsure if they have understood. Something like "The query could be for 'Yourbank PLC', 'Divorce Solicitors', 'husband having affair', for example." Please use better examples especially if these can help with understanding concepts further in the article. :)

Jim Killock (talk) 07:00, 10 March 2024 (UTC)[reply]

I wasn't able to reorganize the attack in a manner where those two sentences were close to each other. I've instead moved the papargraph break to a place where it is obvious that both paragraphs need to be read to understand the attack. (I think?)

Similarly, I'm a bit uncomfortable with making up queries that are not supported by any paper. I've instead opted to add a footnote clarifying what types of information are queried and cited a source for that. Sohom (talk) 16:57, 10 March 2024 (UTC)[reply]

@JimKillock Let me know if the changes work. (They are in the lede of Cross-site leaks) Sohom (talk) 16:58, 10 March 2024 (UTC)[reply]

@Sohom I think it's fine for now. Someone else needs to look and review at this point - I am now too familiar with it to be of much help - and you are probably bored of hearing from me! A fresh reader can assess whether it hits the mark I think. but for me this is sufficient for me to make a recommendation to support on the basis that it is now much clearer to a first time reader. Jim Killock (talk) 18:07, 10 March 2024 (UTC)[reply]

Supporting on the basis of an edited simplified introduction, but I recommend that other non-technical editors have a read and make their own assessment and give advice. This is a difficult topic to ensure there is a basic, top-level version accessible enough for WP's general audience. --Jim Killock (talk) 18:11, 10 March 2024 (UTC)[reply]

Coordinator note

This has been open for more than three weeks and has yet to pick up a support. Unless it attracts considerable movement towards a consensus to promote over the next three or four days I am afraid that it is liable to be archived. Gog the Mild (talk) 21:11, 8 March 2024 (UTC)[reply]

@Gog the Mild Any suggestions on where I might be able to attract more reviewers? This article more on the technical side, and a lack of regular reviewers would be expected, since the subject matter (CS/Privacy) is pretty different from most regular FACs (Not that that's a bad thing) :) Sohom (talk) 23:46, 8 March 2024 (UTC)[reply]

@Sohom Datta: I've listed the article at Wikipedia:WikiCup/Reviews needed. Like I mentioned to you earlier today, I'll see if I can come around for another review myself this weekend. Maybe you could also try reaching out to some of the devs on the Wikimedia Discord? —TechnoSquirrel69 (sigh) 01:47, 9 March 2024 (UTC)[reply]

I've notified the Computer Science and semi-active Computer Security Wikiprojects (I should have done this earlier). I don't think the devs on Wikimedia Discord are my best bet, but I'll reach out and see what I can do :) Sohom (talk) 23:45, 9 March 2024 (UTC)[reply]

My boilerplate advice is

Reviewers are more happy to review articles from people whose name they see on other reviews (although I should say there is definitely no quid pro quo system on FAC). Reviewers are a scarce resource at FAC, unfortunately, and the more you put into the process, the more you are likely to get out. Personally, when browsing the list for an article to review, I am more likely to select one by an editor whom I recognise as a frequent reviewer. Critically reviewing other people's work may also have a beneficial impact on your own writing and your understanding of the FAC process.

Sometimes placing a polite neutrally phrased request on the talk pages of a few of the more frequent reviewers helps. Or on the talk pages of relevant Wikiprojects. Or of editors you know are interested in the topic of the nomination. Or who have contributed at PR, or assessed at GAN, or edited the article. Sometimes one struggles to get reviews because potential reviewers have read the article and decided that it requires too much work to get up to FA standard. I am not saying this is the case here - I have not read the article - just noting a frequent issue.

Gog the Mild (talk) 20:06, 11 March 2024 (UTC)[reply]

Comments from TechnoSquirrel69

A little late, but saving a spot here. I'm not able to get to the review this weekend, but I'll try to get back here with some comments as soon as time permits. —TechnoSquirrel69 (sigh) 21:05, 10 March 2024 (UTC)[reply]

@Sohom Datta: It seems that time did not permit, unfortunately. However, I'm officially back; have a review! Citation numbers from this revision.

Media review

File:XS-Leaks Attack Steps.svg is freely licensed and tagged accordingly.
File:Histogram of cross-site leaks cache timing attack example.png is freely licensed and tagged accordingly.
Code blocks included in the article are taken from Van Goethem et al., which is freely licensed. The text is appropriately attributed.
Media review passed.

Other comments

So I think the entirety of footnote 4 is unnecessary. The attribution for the code is already given in the references, and if it contains an error of some sort (which I'm not pretending to understand, just assuming), then it should be silently corrected. This is kind of clarification is useful for editors, but I'd expect it more in an HTML comment than in the actual article.
There are a bunch of citations where you've duplicated part of the URL in the |website= parameter. It's much more useful and consistent to identify the name of the website or entity and link to the article about it if possible. For example, in citation 2: |website=developer.mozilla.org → |publisher=[[MDN Web Docs]]. Also, use italics only when referring to the name of a publication (The Daily Swig) and not just for every website (Medium), similarly to how the name would be treated in running prose.

Done, lmk if I missed any

In citation 15: lose the underscore.

Done

In citation 28: Add the author, remove "Cybersecurity news and views".

Done

There are a few other citations also missing authors.

Should be fixed, lmk if I missed any

I'm doing a more detailed review of the prose, which I'll add here once I'm done. Let me know if you have any questions in the meantime! —TechnoSquirrel69 (sigh) 21:59, 21 March 2024 (UTC)[reply]

@Sohom Datta: Alright, we talked in much more detail earlier, but I'll just summarize my feedback here for the record.

A reader needs additional context for the subject matter, which takes the form of explaining how the system is supposed to work before getting into how to exploit it.
Don't dumb down the content, try to abstract it so readers can understand the general concepts without needing specific knowledge of all the moving parts. When you do have to get into technical territory, make sure to contextualize it in plainer terms.
The diagram in § Background is confusing and borderline illegible. The article is probably not well served by it, so consider options to replace it or remove it altogether.

—TechnoSquirrel69 (sigh) 04:04, 22 March 2024 (UTC)[reply]

Comments from Mike Christie

I'm not a security expert but I do have some technology background so I'll see if I can provide some useful comments.

The first thing that strikes me is that the lead is too long for the size of the article -- it's almost 25% of the length of the body. I would cut at least a quarter of it; it only needs to summarize and point to the content in the body.
Most of the bulk of the lede comes from a detailed, simplified example of a how a attack is performed. (Which was something JimKillock wanted) Would it make sense to merge that with the Mechanism section?
Per your comment below, I think moving some of the detail to the "Mechanism" section would work well. Mike Christie (talk - contribs - library) 11:02, 12 March 2024 (UTC)[reply]

let me know if the current version is any better. Sohom (talk) 00:32, 25 March 2024 (UTC)[reply]

You link to information leakage twice in the lead.
Fixed
Still there -- you link from "leak information" in the first paragraph and "information leakage" in the third; I was going to remove the second link but realized you might prefer to keep that one. Mike Christie (talk - contribs - library) 11:02, 12 March 2024 (UTC)[reply]
Looking at File:XS-Leaks Attack Steps.svg, I suggest adding a little more to the caption explaining the sequence. Perhaps add "Here the attacker can deduce that the victim is logged in to the vulnerable site".
I've expanded some of the captions.
That does help. How about using the "upright" param to increase the size of the image a bit? It's not readable without clicking through on most screen sizes. Mike Christie (talk - contribs - library) 11:02, 12 March 2024 (UTC)[reply]
Just making sure I understand the mechanics: client-side Javascript sends a request to victim.leak, which replies; the body of that http response is hidden from attack.leak by the browser, but the http header of the response is returned to the Javascript's execution context, which means it can be forwarded to attack.leak. Is that correct?
Mostly, the HTTP header is not returned/read by the attacker, but some of the effects of the Content-Disposition header can be observed by attack.leak. (https://xsleaks.dev/docs/attacks/navigations/#download-trigger gives a nice overview of the attack).
I can see the types of attack are so numerous that it's not feasible for you to list every single one. However, the detection of downloads seems like it is a clear enough example you might consider adding a mention of it to the "Other techniques" paragraph. And is what you say about the Content-Disposition header a generally true statement for most of the attacks? If so it seems like that's a technical detail that ought to be mentioned. Mike Christie (talk - contribs - library) 11:02, 12 March 2024 (UTC)[reply]

I've elected to remove references to the download attack in the new diagram per the new feedback.

Have there been any known instances of this attack in the wild?
None that have been documented in RS.
It's hard to source a negative but I think we should say this if we can source it. Mike Christie (talk - contribs - library) 11:02, 12 March 2024 (UTC)[reply]

I've not been able to find any sources that prove the negative, the closest RS comes to describing in-the-wild instances are Terjanq's and Luan Herrara's attack. :( Sohom (talk) 00:32, 25 March 2024 (UTC)[reply]

In the last paragraph of the lead, "traditionally", "modern" and "more recently" imply a time frame; can we put dates on these? "Until the 2010s" and "since about 2020", or whatever the sources would support. Otherwise the language is going to date relatively quickly.
"Cross-site leaks allow attackers to break this cross-origin barrier, which is inherent in web app contexts": The previous sentence described the barrier as preventing arbitrary execution, so I think "break" is too strong here -- really it's a read-only breach. How about "Cross-site leaks allow attackers to obtain information despite [or through] this cross-origin barrier". And what does "which is inherent in web app contexts" add that hasn't been said in the previous sentences?
Removed the last part, and reworded the rest
"To perform a cross-site leak, the attacker must identify and include at least one state-dependent URL in the victim app." This makes it sound as if the attacker is including something in victim.leak; what I think you mean is "To perform a cross-site leak, the attacker must identify at least one state-dependent URL in the victim app for use in the attack app".
Done
~~"To demonstrate ... is taken": suggest "The following example of ... demonstrates a common scenario of ..." -- I think the "is taken" wording sounds a bit strained.~~
Done
Just out of curiosity, and to see if I understand the mechanism properly, if the attacker used an icon loaded from their own network, wouldn't that give them more specific information than timing a CDN icon return to see if it was cached
Yes, that would definitely give the attacker more information (and make the attack easier), but in this case, the assumption we are making is that the attacker cannot tamper with the content of the victim website, just make requests to it
Right -- I'd misunderstood the mechanism. Rereading I don't think more is needed; I just misread it. Mike Christie (talk - contribs - library) 11:02, 12 March 2024 (UTC)[reply]
~~Suggest expanding "iff" and unlinking it; no need to abbreviate to that level.~~
Done
"but used an amplification technique in which the input was crafted to extensively grow the size of the responses, leading to a proportional growth in the time taken to generate the responses, thus increasing the attack's accuracy". What would we lose if this was shortened to "but used a technique in which the input was crafted to grow the size of the responses, leading to a proportional growth in the time taken to generate the responses, thus increasing the attack's accuracy"?
Done
"Since 2020, there has been some interest among the academic security community to standardize these attacks." Suggest "Since 2020, there has been some interest among the academic security community in standardizing the classification of these attacks".
Done
~~You might consider changing to {{Use Oxford spelling}} instead of {{Use British English}}, since you're using "-ize" endings.~~
Done
~~"... for which there is no established, uniform classification. These attacks are typically categorized by ...": seems contradictory.~~
I guess I want to emphasize "established" and "uniform" in the previous sentence.
OK. Mike Christie (talk - contribs - library) 11:02, 12 March 2024 (UTC)[reply]
"As of 2021, researchers have identified over 38 leak techniques that target components of the browser, and new techniques are discovered due to ongoing changes in web platform APIs": I'm not sure what the second half of this is saying. Does it refer to discoveries that post-date the 2021 list of 38 techniques? Or is it a general statement about how new techniques can appear?
It's a general statement on how new techniques appear.
Could we make this "As of 2021, researchers have identified over 38 leak techniques that target components of the browser. New techniques are typically [or often] discovered due to ongoing changes in web platform APIs"? Assuming the source supports this? Mike Christie (talk - contribs - library) 11:02, 12 March 2024 (UTC)[reply]
~~"timing attacks could infer cross-origin execution times across embedded contexts": what does "across embedded contexts" mean?~~
"Embedded contexts" would be mostly iframes (and other more obscure framing techniques)
OK -- I think that's fine as is; I'm not a web developer but I think anyone familiar with the field would have no trouble with this. Mike Christie (talk - contribs - library) 11:02, 12 March 2024 (UTC)[reply]
~~"showed the Performance API could leak": "Performance API" needs a link or a footnoted explanation; I assume it's one of Chrome's APIs but that should be clearer.~~
Added
"In contrast, if the handler onerror is triggered with a specific error event, the attacker can use that information to distinguish between HTTP content types, status codes and media-type errors": again just checking my understanding -- wouldn't this information already be available in the http status code?
Yes it would, but the browser would not allow cross-origin pages to access the http status codes
OK -- this is the same question as above about the Content-Disposition header; I hadn't understood exactly what information is allowed to be seen by the browser, and was assuming some aspects of the status were directly visible. I think in the mechanism section some statement that incorporates what you've told me in answer to these two questions would be helpful. Mike Christie (talk - contribs - library) 11:02, 12 March 2024 (UTC)[reply]
If the sources give enough information, what could the "global limits" reveal? And is this section different from the last sentence of "Timing attacks" which talks about a pool party attack? ~~(And is it "pool party" or "pool-party"? You have both.)~~
This section is not different from the last sentence, these attacks have been categorized by Knittel as both timing and as part of the new "global limits" type. The paper dicussing pool-party attacks uses the "pool-party" convention, I'll stick with that.
I think it would be helpful if the reader knew in the global limits section that the previously mention pool-party attack was an example of this type of attack. Perhaps in the timing attacks section add something like "this is an example of a global limits attack"? Or the reverse: in the later section mention the earlier timing attack as an example. Mike Christie (talk - contribs - library) 11:02, 12 March 2024 (UTC)[reply]
~~"an attacker could leak whether or not a Cross-Origin-Opener-Policy header was set": can we say what this would reveal to the attacker?~~
So, the presence or absence of a header doesn't reveal much on it's own. However, it's a mechanism to tell two responses apart. Ditto for the one above.
~~Suggest linking "stateless" to stateless protocol.~~
Done
~~"By disallowing the embedding of the website in untrusted contexts, the malicious app can no longer ...": needs rephrasing; as written this says it's the malicious app that is doing the disallowing.~~
Rephrased
Am I right in thinking that the Fetch metadata headers do nothing by themselves, but require the targeted app to take action depending on their content? So they enable a defence but are not in themselves a defence?
Yep, they enable a defence but they are not defences in themselves (it allows for disallowing specific "risky" requests)

-- Mike Christie (talk - contribs - library) 13:16, 11 March 2024 (UTC)[reply]

@Mike Christie Thank you so much for the review. I've implemented most of the feedback and left a few inline explanations.

I'm a bit confused regarding the lede (a lot of the bulk comes from implementing User:JimKillock's (courtesty ping) suggestions regarding simplified overview of the topic for general readers). I wonder if moving some of the example related content into the "mechanism" section would be a good idea :) Sohom (talk) 04:55, 12 March 2024 (UTC)[reply]

Hi both, take a read of

WP:TECHNICAL: It is especially important to make the lead section understandable using plain language, and it is often helpful to begin with more common and accessible subtopics, then proceed to those requiring advanced knowledge or addressing niche specialties.
WP:EXPLAINLEAD: For these reasons, the lead should provide an understandable overview of the article. While the lead is intended to mention all key aspects of the topic in some way, accessibility can be improved by only summarizing the topic in the lead and placing the technical details in the body of the article. … In general, the lead should not assume that the reader is well acquainted with the subject of the article. Terminology in the lead section should be understandable on sight to general readers whenever this can be done in a way that still adequately summarizes the article, and should not depend on a link to another article.
WP:ONEDOWN A general technique for increasing accessibility is to consider the typical level where the topic is studied (for example, secondary, undergraduate, or postgraduate). … The lead section should be particularly understandable, but the advice to write one level down can be applied to the entire article, increasing the overall accessibility. Writing one level down also supports our goal to provide a tertiary source on the topic, which readers can use before they begin to read other sources about it. Writing one level down also supports our goal to provide a tertiary source on the topic, which readers can use before they begin to read other sources about it. In terms of the example, For example, a long-winded mathematical proof of some result is unlikely to be read by either a general reader or an expert, but a short summary of the proof and its most important points may convey a sense to a general reader without reducing the usefulness to an expert reader.

I think a simple lead and then layering the basic description afterwards would fit the above from the WP:MOS, but you would need to take care that the lead itself remains comprehensible to an "average non-technical reader". Jim Killock (talk) 07:48, 12 March 2024 (UTC)[reply]

Arbitrary break: Sohom

@JimKillock, Mike Christie, and TechnoSquirrel69: (also @Joereddington: who left some comments at WikiProject Computer Security :) I've rewritten the lede and the background. I've elected to remove the detailed description of the attack from the lede (the example and description have been moved to the mechanism section) and instead provide a brief overview of the salient aspects of the attack. The background section has been expanded to provide some context on why a attacker might want to perform the attack and explain the impact of the same-origin policy in a better way (it also goes into detail about the ideal way everything should work). The confusing drawing in the background+mechanism section has been replaced with a much better and simplified diagram that does not include references to the download identification attack. (after a lot of feedback from Technosquirrel69) Sohom (talk) 00:32, 25 March 2024 (UTC)[reply]

Thanks - I think the shortened lead approach can work well here, and entirely agree with it being moved; however, the current lead contains a lot of unexplained concepts which are of course broken down in the background you wrote. If this wasn't Wikipedia, I would add a sentence to guide the unitiated to hold on (eg, "a simple explanation of the process is provided below"). Also if this wasn't Wikipedia I would suggest removing more of the potentially confusing and not fully expanded concepts in order to ensure the reader doesn't feel they've lost the thread and stopped.

Given all that may break the rules, I would aim for a very simple over-view up front along the lines of: In a cross site attack, the user is duped into visiting a malicious website, that asks the users' browser to get information from another web service, like a search engine, without the user knowing about it. Because the other web service was "asked" by the users' web browser, it complies with the request. The malicious website can then learn something about the user's relationship with the web service, through things like the length of time it takes for a request to come back, or the amount of information the web service gives to the user. While the malicious website cannot read the information from the web service directly, as it is collected by the user in their web browser, the malicious website can make accurate inferences that reveal specific facts about the user.

You could even incorporate here or perhaps in the background section: For example, the attacker could ask your browser to search a web based email service. The attacker would then pick two queries (say "dog" and "ggdkjsvkjfdsgfdjkgjfdsdj"). They know that "ggdkjsvkjfdsgfdjkgjfdsdj" will always return a empty response. Given this, the attacker will then observe the difference when your browser gets an empty response versus a non-empty response. Once they do that, they are able to make the two queries, and if both responses are empty, they know that you don't own dogs, else they now know that you own or talk about dogs. Jim Killock (talk) 04:58, 25 March 2024 (UTC)[reply]

I've tried to simplify the language of the first portion of the lede and incorporate some of your suggestions. Let me know if it works now. Regarding the rest, I have reservations about including the exact example (which I had outlined previously). However, I've incorporated a part of the text in the Mechanism section. Sohom (talk) 16:44, 25 March 2024 (UTC)[reply]

I understand the reluctance, and I've no wish to keep pushing my own view here. But I would ask you if you can honestly say that per WP:EXPLAINLEAD and WP:TECHNICAL that the "lead section [is] understandable using plain language", consistently does "not assume that the reader is well acquainted with the subject" and that "Terminology in the lead section [is] understandable on sight to general readers".

The recommended tool hemingwayapp is giving the lead a rating of "Grade 14 Poor. Aim for 9. and says "11 of 28 sentences are very hard to read".

Personally I think it is possible to make the introductory remarks simpler, which was my aim in writing a few lines to show how it could be approached. And when asked casually, you have yourself given me very good and impressively clear simple explanations. Jim Killock (talk) 20:59, 25 March 2024 (UTC)[reply]

Coordinator note 2

This has been open for nearly six weeks and has attracted a lot of comments but only declarations of support. It currently feels more like a PR than a FAC. There still seems a way to go to achieve any consensus to promote so I am to put it to bed now and ask that further work take place away from FAC with discussion on the article talk page, or possibly PR. I anticipate seeing it back here soon, although the usual two-week wait applies. You can of course again ping the reviewers to comment at the next FAC. Cheers. Gog the Mild (talk) 12:33, 26 March 2024 (UTC)[reply]

Closing note: This candidate has been archived, but there may be a delay in bot processing of the close. Please see WP:FAC/ar, and leave the {{featured article candidates}} template in place on the talk page until the bot goes through. Gog the Mild (talk) 12:33, 26 March 2024 (UTC)[reply]

The above discussion is preserved as an archive. Please do not modify it. No further edits should be made to this page.