Talk:Illegal opcode

Latest comment: 2 years ago by Mathnerd314159 in topic Split illegal and undefined

Talk edit

Illegal opcode should not be the same like an Undocumented opcode. They are very different. Michael 19:22, 18 October 2007 (UTC)Reply

What exactly is out of date here? edit

I don't understand why there is an out-of-date warning. What exactly is supposed to be out of date about the given information? The subject is mainly historical - so there isn't going to be much *new* information that needs to be added. -- 89.182.203.44 (talk) 11:37, 19 January 2016 (UTC)Reply

Needs links to other articles edit

This article needs links to Pentium F00F bug along with Pentium FDIV bug.

Also more examples are needed.(such as code and documentation.)

FockeWulf FW 190 (talk) 16:33, 11 March 2016 (UTC)Reply

BIOS operation? edit

BIOS operation links to this article, but that term is not mentioned in the article text. :-( --RokerHRO (talk) 21:05, 10 June 2017 (UTC)Reply

It is, "bop". --Matthiaspaul (talk) 14:09, 1 January 2022 (UTC)Reply

Split illegal and undefined edit

I propose splitting the article into Illegal opcode and Unintended opcode, or moving the article and then usurping the original title. The lead in the Unintended opcode page should drop the reference to invalid opcodes. Both articles should include {{distinguish}} templates.

I'm undecided whether the new Invalid opcode page should discuss privileged instructions used in a context where they are invalid, e.g., an IBM System/360 'Load PSW instruction issued in[problem state. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 13:40, 29 December 2021 (UTC)Reply

Hi Chatul, I'm not sure this would be beneficial to the readers at this state of the article. Splitting might be an option if the article has grown much longer and if the two sub-topics have clearly emerged, but right now it seems that people do not distinguish the terms much.
Either way, if you can add to the topic (or potential sub-topics), by all means please do, possibly by adding a sub-section.
--Matthiaspaul (talk) 14:17, 1 January 2022 (UTC)Reply
The two really are very different. Some optional features have very well documented opcodes that are illegal when the feature is not installed. Some opcodes are illegal on one model but very well documented on a newer model. What the article describes is mostly opcodes that are legal but unintended. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 02:58, 2 January 2022 (UTC)Reply
The current state of the article confuses readers. The first paragraph lumps together
  • instructions that aren't documented and that cause an "illegal instruction" trap;
  • instructions that aren't documented and that don't cause an "illegal instruction" trap, and may even do something unintended but useful.
Most of the article appears to discuss the latter.
(In addition, the second paragraph of the "Overview" section is rubbish. UUOs were not undocumented; they were instructions specifically intended to cause traps. This dates all the way back to the PDP-6; page 21 of the PDP-6 Handbook speaks of the instructions with the topmost 3 bits of opcode being zeroes as "programmed operators"; page 2-64 of volume 1 of the PDP-10 Reference Manual uses the terms "unimplemented operations" and "unimplemented user operations", as well as the abbreviation "UUO", and notes that half of those operations trap to monitor mode and the other half trap into user-mode code, so it explicitly lists instructions that are intended as system call instructions. I've removed that paragraph.) Guy Harris (talk) 05:45, 3 January 2022 (UTC)Reply
I think, the article lumps them all together because in the real world many people use at least some of these terms intermixed without paying too much attention to it, so that, if we'd split the contents into two articles, people might land in the wrong one. Therefore, unless we find proper commonly accepted definitions of the terms, I think, we should redirect all such related terms in use now and in the past into this article and discuss all aspects of them inhere in order to help readers to see what they have in common but also the differences. I think, it is possible to discuss them systematically in a single article so that we don't have to split them into many. If enough contents has been accumulated on a specific sub-topic to warrant a separate article, this can always be done at a later stage.
Regarding the removal of the UUO stuff, I think, this should be readded (possibly reworded to be factually correct) because, documented or not, it is at least a related topic. I would actually love to see a comprehensive list of such traps being used in the various systems (while the mechanism how the CPU reacts is documented by processor manufacturers, the opcodes actually used for such purposes by operating system vendors are rarely documented, but nevertheless interesting to know).
BTW, I found the term "unintended instruction" (but not "unintended opcode") also used in conjunction with gadgets, which brings in yet another aspect.
--Matthiaspaul (talk) 14:00, 3 January 2022 (UTC)Reply
If readers are confused, we (TINW) should not add to the confusion. Surely {{distinguish}} is a better way to deal with the confusion.
I'm not sure of the best way to handle UUO. My first take is that, e.g., INT, MME, SVC, TRAP and UUO are in a separate category and that the redirect is inappropriate.
With regards to gadgets, that does not relate to the use of either invalid or unintended opcodes, but rather to an unintended uses of valid code sequences. That might be a candidate for see also. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 16:58, 3 January 2022 (UTC)Reply

Yes, UUOs are in the same category as SVC and INT and MME and EMT and IOT and BPT and all other explicitly-defined-to-trap instructions; the only way in which UUOs differ is that some of them, at least on the PDP-10, are explicitly defined to trap to user-mode code. They're not opcodes not mentioned in the architectural specification.

Including them in this article, however, would turn this article into even more of a mishmash of different topics. Explicitly-defined-to-trap instructions, as a general concept, belongs in its own article.

The lede in this article speaks of instructions "that [are] not mentioned in any official documentation released by the CPU's designer or manufacturer, which nevertheless [have] an effect". That does not include SVC/INT/MME/EMT/IOT/BPT/UUOs/etc., as those are mentioned in the instruction set documentation.

That also doesn't say what "an effect" is. It could be "takes bits 2 and 3 of the AC, shifts bits 0 and 1 right by 2 bits, and puts bits 2 and 3 at the top of the AC" (big-endian bit numbering here), or it could be "gets an illegal instruction trap".

Those are not just "common on older CPUs designed during the 1970s, such as the MOS Technology 6502, Intel 8086, and the Zilog Z80"; they're present on any CPU where not all possible opcode values have operations associated with them, including, for example, System/3x0. It's just that, on some processors, they get illegal instruction traps rather than doing whatever the circuitry of the CPU happens, by accident, to cause them to do.

The latter behavior is what, as per the citation for the term, "unintended opcodes" refers to. As I read the 6502 programmer's manual, it has no illegal instruction interrupt (although it does have an explicitly-defined-to-interrupt instruction, BRK). As such, all "illegal opcodes" on the 6502 are "unintended opcodes".

So far, two behaviors for opcodes not mentioned in the manual/ISA spec are described - "do whatever the circuitry happens to make it do, even though the implementers didn't intend anything" and "illegal instruction trap".

Neither of those cover, for example, LOADALL; that's "do whatever the implementers intended, even though they didn't document it". I don't know whether anybody's ever described those as "unintended opcodes", and if they ever did, I'd respond that they're not unintended, they're just undocumented. Both "do whatever the circuitry happens to make it do, even though the implementers didn't intend anything" and "do whatever the implementers intended, even though they didn't document it" could reasonably called "undocumented instructions", so that phrase doesn't necessarily refer only to the latter, although one could argue that "undocumented" might imply an explicit decision not to document it.

In any case, I consider "do something weird and not explicitly intended", "do an illegal instruction trap", and "do something the implementers intended but didn't document" as somewhat separate topics. "Do an illegal instruction trap" is the least exotic; the only way I can see that's interesting beyond "your program gets terminated" is "if, in the future, that opcode gets used for a new, documented operation, you could have the illegal instruction trap handler simulate it, so you can run software for newer processors, albeit slowly, on older processors". (Using them for OS or user-mode traps is risky, as a future processor might implement them. That's why DEC explicitly setting aside 000 through 077 as UUOs matters - that's a promise never to use those opcodes for hardware-implemented instructions.)

So does this call for one article, two articles, or three articles? Guy Harris (talk) 20:01, 3 January 2022 (UTC)Reply

I'm aware of unintended behaviors that the vendor had to maintain compatibility with in subsequent designs due to widespread exploitation, e.g., doing a logical or of index registers on the IBM 7090 when an instruction had multiple bits on in the tag field.
IBM has on multiple occasions implemented proprietary instructions, where the documentation was for internal use only. IBM has also defined instructions, e.g., DIAG, that they explicitly documented as being model dependent.
Possibly split some of into:
  • Invalid opcode
  • Optional opcode
  • trap opcode
  • Undocumented opcode
  • Unintended opcode
I'm aware of software on the IBM System/360 that intercepts program interrupts and simulates System/370 instructions, and I'd be astonished if there weren't similar software for other architecture. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:05, 4 January 2022 (UTC)Reply
So those would be:
  • opcode that's not defined by a manual and causes whatever the system uses as an "invalid opcode" trap (or perhaps is treated as a NOP);
  • opcode that is defined by a manual, but isn't required to be implemented by all CPUs and that causes a trap if not implemented;
  • opcode that is defined by a manual to cause a trap (i.e., the intended behavior is a trap);
  • opcode that is not defined by a public manual but that performs an intended operation;
  • opcode that is not defined by a manual but that performs an unintended (but possibly useful) operation?
I think trap-and-emulate software might have existed for the Motorola 68000 series for floating-point instructions, and wouldn't be surprised if it existed for the PDP-10. Guy Harris (talk) 17:56, 4 January 2022 (UTC)Reply
Don't forget
  • opcode that is defined by a public manual to conditionally cause a trap, e.g., Load and Trap (LAT, LGAT) on z/Architecture;
  • opcode that is defined by a public manual to be model dependent. The details might be published in, e.g., CE manuals, functional characteristics manuals, or might not be publicly published.
I'm not sure which categories deserve separate articles, and which should be grouped. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 19:19, 4 January 2022 (UTC)Reply
"opcode that is defined by a public manual to conditionally cause a trap" Does that include, for example, integer divide instructions that trap on a zero divisor, or arithmetic instructions that trap on overflow, etc.??
"opcode that is defined by a public manual to be model dependent" That's similar to "opcode that is not defined by a public manual but that performs an intended operation", but it's not hidden the way instructions such as LOADALL are, so that probably deserves a separate category. Guy Harris (talk) 19:56, 4 January 2022 (UTC)Reply
An interrupt due to a zero divisor is a case of an instruction faulting due to invalid data. An interrupt due to a zero operand to a Load and Trap[1] instruction is a defined behavior for valid data.
So the SPARC TADDccTV and TSUBccTV instructions, which trap if the upper 2 bits (tag bits) of either operand are non-zero,[2] would also be "opcode[s] that [are] defined by a public manual to conditionally cause a trap" (they also trap on an integer overflow, presumably to keep the tag bits from getting overwritten)? Guy Harris (talk) 22:43, 4 January 2022 (UTC)Reply
Yes, TADDccTV and TSUBccTV are opcodes that are defined by a public manual to conditionally cause a trap. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 22:47, 11 January 2022 (UTC)Reply

References

  1. ^ z/Architecture Principles of Operation (Thirteenth ed.). IBM. September 2019. SA22-7832-12. The second operand is placed unchanged at the first operand location. If all zeros are placed at the first operand location, a compare-and-trap-instruction data exception is recognized.
  2. ^ The SPARC Architecture Manual: Version 8. pp. 109, 111. A tag_overflow occurs if bit 1 or bit 0 of either operand is nonzero, or if the addition generates an arithmetic overflow (both operands have the same sign and the sign of the sum is different). If a TADDccTV causes a tag_overflow, a tag_overflow trap is generated and r[rd] and the condition codes remain unchanged.

OK, let's decide what to do.

The introduction says:

An illegal opcode, also called an illegal operation code, unintended opcode or undocumented instruction, is an instruction to a CPU that is not mentioned in any official documentation released by the CPU's designer or manufacturer, which nevertheless has an effect.

and the short description is "Undocumented CPU instruction that has an effect".

Given that, I have a strong suspicion that no valid argument can be constructed that any documented instructions - i.e., "instructions mentioned in any official documentation released by the CPU's designer or manufacturer, which nevertheless has an effect" - should be discussed here.

I also consider "documentation" to include:

  • documentation that just says "performs processor-dependent functions" (e.g., the S/3x0 DIAGNOSE instruction);
  • documentation that says "this causes a trap" (e.g., system call instructions, user-mode UUOs, instructions in a range of opcodes that are flagged as "reserved, causes an illegal instruction trap", and even instructions not explicitly documented if the documentation says "everything we don't mention here will cause a trap".

The article itself seems not to discuss documented instructions.

So I vote for:

  • renaming the article to "undocumented instruction";
  • not mentioning "illegal opcode" and "illegal operation code" as synonyms, because those terms are used for other purposes and might usually be used to refer to opcodes that cause illegal instruction traps;
  • fixing all links to "illegal opcode" that refer to undocumented instructions to use the new name;

which leaves room for making "illegal opcode" a page that mentions both illegal instruction trapping (including what the trap handlers might do, including "emulate an instruction on processors that don't implement it") and illegal instructions that accidentally perform possibly-useful functions; the latter could mention that and have "undocumented instruction" as the main page.

Barring any technically-correct objections, I will make those changes at some point in the next couple of weeks. Guy Harris (talk) 21:55, 20 April 2022 (UTC)Reply

I agree that undocumented instruction is a good term, and this should be the article title. But I think it can have a broader definition that encompasses the other terms, something like "Undocumented instructions are instructions whose behavior is not fully documented by the manufacturer." So the article would have sections like the following:
  • Illegal instruction: An instruction which is not mentioned by the manufacturer but simply raises an illegal instruction exception upon execution. In theory "Illegal instruction" could be its own article but I don't think there's enough content - it's just a few sentences explaining the behavior across various processors when unrecognized instructions are encountered. So this section would be the target of "illegal instruction" and "illegal operation code" redirects.
  • Poorly documented instructions: Instructions where the documentation is vague, such as the DIAGNOSE mentioned. I didn't find any reliably-sourced material for this section in a quick search and it can probably be omitted.
  • Design flaws: Instructions that crash the processor, summary-style section of the Halt and Catch Fire (computing) article
  • Hidden instructions: Instructions that are not documented but have a useful effect - this would be where what's in the article now would end up.
    • 6502: Subsection, target of "illegal opcode" and "unintended instruction" redirects as those terms seem to be used most commonly with the 6502.
    • x86 - the sandsifter stuff
    • other processors as appropriate

--Mathnerd314159 (talk) 17:17, 23 April 2022 (UTC)Reply

"Ub2" listed at Redirects for discussion edit

  An editor has identified a potential problem with the redirect Ub2 and has thus listed it for discussion. This discussion will occur at Wikipedia:Redirects for discussion/Log/2022 April 20#Ub2 until a consensus is reached, and readers of this page are welcome to contribute to the discussion. signed, Rosguill talk 20:49, 20 April 2022 (UTC)Reply