The Machine (computer architecture)

The Machine is the name of an experimental computer made by Hewlett Packard Enterprise. It was created as part of a research project to develop a new type of computer architecture for servers. The design focused on a “memory centric computing” architecture, where NVRAM replaced traditional DRAM and disks in the memory hierarchy. The NVRAM was byte addressable and could be accessed from any CPU via a photonic interconnect.[1][2] The aim of the project was to build and evaluate this new design.

Hardware overview edit

The Machine was a computer cluster with many individual nodes connected over a memory fabric. The fabric interconnect used VCSEL-based silicon photonics with a custom chip called the X1.[3] Access to memory is non-uniform and may include multiple hops. The Machine was envisioned to be a rack-scale computer initially with 80 processors and 320 TB of fabric attached memory, with potential for scaling to more enclosures up to 32 ZB.[4][5] The fabric attached memory is not cache coherent and requires software to be aware of this property.[4] Since traditional locks need cache coherency, hardware was added to the bridges to do atomic operations at that level.[4] Each node also has a limited amount of local private cache-coherent memory (256 GB).[6][4] Storage and compute on each node had completely separate power domains.[4]

 
A logical diagram showing a single node in the Machine. Dozens of nodes are connected together using the backplane. The initial prototype contained DRAM, with the eventual goal of being replaced with more NVRAM.

The whole fabric attached memory of The Machine is too large to be mapped into a processor's virtual address space (which was 48-bits wide[4]). A way is needed to map windows of the fabric attached memory into processor memory. Therefore, communication between each node SoC and the memory pool goes through an FPGA-based “Z-bridge” component that manages memory mapping of the local SoC to the fabric attached memory.[4] The Z-bridge deals with two different kinds of addresses: 53-bit logical Z addresses and 75-bit Z addresses, which allows addressing 8PB and 32ZB respectively.[4] Each Z-bridge also contained a firewall to enforce access control.[7] The interconnect protocol was developed in-house and known as Next Generation Memory Interconnect (NGMI).[4] This protocol evolved into the open Gen-Z standard.[8][9] The Z-bridge connects to the SoC using PCIe, avoiding major software changes.[9]

A half rack prototype of the machine was unveiled at HPE Discover in London in 2016.[10] Each node contained ARMv8-A based Broadcom/Cavium ThunderX2 SoCs.[11][12][3] In total there were 40 32-core SoCs.[13] Due to unavailability of adequate memristor-based NVRAM or phase-change memory, the prototype used 160 TB of battery-backed DRAM.[14][12][15] Despite this setback, software architect Keith Packard said this "can be used to prove the other parts of the design before switching".[4] According to The Register, HPE's partnership with SK Hynix to develop memristor-based NVRAM ran into funding and directional problems and they were working with Sandisk on Resistive RAM (ReRAM) for The Machine.[16] According to The Next Platform, HPE considered switching to Intel Optane DIMMs "when production quantities of are available on the market".[9]

The Next Platform estimated the rack prototype to consume 24 kW to 36 kW of power.[9]

Software overview edit

Two major software projects were created for the Machine.[17] An experimental version of Linux called Linux++[18] with all the necessary enhancements to configure the hardware and work with traditional programming models.[19] This included bridge configuration, access control and mapping using the DAX subsystem. In parallel, a new operating system (OS) called Carbon[20][21] was announced that would be designed from first principles to take full advantage of an NVRAM based computer.[22][23][24]

Primary workloads for The Machine included in-memory database, Hadoop-style software, and real-time big data analytics.[25][26] HPE claimed that a memory-driven computing design like The Machine could "improve speeds by up to 8000x compared to conventional systems".[27]

In the prototype system, the fabric attached memory of the system was organised by a "top of rack" management server component called The Librarian.[4][28] The Librarian divided the memory into "shelves" of 8GB "books", and hardware protections could be configured on book boundaries.[4] A fine grained 64KB "booklet" was also supported.[4]

The mapping of memory is handled by the OS, while the access controls for the memory are configured by the management infrastructure of The Machine system as a whole.[4] Software needs to be aware that fabric attached memory memory reads can have synchronous errors whilst writes can have asynchronous errors. On the Linux system, when a memory error occurs the SIGBUS operating system signal is used.[4]

Programming model and data structure changes were also explored, including changes to thread libraries and heap data structures to be resilient with non-volatile memory failure modes.[29][30][31][32][33]

History edit

A few years after HP’s re-discovery of the Memristor,[34] the newly appointed CTO of HP, Martin Fink, created a HP Labs project to build a computer system based on memristor to tackle the slowing of Moore's law. He announced the project at HP’s Discover event in the summer of 2014.[35] Some of the ideas of The Machine also came from Dragonhawk system designs.[4][36] Three-quarters of HP Labs’s 200 staff were focused on the hardware and software of the machine.[22]

Speaking to Bloomberg, HP says it would commercialize The Machine within a few years, “or fall on its face trying.”[35]

Kirk Bresniker served as Chief Architect, and Keith Packard was hired to work on the Linux enhancements.[37][7] Bdale Garbee was hired to manage open source development.[38]

In 2015, Hewlett-Packard separated into two separate companies, HP Inc and Hewlett Packard Enterprise (HPE), with The Machine project assigned to the latter.[39]

In late 2016, Martin Fink retired as HPE CTO.[40] Fink's retirement announcement also said that Hewlett Packard Labs staff would be moved into the Enterprise product group to "align our R&D work on The Machine with the business".[41][42]

By early 2017, Hewlett Packard Labs had a slide saying that the project's aim was “to demonstrate progress, not develop products” and they would “collaborate to deliver differentiating Machine value into existing architectures as well as disruptive architectures”.[43] BleepingComputer said "In other words, The Machine is no longer a product in its own right. Instead it will provide technologies that will be used in other HPE products going forward.". HPE restructured its pure R&D organization and placed it in the products group.[44] Yahoo! Finance reported that the Machine prototype "remains years away from being commercially available".[45]

In 2018, HPE stated that the project had reached the stage where it needed commercial applications from customers in the next step of its evolution.[46]

References edit

  1. ^ Morgan, Timothy Prickett (2016-01-04). "Drilling Down Into The Machine From HPE". The Next Platform. Retrieved 2023-01-04.
  2. ^ Keeton, Kimberly (2015-06-16). "The Machine". Proceedings of the 5th International Workshop on Runtime and Operating Systems for Supercomputers. ROSS '15. New York, NY, USA: Association for Computing Machinery. p. 1. doi:10.1145/2768405.2768406. ISBN 978-1-4503-3606-2. S2CID 7768740.
  3. ^ a b Morgan, Timothy Prickett (2017-06-15). "The Memory Scalability At The Heart Of The Machine". The Next Platform. Retrieved 2023-01-04.
  4. ^ a b c d e f g h i j k l m n o p "A look at The Machine [LWN.net]". lwn.net. Retrieved 2023-01-04.
  5. ^ "Can HPE's "The Machine" Deliver? - IEEE Spectrum". spectrum.ieee.org. Retrieved 2023-01-04.
  6. ^ Mellor, Chris. "RIP HPE's The Machine product, 2014-2016: We hardly knew ye". www.theregister.com. Retrieved 2023-01-04.
  7. ^ a b "the machine architecture". keithp.com. Retrieved 2023-01-04.
  8. ^ "Gen-Z Looks to Ignite IT innovation With Open, High-Performance Interconnect Technology | HPE". 2022-01-31. Archived from the original on 2022-01-31. Retrieved 2023-01-04.
  9. ^ a b c d Teich, Paul (2017-01-09). "HPE Powers Up The Machine Architecture". The Next Platform. Retrieved 2023-01-04.
  10. ^ Clark, Don (28 November 2016). "HP Enterprise Unveils Prototype of Next-Generation Computer 'The Machine'". Wall Street Journal. Retrieved 2023-01-04.
  11. ^ Trader, Tiffany (2018-06-18). "Sandia to Take Delivery of World's Largest Arm System". HPCwire. Retrieved 2023-01-04.
  12. ^ a b "HPE Rolls Out The Machine Prototype, Its Version of the Future of Computing". Data Center Knowledge | News and analysis for the data center industry. 2017-05-16. Retrieved 2023-01-04.
  13. ^ "HPE shows off The Machine prototype without memistors". www.reseller.co.nz. Retrieved 2023-01-04.
  14. ^ Coughlin, Tom. "HPE 's The Machine, Secure Computing And Intelligent Edges". Forbes. Retrieved 2023-01-04.
  15. ^ "HPE's 'The Machine' computer prototype has 160TB of memory". BetaNews. 2017-05-18. Retrieved 2023-01-04.
  16. ^ Mellor, Chris. "RIP HPE's The Machine product, 2014-2016: We hardly knew ye". www.theregister.com. Retrieved 2023-01-04.
  17. ^ "HP's The Machine Open Source OS: Truly Revolutionary – Channel Futures". 2022-01-21. Archived from the original on 2022-01-21. Retrieved 2023-01-04.
  18. ^ "HP reveals more details about The Machine: Linux++ OS coming 2015, prototype in 2016 | ExtremeTech". www.extremetech.com. Retrieved 2023-01-04.
  19. ^ FabricAttachedMemory/linux-l4fame, Fabric-Attached Memory, 2017-11-16, retrieved 2023-01-04
  20. ^ Pirzada, Usman (2014-12-21). "The Machine with Open Source Carbon OS is the Next Big Thing - if HP can deliver". Wccftech. Retrieved 2023-01-11.
  21. ^ Morgan, Timothy Prickett (2016-02-01). "Operating Systems, Virtualization, And The Machine". The Next Platform. Retrieved 2023-01-10.
  22. ^ a b Roszczyk, William (2014-12-09). "HP to launch "revolutionary" computer and OS". The Recycler. Retrieved 2023-01-04.
  23. ^ Niccolai, James (2014-06-12). "Dell executive says HP's new Machine architecture is 'laughable'". Network World. Retrieved 2023-01-04.
  24. ^ "Rack Scalable OS for The Machine and the Case for Capabilities" (PDF).
  25. ^ Mellor, Chris. "RIP HPE's The Machine product, 2014-2016: We hardly knew ye". www.theregister.com. Retrieved 2023-01-04.
  26. ^ "Billion node graph inference: iterative processing on The Machine" (PDF). 2017-05-08. Archived from the original (PDF) on 2017-05-08. Retrieved 2023-01-04.
  27. ^ Donnell, Peter (5 December 2016). "HP 'The Machine' Supercomputer Is 8000x Faster Than a PC". Eteknix.
  28. ^ The Librarian File System (LFS) Suite, Fabric-Attached Memory, 2022-03-13, retrieved 2023-01-10
  29. ^ Hsu, Terry Ching-Hsiang; Brügner, Helge; Roy, Indrajit; Keeton, Kimberly; Eugster, Patrick (2017-04-23). "NVthreads". Proceedings of the Twelfth European Conference on Computer Systems. EuroSys '17. New York, NY, USA: Association for Computing Machinery. pp. 468–482. doi:10.1145/3064176.3064204. ISBN 978-1-4503-4938-3.
  30. ^ "Memory-Driven Computing | USENIX". www.usenix.org. Retrieved 2023-01-04.
  31. ^ Chakrabarti, Dhruva R.; Boehm, Hans-J.; Bhandari, Kumud (2014-10-15). "Atlas: leveraging locks for non-volatile memory consistency". ACM SIGPLAN Notices. 49 (10): 433–452. doi:10.1145/2714064.2660224. ISSN 0362-1340. S2CID 234775584.
  32. ^ Atlas: Programming for Persistent Memory, Hewlett Packard Enterprise, 2022-08-01, retrieved 2023-01-04
  33. ^ Morgan, Timothy Prickett (2016-02-08). "Non Volatile Heaps And Object Stores In The Machine". The Next Platform. Retrieved 2023-01-04.
  34. ^ Strukov, Dmitri B.; Snider, Gregory S.; Stewart, Duncan R.; Williams, R. Stanley (2008-05-01). "The missing memristor found". Nature. 453 (7191): 80–83. Bibcode:2008Natur.453...80S. doi:10.1038/nature06932. ISSN 0028-0836. PMID 18451858. S2CID 4367148.
  35. ^ a b "With 'The Machine,' HP May Have Invented a New Kind of Computer". Bloomberg.com. 2014-06-11. Retrieved 2023-01-04.
  36. ^ Morgan, Timothy Prickett (2017-11-07). "HPE's Superdome Gets An SGI NUMAlink Makeover". The Next Platform. Retrieved 2023-01-04.
  37. ^ "Big Data: A Monster Machine for Solving Monster-sized Data Problems | Formtek Blog". Retrieved 2023-01-04.
  38. ^ Bhartiya, Swapnil (2016-06-08). "Linux Leader Bdale Garbee Touts Potential of HPE's Newest Open Source Project". Linux.com. Retrieved 2023-01-04.
  39. ^ "Two HPs, One Dream". Bloomberg.com. 2015-04-09. Retrieved 2023-01-04.
  40. ^ "HP Enterprise CTO Martin Fink stepping down". ZDNET. Retrieved 2023-01-04.
  41. ^ Mellor, Chris. "Cutting Hewlett-Packard Labs down to size". www.theregister.com. Retrieved 2023-01-04.
  42. ^ "HPE Moves The Machine into Enterprise Group | TOP500". www.top500.org. Retrieved 2023-01-11.
  43. ^ "What happened to the HP machine? | TechTarget". MicroscopeUK. Retrieved 2023-01-04.
  44. ^ says, Calvin Zito (2016-06-28). "HPE Labs goes all in for The Machine – with John Obeto". VulcanCast. Retrieved 2023-01-04.
  45. ^ "Hewlett Packard Enterprise reveals powerful computer prototype". uk.finance.yahoo.com. 16 May 2017. Retrieved 2023-01-04.
  46. ^ Burt, Jeffrey (2018-06-21). "HPE Boots Up Sandbox Of The Machine For Early Users". The Next Platform. Retrieved 2023-01-04.