TOSS (operating system)

The Tri-Lab Operating System Stack (TOSS) is a Linux distribution based on Red Hat Enterprise Linux (RHEL) that was created to provide a software stack[1] for high performance computing (HPC) clusters[2] for laboratories within the National Nuclear Security Administration (NNSA).[3] The operating system allows multiple smaller systems to emulate a high-performance computing (HPC) platform.[1]

Tri-Lab Operating System Stack
OS familyUnix-like
Working stateCurrent
Package managerRPM Package Manager[1]

Linux distribution

edit

The name "tri-lab" refers to the three major NNSA labs, the Lawrence Livermore National Laboratory, the Los Alamos National Laboratory, and the Sandia National Laboratories.[4]

The OS is used by NNSA computers including the El Capitan supercomputer[5] and systems using ARM architecture including the ThunderX2 system on a chip (SoC).[6] In addition to being used by the National Nuclear Security Administration (NNSA),[2] most of the systems in NASA's High-End Computing Capabiity Project, part of the NASA Advanced Supercomputing Division, were all migrated to TOSS in March 2022.[7]

Many of the software packages included in TOSS are from the RHEL repository. Additional packages are built using Fedora's Koji build system to create RPM packages.[1] The system also uses SLURM and Flux scheduling and resource management software.[1]

References

edit
  1. ^ a b c d e de Supinski, Bronis R. (August 29, 2019). The LLNL Near and Long Term Vision for Large-Scale Systems (PDF) (Report). Retrieved August 28, 2022.
  2. ^ a b "TOSS: Speeding Up Commodity Cluster Computing". Lawrence Livermore National Laboratory. Retrieved August 28, 2022.
  3. ^ León, Edgar A.; D'Hooge, Trent; Hanford, Nathan; Karlin, Ian; Pankajakshan, Ramesh; Foraker, Jim; Chambreau, Chris; Leininger, Matthew L. (November 2020). TOSS-2020: a commodity software stack for HPC. SC '20: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. Atlanta, Georgia. pp. 1–15. ISBN 978-1-7281-9998-6. OCLC 1223541587 – via IEEE.
  4. ^ Morgan, Timothy Prickett (November 26, 2018). "One Linux stack to rule HPC and AI". NextPlatform.com. Retrieved August 28, 2022.
  5. ^ Степин, Алексей (June 23, 2022). "2-Эфлопс cуперкомпьютер El Capitan получит новейшие APU AMD MI300" [2-Eflops El Capitan supercomputer will receive the latest AMD MI300 APUs]. ServerNews.ru (in Russian). Retrieved August 29, 2022. В El Capitan лаборатория перейдет от использования проприетарного системного и управляющего ПО к собственному стеку NNSA Tri-Lab Operating System Stack (TOSS). [At El Capitan, the laboratory will move from using proprietary system and management software to its own NNSA Tri-Lab Operating System Stack (TOSS).]
  6. ^ Feldman, Michael (June 18, 2018). "Sandia to Install First Petascale Supercomputer Powered by ARM Processors". Top500. Retrieved August 29, 2022.
  7. ^ "Migration to TOSS Operating System". NASA. July 20, 2022. Retrieved August 28, 2022.