Ashish Vaswani(born 1986) is a computer scientist working in deep learning,[1] who is known for his significant contributions to the field of artificial intelligence (AI) and natural language processing (NLP). He is one of the co-authors of the seminal paper "Attention Is All You Need"[2] which introduced the Transformer model, a novel architecture that uses a self-attention mechanism and has since become foundational to many state-of-the-art models in NLP. Transformer architecture is the core of language models that power applications such as ChatGPT.[3][4][5] He was a co-founder of Adept AI Labs[6][7] and a former staff research scientist at Google Brain.[8][9]

Ashish Vaswani
Born1986
Alma mater
Known forTransformer (deep learning architecture)
Scientific career
Fields
Institutions
Thesis Smaller, Faster, and Accurate Models for Statistical Machine Translation  (2014)
Doctoral advisor
  • David Chiang
  • Liang Huang

Career

edit

Vaswani completed his engineering in Computer Science from BIT Mesra in 2002. In 2004, he moved to the US to pursue higher studies at University of Southern California.[10] He did his PhD at the University of Southern California under the supervision of Prof. David Chiang.[11] He has worked as a researcher at Google,[12] where he was part of the Google Brain team. He was a co-founder of Adept AI Labs but has since left the company.[13][14]

Notable works

edit

Vaswani's most notable work is the paper "Attention Is All You Need", published in 2017.[15] The paper introduced the Transformer model, which eschews the use of recurrence in sequence-to-sequence tasks and relies entirely on self-attention mechanisms. The model has been instrumental in the development of several subsequent state-of-the-art models in NLP, including BERT,[16] GPT-2, and GPT-3.

References

edit
  1. ^ "Ashish Vaswani". scholar.google.com. Retrieved 2023-07-11.
  2. ^ Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N; Kaiser, Łukasz; Polosukhin, Illia (2017). "Attention is All you Need" (PDF). Advances in Neural Information Processing Systems. 30. Curran Associates, Inc.
  3. ^ "Inside the brain of ChatGPT". stackbuilders.com. Retrieved 2023-07-12.
  4. ^ "Understanding ChatGPT as explained by ChatGPT". Advancing Analytics. 2023-01-18. Retrieved 2023-07-12.
  5. ^ Seetharaman, Deepa; Jin, Berber (2023-05-08). "ChatGPT Fever Has Investors Pouring Billions Into AI Startups, No Business Plan Required". Wall Street Journal. ISSN 0099-9660. Retrieved 2023-07-12.
  6. ^ "Introducing Adept".
  7. ^ "Top ex-Google AI researchers raise $8 million in funding from Thrive Capital". The Economic Times. May 4, 2023.
  8. ^ Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N.; Kaiser, Lukasz; Polosukhin, Illia (May 21, 2017). "Attention is All You Need". arXiv:1706.03762 [cs.CL].
  9. ^ Shead, Sam (2022-06-10). "A.I. gurus are leaving Big Tech to work on buzzy new start-ups". CNBC. Retrieved 2023-07-12.
  10. ^ Team, OfficeChai (February 4, 2023). "The Indian Researchers Whose Work Led To The Creation Of ChatGPT". OfficeChai.
  11. ^ "Ashish Vaswani's webpage at ISI". www.isi.edu.
  12. ^ "Transformer: A Novel Neural Network Architecture for Language Understanding". ai.googleblog.com. August 31, 2017.
  13. ^ Rajesh, Ananya Mariam; Hu, Krystal; Rajesh, Ananya Mariam; Hu, Krystal (March 16, 2023). "AI startup Adept raises $350 mln in fresh funding". Reuters – via www.reuters.com.
  14. ^ Tong, Anna; Hu, Krystal; Tong, Anna; Hu, Krystal (2023-05-04). "Top ex-Google AI researchers raise funding from Thrive Capital". Reuters. Retrieved 2023-07-11.
  15. ^ "USC Alumni Paved Path for ChatGPT". USC Viterbi | School of Engineering.
  16. ^ Devlin, Jacob; Chang, Ming-Wei; Lee, Kenton; Toutanova, Kristina (May 24, 2019). "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding". arXiv:1810.04805 [cs.CL].