From Wikipedia, the free encyclopedia
Artificial intelligence researcher, co-author of "Attention Is All You Need"
Ashish Vaswani is a
computer scientist working in
deep learning,
[1] who is known for his significant contributions to the field of
artificial intelligence (AI) and
natural language processing (NLP). He is one of the co-authors of the seminal paper "
Attention Is All You Need"
[2] which introduced the
Transformer model, a novel architecture that uses a self-attention mechanism and has since become foundational to many state-of-the-art models in NLP.
Transformer architecture is the core of
language models that power applications such as
ChatGPT.
[3]
[4]
[5] He was a co-founder of Adept AI Labs
[6]
[7] and a former staff research scientist at
Google Brain.
[8]
[9]
Career
Vaswani completed his engineering in Computer Science from
BIT Mesra in 2002. In 2004, he moved to the US to pursue higher studies at
University of Southern California.
[10] He did his PhD at the
University of Southern California.
[11] He has worked as a researcher at Google,
[12] where he was part of the
Google Brain team. He was a co-founder of Adept AI Labs but left the company.
[13]
[14]
Notable works
Vaswani's most notable work is the paper "
Attention Is All You Need", published in 2017.
[15] The paper introduced the
Transformer model, which eschews the use of recurrence in
sequence-to-sequence tasks and relies entirely on
self-attention mechanisms. The model has been instrumental in the development of several subsequent state-of-the-art models in
NLP, including
BERT,
[16]
GPT-2, and
GPT-3.
References
-
^
"Ashish Vaswani". scholar.google.com. Retrieved 2023-07-11.
-
^
Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion;
Gomez, Aidan N; Kaiser, Łukasz; Polosukhin, Illia (2017).
"Attention is All you Need" (PDF). Advances in Neural Information Processing Systems. 30. Curran Associates, Inc.
-
^
"Inside the brain of ChatGPT". stackbuilders.com. Retrieved 2023-07-12.
-
^
"Understanding ChatGPT as explained by ChatGPT". Advancing Analytics. 2023-01-18. Retrieved 2023-07-12.
-
^ Seetharaman, Deepa; Jin, Berber (2023-05-08).
"ChatGPT Fever Has Investors Pouring Billions Into AI Startups, No Business Plan Required". Wall Street Journal.
ISSN
0099-9660. Retrieved 2023-07-12.
-
^
"Introducing Adept".
-
^
"Top ex-Google AI researchers raise $8 million in funding from Thrive Capital". The Economic Times. May 4, 2023.
-
^ Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N.; Kaiser, Lukasz; Polosukhin, Illia (May 21, 2017). "Attention is All You Need".
arXiv:
1706.03762 [
cs.CL].
-
^ Shead, Sam (2022-06-10).
"A.I. gurus are leaving Big Tech to work on buzzy new start-ups". CNBC. Retrieved 2023-07-12.
-
^ Team, OfficeChai (February 4, 2023).
"The Indian Researchers Whose Work Led To The Creation Of ChatGPT". OfficeChai.
-
^
"Ashish Vaswani's webpage at ISI". www.isi.edu.
-
^
"Transformer: A Novel Neural Network Architecture for Language Understanding". ai.googleblog.com. August 31, 2017.
-
^ Rajesh, Ananya Mariam; Hu, Krystal; Rajesh, Ananya Mariam; Hu, Krystal (March 16, 2023).
"AI startup Adept raises $350 mln in fresh funding". Reuters – via www.reuters.com.
-
^ Tong, Anna; Hu, Krystal; Tong, Anna; Hu, Krystal (2023-05-04).
"Top ex-Google AI researchers raise funding from Thrive Capital". Reuters. Retrieved 2023-07-11.
-
^
"USC Alumni Paved Path for ChatGPT". USC Viterbi | School of Engineering.
-
^ Devlin, Jacob; Chang, Ming-Wei; Lee, Kenton; Toutanova, Kristina (May 24, 2019). "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding".
arXiv:
1810.04805 [
cs.CL].