The IISA Webinar Committee presents a webinar on 'Key Ingredients of the Transformer Revolution'
11/21/2024 | 10:00 - 11:00 AM (EST) / 8:30 - 9:30 PM (IST)
Registration (free but required): https://weillcornell.zoom.us/webinar/register/WN_rodc9ctvRwCmQNDkId9png
Abstract: The rise of the transformer architecture, first proposed in 2017 in a paper aptly titled "Attention Is All You Need", has been spectacular. It has revolutionized statistical natural language processing and captured the public imagination by powering large language models (LLMs). LLMs are a result of several intellectual threads coming together at the right time and assisted by the availability of massive amounts of data and computational resources. The first idea, which goes back all the way to Shannon, is modeling language as a stochastic process. Second is the idea of training word embeddings to capture some aspect of their meaning. This is related to the distributional hypothesis in linguistics championed by linguists such as Zellig Harris. Third is the idea of using neural networks to build language models. A seminal contribution here came from Yoshua Bengio and collaborators. Fourth is the use of a specialized architecture, namely transformer, based on what are called self-attention layers. This webinar will discuss these key ingredients along with speculations on how computational, statistical, and mathematical theory can help us better understand transformers and LLMs.
Bio: Ambuj Tewari, PhD, is a Professor and Director of the Master's Program in Data Science at the Department of Statistics and Department of EECS at the University of Michigan. Dr. Tewari's primary research area is machine learning. His work spans the mathematical foundations of machine learning algorithms as well as discovering new application areas for machine learning. Recently, his research has focused on topics such as statistical learning theory, online learning, bandit problems, reinforcement learning, high-dimensional statistics, and large-scale learning optimization. He is also applying machine learning to problems in behavioral sciences, computational chemistry, computational biology, learning sciences, and complex networks.
A flyer is attached to this post, and available for download at this link: https://www.dropbox.com/scl/fi/jgeoxyamcrf385vt53aw7/IISA-Flyer-Nov-2024.pdf?rlkey=vc3s6b33mu2cjcdulme4q8h8m&dl=0
------------------------------
Himel Mallick, PhD, FASA
Principal Investigator (Tenure-track Faculty)
Cornell University
New York, New York 10065
------------------------------