Deep PapersAuthor: Arize AI
Deep Papers is a podcast series featuring deep dives on todays most important AI papers and research. Hosted by Arize AI founders and engineers, each episode profiles the people and techniques behind cutting-edge breakthroughs in machine learning. Language: en-us Genres: Mathematics, Science, Technology Contact email: Get it Feed URL: Get it iTunes ID: Get it |
Listen Now...
Training Large Language Models to Reason in Continuous Latent Space
Tuesday, 14 January, 2025
LLMs have typically been restricted to reason in the "language space," where chain-of-thought (CoT) is used to solve complex reasoning problems. But a new paper argues that language space may not always be the best for reasoning. In this paper read, we cover an exciting new technique from a team at Meta called Chain of Continuous Thought—also known as "Coconut." In the paper, "Training Large Language Models to Reason in a Continuous Latent Space" explores the potential of allowing LLMs to reason in an unrestricted latent space instead of being constrained by natural language tokens.Read a full breakdown of Coconut on our blogLearn more about AI observability and evaluation in our course, join the Arize AI Slack community or get the latest on LinkedIn and X.