10th Indian Delegation to Dubai, Gitex & Expand North Star – World’s Largest Startup Investor Connect
All News

Llemma is Here, An Open Language Model For Mathematics

Researchers from EleutherAI have introduced Llemma, an open language model designed for mathematics, along with a Proof-Pile-2 dataset. This project, which is built with continuous pretraining of CodeLlama, has garnered significant attention in the academic and research community.

Check out the GitHub repository here.

Llemma stands out by offering both 7 billion and 34 billion parameter models, surpassing the capabilities of all other open base models, including Google’s Minerva, even at similar model scales. The achievement is particularly noteworthy as the 34-billion parameter Llemma model approaches the performance of Google’s Minerva, which boasts 62 billion parameters, despite having just half the parameters.

We release Llemma: open LMs for math trained on up to 200B tokens of mathematical text.

The performance of Llemma 34B approaches Google’s Minerva 62B despite having half the parameters.

Models/data/code: https://t.co/zFvKHrK7t3
Paper: https://t.co/gGgyFQX8sA

More pic.twitter.com/K7ZiG9n8BT

— Zhangir Azerbayev (@zhangir_azerbay) October 17, 2023

This new development from EleutherAI not only parallels Minerva, a closed model specially designed for mathematics by Google Research but also manages to exceed Minerva’s problem-solving capabilities on an equi-parameter basis. Notably, Llemma’s capabilities extend to a broader spectrum of tasks, including tool use and formal mathematics, which further distinguishes it in the realm of mathematical language modeling.

Zhangir Azerbayev, the lead author of the paper, describes that the journey toward creating Llemma began with the assembly of a vast dataset of mathematical tokens, encompassing the ArXiv subset of RedPajama, the recent OpenWebMath dataset, and the introduction of the AlgebraicStack, a code dataset tailored specifically for mathematics. This comprehensive approach resulted in training on an astounding 55 billion unique tokens.

Llemma’s models were initialized with Code Llama weights and subsequently trained across a network of 256 A100 GPUs on StabilityAI‘s Ezra cluster. The 7-billion model underwent extensive training, spanning 200 billion tokens and 23,000 A100 hours, while the 34-billion model received 50 billion tokens of training over 47,000 A100 hours.

In addition to its exceptional performance on chain-of-thought tasks when compared on an equal-parameter basis with Minerva, Llemma benefits from majority voting, providing an extra boost to its performance.

The collaborative effort of institutions such as Princeton University, EleutherAI, University of Toronto, Vector Institute, University of Cambridge, Carnegie Mellon University, and University of Washington has culminated in the creation of Llemma.

The post Llemma is Here, An Open Language Model For Mathematics appeared first on Analytics India Magazine.

by Siliconluxembourg

Would-be entrepreneurs have an extra helping hand from Luxembourg’s Chamber of Commerce, which has published a new practical guide. ‘Developing your business: actions to take and mistakes to avoid’, was written to respond to  the needs and answer the common questions of entrepreneurs.  “Testimonials, practical tools, expert insights and presentations from key players in our ecosystem have been brought together to create a comprehensive toolkit that you can consult at any stage of your journey,” the introduction… Source link

by WIRED

B&H Photo is one of our favorite places to shop for camera gear. If you’re ever in New York, head to the store to check out the giant overhead conveyor belt system that brings your purchase from the upper floors to the registers downstairs (yes, seriously, here’s a video). Fortunately B&H Photo’s website is here for the rest of us with some good deals on photo gear we love. Save on the Latest Gear at B&H Photo B&H Photo has plenty of great deals, including Nikon’s brand-new Z6III full-frame… Source link

by Gizmodo

Long before Edgar Wright’s The Running Man hits theaters this week, the director of Shaun of the Dead and Hot Fuzz had been thinking about making it. He read the original 1982 novel by Stephen King (under his pseudonym Richard Bachman) as a boy and excitedly went to theaters in 1987 to see the film version, starring Arnold Schwarzenegger. Wright enjoyed the adaptation but was a little let down by just how different it was from the novel. Years later, after he’d become a successful… Source link