10th Indian Delegation to Dubai, Gitex & Expand North Star – World’s Largest Startup Investor Connect
Tech

Why Upstage Builds Small Language Models


LAS VEGAS — Upstage is a South Korean enterprise AI company that builds small language models (SLMs)to help companies solve document processing problems. It originated as a company using optical character recognition (OCR) to scan documents for large corporations in South Korea.

When ChatGPT emerged, customers started asking Upstage about large language models (LLMs). Upstage had provided 95% accuracy using its OCR capability, but customers wanted 100% accuracy. So, the Upstate team began looking at models that would fit the requirements for getting better accuracy. The LLMs serve a general purpose, but the smaller models were more applicable to the narrow focus that document processing requires.

There’s not much attention paid to SLMs, but their capabilities have uses that include providing company-specific or even country-specific LLMs.

“Customers wanted a language model that was fit for their own use,” said Lucy Park, co-founder and chief product officer, in an interview at AWS re:Invent. “So that’s one of the reasons we started out to build small language models. And so here we are working on document processing engines and large language models.”‘

Model Merging to Create SLMs

Upstage, an AWS Generative AI Accelerator participant uses open source models, allowing running on a single GPU. Its flagship model, Solar, compares to other small models that also run on a single GPU, including Llama 3.81 B, Mistral Small Instruct 2409, and Hugging Face’s ExaOne3.0 7.8B Instruct.

Park said Upstage merges two copies of a small LLM into a large LLM. For instance, it would integrate a 7 billion parameter model into a 10 billion parameter mode. “If we have a 14 billion model, we explode that into a 22 billion model,” she said. “So that’s what we have been doing recently.”

Model merging, a technique for combining LLMs has gained acceptance in the AI community.  Implementation includes such practices as weight averaging, a method that merges the parameters of multiple separate models with different capabilities. Model merging allows data scientists “to build a universal model without needing access to the original training data or expensive computation,” according to a paper published in August by researchers from Nanyang Technological University, Northeastern University and Sun Yat-sen University.

Park said Upstage has found increases in its benchmarks using a combined model approach. According to the Upstage site, Solar Pro is a small language model that shows a 64% improvement in Eastern Asia language mastery compared to Solar Pro preview.

The improvements in SLMs for languages reflect their growing popularity. SLMs train smaller data sets, making them flexible for domain-centered approaches like Upstage’s.

Park said the large language models focus on general intelligence. The small language models also provide a narrower focus.

For example, Upside built a specific model for the Thai language. With Thai, it’s similar to GPT 4, the OpenAI model.

SLMs also cost a lot less to develop. Hypothetically, Park said, imagine an SLM that costs $10 to build. An LLM that is 10 times bigger may cost $100.

Customers will pursue three options to deploy the models, she said. If they are deploying on-premise models, they can use the Upstage console, which provides APIs through the AWS marketplace. For example, the Solar Pro model is now available on the Amazon Bedrock Marketplace.


Group Created with Sketch.





Source link

by Siliconluxembourg

Would-be entrepreneurs have an extra helping hand from Luxembourg’s Chamber of Commerce, which has published a new practical guide. ‘Developing your business: actions to take and mistakes to avoid’, was written to respond to  the needs and answer the common questions of entrepreneurs.  “Testimonials, practical tools, expert insights and presentations from key players in our ecosystem have been brought together to create a comprehensive toolkit that you can consult at any stage of your journey,” the introduction… Source link

by WIRED

B&H Photo is one of our favorite places to shop for camera gear. If you’re ever in New York, head to the store to check out the giant overhead conveyor belt system that brings your purchase from the upper floors to the registers downstairs (yes, seriously, here’s a video). Fortunately B&H Photo’s website is here for the rest of us with some good deals on photo gear we love. Save on the Latest Gear at B&H Photo B&H Photo has plenty of great deals, including Nikon’s brand-new Z6III full-frame… Source link

by Gizmodo

Long before Edgar Wright’s The Running Man hits theaters this week, the director of Shaun of the Dead and Hot Fuzz had been thinking about making it. He read the original 1982 novel by Stephen King (under his pseudonym Richard Bachman) as a boy and excitedly went to theaters in 1987 to see the film version, starring Arnold Schwarzenegger. Wright enjoyed the adaptation but was a little let down by just how different it was from the novel. Years later, after he’d become a successful… Source link