10th Indian Delegation to Dubai, Gitex & Expand North Star – World’s Largest Startup Investor Connect
Tech

Apple AI researchers boast useful on-device model that ‘substantially outperforms’ GPT-4


Siri has recently been attempting to describe images received in Messages when using CarPlay or the announce notifications feature. In typical Siri fashion, the feature is inconsistent and with mixed results.

Nevertheless, Apple forges ahead with the promise of AI. In a newly published research paper, Apple’s AI gurus describe a system in which Siri can do much more than try to recognize what’s in an image. The best part? It thinks one of its models for doing this benchmarks better than ChatGPT 4.0.

In the paper (ReALM: Reference Resolution As Language Modeling), Apple describes something that could give a large language model-enhanced voice assistant a usefulness boost. ReALM takes into account both what’s on your screen and what tasks are active. Here’s a snippet from the paper that describes the job:

1. On-screen Entities: These are entities that are currently displayed on a user’s screen

2. Conversational Entities: These are entities relevant to the conversation. These entities might come from a previous turn for the user (for example, when the user says “Call Mom”, the contact for Mom would be the relevant entity in question), or from the virtual assistant (for example, when the agent provides a user a list of places or alarms to choose from).

3. Background Entities: These are relevant entities that come from background processes that might not necessarily be a direct part of what the user sees on their screen or their interaction with the virtual agent; for example, an alarm that starts ringing or music that is playing in the background.

If it works well, that sounds like a recipe for a smarter and more useful Siri. Apple also sounds confident in its ability to complete such a task with impressive speed. Benchmarking is compared against OpenAI’s ChatGPT 3.5 and ChatGPT 4.0:

As another baseline, we run the GPT-3.5 (Brown et al., 2020; Ouyang et al., 2022) and GPT-4 (Achiam et al., 2023) variants of ChatGPT, as available on January 24, 2024, with in-context learning. As in our setup, we aim to get both variants to predict a list of entities from a set that is available. In the case of GPT-3.5, which only accepts text, our input consists of the prompt alone; however, in the case of GPT-4, which also has the ability to contextualize on images, we provide the system with a screenshot for the task of on-screen reference resolution, which we find helps substantially improve performance.

So how does Apple’s model do?

We demonstrate large improvements over an existing system with similar functionality across different types of references, with our smallest model obtaining absolute gains of over 5% for on-screen references. We also benchmark against GPT-3.5 and GPT-4, with our smallest model achieving performance comparable to that of GPT-4, and our larger models substantially outperforming it.

Substantially outperforming it, you say? The paper concludes in part as follows:

We show that ReaLM outperforms previous ap- proaches, and performs roughly as well as the state- of-the-art LLM today, GPT-4, despite consisting of far fewer parameters, even for onscreen references despite being purely in the textual domain. It also outperforms GPT-4 for domain-specific user utterances, thus making ReaLM an ideal choice for a practical reference resolution system that can exist on-device without compromising on performance.

On-device without compromising on performance seems key for Apple. The next few years of platform development should be interesting, hopefully, starting with iOS 18 and WWDC 2024 on June 10.

FTC: We use income earning auto affiliate links. More.



Source link

by Siliconluxembourg

Would-be entrepreneurs have an extra helping hand from Luxembourg’s Chamber of Commerce, which has published a new practical guide. ‘Developing your business: actions to take and mistakes to avoid’, was written to respond to  the needs and answer the common questions of entrepreneurs.  “Testimonials, practical tools, expert insights and presentations from key players in our ecosystem have been brought together to create a comprehensive toolkit that you can consult at any stage of your journey,” the introduction… Source link

by WIRED

B&H Photo is one of our favorite places to shop for camera gear. If you’re ever in New York, head to the store to check out the giant overhead conveyor belt system that brings your purchase from the upper floors to the registers downstairs (yes, seriously, here’s a video). Fortunately B&H Photo’s website is here for the rest of us with some good deals on photo gear we love. Save on the Latest Gear at B&H Photo B&H Photo has plenty of great deals, including Nikon’s brand-new Z6III full-frame… Source link

by Gizmodo

Long before Edgar Wright’s The Running Man hits theaters this week, the director of Shaun of the Dead and Hot Fuzz had been thinking about making it. He read the original 1982 novel by Stephen King (under his pseudonym Richard Bachman) as a boy and excitedly went to theaters in 1987 to see the film version, starring Arnold Schwarzenegger. Wright enjoyed the adaptation but was a little let down by just how different it was from the novel. Years later, after he’d become a successful… Source link