10th Indian Delegation to Dubai, Gitex & Expand North Star – World’s Largest Startup Investor Connect
Tech

Stable Diffusion 3 Is Here, AI Image Generation That Beats MidJourney, Dall-E 3 and Google ImageFX


Stability AI has just announced the early preview of its next-generation image tool, Stable Diffusion 3 (SD3), calling it its “most capable text-to-image model” to date. The announcement is a solid follow-up to the company’s release of Stable Diffusion XL (SDXL) last year, which quickly established itself as the most advanced open-source image generator.

The headline improvements delivered with SD3 are better text generation, strong prompt adherence, and resistance to prompt leaking—the latter strengths ensuring that generated images match what was requested. Stability AI has also highlighted SD3 support of multimodal input—promising to demonstrate it via a future technical report.

The AI community has responded with enthusiasm to the SD3 news.

“This AI image generator is the best we’ve ever seen in terms of prompt understanding and text generation,” said MattVidPro, a prominent AI-focused YouTuber. “It is leaps above the rest, and it’s truly mind blowing.”

Similarly, machine learning engineer Ralph Brooks said the model’s text generation capabilities were “amazing.”

Side-by-side showdown

Although Stable Diffusion 3 is only available to select partners right now, Stability AI and AI enthusiasts are sharing comparisons between its output and the result of similar prompts from SDXL, MidJourney, and Dall-E 3. By all appearances, SD3 outperforms its competitors in overall quality, and Decrypt ran some of its own tests to verify this. The results speak for themselves:

SD3 vs MidJourney

Prompt: “Epic anime artwork of a wizard atop a mountain at night casting a cosmic spell into the dark sky that says ‘Stable Diffusion 3’ made out of colorful energy.”

Stable Diffusion 3 (left) vs MidJourney (right) using the same prompt. Image: Decrypt

In our first comparison, SD3 followed the prompt very closely. MidJourney failed at the prompt generation, didn’t generate a mountain, and the wizard was not casting a cosmic spell.

SD3 vs ImageFX

Prompt: “Photo of a 90’s desktop computer on a work desk. On the computer screen it says ‘welcome.’ On the wall in the background we see beautiful graffiti with the text ‘SD3’ very large on the wall.”

Stable Diffusion 3 (left) vs ImageFX (right) using the same prompt. Image: Decrypt

In our second comparison, SD3 followed the prompt with remarkable adherence, whereas Google’s top AI image generator, ImageFX, suffered from prompt leaking, generating the text SD3 on the computer screen and not in the background, failing to heed the graffiti style request, and failing to depict the word “welcome.”

The aesthetics generated by SD3 are also more like a photograph and less of an obvious “photorealistic” render. Note the effects surrounding the pencil holder and other items, which seem to be blending with the background.

SD3 vs SDXL

Prompt: “Resting on the kitchen table is an embroidered cloth with the text ‘good night’ and an embroidered baby tiger. Next to the cloth there is a lit candle. The lighting is dim and dramatic.”

Stable Diffusion 3 (left) vs Dall-e 3 (right) using the same prompt. Image: Decrypt

In our third comparison, both Stable Diffusion 3 and Stable Diffusion XL captured the essence of the prompt, but SDXL failed at generating the text, suffered from prompt leaking (generating two cloths, one of which morphed into something else), and the embroidered baby tiger was generated better by SD3.

SD3 vs Dall-e 3

Prompt: “A painting of an astronaut riding a pig wearing a tutu holding a pink umbrella, on the ground next to the pig is a robin bird wearing a top hat, in the corner are the words ‘stable diffusion.’”

Stable Diffusion 3 (left) vs SDXL (right) using the same prompt. Image: Decrypt

Stable Diffusion 3 generated what was requested in the prompt, whereas Dall-e 3 failed to generate text, created a 3D render instead of a painting, and generated a galaxy background just because it was prompted to generate an astronaut.

Beneath the hood

In theory, Stable Diffusion 3 should have enough computing power to back up its claims of power and prowess.

“(SD3) uses a new type of diffusion transformer (similar to Sora) combined with flow matching and other improvements,“ Emad Mostaque, CEO of Stability AI, said on Twitter. Sora is the top-of-the-line text to video generator announced by OpenAI a few days ago. Flow Matching, meanwhile, is an AI technique for generative modeling based on faster and more stable training and inference than alternative methods, like generative adversarial networks (GANs).

Stability AI claims these improvements boost the model’s scalability and ability to accept multimodal inputs, and also pave the way for its application in video, 3D, and more. Mostaque tweeted that his vision for SD3 includes a comprehensive ecosystem of tools designed to leverage the latest hardware advancements while at the same time remain accessible and adaptable across various creative domains.

A week prior to the SD3 announcement, Stability AI released Stable Cascade. Unlike its predecessors, Stable Cascade is based on the Würstchen architecture, known for its modularity and record compression achievements. Despite hosting more parameters than Stable Diffusion XL, Stable Cascade boasts faster inference times and superior prompt alignment, showcasing the innovative strides Stability AI continues to make in AI development.

Although Stable Diffusion 3 is not yet publicly available, Stability AI emphasized it would be free, open source, and available to all under a non-commercial license. However, enthusiasts can apply for preview access as part of Stability AI’s membership program.

Edited by Ryan Ozawa.

Stay on top of crypto news, get daily updates in your inbox.





Source link

by Siliconluxembourg

Would-be entrepreneurs have an extra helping hand from Luxembourg’s Chamber of Commerce, which has published a new practical guide. ‘Developing your business: actions to take and mistakes to avoid’, was written to respond to  the needs and answer the common questions of entrepreneurs.  “Testimonials, practical tools, expert insights and presentations from key players in our ecosystem have been brought together to create a comprehensive toolkit that you can consult at any stage of your journey,” the introduction… Source link

by WIRED

B&H Photo is one of our favorite places to shop for camera gear. If you’re ever in New York, head to the store to check out the giant overhead conveyor belt system that brings your purchase from the upper floors to the registers downstairs (yes, seriously, here’s a video). Fortunately B&H Photo’s website is here for the rest of us with some good deals on photo gear we love. Save on the Latest Gear at B&H Photo B&H Photo has plenty of great deals, including Nikon’s brand-new Z6III full-frame… Source link

by Gizmodo

Long before Edgar Wright’s The Running Man hits theaters this week, the director of Shaun of the Dead and Hot Fuzz had been thinking about making it. He read the original 1982 novel by Stephen King (under his pseudonym Richard Bachman) as a boy and excitedly went to theaters in 1987 to see the film version, starring Arnold Schwarzenegger. Wright enjoyed the adaptation but was a little let down by just how different it was from the novel. Years later, after he’d become a successful… Source link