Apple cuts its AI rendering times in half with new Stable Diffusion fix

Enlarge / Two examples of Stable Diffusion generated artwork provided by Apple.

Apple

On Wednesday, Apple released optimizations that enable the Stable Diffusion AI image generator to run on Apple Silicon using Core ML, Apple’s proprietary machine learning model framework. The optimizations will allow app developers to use Apple Neural Engine hardware to run Stable Diffusion roughly twice as fast as previous Mac-based methods.

Stable Diffusion (SD), launched in August, is an open-source AI image synthesis model that uses text input to generate novel images. For example, if you type “astronaut on a kite” in SD, it usually creates an image of exactly that.

By releasing the new SD optimizations, available as conversion scripts on GitHub, Apple aims to unlock the full potential of image synthesis on its devices, as noted on Apple Research’s announcement page. “With the growing number of stable diffusion applications, it’s important to ensure developers can leverage this technology effectively to build apps that creatives can use anywhere.”

Apple also mentions privacy and avoiding cloud computing costs as benefits of running an AI generation model locally on a Mac or Apple device.

“End-user privacy is protected because any data that the user provides as input to the model remains on the user’s device,” Apple says. “Second, after the initial download, users don’t need an internet connection to use the model. Finally, local deployment of this model allows developers to reduce or eliminate their server-related costs.”

Currently, Stable Diffusion generates images fastest on high-end Nvidia GPUs when running locally on a Windows or Linux PC. For example, generating a 512×512 image in 50 steps on an RTX 3060 takes about 8.7 seconds on our machine.

In comparison, the traditional method of running Stable Diffusion on an Apple Silicon Mac is much slower, taking about 69.8 seconds in our tests on an M1 Mac Mini to produce a 512×512 image in 50 steps with Diffusion Bee .

According to Apple’s benchmarks on GitHub, Apple’s new Core ML SD optimizations can produce a 512×512 50-step image on an M1 chip in 35 seconds. An M2 does the job in 23 seconds, and Apple’s most powerful silicon chip, the M1 Ultra, can achieve the same result in just nine seconds. This is a dramatic improvement, almost halving generation time in the case of the M1.

Apple’s GitHub release is a Python package that converts stable diffusion models from PyTorch to Core ML and includes a Swift package for model deployment. The optimizations work for Stable Diffusion 1.4, 1.5 and the newly released 2.0.

For now, the experience of setting up Stable Diffusion with Core ML locally on a Mac is aimed at developers and requires some basic command-line knowledge, but Hugging Face has published an in-depth guide on tuning Apple’s Core ML tweaks for those who want to experiment.

For those less tech-savvy, the aforementioned app called Diffusion Bee makes it easy to run Stable Diffusion on Apple Silicon, but doesn’t yet integrate Apple’s new tweaks. You can also run Stable Diffusion on an iPhone or iPad using the Draw Things app.

Leave a Reply

Your email address will not be published. Required fields are marked *