NPU Model Conversion

This section covers converting your own Stable Diffusion checkpoints into NPU-compatible assets that Local Dream can load on supported Snapdragon devices.

When You Need This

✅ You want to run a custom SD1.5 or SDXL checkpoint on the NPU path.
❌ You want to run a custom SD1.5 checkpoint on the CPU/GPU path — this is supported directly in the app, no host-side conversion required.

Available Workflows

Workflow	Status	Guide
SD1.5 → NPU	Stable	SD1.5 Conversion Guide
SDXL → NPU	Experimental	SDXL Conversion Guide

What to Expect

Conversion is host-side, not on-device. You will need a Linux or WSL machine.
The pipeline produces W8A16-quantized QNN binaries packaged into a zip that the app imports.
For SD1.5 you build one zip per chip tier (_min / _8gen1 / _8gen2). For SDXL there is only one chip tier (_8gen3).
A single SD1.5 conversion run takes several hours of CPU time. SDXL takes substantially longer.

Why two QNN SDK versions?

The conversion scripts pin QNN SDK 2.28, but the Android app itself ships with QNN SDK 2.39 as its runtime. This is intentional: 2.28 is the version known to produce correct quantized binaries for the conversion pipelines in this guide, while the runtime stays current. You do not need 2.39 to convert models, and you should not mix versions inside a single conversion run.

Hardware Requirements

Workflow	RAM + swap	Disk	GPU
SD1.5 @ 512×512	~20 GB	~30 GB	optional
SD1.5 @ higher resolutions	64 GB+	60 GB+	optional
SDXL @ 1024×1024	64 GB+	60 GB+	optional

A CUDA-enabled GPU is optional — it only speeds up the data preparation phase. The actual quantization runs on CPU.

Skip the Conversion?

If you just want a model that works without the conversion overhead, check the pre-converted community collections first. Many popular SD1.5 and SDXL checkpoints are already available there.

NPU Model Conversion ​

When You Need This ​

Available Workflows ​

What to Expect ​

Hardware Requirements ​