AI Image Diffusion Simulator: Interactive Visualization of Denoising Diffusion Probabilistic Models with Procedural Canvas Generation

Romi Nur Ismanto
Jakarta AI Research Lab, Jakarta, Indonesia
rominur@gmail.com
March 2026

Abstract

We present AI Image Diffusion Simulator, a browser-based interactive educational tool that visualizes the core mechanics of Denoising Diffusion Probabilistic Models (DDPMs). The application enables users to select from 10 procedurally generated animal illustrations and observe the reverse diffusion process—transforming Gaussian noise into coherent images through iterative denoising steps. The simulator exposes key diffusion model hyperparameters including the number of denoising steps, Classifier-Free Guidance (CFG) scale, and manual noise level, allowing users to develop intuition for how these parameters influence generation quality. The system implements procedural image generation using the HTML5 Canvas API, where each animal is rendered algorithmically with custom color palettes and geometric primitives. A timeline carousel captures snapshots at each denoising iteration, providing a visual record of the generation trajectory. The entire application is built as a single HTML file with zero external dependencies using vanilla JavaScript, CSS3, and Canvas API, deployed on Vercel with edge CDN distribution. We describe the diffusion simulation algorithm, procedural rendering pipeline, state machine architecture, and interactive parameter controls.

Keywords: diffusion models, DDPM, denoising, image generation, noise schedule, Classifier-Free Guidance, CFG, procedural generation, Canvas API, interactive visualization, educational tool, zero dependencies

1. Introduction

Denoising Diffusion Probabilistic Models (DDPMs) have emerged as the dominant paradigm for generative image synthesis, powering systems such as DALL-E 2, Stable Diffusion, Imagen, and Midjourney. Despite their widespread impact, the core mechanism—iteratively removing noise from a random sample to reveal a coherent image—remains conceptually opaque to many practitioners, students, and enthusiasts. The mathematical formulation involving variational inference, Markov chains, and noise schedules creates a significant barrier to intuitive understanding.

Existing educational resources for diffusion models predominantly rely on static diagrams, mathematical notation, and code notebooks that require GPU infrastructure. Interactive visualizations that allow users to manipulate diffusion parameters and observe their effects in real-time are notably absent from the educational landscape.

AI Image Diffusion Simulator addresses this gap by providing a browser-based, zero-dependency interactive tool that simulates the reverse diffusion process. Users select a target image (from 10 procedurally generated animal illustrations), configure generation parameters, and observe the step-by-step transformation from noise to image. The key contributions of this work are:

2. Related Work

2.1 Denoising Diffusion Probabilistic Models

Ho et al. (2020) formalized DDPMs as a class of latent variable models that learn to reverse a fixed Markov chain that gradually adds Gaussian noise to data. The forward process is defined as:

q(xt | xt-1) = N(xt; √(1 - βt) xt-1, βt I)

where βt is the noise schedule controlling the amount of noise added at each step. The reverse process learns to denoise:

pθ(xt-1 | xt) = N(xt-1; μθ(xt, t), Σθ(xt, t))

Our simulator approximates this reverse process visually by interpolating between a noise image and the target image according to a cosine noise schedule, providing an intuitive analog to the mathematical denoising operation.

2.2 Classifier-Free Guidance

Ho and Salimans (2022) introduced Classifier-Free Guidance (CFG), a technique that interpolates between conditional and unconditional model predictions to control the fidelity-diversity trade-off:

εguided = εuncond + s · (εcond - εuncond)

where s is the guidance scale. Higher values produce images that more closely match the conditioning signal (prompt) at the cost of diversity. Our simulator exposes this parameter as a slider (range 1–20), allowing users to observe how CFG scale influences the sharpness and fidelity of the generated output.

2.3 Educational Visualizations for Deep Learning

Interactive visualizations have proven effective for teaching deep learning concepts. TensorFlow Playground (Smilkov et al., 2017) demonstrates neural network training on 2D datasets. CNN Explainer (Wang et al., 2020) visualizes convolutional neural network operations. However, no comparable interactive tool exists specifically for diffusion models. Our work fills this gap by targeting the diffusion generation process with an accessible, browser-based simulator.

3. System Architecture

The application is architected as an event-driven state machine contained within a single HTML file. Four major subsystems interact through a shared state object and DOM event listeners: the Procedural Renderer, the Diffusion Engine, the Timeline Manager, and the UI Controller.

3.1 Application Flow

Animal Selection → Procedural Canvas Rendering → Noise Generation (Gaussian) → Reverse Diffusion (Iterative Denoising) → Timeline Snapshot Capture → Progress Visualization → Generated Image Display

3.2 State Machine

The application manages state transitions through a finite state machine with the following states:

Table 1: Application state machine
State Description Transitions
IDLE Waiting for animal selection → SELECTED (on card click)
SELECTED Animal chosen, ready to generate → GENERATING (on start), → IDLE (on deselect)
GENERATING Reverse diffusion in progress → COMPLETED (on finish), → SELECTED (on stop)
COMPLETED Generation finished, displaying result → GENERATING (on regenerate), → SELECTED (on new animal)
FORWARD Forward diffusion (adding noise) → COMPLETED (on finish), → SELECTED (on stop)

3.3 Procedural Animal Renderer

Each of the 10 animal illustrations is rendered procedurally using the Canvas 2D API. Rather than loading pre-made image assets, the system draws each animal from geometric primitives (circles, ellipses, arcs, bezier curves) with custom color palettes. This approach eliminates external image dependencies and allows the application to remain fully self-contained.

Table 2: Procedural animal library
Animal Primary Color Key Features Geometric Primitives
Cat Orange/Amber Pointed ears, whiskers, tail Ellipses, bezier curves, lines
Dog Brown/Tan Floppy ears, tongue, collar Arcs, ellipses, rectangles
Bunny White/Pink Long ears, round tail, whiskers Elongated ellipses, circles
Panda Black/White Eye patches, round body Concentric circles, ellipses
Fox Orange/White Pointed ears, bushy tail Triangles, bezier curves
Penguin Black/White Belly patch, beak, flippers Ellipses, arcs, triangles
Hamster Beige/Brown Cheek pouches, small ears Overlapping circles
Owl Brown/Gold Large eyes, beak, feather tufts Concentric circles, triangles
Koala Gray/White Round ears, large nose Circles, ellipses, arcs
Duck Yellow/Orange Flat beak, wing, tail feathers Ellipses, bezier curves

Each animal rendering function accepts a Canvas 2D context and dimensions, then draws the complete illustration using approximately 20–40 drawing calls. Background colors and accent gradients are derived from each animal's custom palette, creating visually cohesive cards.

4. Diffusion Simulation Algorithm

4.1 Noise Generation

The noise image is generated by filling each pixel of the canvas with random RGB values sampled from a uniform distribution. While true DDPMs use Gaussian noise, the visual effect of uniform random pixels is perceptually equivalent for educational purposes and avoids the need for a Gaussian sampling implementation.

for (let i = 0; i < imageData.data.length; i += 4) {
    imageData.data[i]     = Math.random() * 255;  // R
    imageData.data[i + 1] = Math.random() * 255;  // G
    imageData.data[i + 2] = Math.random() * 255;  // B
    imageData.data[i + 3] = 255;                   // A
}

4.2 Reverse Diffusion (Denoising)

The reverse diffusion process is simulated by progressively blending the noise image with the target animal image over T steps. At each step t, the displayed image is a weighted combination:

xdisplay(t) = (1 - αt) · xnoise + αt · xtarget

where αt follows a cosine schedule from 0 to 1 over T steps. The cosine schedule provides smoother visual transitions compared to a linear schedule, as it spends more time in the perceptually important middle range where the image structure emerges from noise.

4.3 CFG Scale Simulation

The Classifier-Free Guidance scale modifies the denoising behavior by adjusting how aggressively the noise is removed at each step. In the simulator, higher CFG values increase the contrast between the emerging image and remaining noise, producing sharper but potentially over-saturated results at extreme values (mimicking the real-world behavior of high CFG in Stable Diffusion):

αeffective(t) = clamp(αt · (1 + log(cfg_scale) / log(7.5)), 0, 1)

This formulation ensures that at the default CFG of 7.5, the schedule is unmodified. Lower values produce softer, more diverse outputs, while higher values produce sharper, more deterministic results—faithfully representing the behavior of CFG in production diffusion systems.

4.4 Forward Diffusion

The forward diffusion mode reverses the process, progressively adding noise to the generated image. This demonstrates the noise addition process that training data undergoes during DDPM training. The blending weight transitions from α = 1 (clean image) to α = 0 (pure noise), visually showing how image structure is gradually destroyed.

4.5 Manual Noise Control

A manual noise slider allows users to freely explore any intermediate state along the diffusion trajectory without running the animation. This is particularly valuable for understanding the critical transition points where recognizable structure emerges from noise—typically occurring between 30–60% of the denoising process.

5. User Interface Design

5.1 Visual Design

The interface uses a dark theme with animated gradient background orbs, creating an immersive aesthetic inspired by AI art generation tools. The design system employs CSS custom properties for consistent theming:

Table 3: Visual design parameters
Element Implementation Purpose
Background Dark gradient (#0f0c29 → #302b63 → #24243e) Immersive dark environment for image focus
Animated Orbs CSS keyframe animation on pseudo-elements Ambient depth and visual interest
Animal Cards CSS Grid with hover transforms and glow effects Interactive selection interface
Progress Bar CSS width transition with gradient fill Real-time generation progress feedback
Toast Notifications CSS animation (slide-in, fade-out) Non-intrusive user feedback
Controls Panel Glassmorphism with backdrop-filter blur Parameter adjustment interface

5.2 Interactive Controls

Table 4: User-configurable parameters
Parameter Control Type Range Default Effect
Diffusion Steps Range slider 5 – 50 20 Number of denoising iterations
CFG Scale Range slider 1 – 20 7.5 Prompt adherence / sharpness
Manual Noise Range slider 0% – 100% 0% Explore intermediate diffusion states
Speed Selector buttons Slow / Normal / Fast Normal Animation frame delay

5.3 Timeline Carousel

During the diffusion process, the system captures canvas snapshots at each denoising step and displays them as thumbnail images in a horizontally scrollable carousel below the main canvas. This timeline provides:

5.4 Responsive Layout

Table 5: Responsive breakpoints
Viewport Width Layout Adaptation
Desktop ≥ 1024px Full grid layout, large canvas, side-by-side controls
Tablet 768px – 1023px Reduced grid columns, stacked controls
Mobile < 768px Single column, compact cards, touch-optimized sliders

6. Implementation Details

6.1 Technology Stack

Table 6: Core technology stack
Layer Technology Purpose
Markup HTML5 Document structure, semantic elements
Styling CSS3 Animations, gradients, grid layout, glassmorphism, responsive design
Logic Vanilla JavaScript (ES6+) State machine, event handling, animation loop
Rendering Canvas 2D API Procedural animal illustration, noise generation, image blending
Image Processing ImageData API Pixel-level noise/image manipulation
Hosting Vercel Static deployment with edge CDN and CI/CD
Source Control GitHub Version control with automatic Vercel deployment

6.2 Zero-Dependency Architecture

The entire application—HTML structure, CSS styling, procedural rendering functions, diffusion simulation logic, and UI management—resides in a single index.html file with zero external dependencies. No npm packages, no build step, no bundler, no framework. This architecture provides:

6.3 Animation Loop

The diffusion animation is driven by setTimeout-based frame scheduling rather than requestAnimationFrame. This design choice allows precise control over animation speed through the speed selector (slow: 500ms, normal: 200ms, fast: 50ms per frame) while maintaining consistent behavior across different display refresh rates. Each frame performs:

  1. Calculate the current blending weight αt based on step count and cosine schedule.
  2. Apply CFG scale modification to the blending weight.
  3. Blend noise and target image pixel data on the canvas.
  4. Capture a snapshot thumbnail for the timeline carousel.
  5. Update the progress bar and step counter display.
  6. Schedule the next frame or transition to COMPLETED state.

6.4 Canvas Rendering Pipeline

The procedural rendering pipeline for each animal follows a layered approach:

Clear Canvas → Draw Background Gradient → Draw Body Shape → Draw Limbs/Appendages → Draw Head → Draw Facial Features (Eyes, Nose, Mouth) → Draw Accessories (Whiskers, Ears, Tail) → Apply Highlights/Shadows

All drawing operations use the Canvas 2D context methods (arc, ellipse, bezierCurveTo, quadraticCurveTo, fill, stroke) without any external graphics libraries.

7. Deployment

7.1 Vercel Deployment

The project is deployed on Vercel with automatic deployment triggered by GitHub pushes. The zero-dependency architecture means no build step is required—Vercel serves the index.html file directly through its edge CDN, ensuring low-latency delivery globally.

7.2 Local Development

git clone https://github.com/romizone/diffusion-simulator.git
cd diffusion-simulator
npx serve .
# Or: python3 -m http.server 3000
# Or: open index.html (direct file access)

Due to the zero-dependency architecture, the application can be opened directly as a local file without a web server. All rendering uses the Canvas API which is fully supported in the file:// protocol.

8. Educational Value

The simulator serves as an educational bridge between the mathematical formulation of diffusion models and intuitive visual understanding. Key learning outcomes include:

Table 7: Educational objectives and simulator features
Learning Objective Simulator Feature Insight Gained
Understand reverse diffusion Denoising animation Images emerge gradually from noise through iterative refinement
Understand forward diffusion Forward mode Noise progressively destroys image structure
Role of step count Steps slider More steps = smoother transitions, diminishing returns beyond ~30
Role of CFG scale CFG slider Higher CFG = sharper but potentially over-saturated output
Noise-to-image transition Manual noise slider Structure emerges at ~30–60% denoising, not linearly
Generation trajectory Timeline carousel The path from noise to image is not uniform across steps

9. Browser Compatibility

The application relies on Canvas 2D API and modern CSS features (CSS Grid, custom properties, backdrop-filter). Compatibility across major browsers:

Table 8: Browser compatibility
Browser Canvas 2D CSS Grid Backdrop Filter Overall
Chrome 90+ Full Full Full Full support
Edge 90+ Full Full Full Full support
Safari 15+ Full Full Full Full support
Firefox 103+ Full Full Full Full support
Chrome (Android) Full Full Full Full support
Safari (iOS 15+) Full Full Full Full support

10. Project Structure

diffusion-simulator/
├── index.html              # Complete application (single-file, all-in-one)
│   ├── <style>             # CSS: dark theme, animations, responsive grid
│   ├── <body>              # HTML: animal cards, canvas, controls, timeline
│   └── <script>            # JS: state machine, diffusion engine, procedural renderer
├── vercel.json             # Vercel deployment configuration
├── LICENSE                 # MIT License
└── README.md               # Documentation and usage guide

11. Future Work

The current release establishes a foundation for more advanced diffusion model education. Planned enhancements include:

12. Conclusion

AI Image Diffusion Simulator demonstrates that the core concepts of denoising diffusion probabilistic models can be made accessible through interactive browser-based visualization. By simulating the reverse diffusion process with configurable parameters—step count, CFG scale, and manual noise level—users develop intuitive understanding of how diffusion models transform noise into coherent images.

The procedural Canvas API rendering eliminates external image dependencies while providing visually engaging target content. The timeline carousel creates a persistent record of the generation trajectory, reinforcing the iterative nature of the diffusion process. The forward diffusion mode completes the conceptual picture by demonstrating the noise addition process.

Built as a single HTML file with zero external dependencies, the application is instantly deployable, fully transparent, and suitable for educational contexts ranging from introductory AI courses to self-directed learning. The zero-dependency architecture ensures long-term maintainability and eliminates the supply chain risks associated with npm-based projects.

The complete source code is available at https://github.com/romizone/diffusion-simulator and a live demo is accessible at https://diffussion.vercel.app/.

References

  1. Ho, J., Jain, A., and Abbeel, P. (2020). Denoising Diffusion Probabilistic Models. Advances in Neural Information Processing Systems (NeurIPS), 33, 6840–6851.
  2. Ho, J. and Salimans, T. (2022). Classifier-Free Diffusion Guidance. arXiv preprint arXiv:2207.12598.
  3. Rombach, R., Blattmann, A., Lorenz, D., Esser, P., and Ommer, B. (2022). High-Resolution Image Synthesis with Latent Diffusion Models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  4. Song, J., Meng, C., and Ermon, S. (2021). Denoising Diffusion Implicit Models. International Conference on Learning Representations (ICLR).
  5. Dhariwal, P. and Nichol, A. (2021). Diffusion Models Beat GANs on Image Synthesis. Advances in Neural Information Processing Systems (NeurIPS), 34.
  6. Nichol, A. and Dhariwal, P. (2021). Improved Denoising Diffusion Probabilistic Models. International Conference on Machine Learning (ICML).
  7. Smilkov, D., Carter, S., Sculley, D., Viégas, F.B., and Wattenberg, M. (2017). Direct-Manipulation Visualization of Deep Networks. ICML Visualization for Deep Learning Workshop.
  8. Wang, Z.J., Turko, R., Shaikh, O., Park, H., Das, N., Hohman, F., Kahng, M., and Chau, D.H.P. (2020). CNN Explainer: Learning Convolutional Neural Networks with Interactive Visualization. IEEE Transactions on Visualization and Computer Graphics.
  9. Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., and Ganguli, S. (2015). Deep Unsupervised Learning using Nonequilibrium Thermodynamics. International Conference on Machine Learning (ICML).
  10. Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., and Chen, M. (2022). Hierarchical Text-Conditional Image Generation with CLIP Latents. arXiv preprint arXiv:2204.06125.