AI Image Diffusion Simulator: Interactive Visualization of Denoising Diffusion Probabilistic Models with Procedural Canvas Generation

Abstract

We present AI Image Diffusion Simulator, a browser-based interactive educational tool that visualizes the core mechanics of Denoising Diffusion Probabilistic Models (DDPMs). The application enables users to select from 10 procedurally generated animal illustrations and observe the reverse diffusion process—transforming Gaussian noise into coherent images through iterative denoising steps. The simulator exposes key diffusion model hyperparameters including the number of denoising steps, Classifier-Free Guidance (CFG) scale, and manual noise level, allowing users to develop intuition for how these parameters influence generation quality. The system implements procedural image generation using the HTML5 Canvas API, where each animal is rendered algorithmically with custom color palettes and geometric primitives. A timeline carousel captures snapshots at each denoising iteration, providing a visual record of the generation trajectory. The entire application is built as a single HTML file with zero external dependencies using vanilla JavaScript, CSS3, and Canvas API, deployed on Vercel with edge CDN distribution. We describe the diffusion simulation algorithm, procedural rendering pipeline, state machine architecture, and interactive parameter controls.

1. Introduction

Denoising Diffusion Probabilistic Models (DDPMs) have emerged as the dominant paradigm for generative image synthesis, powering systems such as DALL-E 2, Stable Diffusion, Imagen, and Midjourney. Despite their widespread impact, the core mechanism—iteratively removing noise from a random sample to reveal a coherent image—remains conceptually opaque to many practitioners, students, and enthusiasts. The mathematical formulation involving variational inference, Markov chains, and noise schedules creates a significant barrier to intuitive understanding.

Existing educational resources for diffusion models predominantly rely on static diagrams, mathematical notation, and code notebooks that require GPU infrastructure. Interactive visualizations that allow users to manipulate diffusion parameters and observe their effects in real-time are notably absent from the educational landscape.

AI Image Diffusion Simulator addresses this gap by providing a browser-based, zero-dependency interactive tool that simulates the reverse diffusion process. Users select a target image (from 10 procedurally generated animal illustrations), configure generation parameters, and observe the step-by-step transformation from noise to image. The key contributions of this work are:

2. Related Work

2.1 Denoising Diffusion Probabilistic Models

Ho et al. (2020) formalized DDPMs as a class of latent variable models that learn to reverse a fixed Markov chain that gradually adds Gaussian noise to data. The forward process is defined as:

where β_t is the noise schedule controlling the amount of noise added at each step. The reverse process learns to denoise:

Our simulator approximates this reverse process visually by interpolating between a noise image and the target image according to a cosine noise schedule, providing an intuitive analog to the mathematical denoising operation.

2.2 Classifier-Free Guidance

Ho and Salimans (2022) introduced Classifier-Free Guidance (CFG), a technique that interpolates between conditional and unconditional model predictions to control the fidelity-diversity trade-off:

where s is the guidance scale. Higher values produce images that more closely match the conditioning signal (prompt) at the cost of diversity. Our simulator exposes this parameter as a slider (range 1–20), allowing users to observe how CFG scale influences the sharpness and fidelity of the generated output.

2.3 Educational Visualizations for Deep Learning

Interactive visualizations have proven effective for teaching deep learning concepts. TensorFlow Playground (Smilkov et al., 2017) demonstrates neural network training on 2D datasets. CNN Explainer (Wang et al., 2020) visualizes convolutional neural network operations. However, no comparable interactive tool exists specifically for diffusion models. Our work fills this gap by targeting the diffusion generation process with an accessible, browser-based simulator.

3. System Architecture

The application is architected as an event-driven state machine contained within a single HTML file. Four major subsystems interact through a shared state object and DOM event listeners: the Procedural Renderer, the Diffusion Engine, the Timeline Manager, and the UI Controller.

3.1 Application Flow

3.2 State Machine

The application manages state transitions through a finite state machine with the following states:

3.3 Procedural Animal Renderer

Each of the 10 animal illustrations is rendered procedurally using the Canvas 2D API. Rather than loading pre-made image assets, the system draws each animal from geometric primitives (circles, ellipses, arcs, bezier curves) with custom color palettes. This approach eliminates external image dependencies and allows the application to remain fully self-contained.

Each animal rendering function accepts a Canvas 2D context and dimensions, then draws the complete illustration using approximately 20–40 drawing calls. Background colors and accent gradients are derived from each animal's custom palette, creating visually cohesive cards.

4. Diffusion Simulation Algorithm

4.1 Noise Generation

The noise image is generated by filling each pixel of the canvas with random RGB values sampled from a uniform distribution. While true DDPMs use Gaussian noise, the visual effect of uniform random pixels is perceptually equivalent for educational purposes and avoids the need for a Gaussian sampling implementation.

4.2 Reverse Diffusion (Denoising)

Table 1: Application state machine
State	Description	Transitions
IDLE	Waiting for animal selection	→ SELECTED (on card click)
SELECTED	Animal chosen, ready to generate	→ GENERATING (on start), → IDLE (on deselect)
GENERATING	Reverse diffusion in progress	→ COMPLETED (on finish), → SELECTED (on stop)
COMPLETED	Generation finished, displaying result	→ GENERATING (on regenerate), → SELECTED (on new animal)
FORWARD	Forward diffusion (adding noise)	→ COMPLETED (on finish), → SELECTED (on stop)

Table 2: Procedural animal library
Animal	Primary Color	Key Features	Geometric Primitives
Cat	Orange/Amber	Pointed ears, whiskers, tail	Ellipses, bezier curves, lines
Dog	Brown/Tan	Floppy ears, tongue, collar	Arcs, ellipses, rectangles
Bunny	White/Pink	Long ears, round tail, whiskers	Elongated ellipses, circles
Panda	Black/White	Eye patches, round body	Concentric circles, ellipses
Fox	Orange/White	Pointed ears, bushy tail	Triangles, bezier curves
Penguin	Black/White	Belly patch, beak, flippers	Ellipses, arcs, triangles
Hamster	Beige/Brown	Cheek pouches, small ears	Overlapping circles
Owl	Brown/Gold	Large eyes, beak, feather tufts	Concentric circles, triangles
Koala	Gray/White	Round ears, large nose	Circles, ellipses, arcs
Duck	Yellow/Orange	Flat beak, wing, tail feathers	Ellipses, bezier curves

The reverse diffusion process is simulated by progressively blending the noise image with the target animal image over T steps. At each step t, the displayed image is a weighted combination:

where α_t follows a cosine schedule from 0 to 1 over T steps. The cosine schedule provides smoother visual transitions compared to a linear schedule, as it spends more time in the perceptually important middle range where the image structure emerges from noise.

4.3 CFG Scale Simulation

The Classifier-Free Guidance scale modifies the denoising behavior by adjusting how aggressively the noise is removed at each step. In the simulator, higher CFG values increase the contrast between the emerging image and remaining noise, producing sharper but potentially over-saturated results at extreme values (mimicking the real-world behavior of high CFG in Stable Diffusion):

This formulation ensures that at the default CFG of 7.5, the schedule is unmodified. Lower values produce softer, more diverse outputs, while higher values produce sharper, more deterministic results—faithfully representing the behavior of CFG in production diffusion systems.

4.4 Forward Diffusion

The forward diffusion mode reverses the process, progressively adding noise to the generated image. This demonstrates the noise addition process that training data undergoes during DDPM training. The blending weight transitions from α = 1 (clean image) to α = 0 (pure noise), visually showing how image structure is gradually destroyed.

4.5 Manual Noise Control

A manual noise slider allows users to freely explore any intermediate state along the diffusion trajectory without running the animation. This is particularly valuable for understanding the critical transition points where recognizable structure emerges from noise—typically occurring between 30–60% of the denoising process.

5. User Interface Design

5.1 Visual Design

The interface uses a dark theme with animated gradient background orbs, creating an immersive aesthetic inspired by AI art generation tools. The design system employs CSS custom properties for consistent theming:

5.2 Interactive Controls

5.3 Timeline Carousel

During the diffusion process, the system captures canvas snapshots at each denoising step and displays them as thumbnail images in a horizontally scrollable carousel below the main canvas. This timeline provides:

5.4 Responsive Layout

6. Implementation Details

6.1 Technology Stack

6.2 Zero-Dependency Architecture

Table 3: Visual design parameters
Element	Implementation	Purpose
Background	Dark gradient (#0f0c29 → #302b63 → #24243e)	Immersive dark environment for image focus
Animated Orbs	CSS keyframe animation on pseudo-elements	Ambient depth and visual interest
Animal Cards	CSS Grid with hover transforms and glow effects	Interactive selection interface
Progress Bar	CSS width transition with gradient fill	Real-time generation progress feedback
Toast Notifications	CSS animation (slide-in, fade-out)	Non-intrusive user feedback
Controls Panel	Glassmorphism with backdrop-filter blur	Parameter adjustment interface

Table 4: User-configurable parameters
Parameter	Control Type	Range	Default	Effect
Diffusion Steps	Range slider	5 – 50	20	Number of denoising iterations
CFG Scale	Range slider	1 – 20	7.5	Prompt adherence / sharpness
Manual Noise	Range slider	0% – 100%	0%	Explore intermediate diffusion states
Speed	Selector buttons	Slow / Normal / Fast	Normal	Animation frame delay

Table 5: Responsive breakpoints
Viewport	Width	Layout Adaptation
Desktop	≥ 1024px	Full grid layout, large canvas, side-by-side controls
Tablet	768px – 1023px	Reduced grid columns, stacked controls
Mobile	< 768px	Single column, compact cards, touch-optimized sliders

Table 6: Core technology stack
Layer	Technology	Purpose
Markup	HTML5	Document structure, semantic elements
Styling	CSS3	Animations, gradients, grid layout, glassmorphism, responsive design
Logic	Vanilla JavaScript (ES6+)	State machine, event handling, animation loop
Rendering	Canvas 2D API	Procedural animal illustration, noise generation, image blending
Image Processing	ImageData API	Pixel-level noise/image manipulation
Hosting	Vercel	Static deployment with edge CDN and CI/CD
Source Control	GitHub	Version control with automatic Vercel deployment

The entire application—HTML structure, CSS styling, procedural rendering functions, diffusion simulation logic, and UI management—resides in a single index.html file with zero external dependencies. No npm packages, no build step, no bundler, no framework. This architecture provides:

6.3 Animation Loop

The diffusion animation is driven by setTimeout-based frame scheduling rather than requestAnimationFrame. This design choice allows precise control over animation speed through the speed selector (slow: 500ms, normal: 200ms, fast: 50ms per frame) while maintaining consistent behavior across different display refresh rates. Each frame performs:

6.4 Canvas Rendering Pipeline

All drawing operations use the Canvas 2D context methods (arc, ellipse, bezierCurveTo, quadraticCurveTo, fill, stroke) without any external graphics libraries.

7. Deployment

7.1 Vercel Deployment

The project is deployed on Vercel with automatic deployment triggered by GitHub pushes. The zero-dependency architecture means no build step is required—Vercel serves the index.html file directly through its edge CDN, ensuring low-latency delivery globally.

7.2 Local Development

Due to the zero-dependency architecture, the application can be opened directly as a local file without a web server. All rendering uses the Canvas API which is fully supported in the file:// protocol.

8. Educational Value

The simulator serves as an educational bridge between the mathematical formulation of diffusion models and intuitive visual understanding. Key learning outcomes include:

9. Browser Compatibility

The application relies on Canvas 2D API and modern CSS features (CSS Grid, custom properties, backdrop-filter). Compatibility across major browsers:

10. Project Structure

11. Future Work

The current release establishes a foundation for more advanced diffusion model education. Planned enhancements include:

12. Conclusion

AI Image Diffusion Simulator demonstrates that the core concepts of denoising diffusion probabilistic models can be made accessible through interactive browser-based visualization. By simulating the reverse diffusion process with configurable parameters—step count, CFG scale, and manual noise level—users develop intuitive understanding of how diffusion models transform noise into coherent images.

The procedural Canvas API rendering eliminates external image dependencies while providing visually engaging target content. The timeline carousel creates a persistent record of the generation trajectory, reinforcing the iterative nature of the diffusion process. The forward diffusion mode completes the conceptual picture by demonstrating the noise addition process.

Built as a single HTML file with zero external dependencies, the application is instantly deployable, fully transparent, and suitable for educational contexts ranging from introductory AI courses to self-directed learning. The zero-dependency architecture ensures long-term maintainability and eliminates the supply chain risks associated with npm-based projects.

References

Table 7: Educational objectives and simulator features
Learning Objective	Simulator Feature	Insight Gained
Understand reverse diffusion	Denoising animation	Images emerge gradually from noise through iterative refinement
Understand forward diffusion	Forward mode	Noise progressively destroys image structure
Role of step count	Steps slider	More steps = smoother transitions, diminishing returns beyond ~30
Role of CFG scale	CFG slider	Higher CFG = sharper but potentially over-saturated output
Noise-to-image transition	Manual noise slider	Structure emerges at ~30–60% denoising, not linearly
Generation trajectory	Timeline carousel	The path from noise to image is not uniform across steps

Table 8: Browser compatibility
Browser	Canvas 2D	CSS Grid	Backdrop Filter	Overall
Chrome 90+	Full	Full	Full	Full support
Edge 90+	Full	Full	Full	Full support
Safari 15+	Full	Full	Full	Full support
Firefox 103+	Full	Full	Full	Full support
Chrome (Android)	Full	Full	Full	Full support
Safari (iOS 15+)	Full	Full	Full	Full support

Ho, J., Jain, A., and Abbeel, P. (2020). Denoising Diffusion Probabilistic Models. Advances in Neural Information Processing Systems (NeurIPS), 33, 6840–6851.
Ho, J. and Salimans, T. (2022). Classifier-Free Diffusion Guidance. arXiv preprint arXiv:2207.12598.
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., and Ommer, B. (2022). High-Resolution Image Synthesis with Latent Diffusion Models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
Song, J., Meng, C., and Ermon, S. (2021). Denoising Diffusion Implicit Models. International Conference on Learning Representations (ICLR).
Dhariwal, P. and Nichol, A. (2021). Diffusion Models Beat GANs on Image Synthesis. Advances in Neural Information Processing Systems (NeurIPS), 34.
Nichol, A. and Dhariwal, P. (2021). Improved Denoising Diffusion Probabilistic Models. International Conference on Machine Learning (ICML).
Smilkov, D., Carter, S., Sculley, D., Viégas, F.B., and Wattenberg, M. (2017). Direct-Manipulation Visualization of Deep Networks. ICML Visualization for Deep Learning Workshop.
Wang, Z.J., Turko, R., Shaikh, O., Park, H., Das, N., Hohman, F., Kahng, M., and Chau, D.H.P. (2020). CNN Explainer: Learning Convolutional Neural Networks with Interactive Visualization. IEEE Transactions on Visualization and Computer Graphics.
Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., and Ganguli, S. (2015). Deep Unsupervised Learning using Nonequilibrium Thermodynamics. International Conference on Machine Learning (ICML).
Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., and Chen, M. (2022). Hierarchical Text-Conditional Image Generation with CLIP Latents. arXiv preprint arXiv:2204.06125.