AcroRomi: A Zero-Dependency Native PDF Editor for macOS Built with Swift and SwiftUI

Romi Nur Ismanto
Jakarta AI Research Lab, Jakarta, Indonesia
rominur@gmail.com
March 2026

Abstract

We present AcroRomi, a comprehensive native PDF editor for macOS built entirely with Swift 5.9+ and SwiftUI, relying exclusively on Apple’s native frameworks with zero external dependencies. The application provides eleven core modules—PDF Viewer, Annotations, Text Editor, Converter, Page Organizer, Form Filler, E-Signatures, Security, Redaction, Document Comparison, and OCR—through a modular architecture organized into Core, UI, Features, and Models layers. AcroRomi leverages PDFKit for document rendering and annotation, the Vision framework with VNRecognizeTextRequest for on-device optical character recognition supporting 18+ languages, Core Graphics for rendering and image conversion, and WebKit’s WKWebView for HTML-to-PDF conversion. The text editor implements a custom EditablePDFView subclass enabling click-to-edit functionality with 14 font families, while redaction uses an irreversible bitmap flattening approach for permanent sensitive data removal. Document comparison provides both LCS-based text diff and pixel-level visual comparison modes. The application is built with Swift Package Manager (no .xcodeproj required) and distributed as a native .dmg for macOS 14+ Sonoma. The complete source code is available under the MIT license.

Keywords: PDF editor, macOS, Swift, SwiftUI, PDFKit, Vision OCR, Core Graphics, annotations, redaction, document comparison, zero dependencies, native application

1. Introduction

PDF editing on macOS has long been constrained to a polarized ecosystem: Apple’s built-in Preview provides basic annotation and page management but lacks advanced features such as text editing, form filling, OCR, and redaction, while professional tools like Adobe Acrobat Pro introduce substantial cost, large install footprints, and dependency on proprietary frameworks. Third-party alternatives often rely on cross-platform toolkits that sacrifice macOS-native feel or introduce external library dependencies that complicate distribution and maintenance.

AcroRomi addresses this gap by providing a feature-complete PDF editor built entirely on Apple’s native framework stack. The zero-dependency philosophy means the application relies exclusively on PDFKit, Vision, Core Graphics, WebKit, AppKit, and SwiftUI—all shipped with macOS—eliminating version conflicts, supply chain risks, and the performance overhead of bridging to non-native libraries.

The key contributions of this work are:

2. Related Work

The macOS PDF editing landscape spans several categories. Apple Preview, bundled with every Mac since OS X, provides PDF viewing, basic annotation (highlight, underline, shapes, signatures), and page reordering. However, it lacks text editing capabilities, form filling, OCR, redaction, and document comparison—features essential for professional document workflows.

Adobe Acrobat Pro DC remains the industry standard for PDF editing, offering comprehensive features including text editing, form creation, OCR (via Adobe Scan), redaction, and digital signatures. However, its subscription model ($22.99/month), large installation size (~1.2 GB), and reliance on Adobe’s proprietary rendering engine make it inaccessible for many users and introduce significant resource overhead.

Open-source alternatives such as LibreOffice Draw can edit PDFs but treat them as vector drawings rather than structured documents, losing text fidelity. PDF Expert by Readdle provides a polished macOS experience but uses a proprietary rendering engine rather than PDFKit. Skim focuses on academic annotation workflows but lacks text editing and form capabilities.

AcroRomi occupies a unique position: it provides Acrobat-class functionality (text editing, OCR, redaction, form filling, document comparison) while maintaining a zero-dependency, native macOS architecture built entirely on SwiftUI and Apple frameworks. This combination of comprehensive features with architectural purity is, to our knowledge, unprecedented in the open-source macOS PDF editor space.

3. System Architecture

AcroRomi follows a modular architecture organized into four layers: Core (document management and state), UI (views and layout), Features (module-specific functionality), and Models (data types and enumerations). This separation ensures that each of the eleven modules can be developed, tested, and maintained independently.

3.1 Project Structure

Sources/Acroromi/
├── Core/
│   ├── PDFViewWrapper.swift      # NSViewRepresentable bridge to PDFKit
│   ├── EditablePDFView.swift     # Custom PDFView subclass for text editing
│   ├── DocumentState.swift       # @Observable document state management
│   ├── PDFDocumentManager.swift  # Document lifecycle operations
│   └── Constants.swift           # App-wide constants
├── UI/
│   ├── SidebarView.swift         # Navigation sidebar with module list
│   ├── ThumbnailView.swift       # Page thumbnail strip
│   ├── ToolbarView.swift         # Context-sensitive toolbar
│   ├── StatusBarView.swift       # Document info and zoom control
│   ├── SearchBarView.swift       # Text search with navigation
│   ├── WelcomeView.swift         # Empty-state landing screen
│   └── Styles.swift              # Shared style definitions
├── Features/
│   ├── Viewer/                   # PDF viewing and navigation
│   ├── Annotations/              # Highlight, underline, notes, shapes
│   ├── Editor/                   # Click-to-edit text editing
│   ├── Converter/                # PDF ↔ Image, HTML → PDF, PDF → Text
│   ├── Organizer/                # Merge, split, rotate, reorder, extract
│   ├── FormSign/                 # AcroForm detection and filling
│   ├── ESignature/               # Electronic signature placement
│   ├── Security/                 # Password protection and permissions
│   ├── Redaction/                # Permanent content removal
│   ├── Compare/                  # Text diff and visual comparison
│   └── OCR/                      # Vision-based text recognition
└── Models/
    └── ToolMode.swift            # Tool mode enumerations

3.2 State Management

The application uses Swift’s @Observable macro (introduced in iOS 17/macOS 14) through the AppState class, which serves as the root state container. DocumentState manages per-document properties including the current PDFDocument reference, page index, zoom level, active tool mode, and unsaved changes flag. The PDFViewWrapper bridges SwiftUI’s declarative layer to PDFKit’s imperative NSView-based rendering through NSViewRepresentable, with a coordinator pattern handling delegate callbacks and user interaction events.

3.3 Entry Point and Layout

AcroromiApp, decorated with the @main attribute, initializes the SwiftUI application lifecycle. The primary ContentView uses a NavigationSplitView with a sidebar listing the eleven feature modules and a detail area hosting the active module’s view alongside the PDF rendering surface. The toolbar, status bar, and thumbnail strip are layered around the central PDFViewWrapper.

4. Core Modules

4.1 PDF Viewer

The viewer module wraps PDFKit’s PDFView through PDFViewWrapper, providing zoom control (fit-to-page, fit-to-width, manual percentage), multi-page layout modes (single page, continuous, two-up), text search with result navigation, printing via the macOS print dialog, and keyboard shortcuts for navigation. The search implementation resolves an issue in v1.0 where search results were discarded rather than navigated to; v1.1.0 correctly scrolls to and highlights matched text selections.

4.2 Annotations

The annotation module supports six annotation types: highlight (with configurable color and opacity), underline, strikethrough, sticky notes (with text content), freehand drawing (with adjustable stroke width and color), and geometric shapes (rectangles, circles, lines, arrows). Annotations are stored as standard PDF annotation objects within the document, ensuring compatibility with other PDF readers. Color customization uses a native macOS color picker with preset palettes for common annotation colors.

4.3 Text Editor

Text editing is implemented through the custom EditablePDFView subclass, which intercepts mouse click events on text regions to present an editable overlay. The v1.1.0 release fixes a critical double coordinate conversion bug in click detection that caused the editor to activate at incorrect positions. The editor supports 14 font families: Helvetica, Times New Roman, Courier, Arial, Georgia, Verdana, Futura, Avenir, Menlo, Palatino, American Typewriter, Optima, Baskerville, and Didot. Font size, color, and weight controls are provided through a floating toolbar. A guard mechanism using an isCommitting flag prevents double-commit crashes during text save operations. Editor views are properly removed from the superview after fade-out animation to prevent memory leaks.

4.4 Converter

The converter module supports four transformation pipelines:

PDF → Images (PNG/JPEG per page)  |  Images → PDF  |  HTML → PDF (via WKWebView)  |  PDF → Text

PDF-to-image conversion uses Core Graphics to render each page at configurable DPI (72–600) into CGImage representations, then encodes to PNG or JPEG format. Image-to-PDF creates a new PDFDocument with each image as a full-page content. HTML-to-PDF leverages WebKit’s WKWebView to load and render HTML content, then captures the rendered output as PDF pages. Text extraction walks the document’s page tree, extracting text content via PDFKit’s string property. The v1.1.0 release adds range validation to prevent crashes when converter operations encounter pages with unexpected content boundaries.

4.5 Page Organizer

The page organizer provides six operations: merge (combine multiple PDF files into one), split (extract page ranges into separate files), rotate (90°/180°/270° with normalization to valid states), reorder (drag-and-drop page rearrangement), delete (remove selected pages), and extract (save selected pages as a new document). The v1.1.0 release converts the totalPages property from a stored value to a computed property, ensuring automatic updates after page insertion, deletion, or merge operations. Index validation guards were added to prevent out-of-bounds crashes during page operations.

4.6 Form Filler and Signatures

The form module auto-detects AcroForm fields (text fields, checkboxes, radio buttons, dropdown menus) within PDF documents and presents them in an editable interface. Users can fill form fields interactively with values persisted to the document. The signature sub-module supports two creation methods: drawn signatures (via a canvas-based drawing surface) and typed signatures (text rendered in a handwriting-style font). The v1.1.0 release fixes a rendering bug where drawn signatures appeared vertically flipped due to a coordinate system mismatch between the drawing canvas (origin top-left) and PDF coordinate space (origin bottom-left).

4.7 Electronic Signatures

The e-signature module extends basic signatures with placement tracking, allowing users to position signature fields at specific locations within documents. This module supports signature request workflows where document owners can designate signature locations for multiple signers.

4.8 Security

The security module implements PDF encryption with two password levels: user password (required to open the document) and owner password (required to modify permissions). Permission controls include print, copy, modify, annotation, form fill, and accessibility extraction flags. Encryption uses the PDF specification’s standard security handler as implemented by PDFKit.

4.9 Redaction

Redaction in AcroRomi uses an irreversible bitmap flattening approach. When a region is marked for redaction, the system renders the page to a bitmap, fills the redaction area with a solid color (black by default), and replaces the original page content with the flattened bitmap. This ensures that the redacted content cannot be recovered by removing annotation layers or inspecting the PDF structure—the original vector text and graphics are permanently destroyed. The ImageStampAnnotation decoder was hardened in v1.1.0 to return nil instead of causing fatal errors when encountering malformed annotation data.

4.10 Document Comparison

The comparison module provides two analysis modes:

4.11 OCR (Optical Character Recognition)

The OCR module uses Apple’s Vision framework with VNRecognizeTextRequest for on-device text recognition. The system supports 18+ languages including English, Spanish, French, German, Italian, Portuguese, Chinese (Simplified and Traditional), Japanese, Korean, Arabic, Hindi, Thai, Vietnamese, Indonesian, Turkish, Polish, Dutch, and Swedish. An intelligent skipping mechanism detects pages that already contain selectable text and bypasses OCR processing for those pages, significantly reducing processing time for partially-digitized documents. The v1.1.0 release moves heavy Vision processing to background threads to prevent UI blocking during multi-page OCR operations.

5. Stability Engineering (v1.1.0)

The v1.1.0 release focused extensively on stability, addressing 13 issues across Core, UI, and Feature modules. These fixes fall into four categories:

5.1 Memory Management

Editor views are now properly removed from the superview after fade-out animation completes. In v1.0, animated editor overlays remained in the view hierarchy after dismissal, accumulating memory allocations proportional to the number of edit sessions within a single document session.

5.2 Crash Prevention

Table 1: Crash fixes in v1.1.0
Issue Root Cause Fix
Text editor crash Force-unwrap on textContainer Optional binding with guard
Double-commit crash Concurrent save operations isCommitting flag guard
Page operation crash Out-of-bounds page index Index validation before access
Converter crash Invalid range boundaries Range clamping to valid bounds
Annotation decode crash Malformed ImageStampAnnotation Failable initializer returning nil

5.3 Coordinate System Corrections

Two coordinate-related bugs were resolved: a double coordinate conversion in click detection (where screen coordinates were transformed twice through the PDFView’s coordinate space, causing the text editor to activate at incorrect positions) and rotation normalization (ensuring values are clamped to valid states of 0°, 90°, 180°, and 270° regardless of accumulated rotation operations).

5.4 UI State Synchronization

The PDFViewWrapper coordinator was refactored to eliminate stale struct references that caused the SwiftUI layer to read outdated state. Sidebar icons now properly toggle between active and inactive states. The search functionality navigates to results rather than silently discarding them. The totalPages property was converted to a computed property to automatically reflect page count changes after organizer operations.

6. Framework Integration

AcroRomi’s zero-dependency approach relies on deep integration with six Apple frameworks:

Table 2: Apple framework utilization
Framework Usage Key APIs
PDFKit PDF rendering, annotation, form handling, page management PDFView, PDFDocument, PDFAnnotation, PDFPage
Vision On-device OCR with ML-based text recognition VNRecognizeTextRequest, VNImageRequestHandler
Core Graphics Page rendering, image conversion, bitmap operations CGContext, CGImage, CGPDFDocument
WebKit HTML-to-PDF conversion WKWebView, createPDF(configuration:)
AppKit Native macOS UI components, print dialog, color picker NSView, NSPrintOperation, NSColorPanel
SwiftUI Declarative UI, state management, layout NavigationSplitView, @Observable, NSViewRepresentable

The NSViewRepresentable protocol serves as the critical bridge between SwiftUI’s declarative paradigm and PDFKit’s imperative NSView hierarchy. The PDFViewWrapper implements both makeNSView and updateNSView lifecycle methods, with a coordinator class handling PDFViewDelegate callbacks, gesture recognition, and user interaction events. This bridging pattern allows the application to leverage SwiftUI’s modern layout system while retaining full access to PDFKit’s mature rendering capabilities.

7. Build and Distribution

7.1 Build System

AcroRomi uses Swift Package Manager (SPM) as its build system, eliminating the need for Xcode project files (.xcodeproj). The Package.swift manifest defines the executable target with source files organized under Sources/Acroromi/. Development builds are produced with swift build, while release distribution uses a custom build-app.sh script.

7.2 Distribution Pipeline

swift build -c release.app bundle assembly → Info.plist injection → Icon generation → DMG creation (hdiutil)

The application icon is generated programmatically using Core Graphics, rendering a red penguin mascot (the AcroRomi logo) directly in Swift code rather than relying on external asset files. File associations for .pdf documents are defined in the generated Info.plist.

Table 3: Distribution artifacts
Format File Installation Method
.app .build/Acroromi.app Direct execution
.dmg Acroromi.dmg Drag to Applications

8. System Requirements

Table 4: Runtime and development requirements
Requirement Specification
Operating System macOS 14.0 Sonoma or later
Architecture Apple Silicon (M1/M2/M3/M4) and Intel x86_64
Swift (development) 5.9+
Xcode (development) 15+ Command Line Tools
External Dependencies None (zero-dependency architecture)
Build System Swift Package Manager

9. Feature Comparison

Table 5: Feature comparison with macOS PDF tools
Feature Preview Skim Acrobat Pro AcroRomi
PDF Viewing
Annotations Basic
Text Editing
OCR ✓ (18+ lang)
Form Filling
Redaction
Doc Compare
Signatures Basic
Format Conversion Limited
Page Organization Basic
Zero Dependencies
Open Source
Cost Free Free $22.99/mo Free

10. Conclusion and Future Work

AcroRomi demonstrates that Apple’s native framework stack—PDFKit, Vision, Core Graphics, WebKit, and SwiftUI—is sufficient to build a feature-complete PDF editor competitive with commercial offerings, without requiring any external dependencies. The modular architecture with eleven distinct feature modules organized across Core, UI, Features, and Models layers provides clear separation of concerns while the @Observable state management pattern ensures reactive UI updates across the application.

The v1.1.0 stability release, addressing 13 issues across memory management, crash prevention, coordinate correction, and UI synchronization, demonstrates the engineering rigor required for production-grade document editing software where data integrity and application reliability are paramount.

Future directions include:

The complete source code is available at https://github.com/romizone/acroromi under the MIT license.

References

  1. Apple Inc. (2024). PDFKit Framework Reference. Apple Developer Documentation. https://developer.apple.com/documentation/pdfkit
  2. Apple Inc. (2024). Vision Framework: Recognizing Text in Images. Apple Developer Documentation. https://developer.apple.com/documentation/vision/recognizing_text_in_images
  3. Apple Inc. (2024). Core Graphics Framework Reference. Apple Developer Documentation. https://developer.apple.com/documentation/coregraphics
  4. Apple Inc. (2024). WebKit Framework Reference. Apple Developer Documentation. https://developer.apple.com/documentation/webkit
  5. Apple Inc. (2024). SwiftUI Framework Reference. Apple Developer Documentation. https://developer.apple.com/documentation/swiftui
  6. Apple Inc. (2024). Swift Package Manager. Swift.org Documentation. https://www.swift.org/package-manager
  7. Apple Inc. (2024). Human Interface Guidelines: macOS. Apple Developer Documentation. https://developer.apple.com/design/human-interface-guidelines/macos
  8. Adobe Inc. (2024). PDF Reference, Seventh Edition (ISO 32000-2:2020). Adobe PDF Specification. https://www.adobe.com/devnet-docs/acroforms/FormsAPIReference.pdf
  9. Apple Inc. (2023). Observation Framework: @Observable Macro. Apple Developer Documentation. https://developer.apple.com/documentation/observation
  10. Hunt, J. W. & McIlroy, M. D. (1976). An Algorithm for Differential File Comparison. Computing Science Technical Report No. 41, Bell Laboratories.