MiraGe: Editable 2.5D Image Representations with Flat-Controlled 3D Gaussians

MiraGe: Editable 2.5D Image Representations with Flat-Controlled 3D Gaussians

Table of Contents

Abstract

Implicit Neural Representations (INRs) encode images as continuous functions that map pixel coordinates to RGB values, achieving compact storage and high visual fidelity. Recent work such as GaussianImage replaces neural MLPs with collections of 2D Gaussian primitives, reaching similar reconstruction quality and compression but offering limited editability. In practice, creators often need to adjust content—move objects, bend a photo, cast new shadows, or create parallax—all of which are awkward within purely 2D or purely additive Gaussian schemes.

We introduce MiraGe, a method that treats a flat image as an object perceived in 3D and represents it with flat-controlled 3D Gaussian primitives. By positioning the image on a virtual plane and using mirror reflections to reason about its appearance under viewpoint changes, MiraGe produces realistic 2.5D effects: slight camera motions yield parallax; tilts create perspective-consistent warps; and edits can respect spatial context. This reframing enables precise, local 2D edits—insertions, deletions, color/texture changes—while preserving global coherence. Moreover, because the representation lives in 3D, MiraGe can be coupled with a physics engine to simulate physically plausible manipulations (e.g., bending a print, dropping cut-out elements, or letting parts collide), then render the resulting configuration back into a single edited image.

Across qualitative studies, MiraGe delivers:

  • Higher editing fidelity with clean boundaries and consistent textures,
  • Realistic parallax and shading cues from camera motion and mirror reasoning,
  • Natural, physics-aware modifications that standard 2D Gaussian or INR approaches struggle to reproduce.

In summary, MiraGe bridges the gap between compact Gaussian encodings and practical image editing by embedding 2D content in a controllable 3D Gaussian space. This unlocks intuitive, physically grounded edits while retaining the efficiency and quality associated with Gaussian-based image representations.

Paper: Click here to read