Defurnishing with X-Ray Vision:
Joint Removal of Furniture from Panoramas and Mesh

Alan Dolhasz*, Chen Ma*, Dave Gausebeck*,
Kevin Chen, Gregor Miller, Lucas Hayne, Gunnar Hovden, Azwad Sabik, Olaf Brandt, Mira Slavcheva
* denotes equal contribution
Matterport

Abstract

We present a pipeline for generating defurnished replicas of indoor spaces represented as textured meshes and corresponding multi-view panoramic images. To achieve this, we first segment and remove furniture from the mesh representation, extend planes, and fill holes, obtaining a simplified defurnished mesh (SDM). This SDM acts as an ``X-ray'' of the scene's underlying structure, guiding the defurnishing process. We extract Canny edges from depth and normal images rendered from the SDM. We then use these as a guide to remove the furniture from panorama images via ControlNet inpainting. This control signal ensures the availability of global geometric information that may be hidden from a particular panoramic view by the furniture being removed. The inpainted panoramas are used to texture the mesh. We show that our approach produces higher quality assets than methods that rely on neural radiance fields, which tend to produce blurry low-resolution images, or RGB-D inpainting, which is highly susceptible to hallucinations.

Overview

Our pipeline consists of the following components:

  • Furniture segmentation: We use our fine-tuned semantic segmentation model to estimate the furniture mask from the panorama images.
  • Mesh simplification and defurnishing: We simplify the mesh by removing small components, then remove furniture and extend planes to fill holes, obtaining a simplified defurnished mesh (SDM).
  • Control signal extraction: We extract Canny edges from depth and normal images rendered from the SDM, which serve as a control signal for the inpainting.
  • Inpainting: We use our fine-tuned ControlNet inpainting to remove furniture from panorama images, guided by the extracted edges.
  • Super-resolution: We apply our custom super-resolution model to enhance the resolution of the inpainted panoramas.
  • Texturing: The inpainted panoramas are used to texture the defurnished mesh.

Results

This section shows results from our full defurnishing pipeline.

2D Comparisons

In this section we compare results of our fine-tuned ControlNet inpainting, controlled by Canny edges extracted from depth and normal images, with the following control methods:

3D Comparisons

In this section we compare results of our full inpainting pipeline to radiance fields methods. For fairness, we run radiance fields methods on perspective images, and we run our method on panoramas and then project to perspective images. We compare to:

Acknowledgements

We are greateful to Dorra Larnaout, Ky Waegel, Mykhaylo Kurinnyy, Neil Jassal, Will Yu, Senthil Palanisamy, Zack Baker and David Buchhofer for their contributions to this work.

BibTeX

The paper has been accepted to the Workshop on AI for Creative Visual Content Generation, Editing and Understanding at CVPR 2025.

@inproceedings{matterport2025defurnishing,
    author    = {Dolhasz, Alan and Ma, Chen and Gausebeck, Dave and Chen, Kevin and Miller, Gregor and Hayne, Lucas and Hovden, Gunnar and Sabik, Azwad and Brandt, Olaf and Slavcheva, Mira},
    title     = {Defurnishing with X-Ray Vision: Joint Removal of Furniture from Panoramas and Mesh},
    booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},
    year      = {2025},
}