EgoLifter: Open-world 3D Segmentation for Egocentric Perception

Gu, Qiao; Lv, Zhaoyang; Frost, Duncan; Green, Simon; Straub, Julian; Sweeney, Chris

Computer Science > Computer Vision and Pattern Recognition

arXiv:2403.18118 (cs)

[Submitted on 26 Mar 2024 (v1), last revised 22 Jul 2024 (this version, v2)]

Title:EgoLifter: Open-world 3D Segmentation for Egocentric Perception

Authors:Qiao Gu, Zhaoyang Lv, Duncan Frost, Simon Green, Julian Straub, Chris Sweeney

View PDF HTML (experimental)

Abstract:In this paper we present EgoLifter, a novel system that can automatically segment scenes captured from egocentric sensors into a complete decomposition of individual 3D objects. The system is specifically designed for egocentric data where scenes contain hundreds of objects captured from natural (non-scanning) motion. EgoLifter adopts 3D Gaussians as the underlying representation of 3D scenes and objects and uses segmentation masks from the Segment Anything Model (SAM) as weak supervision to learn flexible and promptable definitions of object instances free of any specific object taxonomy. To handle the challenge of dynamic objects in ego-centric videos, we design a transient prediction module that learns to filter out dynamic objects in the 3D reconstruction. The result is a fully automatic pipeline that is able to reconstruct 3D object instances as collections of 3D Gaussians that collectively compose the entire scene. We created a new benchmark on the Aria Digital Twin dataset that quantitatively demonstrates its state-of-the-art performance in open-world 3D segmentation from natural egocentric input. We run EgoLifter on various egocentric activity datasets which shows the promise of the method for 3D egocentric perception at scale.

Comments:	ECCV 2024 camera ready version. Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2403.18118 [cs.CV]
	(or arXiv:2403.18118v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2403.18118

Submission history

From: Qiao Gu [view email]
[v1] Tue, 26 Mar 2024 21:48:27 UTC (11,611 KB)
[v2] Mon, 22 Jul 2024 20:27:01 UTC (27,933 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:EgoLifter: Open-world 3D Segmentation for Egocentric Perception

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:EgoLifter: Open-world 3D Segmentation for Egocentric Perception

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators