Leibniz Universität Hannover zur zentralen Website
Konstruktiver Ingenieurbau
Wasser und Umwelt
Geodäsie und Geoinformatik
Weitere Einrichtungen
Kontakt

Multi-View Pedestrian Detection and Tracking in 3D

GeoWerkstatt-Projekt des Monats Juni 2026

Projekt: Multi-view 3D pedestrian detection and tracking with integration of viewing direction information

Forschende: Rasho Ali, Max Mehltretter, Christian Heipke

Projektidee: Integration of additional camera information into a learning-based 3D representation to enhance multi-view 3D pedestrian detection and tracking.

Multi-View-3D-Tracking – how it was done up to now

Multi-view 3D pedestrian detection and tracking is a key task for applications such as autonomous driving, surveillance, and public safety, where robust performance under occlusions and changing camera setups is essential. While recent methods project image features into a shared 3D or bird’s-eye-view representation, camera parameters are typically used only for geometric projection. The viewing direction from which image features are captured is not explicitly modeled, causing the learned representations to be tied to specific camera configurations and limiting generalization to new setups.

Our new approach: integration of viewing direction information

In this work, we propose a convolutional neural network (CNN)–based approach that explicitly integrates viewing direction information into the image feature representation. For each camera, directional information derived from intrinsic and extrinsic parameters is combined with the extracted image features before projection into a common 3D voxel grid. This enables the model to distinguish features captured from similar or opposing viewpoints and to fuse multi-view information in a more structured and meaningful way.

So, did it work? 

The proposed method is evaluated on the WildTrack dataset and demonstrates improved detection accuracy and tracking stability. Experimental results show that incorporating viewing direction information enhances generalization across different camera configurations, particularly when training and testing setups differ. Overall, the results indicate that propagating camera viewpoint information beyond the geometric projection step leads to more robust and better-generalizing multi-view 3D pedestrian tracking systems.

Bild Bild Bild © IPI
Figure: overview of the inputs and outputs of the proposed method (MV-MOT means Multi-View Multi Object Tracking; BEV means Bird’s Eye View).

Publikationen

Ali, R., Mehltretter, M., Heipke, C. (2025): Integrating viewing direction and image features for robust multi-view multi-object 3D pedestrian tracking. In: ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences X-G-2025, pp. 47–55.
DOI: 10.5194/isprs-annals-X-G-2025-47-2025

Sie möchten mehr wissen und sich selbst mit diesen Themen beruflich beschäftigen? Legen Sie die Grundlagen mit einem Studium!
Mit dem Newsletter erhalten Sie jeden Monat die aktuelle Ausgabe der GeoWerkstatt per E-Mail.