APGCC: Improving Point-based Crowd Counting and Localization Based on Auxiliary Point Guidance.

1National Taiwan University, 2The University of California Merced 3Google Research
(ECCV 2024)
Overestimation
Proposal Selection
Performance
Instability Rate

APGCC utilizes auxiliary point guidance to overcome the instability in training point-based crowd counting, achieving accurate counting and precise localization results. Green arrows and red arrows are the selected proposal with offset. Instability Rate (IR) measures the inconsistency rate of point proposal selection per epoch, leading to limited performance.

Centered Responsive YouTube Video Embed

Abstract

Crowd counting and localization have become increasingly important in computer vision due to their wide-ranging applications. While point-based strategies have been widely used in crowd counting methods, they face a significant challenge, i.e., the lack of an effective learning strategy to guide the matching process. This deficiency leads to instability in matching point proposals to target points, adversely affecting overall performance. To address this issue, we introduce an effective approach to stabilize the proposal-target matching in point-based methods.

We propose Auxiliary Point Guidance (APG) to provide clear and effective guidance for proposal selection and optimization, addressing the core issue of matching uncertainty. Additionally, we develop Implicit Feature Interpolation (IFI) to enable adaptive feature extraction in diverse crowd scenarios, further enhancing the model's robustness and accuracy. Extensive experiments demonstrate the effectiveness of our approach, showing significant improvements in crowd counting and localization performance, particularly under challenging conditions.


Architecture of APGCC

The main contributions of APGCC are the Auxiliary Point Guidance (APG) and Implicit Feature Interpolation (IFI) modules, which introduce precise auxiliary point proposal to enhance the optimization process's stability during the network's matching phase.

Architecture of APGCC.

APGCC stabilize and enhance the point-based methods by incorporating two essential modules. During training, APG designed to instruct the network on the precise selection and optimization of point proposals for matching with target points, ensuring accurate and informed decisions in the proposal selection and optimization process. To facilitate the implementation of APG, which necessitates feature extraction at arbitrary positions, we introduce the IFI that can access features from diverse locations within the network. By enhancing the robustness of the matching process, our approach significantly improves the precision and reliability of crowd analysis models.

Overview of the proposed Auxiliary Point Guidance framework

Overview of the proposed Auxiliary Point Guidance framework.

The Auxiliary Point Guidance (APG) framework involves the strategic designation of auxiliary positive (Apos) and negative (Aneg) points within the optimization framework, based on ground truth coordinates (x, y). The objective is to ensure that the confidence of auxiliary positive points is as close to one as possible and that their predicted offsets closely match the corresponding ground truth points. Conversely, the expected confidence and offset of auxiliary negative points should be as close to zero as possible, preventing negative points from using offsets to bring their proposal coordinates close to the ground truth. This approach effectively directs the optimization process by clearly distinguishing between potential positive and negative matches.

Overview of Implicit Feature Interpolation

Overview of Implicit Feature Interpolation.

Implicit Feature Interpolation is a technique that enhances the precision of point-based crowd counting and localization. This method leverages the underlying structure of the feature space to interpolate features smoothly and continuously, without explicitly defining interpolation points. By doing this, it allows for more accurate alignment of predicted points with ground truth coordinates, improving the model's ability to handle varying crowd densities and complex spatial relationships. This approach results in finer granularity and better spatial dependencies in feature representations, leading to improved accuracy and robustness in crowd counting and localization tasks.


Evaluation on Crowd Counting

Our approach outperforms state of the art in the several scenarios, with the best results in bold and the second-best results underlined. These findings affirm the effectiveness and adaptability of our approach in various crowd counting scenarios.

Evaluation of crowd counting on SHHA, SHHB, UCF-QNRF and JHU-Crowd datasets.
Architecture of APGCC.
Evaluation of crowd counting on UCF_CC_50 dataset.
Architecture of APGCC.
Evaluation of crowd counting on NWPU dataset.
Architecture of APGCC.

Evaluation on Crowd Localization

We benchmark our approach against a diverse array of methods. APGCC leverages IFI to acquire precise features and utilizes closer proposal predictions to achieve optimal precision.

Evaluation of crowd localization on NWPU dataset.
Architecture of APGCC.
Evaluation of crowd localization on SHHA dataset.
Architecture of APGCC.

Visualization

By offering more precise feature representation across various scales and locations, APGCC ensures more balanced and enhanced performance for different head sizes.


BibTeX

@article{chen2024improving,
      title={Improving Point-based Crowd Counting and Localization Based on Auxiliary Point Guidance},
      author={Chen, I and Chen, Wei-Ting and Liu, Yu-Wei and Yang, Ming-Hsuan and Kuo, Sy-Yen and others},
      journal={arXiv preprint arXiv:2405.10589},
      year={2024}
    }