PointTPA: Dynamic Network Parameter Adaptation for 3D Scene Understanding

Abstract

Scene-level point cloud understanding remains challenging due to diverse geometries, imbalanced category distributions, and highly varied spatial layouts. Existing methods improve object-level performance but rely on static network parameters during inference, limiting their adaptability to dynamic scene data. We propose PointTPA, a Test-time Parameter Adaptation framework that generates input-aware network parameters for scene-level point clouds. PointTPA adopts a Serialization-based Neighborhood Grouping (SNG) to form locally coherent patches and a Dynamic Parameter Projector (DPP) to produce patch-wise adaptive weights, enabling the backbone to adjust its behavior according to scene-specific variations while maintaining a low parameter overhead. Integrated into the PTv3 structure, PointTPA demonstrates strong parameter efficiency by introducing two lightweight modules of less than 2% of the backbone's parameters. Despite this minimal parameter overhead, PointTPA achieves 78.4% mIoU on ScanNet validation, surpassing existing parameter-efficient fine-tuning (PEFT) methods across multiple benchmarks, highlighting the efficacy of our test-time dynamic network parameter adaptation mechanism in enhancing 3D scene understanding.

Method

PointTPA consists of a Serialization-based Neighborhood Grouping (SNG) and a Dynamic Parameter Projector (DPP). SNG groups tokens with spatially close positions into locally coherent patches, while DPP produces input-aware dynamic weights for different patches. During training, we freeze the backbone parameter and only fine-tune DPP and static Adapter modules, and DPP produces dynamic weights during inference.

Experiments

Our model demonstrates strong parameter efficiency and robust performance across multiple benchmarks. On the ScanNet validation set, PointTPA achieves 78.4% mIoU while introducing less than 2% of the backbone's parameters. These results highlight the superiority of our test-time dynamic network parameter generation mechanism over existing parameter-efficient fine-tuning (PEFT) methods.

Visualization

The first image shows additional visualization results on ScanNet, ScanNet200, ScanNet++, and S3DIS with different viewpoints, where PointTPA produces superior segmentation results. The second image visualizes the similarity of dynamically generated projection weights across groups and stages, where similar colors indicate higher similarity. PointTPA shows distinct color patterns across different point groups, indicating significant variance in dynamic weights.

Citation

Thank you for citing our work.

@inproceedings{liu2026pointtpa,
  title={PointTPA: Dynamic Network Parameter Adaptation for 3D Scene Understanding},
  author={Siyuan Liu and Chaoqun Zheng and Xin Zhou and Tianrui Feng and Dingkang Liang and Xiang Bai},
  booktitle=CVPR,
  year={2026}
}