Abstract
Classification-based tracking strategies often face more challenges from intra-class discrimination than from inter-class separability. Even for deep convolutional neural networks that have been widely proven to be effective in various vision tasks, their intra-class discriminative capability is still limited by the weakness of softmax loss, especially for targets not seen in the training dataset. By taking intrinsic attributes of training samples into account, in this paper, we propose a position-sensitive loss coupled with softmax loss to achieve intra-class compactness and inter-class explicitness. Particularly, two additive margins are introduced to encode the position attribute for decision boundary maximization, which is also utilized with the proposed loss to supervise the fine-tuned features on the pre-trained model. With the nearest neighbor ranking measurement in the feature embedding domain, the whole scheme is able to reach an optimized balance between the feature-level inter-class semantic separability and instance-level intra-class relative distance ranking. We evaluate the proposed work on different popular benchmarks, and experimental results demonstrate that our tracking strategy performs favorably against most of the state-of-the-art trackers in the comparison of accuracy and robustness.
Original language | English |
---|---|
Article number | 8734874 |
Pages (from-to) | 96-107 |
Number of pages | 12 |
Journal | IEEE Transactions on Multimedia |
Volume | 22 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2020 |
Keywords
- position-sensitive loss
- ranking
- softmax loss
- Visual tracking
- Training
- Additives
- Semantics
- Feature extraction
- Object tracking
- Task analysis