Multi-item tracking (MOT) is a undertaking where by item instances have to be detected and related collectively to sort trajectories. The accuracy of MOT products is decided by the employed movement model. A modern paper introduces a novel MOT community named SiamMOT.
It combines a location-primarily based detection community with a Siamese-primarily based model. The latter works by using a pair of frames to observe the target item in the 1st frame inside a look for location in the second frame. SiamMOT works by using location-primarily based functions and develops specific template matching to estimate instance movement. It is much more strong to challenging tracking scenarios than recent products.
The experiments present that the advised model increases tracking overall performance in comparison with state-of-the-art products, particularly when cameras are moving rapidly and when people’s poses are deforming significantly.
In this paper, we focus on enhancing on the internet multi-item tracking (MOT). In distinct, we introduce a location-primarily based Siamese Multi-Item Monitoring community, which we name SiamMOT. SiamMOT consists of a movement model that estimates the instance’s motion in between two frames these types of that detected instances are related. To discover how the movement modelling influences its tracking functionality, we present two variants of Siamese tracker, one that implicitly products movement and one that products it explicitly. We carry out extensive quantitative experiments on three various MOT datasets: MOT17, TAO-man or woman and Caltech Roadside Pedestrians, showing the great importance of movement modelling for MOT and the capability of SiamMOT to significantly outperform the state-of-the-art. Lastly, SiamMOT also outperforms the winners of ACM MM’20 HiEve Grand Problem on HiEve dataset. Also, SiamMOT is successful, and it operates at 17 FPS for 720P films on a single modern day GPU.