自動人群行為分析是智能交通系統的一項重要任務,可以為不同的道路參 與者實現有效的流量控制和動態路線規劃。人群計數是自動人群行為分析 的關鍵之一。近年來,使用深度卷積神經網絡 (CNN) 進行人群計數取得 了令人鼓舞的進展。研究人員在變體 CNN 架構的設計上投入了大量精 力,其中大部分都是基於預訓練的 VGG16 模型。由於表達能力不足, VGG16 的骨幹網絡後面通常是另一個笨重的網絡,專門為良好的計數性 能而設計。儘管 VGG 模型在圖像分類任務中已經優於 Inception 模型,但 現有的使用 U-Net 構建的人群計數網絡仍然只有少量具有 U-Net 模塊基本 類型的層。為了填補這一空白,在本文中,我們首先在常用人群計數數據 集上對基線 U-Net 模型進行了基準測試,並取得了與大多數現有人群計數 模型相當或更好的驚人性能。隨後,我們通過提出以 U-Net 為骨幹的分割 引導注意網絡和用於人群計數的新課程損失,進一步推動人群計數的極 限。 Automated crowd behavior is an important task of intelligent traffic systems, which can implement efficient flow control and dynamic route planning for different road participants. Crowd counting is one of the keys to automatic crowd behavior. In recent years, there has been encouraging progress in crowd counting using deep convolutional neural networks (CNNs). Researchers have devoted a lot of effort to designing variant CNN architectures, most of which are based on pre-trained VGG16 models. Due to lack of presentation, the backbone network of the VGG16 is usually behind another heavy network, designed specifically for good counting performance. Although the VGG model is already superior to the Inception model for image categorization tasks, traditional crowd counting networks built using U-Net modules still have only a small number of layers with a basic type of U-Net module. To fill this gap, we first tested the baseline U-Net model on a common population count dataset and achieved remarkable performance comparable to or better than most existing population count models. Subsequently, we further push the limits of crowd counting by proposing U-Net-based segmentation to guide attention to networks and new lesson losses for crowd counting.