IT-BiCorpus-EN

ID	原文	译文
40576	目标检测任务对于检测任务精度和实时性都有很高要求，YOLOv3-tiny网络在这两点有很好的表现.	The YOLOv3-tiny network performs well in both accuracy and real-time for object detection.
40577	但是其复杂的网络结构，使得实际应用需要从软件和硬件方面都进行针对性的优化.	However, its complex network structure makes practical applications require targeted optimization from both software and hardware aspects.
40578	为了达到实时要求，综合使用三种优化技术:	In order to meet the real-time requirements, three optimization techniques are used comprehensively.
40579	在软件层面，通过融合批归一层降低计算量，低位宽增大资源利用率;	At the software level, the amount of computation is reduced through the fusion of batch normalization layer, while the low bit width to increase resource utilization.
40580	设计多维度并行FPGA计算核心匹配多个卷积层，提高整体吞吐率;	The multi-dimensional parallel FPGA computation cores are designed to match multiple convolutional layers to improve the overall throughput.
40581	细粒度层间流水和pingpong缓存设计，降低数据传输时间.	Fine-grained inter-layer flow and pingpong buffer design to reduce the data transfer time.
40582	在ZCU104型号的FPGA上，实现了418ⅹ418图片的21ms检测延时，超过同类加速器设计，并在DSP效率上有2.86倍或者8.81倍的提升.	With the ZCU104 model FPGA, it achieves a detection latency of 21 ms for 418 x 418 images, which exceeds similar accelerator designs and improves the DSP efficiency by 2.86 times or 8.81 times.
40583	当前基于忆阻器的神经网络加速器存在的资源需求高、系统功耗大等问题，	The current ReRAM-based NN acceleratorshave many problems such as high hardwareresource demand and high power consumption.
40584	提出了一种包含剪枝及量化算法在内的神经网络模型压缩框架.	An energy-efficient modelcompression framework consisting of pruning and quantization algorithms is proposed.
40585	根据忆阻器阵列紧密耦合的特点，设计了一种忆阻器阵列感知的规则化增量剪枝算法，在保证模型准确度的条件下实现了硬件资源的节省;	According to the tightly coupled crossbar structure and unstructured sparsity, a crossbar-aware incrementalstructured pruning algorithm is designedtoachievehigher sparsity and accuracy.