ID 原文 译文
943 为对 CUDA 并行程序内核性能进行分析和预测,从而指导并行程序设计及性能优化,提出一种性能预测框架。 In order to analyze and predict the performance of CUDA program kernel and guide parallel program de-sign and performance optimization, a performance prediction framework is proposed.
944 1)从 GPU 编程模型和设备架构细节入手,以线程束为研究单位, This paper starts with the GPU pro-gramming model and hardware architecture details, with warp as the research unit.
945 通过整合与 GPU 程序用时密切相关的软硬件基本特征,定义了并行空间闲置度、流处理器线程束负载、并行效应因子等高层次性能相关特征。 By integrating hardware and software fac-tors closely related to GPU program time, high-level performance-related features such as device parallel space idle degree(DPSID), number of streaming multiprocessor warp (NSMW)are defined.
946 2)基于上述特征,框架针对线程负载均衡型 GPU 程序,评估内核函数在不同问题规模以及执行配置下的执行时间。 Based on the above features, a framework for e-valuating the execution time of kernel functions under different problem sizes and execution configurations is built for thread load balancing GPU programs.
947 3)依据性能评估原理提出了内核函数执行配置参数的优化策略。 The principle of optimizing configuration parameters of kernel function execution is put for-ward to guide optimizing program performance.
948 验证实验结果表明,该框架在两种典型情境下对现有程序性能的平均预测准确率分别达到 89% 94% ,客观归纳了高层次特征与程序性能间的相关关系,且能定性分析并行算法性能水平。 The experimental results show that the average prediction accuracy of the framework is 89% and 94% in the two scenarios, respectively.
949 近似线性相位是许多滤波系统希望具有的重要特性,全通滤波器则是实现这个特性的重要技术手段。 Linear phases are often a requisite in many signal filtering applications. All-pass digital filters are the maindevice to realize the linear phases.
950 提出全通数字滤波器的一种迭代重加权 minimax 设计方法,最小化最大加权线性相位偏差,并利用群延迟偏差信息对权函数进行迭代更新,使群延迟偏差函数近似等纹波。 An iterative reweighted minimax method is proposed for the design of all-pass digital fil-ters. The method minimizes the maximum weighted phase deviation from some linear phase, and reduces the maximumgroup-delay deviation from some constant group delay through updating the phase-error weight function iteratively by virtueof the group-delay deviation.
951 为展现该方法的优越性,将该方法应用于线性相位希尔伯特变换器及相位均衡器的设计。 To demonstrate its advantages, the proposed method is applied in the designs of phase equilizersand linear-phase Hilbert transformers.
952 设计例子展示了算法良好的收敛性,以及在实现滤波系统近似线性相位和近似常数群延迟方面的优良性能。 Simulation results show that the proposed algorithm has good convergence properties, and is very effective in realizing the nearly linear phase and nearly constant group delay of the filtering system.