ID | 原文 | 译文 |
14975 | 为解决这个问题,提出并实践了一种基于矩阵嵌套思想的负载平衡优化方案检索策略帮助完成进程布局优化过程,并介入基于模式并行要求的筛查保证检索结果具有可行性,最终通过实验证明通过这种检索策略搜索获得的最优布局与默认布局相比平均计算性能提升达到47.3%,并在5个节点上实现了1.419的加速比。 | In order to solve this problem,this paper proposed and implemented a retrieval strategy based on the matrix-nesting idea of load balancing optimization scheme to help the process layout and intervenes in the screening work based on the parallel requirements of the original model.Finally,the experiment proved that the optimal layout obtained by this search strategy search had a performance improvement of 47.3% compared with the default layout and achieved an acceleration ratio of 1.419 on 5 nodes. |
14976 | 域气候模式CWRF(Climate-Weather Research and Forecasting model)是国家气候中心区域气候预测系统的重要组成部分,也是系统最耗时的程序。 | CWRF (Climate-Weather Research and Forecasting model) is a component of the regional climate prediction system built in the National Climate Center,and consumes the largest proportion of time. |
14977 | 高性能计算是提高CWRF数值预报计算性能的关键技术,开展CWRF模式在国产神威众核架构上的移植和优化,提高模式的模拟效率,对模式的扩展、开发能力和可持续发展具有重要意义。 | High performance computing is a key technology used to improve the compactional performance of CWRF. Carrying out the configuration and optimization of the CWRF model based on the domestic Sunway many-core system, improving the simulation efficiency are of great significance for the speedup, as well as the development capability and sustainable development of the model. |
14978 | 基于国产众核SW26010处理器,完成了CWRF区域气候模式的移植、性能分析和深入性能优化,采用访存优化、Cache命中率优化及众核加速优化等方法,对CWRF模式动力过程、物理过程和I/O过程计算代码进行重构及众核加速。 | This paper completed the configuration and performance evaluation of CWRF based on the SW26010 many-core architecture.Memory access optimization,Cache hit rate optimization,many-core acceleration models are introduced to speedup CWRF relating to the dynamic-core process,physical process and I/O process. |
14979 | 结果表明:优化技术可使CWRF动力过程平均加速2倍,最高加速6.4倍,物理过程平均加速1.7倍,最高加速5.4倍,I/O过程加速1.2倍,程序整体最高加速1.4倍,计算误差在合理范围内。 | The results show that the average speed of the dynamic process is 2 times and the highest speed is 6.4 times,the average speed of the physical process is 1.7 times and the highest speed is 5.4 times,the I/O process speeds up 1.2 times,the overall program speeds up to 1.4 times,and the calculation error is reasonable. |
14980 | 新一代全球/区域多尺度统一的同化与数值预报系统Global/Regional Assimilation and PreEdiction System(GRAPES)是中国气象局(China Meteorological Administration, CMA)自主研发的数值天气预报软件。 | The new generation Global/Regional Assimilation and PreEdiction System(GRAPES) is a homegrown numerical weather prediction software developed by China Meteorological Administration(CMA). |
14981 | 随着对模式分辨率和预测时效性要求的提高,GRAPES的输入输出(I/O)性能成为了一个重要的瓶颈。分析了GRAPES区域模式的I/O行为,提出并设计实现了一个高性能I/O框架。 | As the requirements for model resolution and prediction timeliness increase, the Input/Output(I/O) performance of GRAPES becomes a critical performance bottleneck. This paper performs a deep analysis of I/O behavior for the GRAPES regional model,and proposes,designs and implements a high-performance I/O framework. |
14982 | 该框架采用二进制编码以及多I/O通道技术实现了灵活可配置的输出方式。 | This framework achieves a flexible and configurable output method through binary encoding and multiple I/O channels. |
14983 | 同时,通过非堵塞通信的方式实现了异步I/O,隐藏了I/O与通信的开销。 | At the same time, asynchronous I/O is included by non-blocking communication,which hides the I/O and communication overhead. |
14984 | 工作在曙光“派”超级计算机上进行了测试,结果显示该框架不仅可以提高I/O性能达到10倍以上,也可以减少性能抖动带来的性能不确定性问题。 | The framework has been tested on the Sugon Pai supercomputer, and the results show that the framework can not only improve I/O performance by up to over ten times but also reduce the performance uncertainty caused by performance jitter. |