IT-BiCorpus-EN

ID	原文	译文
56888	实验表明:（1）在Apache Calcite上,与一系列剪枝的启发式算法相比, RLO搜索计划的效率为它们的1056倍,并且生成的计划能更快地执行（80%的加速）;（2）与原生的Postgres相比, RLO搜索计划的效率是其14倍,并且在端到端的执行中达到12.	Extensive experimentsdemonstrate that: (1) Apache Calcite RLO is 10×–56× faster in finding the execution plan and 80% faster inexecuting the plan than the state-of-the-art heuristics.
56889	9%的加速.	(2) Compared with the native Postgres implementation,RLO can be 14× faster in finding the execution plan and 12. 9% faster in an end-to-end comparison.
56890	针对"信息孤岛"中的关系数据融合问题,本文提出并实现了多源关系数据融合的基本框架(multi-source relational data fusion, MSF).	Focusing on the problem of relational data fusion in the environment with “information isolatedisland”, this paper presents a multi-sources relational data fusion (MSF) framework.
56891	框架包含3个主要模块:模式匹配、实体对齐、实体融合.	The framework consists ofthree components: schema matching, entity alignment, and entity fusion.
56892	模式匹配面向多源关系数据的属性对齐问题,结合属性值的多维特征,提出基于匈牙利(Hungarian)算法的属性间对齐发现机制,实现了多源关系数据的快速模式匹配.实体对齐连接多源关系中的元组对,通过引入多样性取样策略和实体特征抽取方法,提升了实体对齐的效果.	Based on the Hungarian algorithm, wepropose an alignment discovery mechanism for the attributes alignment among multi-sources relational data. Byextracting the multi-dimensional features of attribute values, we efficiently realized schema matching of multi?sources relational data.
56893	最后将对齐实体进行融合,为数据分析提供统一的数据视图.	To link the tuple pairs from multi-source data, we introduced the diversity samplingstrategy and the entity feature extraction approach.
56894	为了验证MSF的效果和效率,实现了数据融合系统DataPuzzle,并在该系统上,结合真实公开的多领域数据,对提出的方法进行了验证.	These can effectively improve the performance of entityalignment. Finally, linked entities are fused to provide a unified view of data analysis. To verify the usefulnessand efficiency of the proposed methods, we implemented a fusion system called Data Puzzle, which is verifiedwith the real public multi-field data.
56895	结果表明,所提出的方法可以高效地实现数据融合,具有较高的查全率、查准率.	Experimental results demonstrate that the proposed methods can fusemulti-source relational data efficiently with high recall and precision.
56896	机器学习依赖大量样本的统计信息进行模型的训练,从而能对未知样本进行精准的预测.	Although achieve inspiring performance in many real-world applications, machine learning methodsrequire a huge amount of training examples to obtain an effective model.
56897	搜集样本及标记需要耗费大量的资源,因而如何基于少量样本(few-shot learning)进行模型的训练至关重要.	Considering the effort collecting labeledtraining data, the few-shot learning, i. e. , learning with budgeted training set, is necessary and useful.