ID 原文 译文
2523 面对多个属性约简,人们往往难以区别,缺乏有效的手段选取最优或较优的属性约简。 Formany attribute reducts, it is hard for people to distinguish them, and lacks of valid methods of selecting the best one or a bet-ter one.
2524 使用多种概念漂移的度量指标和信息损失的度量方法比较了同一个知识系统中不同 Pawlak 约简之间的区别与联系。 Indexes of concept drift and information loss are employed to compare the same type of Pawlak attribute reducts ina knowledge system.
2525 提出了属性约简重心的概念,并研究其性质。 The focus of attribute reducts is presented, and its properties are investigated in this paper.
2526 实验结果显示,在众多的属性约简中,离重心最近的属性约简在分类准确率方面具有较大的优势。 experimental results show that the closest attribute reduct to the focus of attribute reducts is better than other attribute reducts in classifica-tion accuracy.
2527 概念漂移的度量指标和信息损失的度量方法有助于区分不同的属性约简,属性约简的重心有助于在众多的属性约简中选择最优或较优的一个。 Indexes of concept drift detection and information loss can distinguish different attribute reducts, and the focus of attribute reducts can be employed to select the best attribute reduct or a better one.
2528 本文研究敏感属性与部分准标识符属性存在相关时,如何有效减小重构攻击导致的隐私泄漏风险。 We investigate in this paper how to effectively reduce the risk of privacy leakage caused by refactoring at-tacks when the sensitive attributes and some quasi-identifier attributes are correlated.
2529 首先,用互信息理论寻找原始数据集中对敏感属性具有强依赖关系的准标识符属性,为精确扰动数据属性提供理论依据; Firstly, the mutual information theory isused to find the quasi-identifier attributes which have strong dependence on the sensitive attributes in the original dataset, which provides a theoretical basis for accurately perturbing the data attributes.
2530 其次,针对关联属性和非关联属性,应用不变后随机响应方法分别对某个数据属性或者属性之间的组合进行扰动,使之满足局部 ε-差分隐私要求,并理论分析后数据扰动对隐私泄露概率和数据效用的影响; Secondly, for the correlated attributes and thenon-correlated attributes, the invariant random response method is applied to perturb a certain data attribute or a combination of data attributes to satisfy the local ε-differential privacy requirement. Theoretical analysis of the impact of data perturba-tions on privacy leakage probability and data utility is also conducted.
2531 最后,实验验证所提算法的有效性和处理增量数据的能力,理论分析了数据结果。 Finally, the experiment verifies the effectiveness of the proposed algorithm and its ability to process incremental data.
2532 由实验结果可知,算法可以更好地达到数据效用和隐私保护的平衡。 The experimental results demonstrate that the algorithm can a-chieve a better balance between data utility and privacy protection.