ID |
原文 |
译文 |
56878 |
分布式数据管理系统不仅需要解决数据分区与多副本带来的节点间元数据同步问题,还要支持高效查询请求处理. |
A distributed data management system not only needs to solve the problem of metadata synchronizationbetween nodes caused by data partition and multiple replicas, but also needs to support efficient query requestprocessing. |
56879 |
本文针对节点间元数据同步问题提出了双层粒度元数据管理策略,在此基础上基于一致性哈希分区方法和Raft协议设计了同时支持强一致性查询和最终一致性查询的分布式框架. |
To solve the problem of metadata synchronization among nodes, we propose a dual-layer granularitymetadata management strategy. Based on the consistency hash partitioning method and Raft protocol, wedesigned a distributed framework that supports both strong consistency query and eventual consistency query. |
56880 |
基于单机版Apache IoTDB进行了系统实现与实验测试,测试结果表明:双层粒度元数据管理策略与单层粒度管理策略相比,其元数据内存资源占用更少且写入性能提升5%10%,并且分布式Apache IoTDB的读写性能随着集群规模的扩大而线性增长. |
Based on the single-machine version of Apache IoTDB, we carried out the system implementation and experimentaltest. Compared with the single-level granularity management strategy, the test results showed that the two-levelgranularity metadata management strategy takes less memory resources and improves the write performance by5% ∼ 10%. Also, the results showed that the read and write performance of the distributed Apache IoTDBincreases linearly with the extension of cluster size. |
56881 |
连接优化是数据库领域最重要的研究问题之一. |
Join optimization is one of the most important research problems in database systems. |
56882 |
传统的连接优化方法一般应用基础启发式规则,他们通常搜索代价很高,并且很难发现最优的执行计划. |
Traditionaljoin optimizers are usually proposed based on heuristics, which are expensive and often fail to generate the optimalexecution plan. |
56883 |
主要原因有两个:(1)这些基于规则的优化方法只能探索解空间的一个子集,(2)他们没有利用历史信息,不能够很好地衡量执行计划的代价,经常重复选择相同的糟糕计划. |
There are two reasons accounting for this. (1) The optimizers are based on heuristics and onlyexplore a subset of the search space. (2) They do not use the history logs and cannot estimate the goodness oftheir generated plans on a specific join problem. |
56884 |
为了解决以上两个问题,我们提出RLO (reinforcement learning optimization),一个基于强化学习的连接优化方法. |
To tackle these challenges, we propose RLO, a reinforcementlearning-based optimizer for join optimization. |
56885 |
我们将连接优化问题建模成马尔可夫(Markov)决策过程,并且使用深度Q-学习来估计每一种可能的执行计划的执行代价. |
We model the join optimization problem as a Markov decisionprocess and use deep Q-learning to estimate the possible reward of a possible operation. |
56886 |
为了进一步增强RLO的有效性,我们提出了基于树形结构的嵌入方法和集束搜索策略来尽量避免错过最好的执行计划. |
To boost the effectivenessof RLO, we further propose a tree-based embedding method to represent the “state” and use a beam search toavoid missing the optimal plans. |
56887 |
我们在Apache Calcite和Postgres上实现了RLO. |
We implement RLO in Apache Calcite and Postgres. |