YARM: Efficient and Scalable RDFS Semantic Reasoning Engine Based on MapReduce.

YARM
Primary goals:
  • Optimize the distributed parallel reasoning algorithm on MapReduce.
  • Overcome the lack of scalability of the existing semantic reasoning engines.
  • To improve the efficiency of reasoning.
YARM includes four major optimizations:
  • It adopts a well-designed data partitioning schema and a corresponding reasoning algorithm to minimize the amount of data transferred among computing nodes.
  • It optimizes the execution order of the reasoning rules to improve the computing speed.
  • It uses an efficient way to remove duplicates yielded in reasoning process. This avoids the need of extra MapReduce jobs to do this work.
  • Based on the optimizations above, we design and implement a new parallel reasoning algorithm on the Hadoop MapReduce framework.
  • Experimental results on both real-world and synthetic datasets show that YARM is about 10 times faster than the latest reasoning engine and also achieves better scalability.