Marlin
Background:

Matrix computation is the core of many massive data-intensive scientific applications such as large-scale numerical analysis, data mining, and computational physics. In the Big Data era, as the scale of the matrix grows, traditional single-node matrix computation can hardly cope with such large data and computation...More About Marlin→

Dolphin
Background:

Deep learning is a unsupervised feature extraction algorithm. The training process of it usually contains large number matrix operations. It can be quite time-consuming as data size increases. Intel Xeon Phi is a many-core platform introduced by Intel which is suitable for vector operations. With the help Intel MKL math lib, The platform can deal with the training process easily...More About Dolphin→

Cichild
Introduction:

Cichlid is a distributed RDFS & OWL reasoning system based on Spark. Cichlid achieves higher efficiency and scalability than the existing large scale reasoning systems based the MapReduce framework or the P2P self-organizing networks. The major contributions and novelties of Cichlid are summarized as follows...More About Cichlid→

Perf
Background:

Distributed file systems (DFS) store large-scale data and play an import role in various Big Data applications. DFS form the cornerstones of the upper distributed computing frameworks, which makes them become widely-used and diversified...More About DFS-Perf→

YARM
Primary goals:

Optimize the distributed parallel reasoning algorithm on MapReduce;Overcome the lack of scalability of the existing semantic reasoning engines;To improve the efficiency of reasoning...More About YARM→

HIBase
Introduction:

HiBase is a hierarchical indexing mechanism and a prototype distributed data-storage system, which has hierarchical indexes for non-primary keys in tables and makes data querying more efficient. HiBase uses HBase as the lower data storage and creates a memory cache with Hot Score Algorithm for more efficient data transmission...More About HIBase→

YAFIM
Background:

The frequent itemset mining (FIM) is one of the most important techniques to extract knowledge from data in many real-world...More About YAFIM→