Supporting Efficient Family Joins for Big Data Tables via Multiple Freedom Family Index
The Hadoop/MapReduce framework has been widely utilized for processing big data. To overcome the limitations of existing work and meet the growing requirements of querying big data, this paper introduces novel join operations, called family joins, for HBase tables using their column families as join...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10855900/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The Hadoop/MapReduce framework has been widely utilized for processing big data. To overcome the limitations of existing work and meet the growing requirements of querying big data, this paper introduces novel join operations, called family joins, for HBase tables using their column families as join keys. Family joins possess the closure property that is demanded by many big data applications. This work explores four types of family joins according to different types of freedom in prefix matching for join comparisons. Two approaches to processing such family joins are discussed. The first is the direct method, which is inspired by the straightforward nested-loop strategy. The second is an index-based method, which utilizes a special index for HBase tables. Detailed definitions, practical applications, and processing strategies and algorithms for family joins are provided. Experimental results demonstrate that the index-based join method is quite promising in efficiently processing family joins. |
---|---|
ISSN: | 2169-3536 |