Integration and fusion techniques in Big Data

Université Amar Telidji - Laghouat - Département d'informatique
Big data is a term for large and complex datasets that traditional processing ap- proaches are inefficient to handle them. Big data usually comes from the Internet, enterprise systems, Internet of Things, and other information systems, Data integra- tion has become an active area for research due to increasing of information resources with the need of users and applications to integrate and fusion data from these dif- ferent sources.addressing the big data integration challenge is critical to realizing the promise of Big Data. Big data Integration differs from traditional data integration in many dimensions : volume, velocity, variety and veracity. In this document we shall present the definition of big data and some of technolo- gies and tools developed to handle the big data, furthermore, we explores the new solutions have been developed by the data integration community on the topics of schema mapping, record linkage and data fusion in addressing these novel challenges faced by big data integration. finally, we have implement the map reduce based linkage method and present the used tools, algorithms and the result thrown by this method in different computa- tional nodes.