INTRODUCTION OF HADOOP & SPARK
INTRODUCTION Hadoop and Apache Spark both are big-data frameworks, but direct comparison of Hadoop and Spark is difficult because they do many of the same things, but are also non-overlapping in some areas. Hadoop is essentially a distributed data infrastructure, It distributes massive data collections across multiple nodes within a cluster of commodity servers, which means you don't need to buy and maintain expensive custom hardware. It also indexes and keeps track of that data, enabling big-data processing and analytics far more effectively than was possible previously. Spark, on the other hand, is a data-processing tool that operates on those distributed data collections; it doesn't do distributed storage Hadoop have many components of modules that work together to create the Hadoop framework. The primary Hadoop framework modules are: · Hadoop Common · ...
Sir assignment question
ReplyDelete