CHALLENGES IN USING HADOOP VS SPARK
Challenges in using Hadoop vs spark • Hadoop and Spark are Apache projects, they are Open source and free software products. And both especially designed to run on commodity hardware white box server system. Generally, cost wise both are cheap and equal. • They are highly compatible with each other. By using IDBC and ODBC spark shares all MapReduce’s data sources and file formats. • Spark 10 times more faster in batch processing and 100 times faster in memory analytics than MapReduce because MapReduce operates in steps i.e. read data from the cluster, perform an operation, write results to the cluster, read updated data from the cluster, perform next operation, write next results to the cluster, etc. but Spark does all data analytics operations in-memory and in near real-time i.e. Read data from the cluster, perform all of the requisite analytic operations, write results to the clust...