Hadoop has three major components: 1. HDFS, a highly reliable, high-throughput distributed file system; 2. MapReduce, a distributed offline parallel computing framework; 3. Yarn, a distributed resource management framework .
#The operating environment of this article: Windows 7 system, Dell G3 computer.
Three major components of hadoop:
1. HDFS
A highly reliable, high-throughput distributed file system
Storing massive data
Distributed
Security
Copy data
Data is stored in blocks, 128M
For example: 200M—128M 72M
2. MapReduce
A distributed offline parallel computing framework
For massive data Processing
Distributed
Ideology:
Divide and Conquer
Large data set is divided into small data sets
Each data set Carry out logical business processing (map)
Merge statistical data results (reduce)
3, Yarn
Distributed resource management framework
Manage the resources of the entire cluster (memory, CPU cores)
Allocate and schedule the resources of the cluster
Related video recommendations: PHP programming from entry to proficiency
The above is the detailed content of What are the three major components of hadoop. For more information, please follow other related articles on the PHP Chinese website!