Operation and Maintenance
Linux Operation and Maintenance
How to build a Hadoop development environment on Debian
How to build a Hadoop development environment on Debian

This guide details how to build a Hadoop development environment on a Debian system.
1. Install Java Development Kit (JDK)
First, install OpenJDK:
sudo apt update sudo apt install openjdk-11-jdk -y
Configure the JAVA_HOME environment variable:
sudo nano /etc/environment
Add at the end of the file (adjust the path according to the actual JDK version):
<code>JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64"</code>
Save and exit, and then execute:
source /etc/environment
Verify installation:
java -version
2. Install Hadoop
Download Hadoop 3.3.6 (or other version):
wget https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.3.6/hadoop-3.3.6-src.tar.gz
Verify download integrity:
wget https://downloads.apache.org/hadoop/common/hadoop-3.3.6/hadoop-3.3.6-src.tar.gz.sha512 sha256sum -c hadoop-3.3.6-src.tar.gz.sha512
Create a directory and unzip:
sudo mkdir /opt/hadoops sudo tar -xzvf hadoop-3.3.6-src.tar.gz -C /opt/hadoops --strip-components 1
3. Configure Hadoop environment variables
Edit /etc/profile file and add:
export HADOOP_HOME="/opt/hadoops/hadoop-3.3.6" export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
Refresh environment variables:
source /etc/profile
4. Configure Hadoop core configuration file
Edit core-site.xml :
sudo nano $HADOOP_HOME/etc/hadoop/core-site.xml
Add to:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>Edit hdfs-site.xml :
sudo nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml
Add to:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/hadoops/hdfs/namenode</value>
</property>
</configuration>Edit mapred-site.xml :
sudo nano $HADOOP_HOME/etc/hadoop/mapred-site.xml
Add to:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>Edit yarn-site.xml :
sudo nano $HADOOP_HOME/etc/hadoop/yarn-site.xml
Add to:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>5. Set SSH without password login
Generate SSH key:
sudo su - hadoop ssh-keygen -t rsa -P ""
Copy the public key:
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
Test connection:
ssh localhost
6. Format NameNode
hdfs namenode -format
7. Start Hadoop service
start-dfs.sh start-yarn.sh
8. Verify Hadoop installation
Check cluster status:
hdfs dfsadmin -report
Visit NameNode web interface: http://localhost:9870
After completing the above steps, the Hadoop development environment on your Debian system is completed. Please adjust the path and Hadoop version according to the actual situation.
The above is the detailed content of How to build a Hadoop development environment on Debian. For more information, please follow other related articles on the PHP Chinese website!
Hot AI Tools
Undresser.AI Undress
AI-powered app for creating realistic nude photos
AI Clothes Remover
Online AI tool for removing clothes from photos.
Undress AI Tool
Undress images for free
Clothoff.io
AI clothes remover
AI Hentai Generator
Generate AI Hentai for free.
Hot Article
Hot Tools
Notepad++7.3.1
Easy-to-use and free code editor
SublimeText3 Chinese version
Chinese version, very easy to use
Zend Studio 13.0.1
Powerful PHP integrated development environment
Dreamweaver CS6
Visual web development tools
SublimeText3 Mac version
God-level code editing software (SublimeText3)
Hot Topics
1384
52
How to set the cgi directory in apache
Apr 13, 2025 pm 01:18 PM
To set up a CGI directory in Apache, you need to perform the following steps: Create a CGI directory such as "cgi-bin", and grant Apache write permissions. Add the "ScriptAlias" directive block in the Apache configuration file to map the CGI directory to the "/cgi-bin" URL. Restart Apache.
How to start apache
Apr 13, 2025 pm 01:06 PM
The steps to start Apache are as follows: Install Apache (command: sudo apt-get install apache2 or download it from the official website) Start Apache (Linux: sudo systemctl start apache2; Windows: Right-click the "Apache2.4" service and select "Start") Check whether it has been started (Linux: sudo systemctl status apache2; Windows: Check the status of the "Apache2.4" service in the service manager) Enable boot automatically (optional, Linux: sudo systemctl
How to delete more than server names of apache
Apr 13, 2025 pm 01:09 PM
To delete an extra ServerName directive from Apache, you can take the following steps: Identify and delete the extra ServerName directive. Restart Apache to make the changes take effect. Check the configuration file to verify changes. Test the server to make sure the problem is resolved.
How to connect to the database of apache
Apr 13, 2025 pm 01:03 PM
Apache connects to a database requires the following steps: Install the database driver. Configure the web.xml file to create a connection pool. Create a JDBC data source and specify the connection settings. Use the JDBC API to access the database from Java code, including getting connections, creating statements, binding parameters, executing queries or updates, and processing results.
How to view your apache version
Apr 13, 2025 pm 01:15 PM
There are 3 ways to view the version on the Apache server: via the command line (apachectl -v or apache2ctl -v), check the server status page (http://<server IP or domain name>/server-status), or view the Apache configuration file (ServerVersion: Apache/<version number>).
What to do if the apache80 port is occupied
Apr 13, 2025 pm 01:24 PM
When the Apache 80 port is occupied, the solution is as follows: find out the process that occupies the port and close it. Check the firewall settings to make sure Apache is not blocked. If the above method does not work, please reconfigure Apache to use a different port. Restart the Apache service.
How to view the apache version
Apr 13, 2025 pm 01:00 PM
How to view the Apache version? Start the Apache server: Use sudo service apache2 start to start the server. View version number: Use one of the following methods to view version: Command line: Run the apache2 -v command. Server Status Page: Access the default port of the Apache server (usually 80) in a web browser, and the version information is displayed at the bottom of the page.
How to build a Zookeeper cluster in CentOS
Apr 14, 2025 pm 02:09 PM
Deploying a ZooKeeper cluster on a CentOS system requires the following steps: The environment is ready to install the Java runtime environment: Use the following command to install the Java 8 development kit: sudoyumininstalljava-1.8.0-openjdk-devel Download ZooKeeper: Download the version for CentOS (such as ZooKeeper3.8.x) from the official ApacheZooKeeper website. Use the wget command to download and replace zookeeper-3.8.x with the actual version number: wgethttps://downloads.apache.or


