Hadoop伪分布安装搭建
搭建Hadoop的环境
======================================
一、准备工作
1、安装Linux、JDK、关闭防火墙、配置主机名
解压:tar -zxvf hadoop-2.7.3.tar.gz -C ~/traning/
设置Hadoop的环境变量: vi ~/.bash_profile
HADOOP_HOME=/root/training/hadoop-2.7.3
export HADOOP_HOME
PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export PATH
环境变量生效
source ~/.bash_profile
=============== 伪分布模式:一台(bigdata111)
特点:是在单机上,模拟一个分布式的环境
具备Hadoop的主要功能
HDFS: namenode+datanode+secondarynamenode
Yarn: resourcemanager + nodemanager
hdfs-site.xml
原则:一般数据块的冗余度跟数据节点(DataNode)的个数一致;最大不超过3
先不设置
core-site.xml
mapred-site.xml 默认没有 cp mapred-site.xml.template mapred-site.xml
yarn-site.xml
格式化:HDFS(NameNode)
hdfs namenode -format
日志:
Storage directory /root/training/hadoop-2.7.3/tmp/dfs/name has been successfully formatted.
启动停止Hadoop的环境
start-all.sh
stop-all.sh
访问:通过Web界面
HDFS: http://192.168.153.111:50070
Yarn: http://192.168.153.111:8088
运行
例子:/root/training/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar
hadoop jar hadoop-mapreduce-examples-2.7.3.jar wordcount /input/data.txt /output/0407
(*)一定配置免密码登录:原理、配置