`

Hadoop 0.20.205.0安装配置

阅读更多

环境:

10.0.30.235 NameNode

10.0.30.236 SecondaryNameNode

10.0.30.237 DataNode

10.0.30.238 DataNode

 

配置主机名

 

/etc/hosts

 

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6
10.0.30.235     nn0001  nn0001
10.0.30.236     snn0001 snn0001
10.0.30.237     dn0001  dn0001
10.0.30.238     dn0002  dn0002

 

修改所有节点network文件中的HOSTNAME值(以NameNode节点为例,其他的改成相应的值)

/etc/sysconfig/network

HOSTNAME=nn0001

 

 

安装jdk-6u26-linux-x64-rpm.bin

配置环境变量

vim /etc/profile

JAVA_HOME=/usr/java/jdk1.6.0_26
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$CATALINA_HOME/common/lib
export JAVA_HOME
export PATH
export CLASSPATH

 

解压hadoop-0.20.205.0.tar.gz

 

配置hadoop-env.sh

export JAVA_HOME=/usr/java/jdk1.6.0_26

 

配置文件hdfs-site.xml

 

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
        <name>dfs.http.address</name>
        <value>nn0001:50070</value>
</property>
<property>
                <name>dfs.name.dir</name>
                <value>/hadoop/dfs/namenode</value>

        </property>
        <property>
                <name>dfs.data.dir</name>
                <value>/hadoop/dfs/datanode</value>
        </property>
        <property>
                <name>dfs.replication</name>
                <value>1</value>
        </property>

        <property>
                <name>dfs.datanode.max.xcievers</name>
                <value>4096</value>
        </property>
</configuration>

 

注:An Hadoop HDFS datanode has an upper bound on the number of files that it will serve at any one time. The upper bound parameter is called xcievers (yes, this is misspelled). Again, before doing any loading, make sure you have configured Hadoop's conf/hdfs-site.xml setting the xceivers value to at least the following

 

Not having this configuration in place makes for strange looking failures. Eventually you'll see a complain in the datanode logs complaining about the xcievers exceeded, but on the run up to this one manifestation is complaint about missing blocks. For example: 10/12/08 20:10:31 INFO hdfs.DFSClient: Could not obtain block blk_XXXXXXXXXXXXXXXXXXXXXX_YYYYYYYY from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry...

 

配置文件core-site.xml

 

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
                <name>fs.default.name</name>
                <value>hdfs://nn0001:9000</value>
        </property>
        <property>
                <name>httpserver.enable</name>
                <value>true</value>
        </property>
<property>
        <name>fs.checkpoint.dir</name>
        <value>/hadoop/dfs/namenodesecondary</value>
</property>
</configuration>

 

配置文件mapred-site.xml

 

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <property>
        <name>mapred.job.tracker</name>
        <value>nn0001:9001</value>
        </property>
</configuration>

 

安装配置ssh

 

NameNode节点:

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

 

SecondaryNameNode节点

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

 

DataNode节点

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

 

把NameNode节点上的authorized_keys文件拷贝到其他节点上

scp /root/.ssh/authorized_keys 10.0.30.23(6-8):/root/.ssh/

 

在NameNode上格式化一个新的分布式文件系统

./hadoop namenode -format

 

启动hdfs

./start-dfs.sh

 

出现以下警告:

dn0001: Warning: $HADOOP_HOME is deprecated.
dn0001:
dn0001: Unrecognized option: -jvm
dn0001: Could not create the Java virtual machine.

 

这个问题导致两个DataNode节点无法启动

 

出现这个问题是由于我直接用root权限运行hadoop,在bin/hadoop中可以看到以下shell脚本:

 

if [[ $EUID -eq 0 ]]; then
    HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"
  else
    HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS"

fi

 

在root用户下echo $EUID,结果为0

 

有两种解决办法:

1、使用其他用户运行

2、修改shell代码,把1、2、3行注释掉,结果为

#if [[ $EUID -eq 0 ]]; then
    #HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"
  #else
    HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS"

#fi

 

 

 

 

 

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics