HDFSコマンドライン操作

5460 ワード

起動後、hadoopはコマンドラインで使用できます.
(1)すべてのコマンド($HADOOP_HOME/binを.bashrcの$PATH変数に追加)
[hadoop@node14 hadoop-0.21.0]$ ll $HADOOP_HOME/bin
total 88
-rwxr-xr-x 1 hadoop hadoop 4131 Aug 17  2010 hadoop
-rwxr-xr-x 1 hadoop hadoop 8658 Aug 17  2010 hadoop-config.sh
-rwxr-xr-x 1 hadoop hadoop 3841 Aug 17  2010 hadoop-daemon.sh
-rwxr-xr-x 1 hadoop hadoop 1242 Aug 17  2010 hadoop-daemons.sh
-rwxr-xr-x 1 hadoop hadoop 4130 Aug 17  2010 hdfs
-rwxr-xr-x 1 hadoop hadoop 1201 Aug 17  2010 hdfs-config.sh
-rwxr-xr-x 1 hadoop hadoop 3387 Aug 17  2010 mapred
-rwxr-xr-x 1 hadoop hadoop 1207 Aug 17  2010 mapred-config.sh
-rwxr-xr-x 1 hadoop hadoop 2720 Aug 17  2010 rcc
-rwxr-xr-x 1 hadoop hadoop 2058 Aug 17  2010 slaves.sh
-rwxr-xr-x 1 hadoop hadoop 1367 Aug 17  2010 start-all.sh
-rwxr-xr-x 1 hadoop hadoop 1018 Aug 17  2010 start-balancer.sh
-rwxr-xr-x 1 hadoop hadoop 1778 Aug 17  2010 start-dfs.sh
-rwxr-xr-x 1 hadoop hadoop 1255 Aug 17  2010 start-mapred.sh
-rwxr-xr-x 1 hadoop hadoop 1359 Aug 17  2010 stop-all.sh
-rwxr-xr-x 1 hadoop hadoop 1069 Aug 17  2010 stop-balancer.sh
-rwxr-xr-x 1 hadoop hadoop 1277 Aug 17  2010 stop-dfs.sh
-rwxr-xr-x 1 hadoop hadoop 1163 Aug 17  2010 stop-mapred.sh

(2)hadoopコマンド
[hadoop@node14 hadoop-0.21.0]$ hadoop
Usage: hadoop [--config confdir] COMMAND
       where COMMAND is one of:
  fs                   run a generic filesystem user client
  version              print the version
  jar             run a jar file
  distcp   copy file or directories recursively
  archive -archiveName NAME -p  *  create a hadoop archive
  classpath            prints the class path needed to get the
                       Hadoop jar and the required libraries
  daemonlog            get/set the log level for each daemon
 or
  CLASSNAME            run the class named CLASSNAME

Most commands print help when invoked w/o parameters.

(3) hadoop fs 
[hadoop@node14 hadoop-0.21.0]$ hadoop fs
Usage: java FsShell
           [-ls ]
           [-lsr ]
           [-df []]
           [-du [-s] [-h] ]
           [-dus ]
           [-count[-q] ]
           [-mv  ]
           [-cp  ]
           [-rm [-skipTrash] ]
           [-rmr [-skipTrash] ]
           [-expunge]
           [-put  ... ]
           [-copyFromLocal  ... ]
           [-moveFromLocal  ... ]
           [-get [-ignoreCrc] [-crc]  ]
           [-getmerge   [addnl]]
           [-cat ]
           [-text ]
           [-copyToLocal [-ignoreCrc] [-crc]  ]
           [-moveToLocal [-crc]  ]
           [-mkdir ]
           [-setrep [-R] [-w]  ]
           [-touchz ]
           [-test -[ezd] ]
           [-stat [format] ]
           [-tail [-f] ]
           [-chmod [-R]  PATH...]
           [-chown [-R] [OWNER][:[GROUP]] PATH...]
           [-chgrp [-R] GROUP PATH...]
           [-help [cmd]]

Generic options supported are
-conf      specify an application configuration file
-D             use value for given property
-fs       specify a namenode
-jt     specify a job tracker
-files     specify comma separated files to be co                                                                                                                     pied to the map reduce cluster
-libjars     specify comma separated jar files to                                                                                                                      include in the classpath.
-archives     specify comma separated archives                                                                                                                      to be unarchived on the compute machines.

The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]

(4)HDFS操作
hadoop fs
hadoop fs -ls
hadoop fs -mkdir firstdir			//HDFS    
hadoop fs -rmr firstdir				//HDFS    
hadoop fs -put test.txt first.txt		//           HDFS
hadoop fs  -cat first.txt
hadoop fs  -df
hadoop fs -get first.txt FirstTXTfromHDFS.txt    // HDFS      

ファイル書き込みに異常が発生した場合
(0)マシン名が正しくnode 14に外部IPと内部IPが配置されているかどうかをチェックし、/etc/hostsに2つのIPとマシン名の対応テーブルを加え、外部IPが内部IPの前に置かれている場合、netstat-nplで調べたところ、9000と9001が外部IP占有であることが判明したため、/etc/hostsで内部IPが外部のIPの前に置かれるべきである.あるいはconfのプロファイルでは、マシン名ではなくすべてIPを使用します.(1)ファイアウォールsudo/etc/init.d/iptables stopを閉じる(2)ディスク領域が正常かどうかを確認するdf-hl(3)ディレクトリが正常かどうかを確認するhadoop.tmp.dirデフォルト:/tmp/hadoop-${user.name}/tmpの下のファイルを削除し、hadoop namenode-formatを再起動し、すべてのプロセスを再起動します.(4)各プロセスを個別に起動namenodeとdatanodeでそれぞれノード$hadoop-daemon.sh start namenode$hadoop-daemon.sh start datanodeを起動する