chmod OwnerGroupOthers
chmod 777
4- read
2-write
1-execute
1) Version Check
To check the version of Hadoop.
ubuntu@ubuntu-VirtualBox:~$ hadoop version
Hadoop 2.7.3
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff
Compiled by root on 2016-08-18T01:41Z
Compiled with protoc 2.5.0
From source with checksum 2e4ce5f957ea4db193bce3734ff29ff4
This command was run using /home/ubuntu/hadoop-2.7.3/share/hadoop/common/hadoop-common-2.7.3.jar
2) list Command
List all the files/directories for the given hdfs destination path.
ubuntu@ubuntu-VirtualBox:~ $ hdfs dfs -ls /
Found 3 items
drwxr-xr-x - ubuntu supergroup 0 2016-11-07 01:11 /test
drwxr-xr-x - ubuntu supergroup 0 2016-11-07 01:09 /tmp
drwxr-xr-x - ubuntu supergroup 0 2016-11-07 01:09 /usr
3) df Command
Displays free space at given hdfs destination
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -df hdfs:/
Filesystem Size Used Available Use%
hdfs://master:9000 6206062592 32768 316289024 0%
4) count Command
- Count the number of directories, files and bytes under the paths that match the specified file pattern.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -count hdfs:/
4 0 0 hdfs:///
5) fsck Command
HDFS Command to check the health of the Hadoop file system.
ubuntu@ubuntu-VirtualBox:~$ hdfs fsck /
Connecting to namenode via http://master:50070/fsck?ugi=ubuntu&path=%2F
FSCK started by ubuntu (auth:SIMPLE) from /192.168.1.36 for path / at Mon Nov 07 01:23:54 GMT+05:30 2016
Status: HEALTHY
Total size: 0 B
Total dirs: 4
Total files: 0
Total symlinks: 0
Total blocks (validated): 0
Minimally replicated blocks: 0
Over-replicated blocks: 0
Under-replicated blocks: 0
Mis-replicated blocks: 0
Default replication factor: 2
Average block replication: 0.0
Corrupt blocks: 0
Missing replicas: 0
Number of data-nodes: 1
Number of racks: 1
FSCK ended at Mon Nov 07 01:23:54 GMT+05:30 2016 in 33 milliseconds
The filesystem under path '/' is HEALTHY
6) balancer Command
Run a cluster balancing utility.
ubuntu@ubuntu-VirtualBox:~$ hdfs balancer
16/11/07 01:26:29 INFO balancer.Balancer: namenodes = [hdfs://master:9000]
16/11/07 01:26:29 INFO balancer.Balancer: parameters = Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
16/11/07 01:26:38 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.36:50010
16/11/07 01:26:38 INFO balancer.Balancer: 0 over-utilized: []
16/11/07 01:26:38 INFO balancer.Balancer: 0 underutilized: []
The cluster is balanced. Exiting...
7 Nov, 2016 1:26:38 AM 0 0 B 0 B -1 B
7 Nov, 2016 1:26:39 AM Balancing took 13.153 seconds
7) mkdir Command
HDFS Command to create the directory in HDFS.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -mkdir /hadoop
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -ls /
Found 5 items
drwxr-xr-x - ubuntu supergroup 0 2016-11-07 01:29 /hadoop
drwxr-xr-x - ubuntu supergroup 0 2016-11-07 01:26 /system
drwxr-xr-x - ubuntu supergroup 0 2016-11-07 01:11 /test
drwxr-xr-x - ubuntu supergroup 0 2016-11-07 01:09 /tmp
drwxr-xr-x - ubuntu supergroup 0 2016-11-07 01:09 /usr
8) put Command
File
Copy file from single src, or multiple srcs from local file system to the destination file system.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -put test /hadoop
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -ls /hadoop
Found 1 items
-rw-r--r-- 2 ubuntu supergroup 16 2016-11-07 01:35 /hadoop/test
Directory
HDFS Command to copy directory from single source, or multiple sources from local file system to the destination file system.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -put hello /hadoop/
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -ls /hadoop
Found 2 items
drwxr-xr-x - ubuntu supergroup 0 2016-11-07 01:43 /hadoop/hello
-rw-r--r-- 2 ubuntu supergroup 16 2016-11-07 01:35 /hadoop/test
9) du Command
Displays size of files and directories contained in the given directory or the size of a file if its just a file.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -du /
59 /hadoop
0 /system
0 /test
0 /tmp
0 /usr
10) rm Command
HDFS Command to remove the file from HDFS.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -rm /hadoop/test
16/11/07 01:53:29 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
Deleted /hadoop/test
11) expunge Command
HDFS Command that makes the trash empty.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -expunge
16/11/07 01:55:54 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
12) rm -r Command
HDFS Command to remove the entire directory and all of its content from HDFS.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -rm -r /hadoop/hello
16/11/07 01:58:52 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
Deleted /hadoop/hello
13) chmod Command
Change the permissions of files.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -chmod 777 /hadoop
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -ls /
Found 5 items
drwxrwxrwx - ubuntu supergroup 0 2016-11-07 01:58 /hadoop
drwxr-xr-x - ubuntu supergroup 0 2016-11-07 01:26 /system
drwxr-xr-x - ubuntu supergroup 0 2016-11-07 01:11 /test
drwxr-xr-x - ubuntu supergroup 0 2016-11-07 01:09 /tmp
drwxr-xr-x - ubuntu supergroup 0 2016-11-07 01:09 /usr
14) get Command
HDFS Command to copy files from hdfs to the local file system.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -get /hadoop/test /home/ubuntu/Desktop/
ubuntu@ubuntu-VirtualBox:~$ ls -l /home/ubuntu/Desktop/
total 4
-rw-r--r-- 1 ubuntu ubuntu 16 Nov 8 00:47 test
15) cat Command
HDFS Command that copies source paths to stdout.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -cat /hadoop/test
This is a test.
16) touchz Command
HDFS Command to create a file in HDFS with file size 0 bytes.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -touchz /hadoop/sample
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -ls /hadoop
Found 2 items
-rw-r--r-- 2 ubuntu supergroup 0 2016-11-08 00:57 /hadoop/sample
-rw-r--r-- 2 ubuntu supergroup 16 2016-11-08 00:45 /hadoop/test
17) text Command
HDFS Command that takes a source file and outputs the file in text format.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -text /hadoop/test
This is a test.
18) copyFromLocal Command
HDFS Command to copy the file from Local file system to HDFS.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -copyFromLocal /home/ubuntu/new /hadoop
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -ls /hadoop
Found 3 items
-rw-r--r-- 2 ubuntu supergroup 43 2016-11-08 01:08 /hadoop/new
-rw-r--r-- 2 ubuntu supergroup 0 2016-11-08 00:57 /hadoop/sample
-rw-r--r-- 2 ubuntu supergroup 16 2016-11-08 00:45 /hadoop/test
19) copyToLocal Command
Similar to get command, except that the destination is restricted to a local file reference.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -copyToLocal /hadoop/sample /home/ubuntu/
ubuntu@ubuntu-VirtualBox:~$ ls -l s*
-rw-r--r-- 1 ubuntu ubuntu 0 Nov 8 01:12 sample
-rw-rw-r-- 1 ubuntu ubuntu 102436055 Jul 20 04:47 sqoop-1.99.7-bin-hadoop200.tar.gz
20) mv Command
HDFS Command to move files from source to destination. This command allows multiple sources as well, in which case the destination needs to be a directory.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -mv /hadoop/sample /tmp
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -ls /tmp
Found 1 items
-rw-r--r-- 2 ubuntu supergroup 0 2016-11-08 00:57 /tmp/sample
21) cp Command
HDFS Command to copy files from source to destination. This command allows multiple sources as well, in which case the destination must be a directory.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -cp /tmp/sample /usr
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -ls /usr
Found 1 items
-rw-r--r-- 2 ubuntu supergroup 0 2016-11-08 01:22 /usr/sample
22) tail Command
Displays last kilobyte of the file "new" to stdout
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -tail /hadoop/new
This is a new file.
Running HDFS commands.
23) chown Command
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -chown root:root /tmp
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -ls /
Found 5 items
drwxrwxrwx - ubuntu supergroup 0 2016-11-08 01:17 /hadoop
drwxr-xr-x - ubuntu supergroup 0 2016-11-07 01:26 /system
drwxr-xr-x - ubuntu supergroup 0 2016-11-07 01:11 /test
drwxr-xr-x - root root 0 2016-11-08 01:17 /tmp
drwxr-xr-x - ubuntu supergroup 0 2016-11-08 01:22 /usr
24) setrep Command
Default replication factor to a file is 3. Below HDFS command is used to change replication factor of a file.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -setrep -w 2 /usr/sample
Replication 2 set: /usr/sample
Waiting for /usr/sample ... done
25) distcp Command
Copy a directory from one node in the cluster to another
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -distcp hdfs://namenodeA/apache_hadoop hdfs://namenodeB/hadoop
26) stat Command
Print statistics about the file/directory at <path> in the specified format. Format accepts filesize in blocks (%b), type (%F), group name of owner (%g), name (%n), block size (%o), replication (%r), user name of owner(%u), and modification date (%y, %Y). %y shows UTC date as “yyyy-MM-dd HH:mm:ss” and %Y shows milliseconds since January 1, 1970 UTC. If the format is not specified, %y is used by default.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -stat "%F %u:%g %b %y %n" /hadoop/test
regular file ubuntu:supergroup 16 2016-11-07 19:15:22 test
27) getfacl Command
Displays the Access Control Lists (ACLs) of files and directories. If a directory has a default ACL, then getfacl also displays the default ACL.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -getfacl /hadoop
# file: /hadoop
# owner: ubuntu
# group: supergroup
28) du -s Command
Displays a summary of file lengths.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -du -s /hadoop
59 /hadoop
29) checksum Command
Returns the checksum information of a file.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -checksum /hadoop/new
/hadoop/new MD5-of-0MD5-of-512CRC32C 000002000000000000000000639a5d8ac275be8d0c2b055d75208265
30) getmerge Command
Takes a source directory and a destination file as input and concatenates files in src into the destination local file.
ubuntu@ubuntu-VirtualBox:~$ cat test
This is a test.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -cat /hadoop/new
This is a new file.
Running HDFS commands.
ubuntu@ubuntu-VirtualBox:~$ hdfs dfs -getmerge /hadoop/new test
ubuntu@ubuntu-VirtualBox:~$ cat test
This is a new file.
Running HDFS commands.