Most commonly Used Hadoop Commands With Examples
In this tutorial, we are going to study the most commonly used commands in Hadoop. These commands help in performing several HDFS file operations. These comprise copying a file, moving a file, showing the contents of the file, creating directories, etc. So let us begin with an introduction and then we will see different commands in Hadoop with examples.
Hadoop stores petabytes of data using HDFS. HDFS is a distributed file system that stores structured to unstructured data. It provides redundant storage for files having a humongous size. There are several commands to perform different file operations. Let us take a look at some of the important Hadoop commands.
List of Hadoop Commands
Command Name: version
Command Usage: version
- hadoop version
Description: Version command Shows the version of Hadoop installed.
Command Name: mkdir
Command Usage: mkdir <path>
- hdfs dfs -mkdir /ram/mrv1
Description: This command takes the <path> as an argument and creates the directory.
Command Name: ls
Command Usage: ls <path>
- hdfs dfs -ls /user
Description: This command displays the contents of the directory specified by <path>. It displays the name, permissions, owner, size, and modification date of each entry.
- hdfs dfs -ls -R /user
Description: This command works like ls but displays entries in all the subdirectories recursively.
Command Name: put
Command Usage: put <localsrc> <dest>
- hdfs dfs -put /home/sample.txt /user/dir1
Description: This command copies the file in the local filesystem to the file in DFS.
5. copyFrom Local
Command Name: copyFrom Local
Command Usage: copyFrom Local <localsrc> <dest>
- hdfs dfs -copyFromLocal /home/sample /user/dir1
Description: This command is the same as put command. But the source should refer to local files.
Command Name: get
Command Usage: get <src> <localdest>
- hdfs dfs -get /user/dir1 /home
Description: This Hadoop shell command copies the file in HDFS identified by <src> to file in local file system identified by <localdest>
- hdfs dfs -getmerge /user/dir1/sample.txt /user/dataflair/dir2/sample2.txt /home/sample1.txt
Description: This command retrieves all files in the source path entered by the user in HDFS and merges them into one single file created in the local file system identified by the local destination.
- hadoop fs –getfacl /user/dir1
- hadoop fs –getfacl -R /user/dir1
Description: This Hadoop command shows the Access Control Lists (ACLs) of files and directories. This command is used to display default ACL if the directory contains the same.
Options : -R: It recursively displays a list of all the ACLs of all files and directories.
- hadoop fs –getfattr –d /user/dir1
Description: This HDFS command displays if there are any extended attribute names and values for the specified file or directory.
Options:-R: It shows the attributes for all files and directories recursively. -n name: It displays the named extended attribute value. -d: It displays all the extended attribute values associated with the pathname. -e encoding: Encodes values after extracting them. The acceptable coded forms that are “text”, “hex”, and “base64”. The values which are encoded as text strings get enclosed with double quotes (” “). It uses prefix 0x for hexadecimal conversion. And 0s for all the values which get coded as base64.
Command Name: copyToLocal
Command Usage: copyToLocal <src> <localdest>
- hdfs dfs -copyToLocal /user/dir1 /home
Description: It is similar to get command. Only the difference is that in this the destination of the copied file should refer to a local file.
Command Name: cat
Command Usage: cat <file-name>
- hdfs dfs -cat /user/cloudera/dezyre1/sample1.txt
Description: This HDFS command displays the contents of the file on the console or stdout.
Command Name: mv
Command Usage: mv <src> <dest>
- hdfs dfs -mv /user/dir1/sample.txt /user/dir2
Description: This shell command moves the file from the specified source to the destination within HDFS.
Command Name: cp
Command Usage: cp <src> <dest>
- hdfs dfs -cp /user/dir2/sample1.txt /user/dir1
Description: This shell command copies the file or directory from the given source to the destination within HDFS.
There are several commands in “$HADOOP_HOME/bin/hadoop fs” other than what we have learned in this tutorial. What we have covered are the most commonly used basic commands to get started. If you are fixed somewhere then type the following:
- $HADOOP_HOME/bin/hadoop fs -help command-name
This will show a short usage summary of the command specified.