Simple Steps to Execute Hadoop copyFromLocal Command
HDFS shell commands have a similar structure to Unix commands. People working with Unix shell command find it easy to adjust to Hadoop Shell commands. These commands relate to HDFS and other file systems supported by Hadoop. For example, the local file system, S3 file system, and so on. Today, we will study Hadoop copyFromLocal Command and its use.
We use this command in Hadoop to copy the file from the local file system to Hadoop Distributed File System (HDFS). Here is one constraint with this command and that is, the source file can be located only in the local file system.
copyFromLocal has an optional parameter –f which gets used to replace the files that already exist in the system. This is useful when we have to copy the same file again or update it. By default, the system will throw the error if we try to copy a file in the same directory in which it already exists. One way to update a file is to delete the file and copy it again and another way is to use –f.
We can invoke the Hadoop file system by the following command:-
- hadoop fs <args>
When the command gets executed the output is sent to stdout and errors to stderr. In most cases, both are the console screen.
We can also use the below approach to invoke fs commands which is a synonym to hadoop fs:-
- hdfs dfs -<command> <args>
Below statement shows the usage of copyFromLocal command:-
- hdfs dfs –copyFromLocal <local-source> URI
We can enter the command with –f option to overwrite the file if it already exists.
- hdfs dfs –copyFromLocal –f <local-source> URI
Steps to Execute copyFromLocal Command
We have to follow the following steps to perform copyFromLocal command:-
1. Make directory
- hdfs dfs –mkdir /user/copy_from_local_example
The above command is used to create a directory in HDFS.
- hdfs dfs –ls /user
The above command is to check if the directory is created in HDFS.
2. Copying the local file into the directory in HDFS
1. hdfs dfs –copyFromLocal desktop / TestFiles /user/Cloudera
The above command is used to copy the file testfile.txt from the local filesystem to the hdfs directory.
- hdfs dfs –ls /user/Cloudera / TestFiles
The above command is used to check the creation of testfile.txt in hdfs directory /user/copy_from_local_example
3. Overwriting the existing file in HDFS
This command does not by default overwrite the existing files. If we try to copy the file with the same name in the same directory then we will get an error. We can see it from the below screenshot.
We have to use –f option of copyFromLocal file to overwrite the file.
1. hdfs dfs –copyFromLocal –f testfile.txt /user/copy_from_local_example
The above command will replace the existing file. To check whether the file has been successfully copied we use ls command. From the below picture we can see that file is copied from the timestamp of 14:53 as compared to timestamp of 14:51 when the file was first created.
Therefore, copyFromLocal is one of the significant commands of the Hadoop FS shell. We can use this command to load the input file of the MapReduce job from the local file system to HDFS. In this article, we have considered an example to understand the copyFromLocal command. And how to go about overwriting the already existing file.