I have an application that dumps a lot of files to a directory. I want to copy these files to a Hadoop cluster using the hadoop command. I use the following code to run the command.
System.Diagnostics.ProcessStartInfo export = new System.Diagnostics.ProcessStartInfo();
export.RedirectStandardOutput = false;
export.RedirectStandardError = false;
export.UseShellExecute = false;
export.WorkingDirectory = Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().Location);
export.FileName = "hadoop";
export.Arguments = "fs -copyFromLocal " + Path.Combine(dumpDirectory, "*.txt") + " " + hadoopPath));
Console.WriteLine("Copying data: hadoop " + export.Arguments);
System.Diagnostics.Process proc = System.Diagnostics.Process.Start(export);
proc.WaitForExit();
if (proc.ExitCode == 0)
{
IEnumerable<string> files = Directory.EnumerateFiles(dumpDirectory);
foreach (string file in files)
File.Delete(file);
}
else
Console.WriteLine("Error copying to Hadoop: " + proc.ExitCode);
The program writes the following message:
Copying data: hadoop fs -copyFromLocal local/directory/*.txt /user/remote/directory/
copyFromLocal: `local/directory/*.txt': No such file or directory
Error copying to Hadoop: 1
Interestingly, when I run the command manually, the files copy without error.
Also, if the program runs the command without using *.txt and instead calls the command for each file individually, the command succeeds.
Can anyone shed some light on this?