we have small gpdb cluster. when i am trying to read external table using 'gphdfs' Protocol from gpdb master.
Environment
Product Version Pivotal Greenplum (GPDB) 4.3.8.2 OS Centos 6.5
Getting Error:
prod=# select * from ext_table; ERROR: external table gphdfs protocol command ended with error. 16/10/05 14:42:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable (seg0 slice1 host.domain.com:40000 pid=25491)
DETAIL:
Exception in thread "main" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://path/to/hdfs
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:285)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
at com.
Command: 'gphdfs://path/to/hdfs'
External table tableame, file gphdfs://path/to/hdfs
We tried : Following link on Greenplum master machine https://discuss.pivotal.io/hc/en-us/articles/219403388-How-to-eliminate-error-message-WARN-util-NativeCodeLoader-Unable-to-load-native-hadoop-library-for-your-platform-with-gphdfs
Result of Command
It did not work after changing the content in "Hadoop-env.sh" as suggested in Link. Still throwing the same error.Do i need to restart the gpdb for affecting the changes "Hadoop-env.sh".
Or
Is there alternate way to handle gphdfs protocol error ?
Any help on it would be much appreciated ?
Attached is DDL for failing External Table
create external table schemaname.exttablename(
"ID" INTEGER,
time timestamp without time zone,
"SalesOrder" char(6),
"NextDetailLine" decimal(6),
"OrderStatus" char(1),
)
location('gphdfs://hadoopmster.com:8020/devgpdb/filename.txt') FORMAT 'text'
Input path does not exist: hdfs://path/to/hdfs- Binary Nerd