I can answer your specific questions, but must warn you that I haven't got pig working yet on the cygwin UNIX emulator on my PC. I'll tell you what I know.
The message: 'Cannot locate pig.jar. do 'ant jar' and try again.' comes from a block of code near the end of the pig shell script. You are using pig-0.10.0. I tried to get pig-0.11.1 working but received the same error messages as you. If Hadoop is not installed, there's no directory to point the environment variable HADOOP_BIN to in the shell script, since the script uses - HADOOP_BIN=which hadoop - to set it. So near the end of the script, with no HADOP_BIN set, the code branches to require either pig.jar or pig-?.!(*withouthadoop).jar in the location given by $PIG_HOME, to be put into the variable PIG_JAR. Your shell script finds neither of these, so PIG_JAR is empty, hence the error message.
if [ -n "$PIG_JAR" ]; then
CLASSPATH="${CLASSPATH}:$PIG_JAR"
else
echo "Cannot locate pig.jar. do 'ant jar, and try again"
exit 1
fi
The java container pig.jar doesn't exist in your directory because pig hasn't been built using ant. But in fact, the script should find pig.?.!(*withouthadoop).jar. You will have pig-0.10.0.jar in your directory and the pattern matching means pig- followed by a single character followed by . followed by anything at all except something ending in 'withouthadoop', followed by .jar . The 'withouthadoop' means that the jar doesn't contain an embedded hadoop, so hadoop must already be installed. If hadoop isn't installed, pig-0.10.0.jar , it seems, should be fine.
So why isn't it finding it? In the shell script is a little branch of code for folks running the script in cygwin UNIX:
if $cygwin; then
CLASSPATH=cygpath -w "$CLASSPATH"
PIG_HOME=cygpath -d "$PIG_HOME"
PIG_LOG_DIR=cygpath -d "$PIG_LOG_DIR"
fi
This converts paths passed into java.exe into a form that java.exe will understand, since it is a Windows executable. I've found that using -m rather than -w or -d in these expressions - getting cygpath to convert e.g. /cygdrive/c/Program Files/Java .. to c:/Program Files/Java .. using forward slashes - which -m stipulates - works.
After a lot more pain with 'cannot find org.apache.pig.Main ' in the pig.jar (yes, I 'anted' it before figuring out the above) I've finally got a 'grunt>' prompt. The alterations I have made to the pig shell script in order to achieve this are:
Remove the entire if $cygwin; ... fi block described above. I assume that converting $PIG_HOME to Windows file path format is causing the code block: if [-f $PIG_HOME/pig.jar]; then; PIG_JAR=$PIG_HOME/pig.jar; else; PIG_JAR=echo $PIG_HOME/pig-?.!(*withouthadoop).jar; fi to throw the errors you see: cygwin warning, MS-DOS style path detected: c:\pig\pig-01~1/pig.jar, etc.
Following the place where you have deleted the cygwin path translation block, rewrite the PIG_OPTS variable settings as:
PIG_OPTS="$PIG_OPTS -Dpig.log.dir=cygpath -m $PIG_LOG_DIR"
PIG_OPTS="$PIG_OPTS -DPIG.log.file=pig.log"
PIG_OPTS="$PIG_OPTS -Dpig.home.dir=cygpath -m $PIG_HOME"
- Rewrite the line of code at the end of the shell script that invokes java.exe - exec "$JAVA" .. as:
exec "$JAVA" $JAVA_HEAP_MAX $PIG_OPTS -classpath "cygpath -p -m $CLASSPATH" $CLASS "${remaining[@]}"
export PATH="$PATH:/cygdrive/c/Program Files/Java/jdk-your_version/bin:/cygdrive/..your-pig-home/bin"
export JAVA_HOME="/cygdrive/c/Program Files/Java/jdk-your_version"
export CLASSPATH=""
All this lets me type 'pig -x local' and I get a 'grunt>' prompt. Interestingly, by downloading pig-0.7.0, unpacking the pig-0.7.0.tar.gz file and running pig -x local, it works out of the box, straight away. The same 'grunt>' prompt.
But, unfortunately, it's a sham. In both cases. A false grunt - a ventriloquist's grunt. The arrow keys move the cursor all over the prompt - in fact anywhere you like on the screen - the return key enters nothing, whatever you may have typed in, and only control+backslash works, to return the dollar prompt. If you get to this point and understand what's happening, please let me know.