8
votes

I have installed spark-hadoop env in my Red Hat 64. And I also want to read and write code in spark source code project in intelliJ idea. I have downloaded spark source code and make everything ready. But I had some errors when compiling spark project in IntelliJ idea. Here are errors:

/home/xuch/IdeaProjects/spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystQI.scala

Error:(809, 34) not found: value SparkSqlParser case ast if ast.tokenType == SparlSqlParser.TinyintLiteral =>

Error:(812, 34) not found: value SparkSqlParser case ast if ast.tokenType == SparlSqlParser.SmallintLiteral =>

... ...

But actually I did not find a file named SparkSqlParser.scala in the whole project neither a scala class named SparkSqlParser.

However, I had searched the web for some files named SparkSqlParser.scala, but they don't have attribute like "TinyintLiteral", "SmallintLiteral", etc. Here are the files link:

5
hello @halfer, saw you vote negative for this question. I wonder if you know how to solve this problem. Really appreciate if any solutions provided.Lyroe Chan
No Lyroe, you didn't see me vote for this question. I didn't vote for this question one way or the other. I do often downvote if I see urgent begging in questions - please note no questions are urgent when presented to volunteers - but for some reason I did not do so here (that may be the reason for the -2, but I'd just be speculating).halfer
Sadly I cannot assist on this topic, I am not familiar with it.halfer
refer to this description in the spark document page !enter image description herezzr1000

5 Answers

15
votes

I meet the same problem. Here is my solution:

  1. Download the antlr4 (i.e. antlr v4) plugin of IntelliJ. Then you can see the file "spark-2.0.1\sql\catalyst\src\main\antlr4\org\apache\spark\sql\catalyst\parser\SqlBase.g4" can be recognized by IntelliJ IDEA.
  2. Navigate to View->Tool Windows->Maven Projects tab. select the project "Spark Project Catalyst". Right click on it. Then select "Generate sources and update folders"
  3. After that you can see some files added into the "spark-2.0.1\sql\catalyst\target\generated-sources\antlr4"
  4. Then you can build success of the project.

Hope it can help you.

5
votes

None of the advice here worked for me. I noticed, however, that the generated code depends on Antlr 3.x while Antlr 4.x is what is in the dependencies (mvn dependecy:tree). I don't know why this was the case. Maybe because I had earlier built it from the command line (?).

Anyway, try cleaning your Catalyst sub-project then rebuild the autogenerated sources. To do this in IntelliJ, go to View -> Tools Window -> Maven Projects.

Then navigate to the "Spark Project Catalyst" in the "Maven Project" tab.

Navigate to clean -> clean:clean and double click it. Navigate to Plugins -> antlr4 -> antlr4:antlr4 and double click it.

Now, you'll see the autogenerated sources of the Antlr classes are different and they should compile. YMMV.

1
votes

1) First build your Spark from command line using build instructions given in http://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn

2) Then check $SPARK_HOME/sql/catalyst/target/generated-sources/antlr3/org/apache/spark/sql/catalyst/parser folder.

Some of the generated classes like SparkSqlLexer.java is there.

List of classes it generates are

    SparkSqlLexer.java[enter link description here][1]    
    SparkSqlParser.java
    SparkSqlParser_ExpressionParser.java
    SparkSqlParser_FromClauseParser.java
    SparkSqlParser_IdentifiersParser.java
    SparkSqlParser_KeywordParser.java
    SparkSqlParser_SelectClauseParser.java

3) Open Module Settings. Click on spark-catalyst module. Go to Source tab in the right. Make target/generated-source as a source folder. Attaching a pic to give an idea.

0
votes

I also faced similar problem when I updated my fork to latest master. Unfortunately, could not find a way to make it work from IDEA. What I did is compiled the project from command line. It generated the antlr classes which is required. I then added the generated-source target/generated-source/antlr as source directory. After that I could run tests from Idea. Ideally Idea generate source should have generated the code. Need to check more why it did not. May be because I have maven3.3.3 configured.

0
votes

I have did as the intruction from Rishitesh Mishra and get stuck in the first step. I have always errors when executing "build/mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package".
I have both tried on source code from https://spark.apache.org and fork on github.
I have attached the log screenshot in a new answer as following.
error log image