0
votes

I have Eclipse Kepler, I have installed the Maven and Scala plugins. I create a new Maven project and add the dependency

groupId: org.apache.spark artifactId: spark-core_2.10 version: 1.1.0

as per current doc at http://spark.apache.org/downloads.html, all is fine, the jars for Scala 2.10 are also added to the project. I then add the "Scala Nature" to the project, this adds Scala 2.11 and I end up with the following error

More than one scala library found in the build path (C:/Eclipse/eclipse-jee-kepler-SR2-win32-x86_64/plugins/org.scala-lang.scala-library_2.11.2.v20140721-095018-73fb460c1c.jar, C:/Users/fff/.m2/repository/org/scala-lang/scala-library/2.10.4/scala-library-2.10.4.jar). At least one has an incompatible version. Please update the project build path so it contains only compatible scala libraries.

Is it possible to use Spark (from Maven) and Scala IDE Plugin together? Any ideas on how to fix this problem?

Thanks for your help. Regards

5

5 Answers

2
votes

In short, yes, it is possible.

Spark is currently using Scala 2.10, and the latest Scala IDE is "cross published" for 2.10 and 2.11. You need to choose the 2.10-based version, which is 3.0.3.

However, the next major version, 4.0, which is in release candidate mode, has multi-version support. You can create a Scala project and select the Scala version you'd like to use (2.10 or 2.11). You could give that a try if you feel like it.

1
votes

If someone stumbles here while searching for the same thing:

I recently created Maven archetype for bootstrapping a new Spark 1.3.0 with Scala 2.10.4 project. Follow instructions here: https://github.com/spark-in-action/scala-archetype-sparkinaction

For IntelliJ IDEA, first generate project from command line and then import into IDE.

0
votes

You have installed the Scala Ide plugin, but Scala nature of a project is of use only if you include scala classes in your project. Spark and Scala are however made to work together. Make sure you you use compatible versions. You can install scala on your computer, and then use the compatible spark maven dependency.

0
votes

yes you can.. use the pom I am providing below

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.spark-scala</groupId>
    <artifactId>spark-scala</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>${project.artifactId}</name>
    <description>Spark in Scala</description>
    <inceptionYear>2010</inceptionYear>

    <properties>
        <maven.compiler.source>1.8</maven.compiler.source>
        <maven.compiler.target>1.8</maven.compiler.target>
        <encoding>UTF-8</encoding>
        <scala.tools.version>2.10</scala.tools.version>
        <!-- Put the Scala version of the cluster -->
        <scala.version>2.10.4</scala.version>
    </properties>

    <!-- repository to add org.apache.spark -->
    <repositories>
        <repository>
            <id>cloudera-repo-releases</id>
            <url>https://repository.cloudera.com/artifactory/repo/</url>
        </repository>
    </repositories>

    <build>
        <sourceDirectory>src/main/scala</sourceDirectory>
        <testSourceDirectory>src/test/scala</testSourceDirectory>
        <plugins>
            <plugin>
                <!-- see http://davidb.github.com/scala-maven-plugin -->
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <version>3.2.1</version>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>2.13</version>
                <configuration>
                    <useFile>false</useFile>
                    <disableXmlReport>true</disableXmlReport>
                    <includes>
                        <include>**/*Test.*</include>
                        <include>**/*Suite.*</include>
                    </includes>
                </configuration>
            </plugin>

            <!-- "package" command plugin -->
            <plugin>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>2.4.1</version>
                <configuration>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.scala-tools</groupId>
                <artifactId>maven-scala-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

    <dependencies>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>1.2.1</version>
        </dependency>
    </dependencies>
</project>
0
votes

There are 2 types of Spark JAR files (just by looking at the Name):

  • Name includes the word "assembly" and not "core" (has Scala inside)

  • Name includes the word "core" and not "assembly" (no Scala inside).

You should include the "core" type in your Build Path via “Add External Jars” (the version you need) since the Scala IDE already shoves one Scala for you.

Alternatively, you can just take advantage of the SBT and add the following Dependency (again, pay attention to the versions you need):

libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.1.0"

Then you should NOT include “forcefully” any spark JAR in the Build Path.

Happy sparking:

Zar

>