Results 1 to 6 of 6

Thread: NullPointerException in Spring Data Hadoop with CDH4

  1. #1
    Join Date
    Feb 2013
    Posts
    9

    Default NullPointerException in Spring Data Hadoop with CDH4

    Hi,
    I try to use Spring Data Hadoop with CDH4 to write a Map Reduce Job.

    On startup, I get the following exception:

    Exception in thread "SimpleAsyncTaskExecutor-1" java.lang.ExceptionInInitializerError
    at org.springframework.data.hadoop.mapreduce.JobExecu tor$2.run(JobExecutor.java:183)
    at java.lang.Thread.run(Thread.java:722)
    Caused by: java.lang.NullPointerException
    at org.springframework.util.ReflectionUtils.makeAcces sible(ReflectionUtils.java:405)
    at org.springframework.data.hadoop.mapreduce.JobUtils .<clinit>(JobUtils.java:123)
    ... 2 more
    I guess there is a problem with my Hadoop related dependencies. I couldn't find any reference
    showing how to configure Spring Data together with CDH4. But Costin showed, he is able to
    configure it: https://build.springsource.org/brows...DOOP-CDH4-JOB1


    Maven Setup
    This is the complete pom file you need to reproduce the problem.

    Code:
    <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    	<modelVersion>4.0.0</modelVersion>
    
    	<groupId>com.example</groupId>
    	<artifactId>com.example.main</artifactId>
    	<version>0.0.1-SNAPSHOT</version>
    	<packaging>jar</packaging>
    
    	<properties>
    		<java-version>1.7</java-version>
    		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    		<spring.version>3.2.0.RELEASE</spring.version>
    		<spring.hadoop.version>1.0.0.BUILD-SNAPSHOT</spring.hadoop.version>
    		<hadoop.version>2.0.0-cdh4.1.3</hadoop.version>
    		<log4j.version>1.2.17</log4j.version>
    	</properties>
    
    	<dependencies>
    
    		<dependency>
    			<groupId>org.springframework</groupId>
    			<artifactId>spring-core</artifactId>
    			<version>${spring.version}</version>
    			<exclusions>
    				<exclusion>
    					<groupId>commons-logging</groupId>
    					<artifactId>commons-logging</artifactId>
    				</exclusion>
    			</exclusions>
    		</dependency>
    
    		<dependency>
    			<groupId>org.springframework</groupId>
    			<artifactId>spring-context</artifactId>
    			<version>${spring.version}</version>
    		</dependency>
    
    
    		<dependency>
    			<groupId>org.springframework.data</groupId>
    			<artifactId>spring-data-hadoop</artifactId>
    			<version>${spring.hadoop.version}</version>
    
    			<exclusions>
    				<exclusion>
    					<groupId>org.slf4j</groupId>
    					<artifactId>slf4j-log4j12</artifactId>
    				</exclusion>
    			</exclusions>
    
    		</dependency>
    
    		<!-- Hadoop Stuff -->
    
    		<dependency>
    			<groupId>org.apache.hadoop</groupId>
    			<artifactId>hadoop-client</artifactId>
    			<version>${hadoop.version}</version>
    		</dependency>
    
    		<dependency>
    			<groupId>org.apache.hadoop</groupId>
    			<artifactId>hadoop-tools</artifactId>
    			<version>2.0.0-mr1-cdh4.1.3</version>
    		</dependency>
    
    	</dependencies>
    
    	<build>
    		<plugins>
    
    			<plugin>
    				<groupId>org.apache.maven.plugins</groupId>
    				<artifactId>maven-compiler-plugin</artifactId>
    				<configuration>
    					<source>${java-version}</source>
    					<target>${java-version}</target>
    				</configuration>
    			</plugin>
    
    		</plugins>
    	</build>
    
    	<repositories>
    		<repository>
    			<id>spring-milestones</id>
    			<url>http://repo.springsource.org/libs-milestone</url>
    			<snapshots>
    				<enabled>false</enabled>
    			</snapshots>
    		</repository>
    
    		<repository>
    			<id>cloudera</id>
    			<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
    			<snapshots>
    				<enabled>false</enabled>
    			</snapshots>
    		</repository>
    
    		<repository>
    			<id>spring-snapshot</id>
    			<name>Spring Maven SNAPSHOT Repository</name>
    			<url>http://repo.springframework.org/snapshot</url>
    		</repository>
    	</repositories>
    </project>


    Application Context

    Code:
    <?xml version="1.0" encoding="UTF-8"?>
    <beans xmlns="http://www.springframework.org/schema/beans"
    	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    	xmlns:hdp="http://www.springframework.org/schema/hadoop"
    	xmlns:context="http://www.springframework.org/schema/context"
    	xsi:schemaLocation="
                        http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
                        http://www.springframework.org/schema/hadoop http://www.springframework.org/schema/hadoop/spring-hadoop.xsd
                        http://www.springframework.org/schema/context/spring-context.xsd http://www.springframework.org/schema/integration
                        http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-3.1.xsd">
    
    	<context:property-placeholder location="classpath:hadoop.properties" />
    
    	<hdp:configuration id="hadoopConfiguration">
    		fs.default.name=hdfs://namenode.example.com:8020
    	</hdp:configuration>
    
    	<hdp:job id="wordCountJob" 
    		mapper="com.example.WordMapper"
    		reducer="com.example.WordReducer" 
    		input-path="/user/christian/input/test"
    		output-path="/user/christian/output" />
    
    	<hdp:job-runner job-ref="wordCountJob" run-at-startup="true"
    		wait-for-completion="true" />
    
    </beans>
    Cluster version

    Hadoop 2.0.0-cdh4.1.3


    Note:

    This small Unittest is running fine with the current configuration:

    Code:
    @RunWith(SpringJUnit4ClassRunner.class)
    @ContextConfiguration(locations = { "classpath:/applicationContext.xml" })
    public class Starter {
    
    	 @Autowired
    	 private Configuration configuration;
    		
    	 @Test
    	 public void shellOps() {
    	 	 Assert.assertNotNull(this.configuration);
    	 	 FsShell fsShell = new FsShell(this.configuration);
    	 	 final Collection<FileStatus> coll = fsShell.ls("/user");
    	 	 System.out.println(coll);
    	 }
    }

    It would be nice if someone can give me an example configuration.

    Best Regards,
    Christian.

  2. #2
    Join Date
    Feb 2013
    Posts
    9

    Default

    This is my dependency tree:

    Code:
    [INFO] [dependency:tree {execution: default-cli}]
    [INFO] com.example:com.example.main:jar:0.0.1-SNAPSHOT
    [INFO] +- org.springframework:spring-core:jar:3.2.0.RELEASE:compile
    [INFO] +- org.springframework:spring-context:jar:3.2.0.RELEASE:compile
    [INFO] |  +- org.springframework:spring-aop:jar:3.2.0.RELEASE:compile
    [INFO] |  |  \- aopalliance:aopalliance:jar:1.0:compile
    [INFO] |  +- org.springframework:spring-expression:jar:3.2.0.RELEASE:compile
    [INFO] |  \- org.springframework:spring-beans:jar:3.2.0.RELEASE:compile
    [INFO] +- org.springframework.data:spring-data-hadoop:jar:1.0.0.BUILD-SNAPSHOT:compile
    [INFO] |  +- org.apache.hadoop:hadoop-streaming:jar:1.0.4:compile
    [INFO] |  \- org.springframework:spring-context-support:jar:3.0.7.RELEASE:compile
    [INFO] +- org.apache.hadoop:hadoop-client:jar:2.0.0-cdh4.1.3:compile
    [INFO] |  +- org.apache.hadoop:hadoop-common:jar:2.0.0-cdh4.1.3:compile
    [INFO] |  |  +- org.apache.hadoop:hadoop-annotations:jar:2.0.0-cdh4.1.3:compile
    [INFO] |  |  +- com.google.guava:guava:jar:11.0.2:compile
    [INFO] |  |  |  \- com.google.code.findbugs:jsr305:jar:1.3.9:compile
    [INFO] |  |  +- commons-cli:commons-cli:jar:1.2:compile
    [INFO] |  |  +- org.apache.commons:commons-math:jar:2.1:compile
    [INFO] |  |  +- xmlenc:xmlenc:jar:0.52:compile
    [INFO] |  |  +- commons-codec:commons-codec:jar:1.4:compile
    [INFO] |  |  +- commons-io:commons-io:jar:2.1:compile
    [INFO] |  |  +- commons-net:commons-net:jar:3.1:compile
    [INFO] |  |  +- commons-logging:commons-logging:jar:1.1.1:compile
    [INFO] |  |  +- log4j:log4j:jar:1.2.17:compile
    [INFO] |  |  +- junit:junit:jar:4.8.2:compile
    [INFO] |  |  +- commons-lang:commons-lang:jar:2.5:compile
    [INFO] |  |  +- commons-configuration:commons-configuration:jar:1.6:compile
    [INFO] |  |  |  +- commons-collections:commons-collections:jar:3.2.1:compile
    [INFO] |  |  |  +- commons-digester:commons-digester:jar:1.8:compile
    [INFO] |  |  |  |  \- commons-beanutils:commons-beanutils:jar:1.7.0:compile
    [INFO] |  |  |  \- commons-beanutils:commons-beanutils-core:jar:1.8.0:compile
    [INFO] |  |  +- org.slf4j:slf4j-api:jar:1.6.1:compile
    [INFO] |  |  +- org.codehaus.jackson:jackson-core-asl:jar:1.8.8:compile
    [INFO] |  |  +- org.codehaus.jackson:jackson-mapper-asl:jar:1.8.8:compile
    [INFO] |  |  +- org.mockito:mockito-all:jar:1.8.5:compile
    [INFO] |  |  +- org.apache.avro:avro:jar:1.7.1.cloudera.2:compile
    [INFO] |  |  |  +- com.thoughtworks.paranamer:paranamer:jar:2.3:compile
    [INFO] |  |  |  \- org.xerial.snappy:snappy-java:jar:1.0.4.1:compile
    [INFO] |  |  +- com.google.protobuf:protobuf-java:jar:2.4.0a:compile
    [INFO] |  |  +- org.apache.hadoop:hadoop-auth:jar:2.0.0-cdh4.1.3:compile
    [INFO] |  |  +- com.jcraft:jsch:jar:0.1.42:compile
    [INFO] |  |  \- org.apache.zookeeper:zookeeper:jar:3.4.3-cdh4.1.3:compile
    [INFO] |  |     \- jline:jline:jar:0.9.94:compile
    [INFO] |  +- org.apache.hadoop:hadoop-hdfs:jar:2.0.0-cdh4.1.3:compile
    [INFO] |  |  +- org.mortbay.jetty:jetty:jar:6.1.26.cloudera.2:compile
    [INFO] |  |  +- org.mortbay.jetty:jetty-util:jar:6.1.26.cloudera.2:compile
    [INFO] |  |  +- com.sun.jersey:jersey-core:jar:1.8:compile
    [INFO] |  |  +- com.sun.jersey:jersey-server:jar:1.8:compile
    [INFO] |  |  |  \- asm:asm:jar:3.1:compile
    [INFO] |  |  +- javax.servlet.jsp:jsp-api:jar:2.1:compile
    [INFO] |  |  +- javax.servlet:servlet-api:jar:2.5:compile
    [INFO] |  |  \- tomcat:jasper-runtime:jar:5.5.23:compile
    [INFO] |  +- org.apache.hadoop:hadoop-mapreduce-client-app:jar:2.0.0-cdh4.1.3:compile
    [INFO] |  |  +- org.apache.hadoop:hadoop-mapreduce-client-common:jar:2.0.0-cdh4.1.3:compile
    [INFO] |  |  |  \- org.apache.hadoop:hadoop-yarn-server-common:jar:2.0.0-cdh4.1.3:compile
    [INFO] |  |  +- org.apache.hadoop:hadoop-mapreduce-client-shuffle:jar:2.0.0-cdh4.1.3:compile
    [INFO] |  |  +- org.slf4j:slf4j-log4j12:jar:1.6.1:compile
    [INFO] |  |  \- org.jboss.netty:netty:jar:3.2.4.Final:compile
    [INFO] |  +- org.apache.hadoop:hadoop-yarn-api:jar:2.0.0-cdh4.1.3:compile
    [INFO] |  |  \- com.sun.jersey:jersey-json:jar:1.8:compile
    [INFO] |  |     +- org.codehaus.jettison:jettison:jar:1.1:compile
    [INFO] |  |     |  \- stax:stax-api:jar:1.0.1:compile
    [INFO] |  |     +- com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:compile
    [INFO] |  |     |  \- javax.xml.bind:jaxb-api:jar:2.2.2:compile
    [INFO] |  |     |     \- javax.activation:activation:jar:1.1:compile
    [INFO] |  |     +- org.codehaus.jackson:jackson-jaxrs:jar:1.7.1:compile
    [INFO] |  |     \- org.codehaus.jackson:jackson-xc:jar:1.7.1:compile
    [INFO] |  +- org.apache.hadoop:hadoop-mapreduce-client-core:jar:2.0.0-cdh4.1.3:compile
    [INFO] |  |  \- org.apache.hadoop:hadoop-yarn-common:jar:2.0.0-cdh4.1.3:compile
    [INFO] |  +- org.apache.hadoop:hadoop-mapreduce-client-jobclient:jar:2.0.0-cdh4.1.3:compile
    [INFO] |  \- jdk.tools:jdk.tools:jar:1.6:system
    [INFO] \- org.apache.hadoop:hadoop-tools:jar:2.0.0-mr1-cdh4.1.3:compile
    [INFO]    +- com.cloudera.cdh:hadoop-ant:pom:2.0.0-mr1-cdh4.1.3:compile
    [INFO]    \- org.apache.hadoop:hadoop-core:jar:2.0.0-mr1-cdh4.1.3:compile
    [INFO]       +- commons-httpclient:commons-httpclient:jar:3.1:compile
    [INFO]       +- tomcat:jasper-compiler:jar:5.5.23:compile
    [INFO]       +- commons-el:commons-el:jar:1.0:compile
    [INFO]       +- net.java.dev.jets3t:jets3t:jar:0.6.1:compile
    [INFO]       +- hsqldb:hsqldb:jar:1.8.0.10:compile
    [INFO]       +- oro:oro:jar:2.0.8:compile
    [INFO]       \- org.eclipse.jdt:core:jar:3.1.1:compile

  3. #3
    Join Date
    Jan 2005
    Location
    Bucharest, Romania
    Posts
    5,403

    Default

    You seem to be using a mixture of CDH 4.1 MRv1 and MRv2 libraries. Note that Spring for Apache Hadoop supports only MRv1 not MRv2. See the CDH maven repository on what artifacts you need [1] - in our build system we had to cherry pick the MRv1 versions by hand to be sure.

    Make sure that both hadoop-tools and hadoop-streaming are the MRv1 version as otherwise MRv2 will be picked up (which is incompatible with SHDP).

    Hope this helps,

    P.S. There's no need to specify the slf4j-log4j exclude any more for spring-data-hadoop since RC2.

    [1] https://ccp.cloudera.com/display/CDH...ven+Repository
    Costin Leau
    SpringSource - http://www.SpringSource.com- Spring Training, Consulting, and Support - "From the Source"
    http://twitter.com/costinl
    Please use [ c o d e ] [ / c o d e ] tags

  4. #4
    Join Date
    Jan 2005
    Location
    Bucharest, Romania
    Posts
    5,403

    Default

    As I mentioned, it looks like some yarn libraries are being polled in.
    Specify 2.0.0-mr1-cdhXXX for hadoop-tools <i>and</i> hadoop-streaming plus the generic 2.0.0-cdhXXX for hadoop-hdfs and hadoop-commons.
    These four dependencies should be enough for CDH4 (that's what we use in the build).

    P.S. also as you'll have two Hadoop distros (Apache Hadoop and CDH), you might want to remove the Hadoop dependency from Spring Data Hadoop.
    Costin Leau
    SpringSource - http://www.SpringSource.com- Spring Training, Consulting, and Support - "From the Source"
    http://twitter.com/costinl
    Please use [ c o d e ] [ / c o d e ] tags

  5. #5
    Join Date
    Feb 2013
    Posts
    9

    Default

    Thanks for looking up the needed dependencies. It's running without any problems now .

    This is the complete pom.xml that solves my problem.

    Code:
    <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    	<modelVersion>4.0.0</modelVersion>
    
    	<groupId>com.example</groupId>
    	<artifactId>com.example.main</artifactId>
    	<version>0.0.1-SNAPSHOT</version>
    	<packaging>jar</packaging>
    
    	<properties>
    		<java-version>1.7</java-version>
    		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    		<spring.version>3.2.0.RELEASE</spring.version>
    		<spring.hadoop.version>1.0.0.BUILD-SNAPSHOT</spring.hadoop.version>
    		<hadoop.version.generic>2.0.0-cdh4.1.3</hadoop.version.generic>
    		<hadoop.version.mr1>2.0.0-mr1-cdh4.1.3</hadoop.version.mr1>
    	</properties>
    
    	<dependencies>
    
    		<dependency>
    			<groupId>org.springframework</groupId>
    			<artifactId>spring-core</artifactId>
    			<version>${spring.version}</version>
    			<exclusions>
    				<exclusion>
    					<groupId>commons-logging</groupId>
    					<artifactId>commons-logging</artifactId>
    				</exclusion>
    			</exclusions>
    		</dependency>
    
    		<dependency>
    			<groupId>org.springframework</groupId>
    			<artifactId>spring-context</artifactId>
    			<version>${spring.version}</version>
    		</dependency>
    
    
    		<dependency>
    			<groupId>org.springframework.data</groupId>
    			<artifactId>spring-data-hadoop</artifactId>
    			<version>${spring.hadoop.version}</version>
    
    			<exclusions>
    				<!-- Excluded the Hadoop dependencies to be sure that they are not mixed with them provided by cloudera. -->
    				<exclusion>
    					<artifactId>hadoop-streaming</artifactId>
    					<groupId>org.apache.hadoop</groupId>
    				</exclusion>
    				<exclusion>
    					<artifactId>hadoop-tools</artifactId>
    					<groupId>org.apache.hadoop</groupId>
    				</exclusion>
    			</exclusions>
    
    		</dependency>
    
    		<!-- Hadoop Cloudera Dependencies -->
    		<dependency>
    			<groupId>org.apache.hadoop</groupId>
    			<artifactId>hadoop-common</artifactId>
    			<version>${hadoop.version.generic}</version>
    		</dependency>
    		
    		<dependency>
    			<groupId>org.apache.hadoop</groupId>
    			<artifactId>hadoop-hdfs</artifactId>
    			<version>${hadoop.version.generic}</version>
    		</dependency>
    
    		<dependency>
    			<groupId>org.apache.hadoop</groupId>
    			<artifactId>hadoop-tools</artifactId>
    			<version>2.0.0-mr1-cdh4.1.3</version>
    		</dependency>
    
    		<dependency>
    			<groupId>org.apache.hadoop</groupId>
    			<artifactId>hadoop-streaming</artifactId>
    			<version>2.0.0-mr1-cdh4.1.3</version>
    		</dependency>
    
    	</dependencies>
    
    	<build>
    		<plugins>
    
    			<plugin>
    				<groupId>org.apache.maven.plugins</groupId>
    				<artifactId>maven-compiler-plugin</artifactId>
    				<configuration>
    					<source>${java-version}</source>
    					<target>${java-version}</target>
    				</configuration>
    			</plugin>
    
    		</plugins>
    	</build>
    
    	<repositories>
    		<repository>
    			<id>spring-milestones</id>
    			<url>http://repo.springsource.org/libs-milestone</url>
    			<snapshots>
    				<enabled>false</enabled>
    			</snapshots>
    		</repository>
    
    		<repository>
    			<id>cloudera</id>
    			<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
    			<snapshots>
    				<enabled>false</enabled>
    			</snapshots>
    		</repository>
    
    		<repository>
    			<id>spring-snapshot</id>
    			<name>Spring Maven SNAPSHOT Repository</name>
    			<url>http://repo.springframework.org/snapshot</url>
    		</repository>
    	</repositories>
    </project>
    Looking forward to see the final release of Spring-Data-Hadoop

  6. #6
    Join Date
    Jan 2005
    Location
    Bucharest, Romania
    Posts
    5,403

    Default

    Glad to help.
    Fwiw, the GA is almost there - there is still some work being done to the Hadoop samples.
    Costin Leau
    SpringSource - http://www.SpringSource.com- Spring Training, Consulting, and Support - "From the Source"
    http://twitter.com/costinl
    Please use [ c o d e ] [ / c o d e ] tags

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •