View Full Version : Adding JAR from HDFS to classpath in driver
hsn
Jul 4th, 2012, 01:46 PM
Is there some simple way to add JAR from hdfs to driver (Tool) classpath? I need to load spring xml configuration from it.
In mapper/reducer distributed cache does the trick.
Costin Leau
Jul 5th, 2012, 02:13 PM
Could you explain in detail what you are trying to do? Some code snippets would help as well.
My understanding is that you have a Tool (called driver) that you trying to access and it needs to access a Spring configuration inside Hdfs. If so, you could try using the libs argument - pass the path to you hdfs resource (using an absolute URL) to libs attribute and see what happens.
hsn
Jul 7th, 2012, 02:03 AM
I want to have generic hadoop task starter. It will load spring context from JAR on HDFS and start bean with supplied name. That tool will have 2 manatory arguments on commandline: <JAR on HDFS with configs> <bean name>
jibjars option is based on URLClassloader. Do you know format of HDFS URL?
if (line.hasOption("libjars")) {
conf.set("tmpjars",
validateFiles(line.getOptionValue("libjars"), conf));
//setting libjars in client classpath
URL[] libjars = getLibJars(conf);
if(libjars!=null && libjars.length>0) {
conf.setClassLoader(new URLClassLoader(libjars, conf.getClassLoader()));
Thread.currentThread().setContextClassLoader(
new URLClassLoader(libjars,
Thread.currentThread().getContextClassLoader()));
}
}
hsn
Jul 7th, 2012, 02:52 AM
calling getURL on hadoop Resource raises MalformedURLException
Exception in thread "main" java.net.MalformedURLException: unknown protocol: hdfs
at java.net.URL.<init>(URL.java:590)
at java.net.URL.<init>(URL.java:480)
at java.net.URL.<init>(URL.java:429)
at java.net.URI.toURL(URI.java:1098)
at org.springframework.data.hadoop.fs.HdfsResource.ge tURL(HdfsResource.java:131)
Costin Leau
Jul 9th, 2012, 07:38 AM
Make sure you register the url handlers for hdfs (see the file system namespace handler). In the next release we might simplify this by using the hdfs resource handler instead of the JVM URL system (which has some downsides).
As for your tool, see the example in M2 with Spring Batch - it's not Hadoop specific but it takes care of bootstrapping a Spring container and executing a task (without the need of a container). The only thing that needs clarifying is handling the HDFS url.
hsn
Jul 10th, 2012, 02:19 AM
I see ConfigurationFactoryBean setRegisterUrlHandler method.
Powered by vBulletin® Version 4.2.1 Copyright © 2013 vBulletin Solutions, Inc. All rights reserved.