Is there some simple way to add JAR from hdfs to driver (Tool) classpath? I need to load spring xml configuration from it.
In mapper/reducer distributed cache does the trick.
Is there some simple way to add JAR from hdfs to driver (Tool) classpath? I need to load spring xml configuration from it.
In mapper/reducer distributed cache does the trick.
Could you explain in detail what you are trying to do? Some code snippets would help as well.
My understanding is that you have a Tool (called driver) that you trying to access and it needs to access a Spring configuration inside Hdfs. If so, you could try using the libs argument - pass the path to you hdfs resource (using an absolute URL) to libs attribute and see what happens.
Costin Leau
SpringSource - http://www.SpringSource.com- Spring Training, Consulting, and Support - "From the Source"
http://twitter.com/costinl
Please use [ c o d e ] [ / c o d e ] tags
I want to have generic hadoop task starter. It will load spring context from JAR on HDFS and start bean with supplied name. That tool will have 2 manatory arguments on commandline: <JAR on HDFS with configs> <bean name>
jibjars option is based on URLClassloader. Do you know format of HDFS URL?
Code:if (line.hasOption("libjars")) { conf.set("tmpjars", validateFiles(line.getOptionValue("libjars"), conf)); //setting libjars in client classpath URL[] libjars = getLibJars(conf); if(libjars!=null && libjars.length>0) { conf.setClassLoader(new URLClassLoader(libjars, conf.getClassLoader())); Thread.currentThread().setContextClassLoader( new URLClassLoader(libjars, Thread.currentThread().getContextClassLoader())); } }
calling getURL on hadoop Resource raises MalformedURLException
Code:Exception in thread "main" java.net.MalformedURLException: unknown protocol: hdfs at java.net.URL.<init>(URL.java:590) at java.net.URL.<init>(URL.java:480) at java.net.URL.<init>(URL.java:429) at java.net.URI.toURL(URI.java:1098) at org.springframework.data.hadoop.fs.HdfsResource.getURL(HdfsResource.java:131)
Make sure you register the url handlers for hdfs (see the file system namespace handler). In the next release we might simplify this by using the hdfs resource handler instead of the JVM URL system (which has some downsides).
As for your tool, see the example in M2 with Spring Batch - it's not Hadoop specific but it takes care of bootstrapping a Spring container and executing a task (without the need of a container). The only thing that needs clarifying is handling the HDFS url.
Costin Leau
SpringSource - http://www.SpringSource.com- Spring Training, Consulting, and Support - "From the Source"
http://twitter.com/costinl
Please use [ c o d e ] [ / c o d e ] tags
I see ConfigurationFactoryBean setRegisterUrlHandler method.