Hi Costin,
Thanks for the update.
Will try and let you know.
Hi Costin,
Thanks for the update.
Will try and let you know.
Hi Costin,
Did basic testing, everything works correctly. Haven't tested the case with specifying the property file, will work on that as well and give you an update.
Thanks for looking into this.
P.S: Seems that somehow I downloaded wrong snapshot version from february, that's why I had issues before.
Sincerely,
David
That's great! Let me know how it goes and of course, if you have any suggestions - bring them on.
Cheers,
Costin Leau
SpringSource - http://www.SpringSource.com- Spring Training, Consulting, and Support - "From the Source"
http://twitter.com/costinl
Please use [ c o d e ] [ / c o d e ] tags
Hi Costin,
Is this change going to be available in the Milestone release?
Sincerely,
David
Of course. This functionality (which has now been extended to hdp:job as well - meaning one can configure a Hadoop job (with all its dependencies) from an external jar, not on the classpath) will be available in the next release along with the HBase extensions and potentially some security improvements just to name a few.
The ETA is probably second half of May but don't quote me on that - keeping an eye on JIRA should help.
Hth,
Costin Leau
SpringSource - http://www.SpringSource.com- Spring Training, Consulting, and Support - "From the Source"
http://twitter.com/costinl
Please use [ c o d e ] [ / c o d e ] tags
Nice to hear that
Btw, I was adding more jobs and I came across the following issue:
I have a my_job.jar, which has the following classes:
My tasklet is in the following form:Code:package test.inner.mypackage; import test.inner.MultipleOutputNamingDecider; public class MyJob extends Configured implements Tool { public final JobConf createJobConf(String[] args) { final JobConf conf = new JobConf(getConf(), MyJob.class); conf.setJobName("My Job Name"); ... conf.setOutputFormat(MultipleOutputNamingDecider.class); } public static void main(String[] args) { new MyJob().configuredBy(args).run(); System.exit(0); } }
When I am running the tasklet, I am getting the following class not found exception:Code:<hdp:tool-tasklet id="MyJob_hadoopTasklet" scope="step" configuration-ref="hadoop-configuration" tool-class="test.inner.mypackage.MyJob" jar="my_job.jar"> ... </hdp:tool-tasklet>
I thought you have covered these cases, or I am mistaken?Code:java.lang.RuntimeException: java.lang.ClassNotFoundException: test.inner.MultipleOutputNamingDecider at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1028) at org.apache.hadoop.mapred.JobConf.getOutputFormat(JobConf.java:619) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:874) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1242) at test.inner.mypackage.MyJob.run(MyJob.java:57) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.springframework.data.hadoop.mapreduce.ToolExecutor.runTool(ToolExecutor.java:47) at org.springframework.data.hadoop.mapreduce.ToolTasklet.execute(ToolTasklet.java:33)
P.S: I am using spring-data-hadoop-1.0.0.BUILD-20120423.231511-73 version.
Sincerely,
David
I'll take a look - I've probably missed a 'configuration' spot.
Costin Leau
SpringSource - http://www.SpringSource.com- Spring Training, Consulting, and Support - "From the Source"
http://twitter.com/costinl
Please use [ c o d e ] [ / c o d e ] tags
Hi Costin,
Did you have a chance to look into this?
Sincerely,
David
Hi,
Sorry for the delay - I was on the road through EU for the SpringOne / CloudFoundry tour.
I managed to replicate your problem and applied a fix - it is available in master and forced a nightly build so please go ahead and try out the latest snapshot.
The issue was in the way, for tool execution (and unfortunately through-out its usage), the Hadoop configuration does not preserve or copies the set classloader and relies on the thread context classloader as well (which is a fragile mechanism at best). This is now handled by the tool support - let me know whether the latest update works for you.
Cheers!
Costin Leau
SpringSource - http://www.SpringSource.com- Spring Training, Consulting, and Support - "From the Source"
http://twitter.com/costinl
Please use [ c o d e ] [ / c o d e ] tags
Hi Costin,
Thanks for the update. I downloaded latest Snapshot and it worked for me.
We still have 1 type of job, which I haven't tested, namely when I need to provide property file on fly.
Will test that on Monday and let you know if there are any issues.
Sincerely,
David