Can you post your config? Thanks.
Can you post your config? Thanks.
Costin Leau
SpringSource - http://www.SpringSource.com- Spring Training, Consulting, and Support - "From the Source"
http://twitter.com/costinl
Please use [ c o d e ] [ / c o d e ] tags
Hmm - seems there is no Hadoop configuration defined (a null one is sent pass through). Again, can you post your config? Thanks.
Costin Leau
SpringSource - http://www.SpringSource.com- Spring Training, Consulting, and Support - "From the Source"
http://twitter.com/costinl
Please use [ c o d e ] [ / c o d e ] tags
Costin - I noticed that the hbasetemplate isn't in the m1 release. If there is a fix for the scriptlet problem I would like to move to the snapshot. Do you have any insight into why it is erroring out?
I can only guess without looking at a sample configuration file - as I mentioned before, posting the of your config with the scriptlet definition and the relevant dependencies would help a lot.
Costin Leau
SpringSource - http://www.SpringSource.com- Spring Training, Consulting, and Support - "From the Source"
http://twitter.com/costinl
Please use [ c o d e ] [ / c o d e ] tags
It looks like your script was declared in a context without any Hadoop configuration which was then wired to create other Hadoop components resulting in exceptions. I've pushed a fix which gives out warnings and does not bind the variables in this scenario.
Will be available in the next nightly build.
Costin Leau
SpringSource - http://www.SpringSource.com- Spring Training, Consulting, and Support - "From the Source"
http://twitter.com/costinl
Please use [ c o d e ] [ / c o d e ] tags
Sorry - the config:
<configuration id="hadoop-configuration">
fs.default.name=${hdfs.namenode:hdfs://localhost:9000}
</configuration>
<hbase-configuration id="hbase-configuration"
configuration-ref="hadoop-configuration">
hbase.zookeeper.quorum=server1.bericotechnologies. com
</hbase-configuration>
<batch:job id="job1">
<batch:step id="import" next="adventureworks">
<batch:tasklet ref="script-tasklet"/>
</batch:step>
<batch:step id="adventureworks">
<batch:tasklet ref="adventureworks-tasklet" />
</batch:step>
</batch:job>
<script-tasklet id="script-tasklet">
<script location="cp-data.js">
<property name="inputPath" value="cube-csv-inputs/adworks.csv" />
<property name="outputPath" value="cubes/adwork" />
<property name="localResource" value="cube-csv-inputs/adworks.csv" />
</script>
</script-tasklet>
<tasklet id="adventureworks-tasklet" job-ref="adventureworks-job"/>
<job id="adventureworks-job" properties-location="classpath:conf/adventureworks.properties"
configuration-ref="hbase-configuration"
input-path="${input}"
output-path="${output}"
mapper="haruspex.etl.csv.CsvToCubeMapper"
reducer="haruspex.etl.csv.CsvToCubeReducer"
validate-paths="false"
/>
the script:
// delete job paths
if (fsh.test(inputPath)) { fsh.rmr(inputPath) }
if (fsh.test(outputPath)) { fsh.rmr(outputPath) }
// copy local resource using the streams directly (to be portable across envs)
inStream = cl.getResourceAsStream(localResource)
org.apache.hadoop.io.IOUtils.copyBytes(inStream, fs.create(inputPath), cfg)
We have a similar test suite in our test battery so it seems to be a wiring problem. I can tell you we dropped the hypehnated names in favor of CamelCase style (to allow auto-wiring to occur using by-name semantics). So try renaming hadoop-configuration to hadoopConfiguration.
You should have received a log warning (on debug level but I've increased its priority) that no configuration was wired as more then one was found - hbase and the normal hadoop config. In fact, if you remove the hbase part (just as an exercise), things should be working.
Anyway, the name change should be enough (by the way, you don't have to specify an id for the configurations as they get wired automatically) and the upcoming nightly build should improve the messages and the behavior (no more exceptions even when no configuration is given).
Costin Leau
SpringSource - http://www.SpringSource.com- Spring Training, Consulting, and Support - "From the Source"
http://twitter.com/costinl
Please use [ c o d e ] [ / c o d e ] tags
So the name change fixed the problem with the script-tasklet although I noticed that there isn't a way to set the configuration-ref on that element and that will be a problem if you ever wish to use another custom config. I am still getting the same problem with the context.getConfiguration() in the reducer. The properties I put in the hbaseConfiguration are there but the list of files from context.getConfiguration().toString() still doesn't include the hbase-site.xml or hbase-default.xml.
Raised https://jira.springsource.org/browse/SHDP-77 to allow a script to have its configuration passed in not just auto-detected. As for the HBase problem I'm not sure what the problem is and I can't reproduce it. Not sure why the configuration gets lost between the mapper and the reducer...R
Raised an issue for that as well - hopefully I can track it down and solve it before M2 gets released.
Costin Leau
SpringSource - http://www.SpringSource.com- Spring Training, Consulting, and Support - "From the Source"
http://twitter.com/costinl
Please use [ c o d e ] [ / c o d e ] tags
Forgot to add the HBase configuration issue: https://jira.springsource.org/browse/SHDP-78
Costin Leau
SpringSource - http://www.SpringSource.com- Spring Training, Consulting, and Support - "From the Source"
http://twitter.com/costinl
Please use [ c o d e ] [ / c o d e ] tags