-
SHDP-77 is fixed but SHDP-78 I can't reproduce. I've added a test, with mapper and a reducer which are passed the hbaseConfiguration and the proper object is passed through.I've added the toString just in case:
Code:
12:33:49,387 INFO Thread-7 mapred.MapTask - record buffer = 262144/327680
Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, hdfs-default.xml, hdfs-site.xml, file:/tmp/hadoop-costin/mapred/local/localRunner/job_local_0001.xml
...
12:33:52,311 INFO Thread-7 mapred.Merger - Down to the last merge-pass, with 0 segments left of total size: 0 bytes
12:33:52,311 INFO Thread-7 mapred.LocalJobRunner -
Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, hdfs-default.xml, hdfs-site.xml, file:/tmp/hadoop-costin/mapred/local/localRunner/job_local_0001.xml
12:33:52,338 INFO Thread-7 mapred.Task - Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
Test: http://j.mp/LPrXj4
Config: http://j.mp/LPrXzz
-
Exactly! The toString() should have hbase-default.xml, hbase-site.xml but it doesn't!
-
I see what you mean however the correct configuration is properly passed to the mapper/reducer.
Basically a Hadoop config is a properties object - some of which can be set directly from files through addResource() method which is displayed by toString().
This is properly done on the client side however when the job information is serialized and sent to the reducer/mapper, only the properties (which are actually relevant) are sent - the rest of the information, such as the resources used - is discarded.
On each node the Configuration object is created used the standard new Configuration/JobConf - this is standard Hadoop. So there are no addResource() calls on the mapper/reducer side which instantiates the Configuration based on the properties sent.
Which means that, while toString() will differ, the Configuration content (meaning its properties) will be correct. Use that for comparison instead of toString() which is not a proper indicator of the content sent (the mapper/reducer do not use the name of the resources added to a Config, only their content which was already loaded on the client side and sent to the cluster).
-
My BAD
Costin,
I am so sorry for wasting your time!! I really appreciate you working with me.
The problem was an extra whitespace in the hbaseConfiguration! When the properties get loaded into the config they aren't trimmed.
-
Glad it is all sorted out. By the way, I've added a section on the DAO support for HBase in the docs (in case you're interested) [1].
[1] http://static.springsource.org/sprin...tml/hbase.html