What is the correct way to distribute third party jar files to all nodes running the Map-Reduce job. Shall I use 'libs' attribute in the job configuration? Or distributed cache? Also, it is more than one jar that needs to be distributed (xalan, serializer etc.)
Also, is there a way to ensure that any of these options have worked correctly - checking a log or job file etc? Because I have tried both of these and have not been able to run the job successfully, so not sure if I am setting these correctly.
I configured Distributed cache as below, where these jars are available on the indicated hdfs path.
<hdp:classpath value="/svjain/lib/xalan-2.7.1.jar" />
<hdp:classpath value="/svjain/lib/serializer-2.7.1.jar" />
When trying the job (libs) option, I tried below configuration, where install.dir.win and library.path refer to non-hdfs path (library.path=lib/*.jar)