Standalone code with URLconnection:
Code:
BufferedReader reader = null;
try {
URLConnection connection = new URL(url).openConnection();
connection.setRequestProperty("Accept-Charset", "iso-8859-1");
InputStream responseBody = connection.getInputStream();
try {
reader = new BufferedReader(new InputStreamReader(responseBody));
for (String line; (line = reader.readLine()) != null;) {
LOGGER.debug(line);
}
} finally {
if (reader != null)
try {
reader.close();
}catch (IOException logOrIgnore) {}
}
} catch (MalformedURLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
I cannot post the external URL, but following is the String as logged by Spring Integration -
Code:
<p>OAKLAND, Calif. â The champagne was returned to storage. Reams of plastic taped onto the lockers as protection were rolled up and stuffed into the equipment manager's office.</p>
Notice the jumbled character -if same string is fetched using the Java class it renders as
Code:
OAKLAND, Calif. – The champagne was returned to storage. Reams of plastic taped onto the lockers as protection were rolled up and stuffed into the equipment manager's office
From the isolated code, even if I remove the charset stuff -it just works fine. Charset has been set based on encoding we get in headers "Content-Type: text/html; charset=iso-8859-1"