6
votes

I am having issues with character encoding when viewing dynamic content from a broker database.

I have a scriptlet that calls the broker DB to generate a XML string and is then parsed by a XSL.

I have stripped back my code when debugging this issue and the script now looks like:

.....

strOutput= "<xml>";
ComponentPresentationFactory cpf = new ComponentPresentationFactory(PublicationID);

for (int i =0; i < itemURIs.length; i++)
{
ComponentPresentation cp = cpf.getComponentPresentation(itemURIs[i], strComponentTemplateURI);  
      if(cp != null){
        String content = "";
        content = cp.getContent();
        strOutput += content;
      }
}
strOutput+= "</xml>";

......

When i manualy overide this code and set the xml string content manualy in the code the data is displayed correctly on screen i.e. :

.....

strOutput= "<xml>";
ComponentPresentationFactory cpf = new ComponentPresentationFactory(PublicationID);

for (int i =0; i < itemURIs.length; i++)
{
ComponentPresentation cp = cpf.getComponentPresentation(itemURIs[i], strComponentTemplateURI);  
      if(cp != null){
        String content = "<xml><dynamicContent><subtitle><![CDATA[Außenbeleuchtung]]></subtitle></dynamicContent></xml>";
        strOutput += content;
      }
}
strOutput+= "</xml>";

......

The conponent is published to the content broker DB using a CT with a output format set to "XML Format".

The publication target is set up as Target Language : JSP and Default Code Page : Unicode UTF-8

When i preview the content using this CT then the data is displayed correctly:

<dynamicContent>
    <tcm_id>tcm:345-23288</tcm_id>
    <title><![CDATA[LED Road R250 - Maximum LED performance for street and highway illumination]]></title>
    <subtitle><![CDATA[Außenbeleuchtung ]]></summary>
</dynamicContent>

This is also the case when previewing through template builder.

The Broker DB is an Oracle DB (Oracle Database 11g Enterprise Edition Release 11.2.0.2.0) and i have checked the charecter set

SQL> select * from v$nls_parameters where parameter like '%CHARACTERSET%'; 

PARAMETER VALUE 

NLS_CHARACTERSET UTF8 
NLS_NCHAR_CHARACTERSET UTF8 

Has anyone else come accross any examples like this before. It looks like there is an issue with either the DB storage , The connection to the DB or the cp.getContent(); method.

Any help would be greatly apreciated and if you have any further questions please let me know.

Regards, Chris

2
How does the content look like when retrieved from SQLPlus? Have you tried setting your log level to debug and check for any encoding issues on the getContent call? Have you tried using ComponentPresentationAssembler instead of ComponentPresentationFactory? - Nuno Linhares
Hi Nuno, I have viewed the content in through the Oracle DB manager and there are no encoding issues when viewing that but this could be because the viewer is handling the encoding. I have just implemented my code using the ComponentPresentationAssembler instead of the ComponentPresentationFactory and it had the exact same end result. - Chris Eccles

2 Answers

1
votes
0
votes

Character encoding issues can be quite complex. In your case, since you have already done quite some investigation, I would start checking that the jsp file has the correct page encoding set:

<%@ page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8" %>