0
votes

I am using some basic styles in ckeditor bold, italic, etc. to allow my users to style their text for report writing.

When this string is passed to iTextSharp I am removing the html otherwise the html is printed on the pdf. I am removing this with

Regex.Replace(item.DevelopmentPractice.ToString(), @"<[^>]*>|&nbsp;", String.Empty)

Is there a way to format the text on the pdf to preserve the bold but not display

<strong></strong>

UPDATE

I have provided full code below as requested.

public FileStreamResult pdf(int id)
{

    // Set up the document and the Memory Stream to write it to and create the PDF writer instance
    MemoryStream workStream = new MemoryStream();
    Document document = new Document(PageSize.A4, 30, 30, 30, 30);
    PdfWriter.GetInstance(document, workStream).CloseStream = false;

    // Open the pdf Document
    document.Open();

    // Set up fonts used in the document
    Font font_body = FontFactory.GetFont(FontFactory.HELVETICA, 10);
    Font font_body_bold = FontFactory.GetFont(FontFactory.HELVETICA, 10, Font.BOLD);

    Chunk cAreasDevelopmentHeading = new Chunk("Areas identified for development of practice", font_body_bold);
    Chunk cAreasDevelopmentComment = new Chunk(item.DevelopmentPractice != null ? Regex.Replace(item.DevelopmentPractice.ToString(), @"<[^>]*>|&nbsp;", String.Empty) : "", font_body);

    Paragraph paraAreasDevelopmentHeading = new Paragraph();
    paraAreasDevelopmentHeading.SpacingBefore = 5f;
    paraAreasDevelopmentHeading.SpacingAfter = 5f;
    paraAreasDevelopmentHeading.Add(cAreasDevelopmentHeading);
    document.Add(paraAreasDevelopmentHeading);

    Paragraph paraAreasDevelopmentComment = new Paragraph();
    paraAreasDevelopmentComment.SpacingBefore = 5f;
    paraAreasDevelopmentComment.SpacingAfter = 15f;
    paraAreasDevelopmentComment.Add(cAreasDevelopmentComment);
    document.Add(paraAreasDevelopmentComment);

    document.Close();

    byte[] byteInfo = workStream.ToArray();
    workStream.Write(byteInfo, 0, byteInfo.Length);
    workStream.Position = 0;

    // Setup to Download
    HttpContext.Response.AddHeader("content-disposition", "attachment; filename=supportform.pdf");
    return File(workStream, "application/pdf");
1
Please show the code that you are using to turn HTML into PDFChris Haas

1 Answers

0
votes

This really is not the best way to do HTML to PDF - iText or no iText. Try to look for a different method, you are not actually converting HTML to PDF, you are inserting scraped text to PDF using Chunks.

The most common way to do iText HTML2PDF seems to be to use HTMLWorker (I think it might be XMLWorker in newer versions), but people complain about that too; see this. It looks like you are building the PDF using non-converted iText elements without HTML and want to use HTML within those elements and I'm guessing that it will be very, very hard.

In the linked HTML worker example, have a look at the structure of the program. They do a HTML2PDF conversion - but if that fails, they create the PDF using the other iText methods, like Paragraph and Chunk. They there set the Chunk to have some styling as well.

I guess that you would have to parse the incoming HTML, divide it to chunks yourself, convert the s to Chunks with styling and only then vomit them onto the PDF. Now imagine doing that with a data source like CKE - even with a very strict ACF it would be a nightmare. If anyone knows of any other way than this, I want to know too (I do basically CKE to PDF for a living)!

Do you have any options, such as creating your own editor or using some other PDF technique? I use wkhtmltopdf but my situation is very different. I would use PrinceXML but it's too expensive.