3
votes

I searched a lot but couldn't get the answer.

I want to retain copied text from pdf to WYSIWYG editor(Ckeditor). I can retain style while copied from Word files but it does not work the same way when copied from PDF.

Original pdf is like this(I can't post image as reputation is < 10 , please refer links):

PDF text enter image description here

It shows following output after copy paste:

After copy paste in WYSIWYG editor enter image description here

Please suggest plugin or code snippet for PDF to RTF conversion.

Thanks

3

3 Answers

6
votes

CKEditor can paste only data which it gets from the browsers. It means that if browsers do not provide more data then the plain text there is nothing CKEditor can do.

Since version 4.5 CKEditor provide facade to handle Clipboard API and get all data which are pasted directly in the paste event. Every browser provide different data and you can easily check them:

editor.on( 'paste', function( evt ) {
  var types = evt.data.dataTransfer.$.types;

  console.log( types );

  for ( var i = 0; i < types.length; i++ ) {
    console.log( evt.data.dataTransfer.getData( types[ i ] ) );
  }

  // Additionally you can get information about pasted files.
  console.log( evt.data.dataTransfer.getFilesCount() );
} );

Note that Internet Explorer does not provide types array and support only Text and URL types.

To learn more about Clipboard Integration see this guide. Especially "Handling Various Data Types with Clipboard API" chapter which describe how to integrate data converter with the paste event, so if the PDF data are available in any browser you can use them during pasting.

3
votes

If it's a common case in your system then imho the best thing you can do is to allow users to upload the PDF file, run server side software to transform PDF into HTML and then automatically insert it into CKEditor.

I have no recommendations though on which application to use.

1
votes

The problem is that PDF files work in a different way that other text documents, so even if you try to paste its contents into a native word processor you won't get the same formatting.

This will vary depending on your PDF reader, but it's usual that images aren't pasted, tables are converted to plain text lines, etc...

If that happens in a native program that has full access to the clipboard, you can't expect anything better in a javascript application that depends on the data that the browser provides, and even after that you have to be careful with CKEditor because by default it includes filters to remove any formatting that it doesn't recognize so even more information can be lost at this last point.