0
votes

I need to read a file in sharepoint document library from node js and store it in a variable. I can read .txt files using sharepoint rest api. sharepoint site/_api/web/ GetFileByServerRelativeUrl('/Shared%20Documents/s_C_locations.txt')/$value.But when I try to read .docx file it returns something as below:

PK!����]�[Content_Types].xml ��(����N�0E�H�C�-j\@B5e�c  H���LZ�_�L��=�@��"Ҳ���s��鋳�$4�b?����B܍.{�"CR�T6x(�P�ww�y�8�c!�D�DJ�Sp
���J�Sįi"�ҏj��?�:xO=�5�pp��Y�.^��I�";{�X{B�h�VĤ�ɗ_\z�9G6{pj"�1��K"L�8W#6�cj��P�^�\�dJ
���z)��_���� Xml/item2.xmlPK-!�H���`customXml/itemProps1.xmlPK-!\�'"�(�acustomXml/_rels/item2.xml.relsPK-!���.j�ccustomXml/item3.xmlPK-!�^����<ecustomXml/itemProps3.xmlPK-!�U�j�OfdocProps/custom.xmlPK-!{���(�hcustomXml/_rels/item3.xml.relsPK��

Please help me to resolve it

1
this is text representation of a binary file (.zip), everything is fine. What you need is proper viewer, check this answer: stackoverflow.com/a/27958186/1498401 - tinamou
Hi @tinamou,I don't need to render it in a browser.I want to read a file content in node js and store it in a variable. - Muthu Prasanth
how you understand file content? Word documents (.docx) have fonts, paragraphs, tables, objects, images etc. not only text like .txt files - tinamou
Yes .docx file contain all those you mentioned. Is there any way to read those files,because in my project I need to read the content and do analysis on that.And another one is if the .docx present in the local(inside my project folder) I can read it using the textract module in node js. But the use case is to read from sharepoint document library - Muthu Prasanth
there are textract methods to extract from buffer or url - tinamou

1 Answers

1
votes

Using encoding: null is a key here.

Here is an example how to download a file from SharePoint correctly with Node.js and REST API.

You can simplify a task and grab ready to use libraries like sppull or sp-download for example.

Cheers, Andrew