0
votes

I created a script which working on data from SQL server and this data has a names of several items in Arabic & English language and i get Unknown Characters and (???) in Data-frame result As in enter image description here And i tried to cjhange the encoding to "UTF-8" by using DBMSencoding=utf-8 but i get the same result. Any one can help to solve this problem? this's my code:

`

dbConnection <- "Driver={SQL Server};Server=.;Database=****;Uid=******;Pwd=****;DBMSencoding=utf-8"
con <- odbcDriverConnect(connection = dbConnection)
Orders <- sqlQuery(con, "SELECT 
                   DOC.ID AS [Transaction]
                   ,CASE
                   WHEN DOCD.ServiceItemID IS NOT NULL 
                   THEN dbo.RemoveNumericCharacters(CONVERT(NVARCHAR (MAX),SRI.Description))
                   WHEN DOCD.StockItemID IS NOT NULL 
                   THEN dbo.RemoveNumericCharacters(CONVERT(NVARCHAR (MAX),STI.Description))  
                   END AS Item
                   ,CASE
                   WHEN  DOCD.ServiceItemID IS NOT NULL
                   THEN 'ServiceItem'
                   WHEN DOCD.StockItemID IS NOT NULL
                   THEN 'StockItem'
                   END AS [ItemType]
                   FROM Sales.Sls_Documents DOC
                   INNER JOIN Sales.Sls_DocumentDetails DOCD
                   ON DOCD.DocumentID = DOC.ID
                   LEFT  JOIN Sales.Sls_ServiceItems SRI
                   ON SRI.ID = DOCD.ServiceItemID
                   LEFT JOIN Warehouse.StockItems STI
                   ON STI.Id = DOCD.StockItemID
                   WHERE CASE
                   WHEN DOCD.ServiceItemID IS NOT NULL 
                   THEN SRI.Description
                   WHEN DOCD.StockItemID IS NOT NULL
                   THEN STI.Description
                   END IS NOT NULL")
Orders$Item <- as.character(Orders$Item)

`

sessionInfo(): R version 3.4.3 (2017-11-30) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 16299)

Matrix products: default

locale: 1 LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252

2
Please type sessionInfo() and paste the part about "locale" into your question.G5W

2 Answers

0
votes

you are using r-studio to display the data.frame? If so i made the experience, that sometimes special characters are not correctly displayed in the r-studio viewer but are correctly saved in the data.frame itself.

To check if your DF saves the values correctly - could you call your data-frame "DF" via the console in R-Studio, so it is printed out in the console itself - i found out that, in my case at least, there the coding of the character vars was displayed correctly.

So in conclusion this might just be a viewing "artefact" of r-studio.

If not, maybe a recoding of values after the import from sql in R is needed - but here i have not enough experience. Maybe this may help: Link to base-R converter

Best regards

Sebastian

0
votes

I Solved This Problem With to steps:

  1. I Used the odbc Package and assigned the encoding argument of the dbGetQuery Fun to "Windows-1256"
  2. Set the LC_ALL Of Sys.setlocale to 'Arabic' And after doing this tow steps i Solved this problem enter image description here