2
votes

I have 2 issues related to Swedish characters. I am fetching data directly from MS SQL database. 1.could anyone gives me a hint how could i change the back to Swedish characters in R?

I use write.csv write the data out to csv then copy and paste those string here to make the df as follow

library(tidyverse)
library(ggplot2)
library(scales)

c <- c("c","u","m","j","c","u","m","j","c","u","m","j")
city <- c("G<f6>teborg", "Ume<e5>", "Malm<f6>", "J<f6>nk<f6>ping","G<f6>teborg", "Ume<e5>", "Malm<f6>", "J<f6>nk<f6>ping","G<f6>teborg", "Ume<e5>", "Malm<f6>", "J<f6>nk<f6>ping")
priority <- c(1,1,1,1,0,0,0,0,2,3,3,2)
n_cust <- sample(50:1000, 12, replace=T)
df <- data.frame(c,city,priority,n_cust)

should be ö and is å

  1. interesting enough. if i use the code as following:
dpri %>% group_by(kommun, artikel_prioritet) %>% 
  summarise(n_cust=n_distinct(kund_id),
            sum_sales=sum(p_sum_adj_sale),
            avg_margin=mean(pp_avg_margin),
            avg_pec_sales=mean(p_pec_sales)) %>% 
  arrange(desc(sum_sales)) %>% 
  head(20)%>% 
  ggplot(aes(x=reorder(kommun, sum_sales), y=sum_sales, 
  fill=factor(artikel_prioritet))) +
  geom_bar(stat='identity')+
  coord_flip()+
  scale_y_continuous(labels = comma)+
  facet_grid(.~ factor(artikel_prioritet), scales = "free")+
  theme(legend.position="none")

i got this error: Error in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : invalid input 'Göteborg' in 'utf8towcs'

if I first put this head(20) into a variable ci. then use ggplot to plot ci

ggplot(ci,aes(x=reorder(kommun, sum_sales), y=sum_sales, fill=factor(artikel_prioritet))) + geom_bar(stat='identity')+
coord_flip()+ scale_y_continuous(labels = comma)+ facet_grid(.~ factor(artikel_prioritet), scales = "free")+
theme(legend.position="none")

I have bar chart without any city legend. then I print out ci, I got pic as follow: enter image description here

then, I write the head(20) to a csv 'cityname.csv' then read.csv back to R use the same code to do the bar chart

ci <- read.csv("cityname.csv")

ggplot(ci,aes(x=reorder(kommun, sum_sales), y=sum_sales, fill=factor(artikel_prioritet))) + geom_bar(stat='identity')+
coord_flip()+ scale_y_continuous(labels = comma)+ facet_grid(.~ factor(artikel_prioritet), scales = "free")+
theme(legend.position="none")

I got the pic as follow: enter image description here

we can see legends this time but see , this time. hope get some suggestions how could i fix the strings in Swedish and wondering suggestion is there any other way without write.csv and then read again still can get the bar chart fixed?

Thank you!

1

1 Answers

0
votes

I believe your issue is that R doesn't know how to interpret your character encoding. Try \u notation instead of <>, which denotes UTF-8 encoding in R

> city <- c("G\u00f6teborg", "Ume\u00e5", "Malm\u00f6", "J\u00f6nk\u00f6ping","G\u00f6teborg", "Ume\u00e5", "Malm\u00f6", "J\u00f6nk\u00f6ping","G\u00f6teborg", "Ume\u00f6", "Malm\u00f6", "J\u00f6nk\u00f6ping")
> Encoding(city)
 [1] "UTF-8" "UTF-8" "UTF-8" "UTF-8" "UTF-8" "UTF-8" "UTF-8" "UTF-8" "UTF-8" "UTF-8" "UTF-8" "UTF-8"
> head(city)
[1] "Göteborg"  "Umeå"      "Malmö"     "Jönköping" "Göteborg"  "Umeå" 

EDIT: You asked a good follow up question about how to make this replacement programmatically. I have provided a solution for that as well below, using the tidyverse packages dplyr and stringr

> city <- c("G<f6>teborg", "Ume<e5>", "Malm<f6>", "J<f6>nk<f6>ping","G<f6>teborg", "Ume<e5>", "Malm<f6>", "J<f6>nk<f6>ping","G<f6>teborg", "Ume<f6>", "Malm<f6>", "J<f6>nk<f6>ping")
> city_df <- as.data.frame(city)

> special_character_replacements <- c("<f6>" = "\\u00f6", "<e5>" = "\\u00e5")
> city_df %>% 
    dplyr::mutate(city_fixed = 
        stringr::str_replace_all(city, special_character_replacements))

              city city_fixed
1      G<f6>teborg   Göteborg
2          Ume<e5>       Umeå
3         Malm<f6>      Malmö
4  J<f6>nk<f6>ping  Jönköping
5      G<f6>teborg   Göteborg
6          Ume<e5>       Umeå
7         Malm<f6>      Malmö
8  J<f6>nk<f6>ping  Jönköping
9      G<f6>teborg   Göteborg
10         Ume<f6>       Umeö
11        Malm<f6>      Malmö
12 J<f6>nk<f6>ping  Jönköping