1
votes

It seems rCharts doesn't work well with Chinese

Background:

I have a CSV file that contains Chinese characters encoded in gb2313(system default). Here is a sample of my CSV:

date,title,name,id,message

"2014-10-07 8:42:37","元老",879231132,879231132,"加 "

"2014-10-07 8:43:50","元老",879231132,879231132,"这么多空格,不加引号。怎么行。 "

"2014-10-07 8:45:10","新人",451635342,451635342,"想问一下,如果有一些专业词汇不懂 找谁帮忙呀? "

"2014-10-07 8:45:30","大神",532594859,532594859,"发出来,一起研究 "

Problem:

I read them using read.csv, and it can be correctly printed out in R Console, but when I try to put the values into a label of hChart, it's shown as gibberish (characters with no meaning) I'v tried Encoding(title)<- "UTF-8" and enc2utf8() but they don't work either. How can I fix this??Any idea would be great helpful

Other Info:

R version 3.1.1 (2014-07-10) Platform: i386-w64-mingw32/i386 (32-bit)

locale: [1] LC_COLLATE=Chinese (Simplified)_People's Republic of China.936 [2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936 [3] LC_MONETARY=Chinese (Simplified)_People's Republic of China.936 [4] LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_People's Republic of China.936

attached base packages: [1] stats graphics grDevices utils
datasets methods base

other attached packages: [1] RJSONIO_1.3-0 httr_0.5 rCharts_0.4.5

loaded via a namespace (and not attached): [1] grid_3.1.1
lattice_0.20-29 plyr_1.8.1 Rcpp_0.11.3 stringr_0.6.2
tools_3.1.1 [7] whisker_0.3-2 yaml_2.1.13

Now I put my code here.

library(rCharts)
library(httr)
library(RJSONIO)
library(data.table)
paresed_data <- read.csv("gb2312.csv",header = TRUE,sep = ",",quote="\"")
get_top_n_speakers <- function(n = 50){
  data <- subset(paresed_data,select = c(id,name,title))
  freq_data <- data.frame(table(data$id))
  colnames(freq_data) <- c("id","msg_cnt")

  desc_data <- data[!duplicated(data$id),]

  df <- merge(desc_data,freq_data,by="id")
  set.seed(666)
  random <- runif(nrow(desc_data))
  df <- cbind(df,random)
  df <- df[order(df$msg_cnt,decreasing = TRUE,na.last = TRUE),]

  df <- head(x = df,n = n)

  h2 <- hPlot(
    x = "random",
    y = "msg_cnt",
    data = df,
    type = "scatter",
    title = paste("前",n,"个成员",sep=" "),
    group ="title",
    radius = 5

  )

  h2$xAxis(title = NULL,labels = list(format = " "));
  h2$tooltip(useHTML = T, formatter = "#! function() {
        return 'Msg count: <b>' + this.y + '</b><br> Title:<b> '+ this.series.name+'</b><br>name:<b>'+this.name+'</b>';
    } !#")

  h2

}
2

2 Answers

0
votes

I found iconv helps... with some letters,but there are still some gibberish . `iconv(x,from = "gb2312",to = "utf-8")

-1
votes

How about using 'GBK' instead of 'gb2312'? It worked well.

iconv(x, from = "GBK", to = "UTF-8")