3
votes

I'm having a problem with the encode of my search for some twetts. Below is my code (after authentication):

load("twitteR_credentials")
registerTwitterOAuth(twitCred)

download.file(url="http://curl.haxx.se/ca/cacert.pem", destfile="cacert.pem")


mach_tweets = searchTwitter("bradesco", n=10, lang="pt", cainfo="cacert.pem", encoding='utf-8')

mach_text = sapply(mach_tweets, function(x) x$getText())

When I print the content of march_text, I get:

 [1] "Sexta meu amor, eu amo você! (@ Bradesco Promotora) http://t.co/evFs3BnvbV"                                                                       
 [2] "RT @LeiSecaFortal: “@luciadeboraa: @LeiSecaFortal acidente entre tópic 06 e um palio na Av. Antônio sales em frente ao bradesco.transito le…"
 [3] "RT @LeiSecaFortal: “@luciadeboraa: @LeiSecaFortal acidente entre tópic 06 e um palio na Av. Antônio sales em frente ao bradesco.transito le…"
 [4] "RT @DanielSoaresmh: I'm at Bradesco (Cajazeiras, PB) http://t.co/Zl3pgZ01ND"                                                                       
 [5] "RT @EquipeManuGTeen: Quem já comprou seu ingresso pro show da @manugavassi em SP, dia 6/4 no Teatro Bradesco?"                                    
 [6] "I'm at Bradesco (Vitória da Conquista, BA) http://t.co/wmWPnRsY7z"                                                                                
 [7] "RT @proconspoficial: Bradesco não pode bloquear ou cancelar cartão de crédito de inadimplente com o banco http://t.co/zjf27oAKkK"               
 [8] "ALÔ EMBU BUAÇU! \nA Estrela lojas e o banco Bradesco agora uniram-se para facilitar sua vida. EXATAMENTE! Evite... http://t.co/nUvYQ3J2o3"       
 [9] "SERVIÇOS: No @CidadeJardimRN temos caixas eletrônicos do Banco do Brasil, Banco 24h, Caixa Econômica, Bradesco, Santander, ITAÚ E HSBC."       
 [10] "RT @estelanaime: @Bradesco se encontrar um leitor de código de barras." 

Does anyone know how to solve this encode problem?

Here's some info:

 sessionInfo()

R version 3.0.2 (2013-09-25) Platform: i386-w64-mingw32/i386 (32-bit)

locale: [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252 LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
[5] LC_TIME=Portuguese_Brazil.1252

attached base packages: [1] graphics grDevices utils datasets stats methods base

other attached packages: [1] seqinr_3.0-7 wordcloud_2.4 RColorBrewer_1.0-5 Rcpp_0.11.0 tm_0.5-10 twitteR_1.1.7 rjson_0.2.13
[8] ROAuth_0.9.3 digest_0.6.4 RCurl_1.95-4.1 bitops_1.0-6 sp_1.0-14 ggplot2_0.9.3.1

loaded via a namespace (and not attached): [1] colorspace_1.2-4 dichromat_2.0-0 grid_3.0.2 gtable_0.1.2 labeling_0.2 lattice_0.20-23 MASS_7.3-29 munsell_0.4.2
[9] parallel_3.0.2 plyr_1.8 proto_0.3-10 reshape2_1.2.2 scales_0.2.3 slam_0.1-31 stringr_0.6.2 tools_3.0.2

I'm using Windows 7 and Rstudio Version 0.97.336

update: Using a linux machine it works fine.

sessionInfo()

R version 3.0.2 (2013-09-25) Platform: x86_64-pc-linux-gnu (64-bit)

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=C LC_COLLATE=C LC_MONETARY=C LC_MESSAGES=C
[7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] twitteR_1.1.7 rjson_0.2.13 seqinr_3.0-7 tm_0.5-10 ggplot2_0.9.3.1 ROAuth_0.9.3 digest_0.6.3
[8] RCurl_1.95-4.1 bitops_1.0-6 wordcloud_2.4 RColorBrewer_1.0-5 Rcpp_0.10.6 data.table_1.8.10 RJDBC_0.2-1
[15] rJava_0.9-4 DBI_0.2-7

loaded via a namespace (and not attached): [1] MASS_7.3-29 colorspace_1.2-4 dichromat_2.0-0 grid_3.0.2 gtable_0.1.2 labeling_0.2 munsell_0.4.2 parallel_3.0.2
[9] plyr_1.8 proto_0.3-10 reshape2_1.2.2 scales_0.2.3 slam_0.1-31 stringr_0.6.2 tools_3.0.2

1

1 Answers

1
votes

Have you tried Sys.setlocale?