Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.2k views
in Technique[技术] by (71.8m points)

r - Sweave doesn't seem to get .Rnw file encoding right

This question arose out of the following question on tex.sx: Sweave generating invalid LaTeX. The problem seems to be that Sweave is not recognizing the encoding of the file, despite the locale being set to UTF-8, and the .Rnw file being saved as UTF-8. The end result is that any .Rnw file that contains non-ASCII characters ends up producing NA in the resultant .tex file. As you can read in the comments to that question, another user doesn't show the problem, with what is apparently an identical setup. (R 2.13.1 on a Mac) Here's a minimal document that fails.

Update

Based on Aaron's suggestions, I've added sessionInfo to the .Rnw file, and now the real problem reveals itself. When Sweave processes the file, it seems to change the locale.

.Rnw file

documentclass{article}
usepackage[utf8]{inputenc}
egin{document}
Some non-ascii text: éüá?
<<>>=
sessionInfo()
@ 
end{document}

Running this through Sweave, produces the following .tex file. The line containing the non-ASCII characters has been converted into NA by Sweave. It seems also that the locale has been changed:

Resultant .tex file

documentclass{article}
usepackage[utf8]{inputenc}
usepackage{Sweave}
egin{document}
NA
egin{Schunk}
egin{Sinput}
> sessionInfo()
end{Sinput}
egin{Soutput}
R version 2.13.1 (2011-07-08)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_2.13.1
end{Soutput}
end{Schunk}
end{document}

sessionInfo() from within R.app returns:

> sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

Update (Response to Aaron)

> text <- readLines("sweave-enc-test.Rnw", warn = FALSE)
> enc <- tools:::.getVignetteEncoding(text, convert = TRUE)
> 
> text
[1] "\documentclass{article}"     "\usepackage[utf8]{inputenc}" "\begin{document}"           
[4] "Some non-ascii text: éüá?"    "\end{document}"             
> enc
[1] "UTF-8"
> iconv(text, enc, "")
[1] "\documentclass{article}"     "\usepackage[utf8]{inputenc}" "\begin{document}"           
[4] "Some non-ascii text: éüá?"    "\end{document}"      

(This is the output from within the R console in R.app.)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Potential fix:

Try putting

export LANG=en_US.UTF-8

in your TeXShop script.

(Original idea was in the ~/.bashrc file, but apparently TeXShop doesn't load that.)

EARLIER:

What happens when you put sessionInfo() in the Rnw file?

documentclass{article}
usepackage[utf8]{inputenc}
egin{document}
Some non-ascii text: éüá?
<<>>=
sessionInfo()
@ 
end{document}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...