I'm using Rails 3.1.0 and Ruby 1.9.2 with PostgreSQL. I want to get data from huge files (~300mb) and put it in database. Here i use transaction:
File.open("./public/data_to_parse/movies/movies.list").each do |line|
if line.match(/\t/)
title = line.scan(/^[^\t(]+/)[0]
title = title.strip if title
year = line.scan(/[^\t]+$/)[0]
year = year.strip if year
movie = Movie.find_or_create(title, year)
temp.push(movie) if movie
if temp.size == 10000
Movie.transaction do
temp.each { |t| t.save }
end
temp =[]
end
end
end
But i want to improve perfomance using mass insert whith raw SQL:
temp.push"(\'#{title}\', \'#{year}\')" if movie
if temp.size == 10000
sql = "INSERT INTO movies (title, year) VALUES #{temp.join(", ")}"
Movie.connection.execute(sql)
temp =[]
end
end
But i have this error "incompatible character encodings: ASCII-8BIT and UTF-8". When i'm using activerecord it's all ok. Files contains characters such as German umlauts. I tried all from here Rails 3 - (incompatible character encodings: UTF-8 and ASCII-8BIT):, but it doesn't help me.
Do you have any idea where it comes from ?
Thanks,