0
votes

I am using SOLR with an index size of 20 million data. I used DataimportHandler for indexing data into SOLR. Now a days It is taking so much time to do full-index as the mysql query becomes very heavy. Please suggest some techniques to faster the indexing process.

I was thinking like pre-processing data in some other data store like Hadoop and then index from there or something. Please suggest which data store will be good If I want to store the pre-process data.

I am using MYSQL as master database.

Delta updates will be like around 100 000 for last hour.

1
Using Hadoop to index 20 million records might not be the good solution. If your data keeps increasing in MySQL and if you considering full import, then it would be good to use Hadoop Map Reduce for indexing purpose.Sravan K Reddy
Ya.Data is continuously increasing every minute. If I can use hadoop,Please suggest some urls for reference.Lijo Abraham
and how can I preprocess data and store it in hadoop from mysql.?Lijo Abraham
you can push data to hadoop using sqoop tool and for preprocessing use pig or java MR . and use Mapreduce indexer or pig indexer to index .cloudera.com/content/cloudera/en/documentation/cloudera-search/…Sravan K Reddy
do you have a single entity or do you have child entities? A copy of the results of a full import would be useful to identify the issues...Persimmonium

1 Answers

0
votes

First check if you have right indexes and your query is using the same.

OR I would suggest to work out partitioning on your current database. i.e. use partitioning on MySQL.

It would help you to retrive the data faster for Solr.

With partitioning it may help you to your other part of the application to get the data faster.

here are the links for achieving partitioning on MySQL.

https://dev.mysql.com/doc/refman/5.1/en/partitioning-overview.html https://dev.mysql.com/doc/refman/5.1/en/partitioning.html

The other work around would be export the data in CSV format and feed the same to Solr.

Check how this works for you. As someone had said that this mechanism worked well for him.