0
votes

Context:

I have an AWS EC2 instance

  • 8Gb RAM
  • 8Gb of disk space

It runs Solr 5.1.0 with

  • Java Heap of 2048Mb
  • -Xms2048m -Xmx2048m

Extra: (updated)

  • Logs are generated on the server
  • Imports happen in intervals of 10s (always delta)
  • Importing from DB (JdbcDataSource)
  • I don't think I have any optimization strategy configured right now
  • GC profiling? I don't know.
  • How can I find out how large the fields are .. and what is large?

Situation:

The index on Solr has 200.000 documents and is queried not more than once per second. However, in about 10 days, the memory and disk space of the server reaches 90% - 95% of the available space.

When investigating the disk usage sudo du -sh / it only returns a total of 2.3G. Not nearly as much as what df -k tells me (Use% -> 92%).

I can, sort of, resolve the situation by restarting the Solr service.

What am i missing? How come Solr consumes all memory and disk space and how to prevent it?

Extra info for @TMBT

Sorry for the delay, but I’ve been monitoring the Solr production server for the last few days. You can see a roundup here: https://www.dropbox.com/s/x5diyanwszrpbav/screencapture-app-datadoghq-com-dash-162482-1468997479755.jpg?dl=0 The current state of Solr: https://www.dropbox.com/s/q16dc5t5ctl32od/Screenshot%202016-07-21%2010.29.13.png?dl=0 I restarted Solr at the beginning of the monitoring and now, 2 days later I see the disk space goes down at a rate of 1,5Gb per day. If you need more specifics, let me know.

  • There are not so many deleted docs per day. We’re talking 50 - 250 per day max.
  • The current logs directory of Solr: ls -lh /var/solr/logs -> total 72M
  • There is no master-slave setup
  • The importer runs ever 10 seconds, but it imports no more than 10 - 20 docs each time. The large import of 3k-4k docs happens each night. There is not much action going on in Solr at that time.
  • There are no large fields, the largest field can contain up to 255 chars.

With the monitoring in place I tested the most common queries. It does contain faceting (field, queries), sorting, grouping, … But I doesn’t really affect the various metrics of heap and gc count.

2
Editing answers to some or all of the following questions in your original question would be helpful: are the log files generated on this server? How often do you perform full imports? Delta imports? Are you importing from a DB, files, etc? How often do you commit documents when you import? How often are you running optimize? Have you done any GC profiling for your server? How large are your individual documents? How large are the fields? What do you mean by "queried moderately" (5 queries per second? Per minute?)?TMBT

2 Answers

2
votes

First, visit your.solr.instance:[port]/[coreName]/admin/system and check to see how many resources Solr is actually using. The memory and system elements will be most useful to you. It may be something else on the box is the culprit for at least some of the resource usage.

To me, that you can "sort of" resolve the problem by restarting Solr screams "query and import obnoxiousness" for memory. For disk space, I wouldn't be surprised if it's the log files behind that. I also wonder if you're ending up with a lot of old, deleted files due to your numerous delta imports that are lying around until Solr automatically deletes them. In fact, if you go to http://your.solr.instance:[port]/solr/#/[coreName], you should be able to see how many deleted docs are in your index. If there's a very, very large number, you should schedule a time during low usage to run optimize to get rid of them.

Also be aware that Solr seems to have a tendency of filling up as much of the given heap space as it can.

Since the logs are generated on the server, check to see how many of them exist. Solr after 4.10 has a nasty habit of generating large numbers of log files, which can cause disk space issues, especially with how often you import. For information on how to deal with Solr's love of logging, I'm going to refer to my self-answer at Solr 5.1: Solr is creating way too many log files. Basically you'll want to navigate to solr startup script to disable Solr's log backups and then replace that with a solution of your own.

If you have a master-slave setup, check to see if the slave is backing up certain configuration files, like schema.xml or solrconfig.xml.

Depending on how many records are imported per delta, you could have commits overlapping each other, which will affect resource usage on your box. If in the logs you read anything about overlapping ondecksearchers this is definitely an issue for you.

Lots of delta imports also means lots of commits. Commit is a fairly heavy operation. You'll want to tweak solrconfig.xml to soft commit after a number of documents and a hard commit after a little bit more. If you perform the commits in batches, your frequent deltas should have less of an impact.

If you are joining columns for your imports, you may need to index those joined columns in your database. If your database is not on the same machine as Solr, network latency is a possible problem. It's one I've struggled with in the past. If the DB is on the same machine and you need to index, then not indexing will most certainly have a negative effect on your box's resources.

It may be helpful to you to use something like VisualVM on Solr to view heap usage and GC. You want to make sure there's not a rapid increase in usage and you also want to make sure that the GC isn't having a bunch of stop-the-world collections that can cause weirdness on your box.

Optimize is a very intensive operation that you shouldn't need to use often, if at all, after 4.10. Some people still do, though, and if you have tons of deleted documents it might be useful to you. If you ever decide to employ an optimization strategy, it should be done only during times of low usage, as optimize temporarily doubles the size of your index. Optimize merges segments and removes files marked for deletion by deltas.

By "large fields", I mean fields with large amounts of data in them. You would need to look up the size limits for each field type you're using, but if you're running towards the max size for a certain field, you may want to try to find a way to reduce the size of your data. Or you can omit importing those large columns into Solr and instead retrieve the data from the columns in the source DB after getting a particular document(s) from Solr. It depends on your set up and what you need. You may or may not be able to do much about it. If you get everything else running more efficiently you should be fine.

The type of queries you run can also cause you problems. Lots of sorting, faceting, etc can be very memory-intensive. If I were you, I would hook VisualVM up to Solr so I could watch heap usage and GC, and then load test Solr using typical queries.

1
votes

I finally managed to solve this problem. So I'm answering my own question.

I changed / added the following lines in the log4j.properties file which is located in /var/solr/ (the Solr root location in my case).

# log4j.rootLogger=INFO, file, CONSOLE
# adding:
log4j.rootLogger=WARN, file, CONSOLE

Lowering the logging level.

# adding:
log4j.appender.file.Threshold=INFO

Set logging treshold.

You can see in the graphs below that as of september 2nd, the disk usage is steady as it should be. The same is true for the memory consumption on the server.

solr-graphs