On November 19, 2009, we put new hardware  into production to provide full-text searching against about 4.6 million volumes. Currently we have about 5.3 million volumes. The average response time is about 3 seconds, 90% of queries take under 4 seconds, 9% of queries take between 4 seconds and 24 seconds, and 1% of queries take longer than 24 seconds.
The chart below shows the average, median, 90th percentile, and 99th percentile response times for the 5.3 Million volume index and compares times based on our logs of actual user queries with the response times for our test queries. The differences between the user queries and the test queries can be most easily observed in the 99th percentile times. 
Response times in milliseconds
Our earlier tests reported only on the Solr reponse times rather than the time it takes for our Large Scale Search application to respond to a query. The elapsed times reported above reflect the time the Large Scale Search application takes to process the user's request, send 2 queries to Solr, process the responses from Solr, and generate and send HTML.   This includes any network time in communcation between the application and Solr and the application and our test program. This overhead adds a second or more to the raw Solr response times previously reported in our earlier benchmarking.
For comparison with our earlier benchmarking, the chart below shows the Solr response times (qtime) for our previous tests against a 500,000 volume index on our test hardware and the same tests run against our production system with indexes of 4.7 million and 5.2 million volumes.
|Solr Response time for 1,000 test queries (ms)|
|Single 500K doc index test machine||87||35||157||1,200|
|Production 4.7 Million volumes||134||67||300||894|
|Production 5.2 Million volumes||207||86||444||1,399|
We are working on a few things that should improve performance. 
- Tuning the network
- Load balencing and replication
- Tuning the list of common words
- Testing to determine the optimum shard size and optimum number of shards per server
- Monitoring our logs for slow queries and working to determine the bottlenecks in processing
  We are currently investigating the causes of the differences in response times between user queries and our test suite. We may need to modify our test suite to better reflect user queries. On the other hand when we rerun slow user queries (against a newly cleared and warmed cache), we see much faster response times than reported in the logs. We are in the process of trying to identify the factors responsible for the poorer performance showing up in the logs.
  The Large Scale Search application sends two queries to Solr for each user query it receives. The first query is to get the first page of results and the second query is to get the count of either "full view only" hits or all hits to populate the "All items" or "Only Full view" tabs. The second query can get its results from the Solr caches and so is much faster than the first.
  Performance measures
1)The network cards on our Solr servers experience intermittent problems with the handling of jumbo packets. We have programs in place to reset the cards when these problems occur. We are experimenting with different NIC cards and drivers, which we hope will eliminate the problem.
2)Currently we are running the large scale search application both here at the University of Michigan and at Indiana. However, until our planned deployment of new hardware at Indiana, both applications must query the Solr servers here at UM. Even with Internet 2 the network latency between Indiana and UM is about 35ms per packet. Solr http requests and responses use multiple tcp packets, so queries that get sent to Indiana have slower overall response times. We plan to install new hardware at Indiana to mirror/replicate our Solr servers here at UM to eliminate this latency problem and to provide for failover.
3) We did some analysis of the most frequent terms in our index and will be adding terms to our list of common words for CommonGrams. A future blog post will provide details.
4) Our previous tests on index sizes and I/O requirements were performed before we implemented CommonGrams  and in a significantly different hardware and storage environment. We have created new indexes for shard sizes from about 500,000 documents up to over 1 million documents. We plan on doing a series of tests on the new hardware to determine the relationship of index size and shards per server, to I/O demands and response time.
5)Our logs show about 0.5% of user queries (or about 1 out of every 200) take over 30 seconds. When we rerun these same queries, we get response times of under 10 seconds. We are currently working on trying to determine the cause of these slow queries so that we can eliminate these slow response times.