12
votes

I write job mapreduce.The input is a table in hbase.

When job run, had error :

org.apache.hadoop.hbase.client.ScannerTimeoutException: 88557ms passed since the last invocation, timeout is currently set to 60000 at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1196) at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:133) at org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:142) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hadoop.hbase.UnknownScannerException: org.apache.hadoop.hbase.UnknownScannerException: Name: 1502530095384129314 at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1837) at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:83) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:38) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1226) at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1187) ... 12 more

Can you help me fix it.

4

4 Answers

11
votes

Scanner Time-Out Exception has been occurred. To avoid time-out Exception Increase the time-out by setting the property in hbase-site.xml which will be available in hbase-> conf

  <property>
    <name>hbase.client.scanner.timeout.period</name>
    <value>900000</value> <!-- 900 000, 15 minutes -->
  </property>
  <property>
    <name>hbase.rpc.timeout</name>
    <value>900000</value> <!-- 15 minutes -->
  </property>
9
votes

As the official HBase book states:

You may need to find a sweet spot between a low number of RPCs and the memory used on the client and server. Setting the scanner caching higher will improve scanning performance most of the time, but setting it too high can have adverse effects as well: each call to next() will take longer as more data is fetched and needs to be transported to the client, and once you exceed the maximum heap the client process has available it may terminate with an OutOfMemoryException. When the time taken to transfer the rows to the client, or to process the data on the client, exceeds the configured scanner lease threshold, you will end up receiving a lease expired error, in the form of a ScannerTimeoutException being thrown.

So it would be better not to avoid the exception by the above configuration, but to set the caching of your Map side lower, enabling your mappers to process the required load into the pre-specified time interval.

2
votes

You can use setCaching(int noOfRows) method of Scan object to reduce the number of rows fetched by scanner at once.

Scan scan=new Scan();
scan.setCaching(400);//here 400 is just an example value

Larger caching value can cause ScannerTimeoutException as your program may take more time in consuming/processing fetched rows than timeout value.

But it can slowdown you task also as scanner is making more fetch requests to the server, so you should fine tune your caching and timeout values as per your program needs.

0
votes

Settting following property in hbase-site.xml worked for me

  <property>
       <name>hbase.client.scanner.timeout.period</name>
       <value>900000</value>
  </property>

This is thrown if the time between RPC calls from the client to RegionServer exceeds the scan timeout. e.g.

if ( RPC_call_time > Scanner_timeout ){

throw ScannerTimeoutException

}

visit my blogspot for details