首頁 > 軟體

使用Jedis執行緒池returnResource異常注意事項

2022-03-24 19:00:22

線上上環境發現了一個工作執行緒異常終止

看紀錄檔先是一些SocketTimeoutException,然後突然有一個ClassCastException

redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out
...
java.lang.ClassCastException: [B cannot be cast to java.lang.Long
        at redis.clients.jedis.Connection.getIntegerReply(Connection.java:208)
        at redis.clients.jedis.Jedis.sismember(Jedis.java:1307)

經過在本地人工模擬網路異常的情境,最終復現了線上的這一異常。

又經過深入分析(提出假設-->驗證假設),最終找出了導致這一問題的原因。

見如下範例程式碼

JedisPool pool = ...;
Jedis jedis = pool.getResource();
String value = jedis.get("foo");
System.out.println("Make SocketTimeoutException");
System.in.read(); //等待制造SocketTimeoutException
try {
    value = jedis.get("foo");
    System.out.println(value);
} catch (JedisConnectionException e) {
    e.printStackTrace();
}
System.out.println("Recover from SocketTimeoutException");
System.in.read();  //等待恢復
Thread.sleep(5000); // 繼續休眠一段時間 等待網路完全恢復
boolean isMember = jedis.sismember("urls", "baidu.com");

以及紀錄檔輸出

bar
Make SocketTimeoutException
redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out
Recover from SocketTimeoutException
    at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:210)
    at redis.clients.util.RedisInputStream.readByte(RedisInputStream.java:47)
    at redis.clients.jedis.Protocol.process(Protocol.java:131)
    at redis.clients.jedis.Protocol.read(Protocol.java:196)
    at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:283)
    at redis.clients.jedis.Connection.getBinaryBulkReply(Connection.java:202)
    at redis.clients.jedis.Connection.getBulkReply(Connection.java:191)
    at redis.clients.jedis.Jedis.get(Jedis.java:101)
    at com.tcl.recipevideohunter.JedisTest.main(JedisTest.java:23)
Caused by: java.net.SocketTimeoutException: Read timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:152)
    at java.net.SocketInputStream.read(SocketInputStream.java:122)
    at java.net.SocketInputStream.read(SocketInputStream.java:108)
    at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:204)
    ... 8 more
Exception in thread "main" java.lang.ClassCastException: [B cannot be cast to java.lang.Long
    at redis.clients.jedis.Connection.getIntegerReply(Connection.java:208)
    at redis.clients.jedis.Jedis.sismember(Jedis.java:1307)
    at com.tcl.recipevideohunter.JedisTest.main(JedisTest.java:32)

分析

等執行第二遍的get("foo")時,網路超時,並未實際傳送 get foo 命令,等執行sismember時,網路已恢復正常,並且是同一個jedis範例,於是將之前的get foo命令(已在輸出流快取中)一併行送。

執行順序如下所示

127.0.0.1:9379> get foo"bar"127.0.0.1:9379> sismember urls baidu.com(integer) 1127.0.0.1:9379> get foo
"bar"
127.0.0.1:9379> sismember urls baidu.com
(integer) 1

故在上述範例程式碼中最後的sismember得到的結果是get foo的結果,即一個字串,而sismember需要的是一個Long型,故導致了ClassCastException。

執行redis的邏輯

為什麼線上會出現這一問題呢?原因是其執行redis的邏輯類似這樣:

while(true){
        Jedis jedis = null;
    try {
        jedis = pool.getResource();
        //some redis operation here.
    } catch (Exception e) {
       logger.error(e);
    } finally {
        pool.returnResource(jedis);
    }
}

因若是網路異常的話,pool.returnResource(jedis)仍能成功執行,即能將其返回到池中(這時jedis並不為空)。等網路恢復後,並是多執行緒環境,導致後續其他某個執行緒獲得了同一個Jedis範例(pool.getResource()),

若該執行緒中的jedis操作返回型別與該jedis範例在網路異常期間第一條未執行成功的jedis操作的返回型別不匹配(如一個是get,一個是sismember),則就會出現ClassCastException異常。

這還算幸運的,若返回的是同一型別的話(如lpop("queue_order_pay_failed"),lpop("queue_order_pay_success")),那我真不敢想象。

如在上述範例程式碼中的sismember前插入一get("nonexist-key")(redis中不存在該key,即應該返回空).

value = jedis.get("nonexist-key");
System.out.println(value);
boolean isMember = jedis.sismember("urls", "baidu.com");
System.out.println(isMember);

實際的紀錄檔輸出為

bar
Exception in thread "main" java.lang.NullPointerException
    at redis.clients.jedis.Jedis.sismember(Jedis.java:1307)
    at com.tcl.recipevideohunter.JedisTest.main(JedisTest.java:37)

分析:

get("nonexist-key")得到是之前的get("foo")的結果, 而sismember得到的是get("nonexist-key")的結果,而get("nonexist-key")返回為空,於是這時是報空指標異常了.

解決方法:

不能不管什麼情況都一律使用returnResource。更健壯可靠以及優雅的處理方式如下所示:

while(true){
    Jedis jedis = null;
    boolean broken = false;
    try {
        jedis = jedisPool.getResource();
        return jedisAction.action(jedis); //模板方法
    } catch (JedisException e) {
        broken = handleJedisException(e);
        throw e;
    } finally {
        closeResource(jedis, broken);
    }
}

/**
 * Handle jedisException, write log and return whether the connection is broken.
 */
protected boolean handleJedisException(JedisException jedisException) {
    if (jedisException instanceof JedisConnectionException) {
        logger.error("Redis connection " + jedisPool.getAddress() + " lost.", jedisException);
    } else if (jedisException instanceof JedisDataException) {
        if ((jedisException.getMessage() != null) && (jedisException.getMessage().indexOf("READONLY") != -1)) {
            logger.error("Redis connection " + jedisPool.getAddress() + " are read-only slave.", jedisException);
        } else {
            // dataException, isBroken=false
            return false;
        }
    } else {
        logger.error("Jedis exception happen.", jedisException);
    }
    return true;
}
/**
 * Return jedis connection to the pool, call different return methods depends on the conectionBroken status.
 */
protected void closeResource(Jedis jedis, boolean conectionBroken) {
    try {
        if (conectionBroken) {
            jedisPool.returnBrokenResource(jedis);
        } else {
            jedisPool.returnResource(jedis);
        }
    } catch (Exception e) {
        logger.error("return back jedis failed, will fore close the jedis.", e);
        JedisUtils.destroyJedis(jedis);
    }
}

補充

Ubuntu本地模擬存取redis網路超時:

sudo iptables -A INPUT -p tcp --dport 6379 -j DROP

恢復網路:

sudo iptables -F

補充:

若jedis操作邏輯類似下面所示的話,

Jedis jedis = null;
try {
    jedis = jedisSentinelPool.getResource();
    return jedis.get(key);
}catch(JedisConnectionException e) {
    jedisSentinelPool.returnBrokenResource(jedis);
    logger.error("", e);
    throw e;
}catch (Exception e) {
    logger.error("", e);
    throw e;
}
finally {
    jedisSentinelPool.returnResource(jedis);
}

若一旦發生了JedisConnectionException,如網路異常,會先執行returnBrokenResource,這時jedis已被destroy了。然後進入了finally,再一次執行returnResource,這時會報錯:

redis.clients.jedis.exceptions.JedisException: Could not return the resource to the pool
    at redis.clients.util.Pool.returnResourceObject(Pool.java:65)
    at redis.clients.jedis.JedisSentinelPool.returnResource(JedisSentinelPool.java:221)

臨時解決方法

jedisSentinelPool.returnBrokenResource(jedis);
jedis=null; //這時不會實際執行returnResource中的相關動作了

但不建議這樣處理,更嚴謹的釋放資源方法見前文所述。

以上就是使用Jedis執行緒池returnResource異常注意事項的詳細內容,更多關於Jedis執行緒池returnResource異常的資料請關注it145.com其它相關文章!


IT145.com E-mail:sddin#qq.com