分散式Netty原始碼分析EventLoopGroup及介紹

2022-03-24 22:00:13

EventLoopGroup介紹

在前面一篇文章中提到了，EventLoopGroup主要負責2個事情，這裡再重複下：

它主要包含2個方面的功能，註冊Channel和執行一些Runnable任務。

功能1：先來看看註冊Channel

即將Channel註冊到Selector上，由Selector來排程Channel的相關事件，如讀、寫、Accept等事件。

而EventLoopGroup的設計是，它包含多個EventLoop（每一個EventLoop通常內部包含一個執行緒），在執行上述註冊過程中是需要選擇其中的一個EventLoop來執行上述註冊行為，這裡就出現了一個選擇策略的問題，該選擇策略介面是EventExecutorChooser，你也可以自定義一個實現。

從上面可以看到，EventLoopGroup做的工作大部分是一些總體性的工作如初始化上述多個EventLoop、EventExecutorChooser等，具體的註冊Channel還是交給它內部的EventLoop來實現。

功能2：執行一些Runnable任務

EventLoopGroup繼承了EventExecutorGroup，EventExecutorGroup也是EventExecutor的集合，EventExecutorGroup也是掌管著EventExecutor的初始化工作，EventExecutorGroup對於Runnable任務的執行也是選擇內部中的一個EventExecutor來做具體的執行工作。

netty中很多工都是非同步執行的，一旦當前執行緒要對某個EventLoop執行相關操作，如註冊Channel到某個EventLoop，如果當前執行緒和所要操作的EventLoop內部的執行緒不是同一個，則當前執行緒就僅僅向EventLoop提交一個註冊任務，對外返回一個ChannelFuture。

總結：EventLoopGroup含有上述2種功能，它更多的是一個集合，但是具體的功能實現還是選擇內部的一個item元素來執行相關任務。這裡的內部item元素通常即實現了EventLoop，又實現了EventExecutor，如NioEventLoop等

繼續來看看EventLoopGroup的整體類圖

從圖中可以看到有2路分支：

1 MultithreadEventLoopGroup：用於封裝多執行緒的初始化邏輯，指定執行緒數等，即初始化對應數量的EventLoop，每個EventLoop分配到一個執行緒

上圖中的newChild方法，NioEventLoopGroup就採用NioEventLoop作為實現，EpollEventLoopGroup就採用EpollEventLoop作為實現

如NioEventLoopGroup的實現：

protected EventLoop newChild(Executor executor, Object... args) throws Exception {
    return new NioEventLoop(this, executor, (SelectorProvider) args[0],
        ((SelectStrategyFactory) args[1]).newSelectStrategy(), (RejectedExecutionHandler) args[2]);
}

2 EventLoop介面實現了EventLoopGroup介面，主要因為EventLoopGroup中的功能介面還是要靠內部的EventLoop來完成具體的操作

EventLoop介紹

EventLoop主要工作就是註冊Channel，並負責監控管理Channel的讀寫等事件，這就涉及到不同的監控方式，linux下有3種方式來進行事件監聽

select、poll、epoll

目前java的Selector介面的實現如下：

PollSelectorImpl：實現了poll方式

EPollSelectorImpl：實現了epoll方式

而Netty呢則使用如下：

NioEventLoop：採用的是jdk Selector介面（使用PollSelectorImpl的poll方式）來實現對Channel的事件檢測

EpollEventLoop：沒有采用jdk Selector的介面實現EPollSelectorImpl，而是Netty自己實現的epoll方式來實現對Channel的事件檢測，所以在EpollEventLoop中就不存在jdk的Selector。

NioEventLoop介紹

對於NioEventLoopGroup的功能，NioEventLoop都要做實際的實現，NioEventLoop既要實現註冊功能，又要實現執行Runnable任務

對於註冊Channel：NioEventLoop將Channel註冊到NioEventLoop內部的PollSelectorImpl上，來監聽該Channel的讀寫事件

對於執行Runnable任務：NioEventLoop的父類別的父類別SingleThreadEventExecutor實現了執行Runnable任務，在SingleThreadEventExecutor中，有一個任務佇列還有一個分配的執行緒

private final Queue<Runnable> taskQueue;
private volatile Thread thread;

NioEventLoop在該執行緒中不僅要執行Selector帶來的IO事件，還要不斷的從上述taskQueue中取出任務來執行這些非IO事件。下面我們來詳細看下這個過程

protected void run() {
    for (;;) {
        try {
            switch (selectStrategy.calculateStrategy(selectNowSupplier, hasTasks())) {
                case SelectStrategy.CONTINUE:
                    continue;
                case SelectStrategy.SELECT:
                    select(wakenUp.getAndSet(false));
                    if (wakenUp.get()) {
                        selector.wakeup();
                    }
                default:
                    // fallthrough
            }
            cancelledKeys = 0;
            needsToSelectAgain = false;
            final int ioRatio = this.ioRatio;
            if (ioRatio == 100) {
                processSelectedKeys();
                runAllTasks();
            } else {
                final long ioStartTime = System.nanoTime();

                processSelectedKeys();

                final long ioTime = System.nanoTime() - ioStartTime;
                runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
            }

            if (isShuttingDown()) {
                closeAll();
                if (confirmShutdown()) {
                    break;
                }
            }
        } catch (Throwable t) {
            ...
        }
    }
}

來詳細說下這個過程：

1 計算當前是否需要執行select過程

如果當前沒有Runnable任務，則執行select（這個select過程稍後詳細來說）。

如果當前有Runnable任務，則要去執行處理流程，此時順便執行下selector.selectNow()，萬一有事件發生那就賺了，沒有白走這次處理流程

2 根據IO任務的時間佔比設定來執行IO任務和非IO任務，即上面提到的Runnable任務

如果ioRatio=100則每次都是執行全部的IO任務，執行全部的非IO任務預設ioRatio=50，即一半時間用於處理IO任務，另一半時間用於處理非IO任務。怎麼去控制非IO任務所佔用時間呢？

這裡是每執行64個非IO任務（這裡可能是每個非IO任務比較短暫，減少一些判斷帶來的消耗）就判斷下佔用時間是否超過了上述時間限制

接下來詳細看下上述select過程

Selector selector = this.selector;
try {
    int selectCnt = 0;
    long currentTimeNanos = System.nanoTime();
    long selectDeadLineNanos = currentTimeNanos + delayNanos(currentTimeNanos);
    for (;;) {
        long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L) / 1000000L;
        if (timeoutMillis <= 0) {
            if (selectCnt == 0) {
                selector.selectNow();
                selectCnt = 1;
            }
            break;
        }
        // If a task was submitted when wakenUp value was true, the task didn't get a chance to call
        // Selector#wakeup. So we need to check task queue again before executing select operation.
        // If we don't, the task might be pended until select operation was timed out.
        // It might be pended until idle timeout if IdleStateHandler existed in pipeline.
        if (hasTasks() && wakenUp.compareAndSet(false, true)) {
            selector.selectNow();
            selectCnt = 1;
            break;
        }
        int selectedKeys = selector.select(timeoutMillis);
        selectCnt ++;
        if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
            // - Selected something,
            // - waken up by user, or
            // - the task queue has a pending task.
            // - a scheduled task is ready for processing
            break;
        }
        if (Thread.interrupted()) {
            // Thread was interrupted so reset selected keys and break so we not run into a busy loop.
            // As this is most likely a bug in the handler of the user or it's client library we will
            // also log it.
            //
            // See https://github.com/netty/netty/issues/2426
            if (logger.isDebugEnabled()) {
                logger.debug("Selector.select() returned prematurely because " +
                        "Thread.currentThread().interrupt() was called. Use " +
                        "NioEventLoop.shutdownGracefully() to shutdown the NioEventLoop.");
            }
            selectCnt = 1;
            break;
        }
        long time = System.nanoTime();
        if (time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos) {
            // timeoutMillis elapsed without anything selected.
            selectCnt = 1;
        } else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
                selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {
            // The selector returned prematurely many times in a row.
            // Rebuild the selector to work around the problem.
            logger.warn(
                    "Selector.select() returned prematurely {} times in a row; rebuilding Selector {}.",
                    selectCnt, selector);
            rebuildSelector();
            selector = this.selector;
            // Select again to populate selectedKeys.
            selector.selectNow();
            selectCnt = 1;
            break;
        }
        currentTimeNanos = time;
    }
} catch (CancelledKeyException e) {
	...
}

1 首先計算此次select過程的截止時間

    protected long delayNanos(long currentTimeNanos) {
        ScheduledFutureTask<?> scheduledTask = peekScheduledTask();
        if (scheduledTask == null) {
            return SCHEDULE_PURGE_INTERVAL;
        }
        return scheduledTask.delayNanos(currentTimeNanos);
    }

這裡其實就是從一個定時任務佇列中取出定時任務，如果有則計算出離當前定時任務的下一次執行時間之差，如果沒有則按照固定的1s作為select過程的時間

2 將當前時間差轉化成ms

如果當前時間差不足0.5ms的話，即timeoutMillis<=0，並且是第一次執行，則認為時間太短執行執行一次selectNow

3 如果有任務，則立即執行一次selectNow，跳出for迴圈
4 然後就是普通的selector.select(timeoutMillis)

在這段時間內如果有事件則跳出for迴圈，如果沒有事件則已經花費對應的時間差了，再次執行for迴圈，計算的timeoutMillis就會小於0，也會跳出for迴圈

在上述邏輯中，基本selectCnt都是1，不會出現很多次，而這裡針對selectCnt有很多次的處理是基於一個情況：

 selector.select(timeoutMillis)

Selector的正常邏輯是一旦有事件就返回，沒有事件則最多等待timeoutMillis時間。然而底層作業系統實現可能有bug，會出現：即使沒有產生事件就直接返回了，並沒有按照要求等待timeoutMillis時間。

現在的解決辦法就是：記錄上述出現的次數，一旦超過512這個閾值（可設定），就重新建立新的Selector，並將之前的Channel也全部遷移到新的Selector上

至此，NioEventLoop的主邏輯流程就介紹完了，之後就該重點介紹其中對於IO事件的處理了。然後就會引出來ChannelPipeline的處理流程

EpollEventLoop介紹

EpollEventLoop和NioEventLoop的主流程邏輯基本上是差不多的，不同之處就在於EpollEventLoop用epoll方式替換NioEventLoop中的PollSelectorImpl的poll方式。

這裡不再詳細說明了，之後會詳細的說明Netty的epoll方式和jdk中的epoll方式的區別。

後續

下一篇就要詳細描述下NioEventLoop對於IO事件的處理，即ChannelPipeline的處理流程。

以上就是分散式Netty原始碼分析EventLoopGroup及介紹的詳細內容，更多關於分散式Netty EventLoopGroup原始碼分析的資料請關注it145.com其它相關文章！