使用 ThreadPoolExecutor 的时候看不到 CPU 绑定任务的上下文切换开销

我在做一个简单的实验，我想找出当有一堆CPU密集型任务时合适的线程池大小。

我知道这个大小应该等于机器上的内核数，但我想验证这一点。这是我的代码：

public class Main {

    public static void main(String[] args) throws ExecutionException {
        List<Future> futures = new ArrayList<>();
        ExecutorService threadPool = Executors.newFixedThreadPool(4);

        long startTime = System.currentTimeMillis();

        for (int i = 0; i < 100; i++) {
            futures.add(threadPool.submit(new CpuBoundTask()));
        }

        for (int i = 0; i < futures.size(); i++) {
            futures.get(i).get();
        }

        long endTime = System.currentTimeMillis();
        System.out.println("Time = " + (endTime - startTime));
        threadPool.shutdown();
    }

    static class CpuBoundTask implements Runnable {
        @Override
        public void run() {
            int a = 0;
            for (int i = 0; i < 90000000; i++) {
                a = (int) (a + Math.tan(a));
            }
        }
    }
}

每个任务在大约 700 毫秒内执行（我觉得这足以被 ThreadScheduler 抢占至少一次了）。

我在 MacbookPro 2017、3.1 GHz Intel Core i5、2 个已激活超线程的物理内核上运行它，所以有 4 个逻辑 CPU。

我调整了线程池的大小，并多次运行这个程序（平均时间）。结果如下：

1 thread = 57 seconds
2 threads = 29 seconds
4 threads = 18 seconds
8 threads = 18.1 seconds
16 threads = 18.2 seconds
32 threads = 17.8 seconds
64 threads = 18.2 seconds

因为上下文切换开销，所以一旦我添加了这么多线程（超过 CPU 内核的数量），我预计执行时间会增加比较多，但好像并没有发生。

我用 VisualVM 来监视程序，看起来所有线程都已经创建，并且跟预期一样处于运行状态。还有CPU占用率也不是很高（接近 95%）。

请问大佬我漏掉什么要点了吗

使用 ThreadPoolExecutor 的时候看不到 CPU 绑定任务的上下文切换开销

本月精选性能专题

本月精选线上案例

本月精选原创好文

联系我们

网媒渠道

友情链接