JIT コンパイラーを使用すると、Python ループが遅くなりますか?-Python チュートリアル-php.cn

まだ聞いたことがない方もいるかもしれませんが、Python ループは、特に大規模なデータセットを扱う場合に遅くなる可能性があります。数百万のデータポイントにわたって計算を行おうとすると、実行時間がすぐにボトルネックになる可能性があります。幸いなことに、Numba には、Python での数値計算とループの高速化に役立つジャストインタイム (JIT) コンパイラーが備わっています。

先日、Python で単純な指数平滑化関数が必要であることに気づきました。この関数は配列を受け取り、平滑化された値を含む同じ長さの配列を返す必要がありました。通常、Python では可能な限りループを避けるようにしています (特に Pandas DataFrame を扱う場合)。私の現在の能力レベルでは、値の配列を指数関数的に平滑化するループの使用を回避する方法が分かりませんでした。

この指数平滑化関数を作成し、JIT コンパイルを使用した場合と使用しない場合のテストのプロセスを順を追って説明します。 JIT と、nopython モードで動作する方法でループをコーディングする方法について簡単に触れます。

JITとは何ですか?

JIT コンパイラーは、Python、JavaScript、Java などの高水準言語で特に役立ちます。これらの言語は柔軟性と使いやすさで知られていますが、C や C++ などの低レベル言語に比べて実行速度が遅い場合があります。 JIT コンパイルは、実行時のコードの実行を最適化し、これらの高水準言語の利点を犠牲にすることなくコードの実行を高速化することで、このギャップを埋めるのに役立ちます。

Numba JIT コンパイラーで nopython=True モードを使用すると、Python インタープリターは完全にバイパスされ、Numba はすべてをマシンコードまでコンパイルする必要があります。これにより、Python の動的型付けやその他のインタープリター関連の操作に関連するオーバーヘッドが排除され、実行がさらに高速化されます。

高速指数平滑化関数の構築

指数平滑化は、過去の観測値に加重平均を適用することでデータを平滑化するために使用される手法です。指数平滑法の公式は次のとおりです:

S_{t} = α \cdot V_{t} + (1 - α) \cdot S_{t - 1} S_t = アルファ cdot V_t + (1 - アルファ) cdot S_{t-1}

ここで:

$S_{t} S_t$ : 時間における平滑化された値を表します $t t$ 。
$V_{t} V_t$ : 時刻の元の値を表します $t t$ 値配列から。
$α アルファ$ : 現在の値の重みを決定する平滑化係数 $V_{t} V_t$ スムージング処理中。
$S_{t - 1} S_{t-1}$ : 時間における平滑化された値を表します $t - 1 t-1$ 、つまり、前の平滑化された値。

式は指数平滑法を適用します。ここで、

新的平滑值 $S_{t} S_t S$ $是當前值的加權平均值_{} V t$ $_{} 和之前的平滑值$

小 >S_{t-1}St−1 . 為了在 Python 中實現這一點，並堅持使用 nopython=True 模式的功能，我們將傳入一個資料值數組和 alpha 浮點數。我將 alpha 預設為 0.33333333，因為這適合我目前的用例。我們將初始化一個空數組來儲存平滑後的值，循環並計算並傳回平滑後的值。這是它的樣子：雷雷簡單吧？讓我們看看 JIT 現在是否在做任何事情。首先，我們需要建立一個大的整數陣列。然後，我們呼叫該函數，計算計算所需的時間，並列印結果。

# Generate a large random array of a million integers large_array = np.random.randint(1, 100, size=1_000_000) # Test the speed of fast_exponential_smoothing start_time = time.time() smoothed_result = fast_exponential_smoothing(large_array) end_time = time.time() print(f"Exponential Smoothing with JIT took {end_time - start_time:.6f} seconds with 1,000,000 sample array.")

ログイン後にコピー

This can be repeated and altered just a bit to test the function without the JIT decorator. Here are the results that I got:

Using JIT-compilers to make my Python loops slower?

Wait, what the f***?

I thought JIT was supposed to speed it up. It looks like the standard Python function beat the JIT version and a version that attempts to use no recursion. That's strange. I guess you can't just slap the JIT decorator on something and make it go faster? Perhaps simple array loops and NumPy operations are already pretty efficient? Perhaps I don't understand the use case for JIT as well as I should? Maybe we should try this on a more complex loop?

Here is the entire code python file I created for testing:

import numpy as np from numba import jit import time @jit(nopython=True) def fast_exponential_smoothing(values, alpha=0.33333333): smoothed_values = np.zeros_like(values) # Array of zeros the same length as values smoothed_values[0] = values[0] # Initialize the first value for i in range(1, len(values)): smoothed_values[i] = alpha * values[i] + (1 - alpha) * smoothed_values[i - 1] return smoothed_values def fast_exponential_smoothing_nojit(values, alpha=0.33333333): smoothed_values = np.zeros_like(values) # Array of zeros the same length as values smoothed_values[0] = values[0] # Initialize the first value for i in range(1, len(values)): smoothed_values[i] = alpha * values[i] + (1 - alpha) * smoothed_values[i - 1] return smoothed_values def non_recursive_exponential_smoothing(values, alpha=0.33333333): n = len(values) smoothed_values = np.zeros(n) # Initialize the first value smoothed_values[0] = values[0] # Calculate the rest of the smoothed values decay_factors = (1 - alpha) ** np.arange(1, n) cumulative_weights = alpha * decay_factors smoothed_values[1:] = np.cumsum(values[1:] * np.flip(cumulative_weights)) + (1 - alpha) ** np.arange(1, n) * values[0] return smoothed_values # Generate a large random array of a million integers large_array = np.random.randint(1, 1000, size=10_000_000) # Test the speed of fast_exponential_smoothing start_time = time.time() smoothed_result = fast_exponential_smoothing_nojit(large_array) end_time = time.time() print(f"Exponential Smoothing without JIT took {end_time - start_time:.6f} seconds with 1,000,000 sample array.") # Test the speed of fast_exponential_smoothing start_time = time.time() smoothed_result = fast_exponential_smoothing(large_array) end_time = time.time() print(f"Exponential Smoothing with JIT took {end_time - start_time:.6f} seconds with 1,000,000 sample array.") # Test the speed of fast_exponential_smoothing start_time = time.time() smoothed_result = non_recursive_exponential_smoothing(large_array) end_time = time.time() print(f"Exponential Smoothing with no recursion or JIT took {end_time - start_time:.6f} seconds with 1,000,000 sample array.")

ログイン後にコピー

I attempted to create the non-recursive version to see if vectorized operations across arrays would make it go faster, but it seems to be pretty damn fast as it is. These results remained the same all the way up until I didn't have enough memory to make the array of random integers.

Let me know what you think about this in the comments. I am by no means a professional developer, so I am accepting all comments, criticisms, or educational opportunities.

Until next time.

Happy coding!

以上がJIT コンパイラーを使用すると、Python ループが遅くなりますか?の詳細内容です。詳細については、PHP 中国語 Web サイトの他の関連記事を参照してください。