我開發了tea-tasting,一個用於 A/B 測試統計分析的 Python 包,具有:
在這篇文章中,我探討了在實驗分析中使用品茶的每一個優點。
如果你想嘗試一下,請查看文件。
品茶包括統計方法和技術,涵蓋了您在實驗分析中可能需要的大部分內容。
使用學生 t 檢定和 Z 檢定分析指標平均值和比例。或使用 Bootstrap 來分析您選擇的任何其他統計數據。並且有一種使用 Bootstrap 分析分位數的預定義方法。品茶還可以檢測 A/B 測試不同變異的樣本比例不符。
品茶採用Delta方法來分析平均值的比率。例如,每平均會話數的平均訂單數,假設會話不是隨機化單位。
使用實驗前數據、指標預測或其他協變量來減少變異數並提高實驗的靈敏度。這種方法也稱為 CUPED 或 CUPAC。
學生 t 檢定和 Z 檢定中百分比變化的置信區間的計算可能很棘手。只需取絕對變化的置信區間並將其除以控制平均值就會產生偏差的結果。品茶採用delta法來計算正確的時間間隔。
分析學生 t 檢定和 Z 檢定的統計功效。有以下三種可能的選擇:
在詳細的使用者指南中了解更多。
路線圖包括:
您可以使用您選擇的統計測試來定義自訂指標。
有許多不同的資料庫和引擎用於儲存和處理實驗數據。而且在大多數情況下,將詳細的實驗數據拉入 Python 環境的效率並不高。許多統計檢驗,例如學生 t 檢定或 Z 檢驗,僅需要匯總資料進行分析。
例如,如果原始實驗數據儲存在 ClickHouse 中,那麼直接在 ClickHouse 中計算計數、平均值、方差和協方差比在 Python 環境中獲取粒度數據並執行聚合更快、更有效率。
手動查詢所有必需的統計資料可能是一項艱鉅且容易出錯的任務。例如,使用 CUPED 分析比率指標和變異數減少不僅需要行數和方差,還需要協方差。不過別擔心-品茶這一切對你有用嗎。
品茶接受 Pandas DataFrame 或 Ibis Table 形式的資料。 Ibis 是一個 Python 包,用作各種資料後端的 DataFrame API。它支援 20 多個後端,包括 BigQuery、ClickHouse、PostgreSQL/GreenPlum、Snowflake 和 Spark。您可以編寫 SQL 查詢,將其包裝為 Ibis 表,然後將其傳遞給茶品嚐.
請記住,品茶假設:
Some statistical methods, like Bootstrap, require granular data for the analysis. In this case,tea-tastingfetches the detailed data as well.
Learn more in the guide on data backends.
You can perform all the tasks listed above using just NumPy, SciPy, and Ibis. In fact,tea-tastinguses these packages under the hood. Whattea-tastingoffers on top is a convenient higher-level API.
It's easier to show than to describe. Here is the basic example:
import tea_tasting as tt data = tt.make_users_data(seed=42) experiment = tt.Experiment( sessions_per_user=tt.Mean("sessions"), orders_per_session=tt.RatioOfMeans("orders", "sessions"), orders_per_user=tt.Mean("orders"), revenue_per_user=tt.Mean("revenue"), ) result = experiment.analyze(data) print(result) #> metric control treatment rel_effect_size rel_effect_size_ci pvalue #> sessions_per_user 2.00 1.98 -0.66% [-3.7%, 2.5%] 0.674 #> orders_per_session 0.266 0.289 8.8% [-0.89%, 19%] 0.0762 #> orders_per_user 0.530 0.573 8.0% [-2.0%, 19%] 0.118 #> revenue_per_user 5.24 5.73 9.3% [-2.4%, 22%] 0.123
The two-stage approach, with separate parametrization and inference, is common in statistical modeling. This separation helps in making the code more modular and easier to understand.
tea-tastingperforms calculations that can be tricky and error-prone:
It also provides a framework for representing experimental data to avoid errors. Grouping the data by randomization units and including all units in the dataset is important for correct analysis.
In addition,tea-tastingprovides some convenience methods and functions, such as pretty formatting of the result and a context manager for metric parameters.
Last but not least: documentation. I believe that good documentation is crucial for tool adoption. That's why I wrote several user guides and an API reference.
I recommend starting with the example of basic usage in the user guide. Then you can explore specific topics, such as variance reduction or power analysis, in the same guide.
See the guide on data backends to learn how to use a data backend of your choice withtea-tasting.
See the guide on custom metrics if you want to perform statistical test that is not included intea-tasting.
Use the API reference to explore all parameters and detailed information about the functions, classes, and methods available intea-tasting.
There are a variety of statistical methods that can be applied in the analysis of an experiment. But only a handful of them are actually used in most cases.
On the other hand, there are methods specific to the analysis of A/B tests that are not included in the general purpose statistical packages like SciPy.
tea-tastingfunctionality includes the most important statistical tests, as well as methods specific to the analysis of A/B tests.
tea-tastingprovides a convenient API that helps to reduce the time spent on analysis and minimize the probability of error.
In addition,tea-tastingoptimizes computational efficiency by calculating the statistics in the data backend of your choice, where the data are stored.
With the detailed documentation, you can quickly learn how to usetea-tastingfor the analysis of your experiments.
The package name "tea-tasting" is a play on words that refers to two subjects:
以上是tea-tasting:用於 A/B 測試統計分析的 Python 套件的詳細內容。更多資訊請關注PHP中文網其他相關文章!