如题,有没有快速一点的方法,我如果要做多折交叉验证,应该怎么去划分数据集
欢迎选择我的课程,让我们一起见证您的进步~~
Divide into 10 parts on average, cycle 10 times, select 1 part each time as the test set, and 9 parts as the training set
Generally speaking, when doing cross validation, everyone will set k to 5 or 10. In other words, the data is (randomly) divided into k份,其中k-1份为训练,1 parts for testing. But having said that, you have to do cross validation, so it shouldn’t be fast.
k
k-1
1
You can use 3.1. Cross-validation: evaluating estimator performance
>>> from sklearn.model_selection import cross_val_score >>> clf = svm.SVC(kernel='linear', C=1) >>> scores = cross_val_score(clf, iris.data, iris.target, cv=5) >>> scores array([ 0.96..., 1. ..., 0.96..., 0.96..., 1. ])
Divide into 10 parts on average, cycle 10 times, select 1 part each time as the test set, and 9 parts as the training set
Generally speaking, when doing cross validation, everyone will set k to 5 or 10. In other words, the data is (randomly) divided into
k
份,其中k-1
份为训练,1
parts for testing. But having said that, you have to do cross validation, so it shouldn’t be fast.You can use 3.1. Cross-validation: evaluating estimator performance