python - 在推荐系统、机器学习中,如何将一个完整的数据集划分为训练集和测试集
天蓬老师
天蓬老师 2017-04-18 09:03:54
0
3
1031

如题,有没有快速一点的方法,我如果要做多折交叉验证,应该怎么去划分数据集

天蓬老师
天蓬老师

欢迎选择我的课程,让我们一起见证您的进步~~

reply all(3)
黄舟

Divide into 10 parts on average, cycle 10 times, select 1 part each time as the test set, and 9 parts as the training set

洪涛

Generally speaking, when doing cross validation, everyone will set k to 5 or 10. In other words, the data is (randomly) divided into k份,其中k-1份为训练,1 parts for testing. But having said that, you have to do cross validation, so it shouldn’t be fast.

Ty80

You can use 3.1. Cross-validation: evaluating estimator performance

>>> from sklearn.model_selection import cross_val_score
>>> clf = svm.SVC(kernel='linear', C=1)
>>> scores = cross_val_score(clf, iris.data, iris.target, cv=5)
>>> scores                                              
array([ 0.96...,  1.  ...,  0.96...,  0.96...,  1.        ])
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template