本文来源吾爱破解论坛
最近学习sklearn机器学习,简单看了一遍书后,想自己找点数据练习下,仔细一想,自己周围好像就双色球还有点数据,红球数据太多,没头绪,所以想试试蓝球 有没有规律可以学习。
先从网上下载了所以双色球数据,存到ssq.txt。
然后照着书上的监督学习,挨个试验一遍,最后发现没有一个能用的,看来随机问题不是那么好蒙的,发财还得靠运气!
#引入常规包
# -*- coding: utf-8 -*-
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import numpy as np
#解决中文显示问题
plt.rcParams['font.sans-serif']=['SimHei']
plt.rcParams['axes.unicode_minus'] = False
## 设置常数初始化
n = 16 #蓝球数
blueball = [] #蓝球列表
count_blueball = [] #统计蓝球数
y_total = [] #总共期数
#网上下载的双色球数据存在ssq.txt文档中,从中读入蓝球数据
with open('ssq.txt', 'r') as f:
lines = f.readlines()
for line in lines:
line = line.strip() # 把末尾的'\n'删掉
#print(line)
blueball += [int(line[-2:])] #蓝球号码是每行最后2个字符
blueball.reverse()
#print( blueball)
#统计各个蓝球总共出现的次数
for i in range(1,17):
count_blueball += [blueball.count(i)]
print(count_blueball)
y_total = range(1,len(blueball)+1)
[163, 146, 152, 145, 152, 152, 161, 134, 167, 151, 166, 173, 152, 159, 153, 168]
#可视化篮球统计情况,发现基本复活随机规律,8号球目前出现次数最少
plt.bar(range(1,len(count_blueball)+1),count_blueball,label='蓝球出现次数')
# 在柱状图上显示具体数值, ha参数控制水平对齐方式, va控制垂直对齐方式
for x, y in enumerate(count_blueball):
plt.text(x+1, y + 1, '%s' % y, ha='center', va='bottom')
# 为两条坐标轴设置名称
plt.xlabel("蓝球号码")
plt.ylabel("出现次数")
plt.show()
#对数据进行处理,变成2维,因为fit时要求必须2维
blueball_np = np.array(blueball).reshape(-1,1)
#数据分段,训练与测试
X_train, X_test, y_train, y_test = train_test_split(blueball_np, y_total, random_state=0)
#先从线性模型开始
from sklearn.linear_model import LinearRegression
lr = LinearRegression().fit(X_train, y_train)
print("lr.coef_: {}".format(lr.coef_))
print("lr.intercept_: {}".format(lr.intercept_))
lr.coef_: [0.4257403]
lr.intercept_: 1238.600326580525
#看下测试效果,成绩为0,失败!!!
print("Training set score: {:.2f}".format(lr.score(X_train, y_train)))
print("Test set score: {:.2f}".format(lr.score(X_test, y_test)))
Training set score: 0.00
Test set score: -0.00
#K邻近模型,成绩为0,毫无疑问的失败
from sklearn.neighbors import KNeighborsClassifier
clf = KNeighborsClassifier(n_neighbors=2)
clf.fit(X_train, y_train)
print("Test set predictions: {}".format(clf.predict(X_test)))
print("Test set accuracy: {:.2f}".format(clf.score(X_test, y_test)))
Test set predictions: [ 100 1873 50 1654 1446 144 262 262 1749 262 262 1446 262 1749
...
144 100 50 1446 1792 1792 50 50 879 1654 130 1749 50 100
1446 1792 50 1446 916 1749 916 130 1654 117 1792 1654 555 1245
1245 100 262 1873 1792 144 50 555 916 879 1873 555 1873 555
117 130 916 130 916 555 1873 1792]
Test set accuracy: 0.00
#决策树模型,失败!
from sklearn.tree import DecisionTreeClassifier
tree = DecisionTreeClassifier(random_state=0)
tree.fit(X_train, y_train)
print("Accuracy on training set: {:.3f}".format(tree.score(X_train, y_train)))
print("Accuracy on test set: {:.3f}".format(tree.score(X_test, y_test)))
Accuracy on training set: 0.009
Accuracy on test set: 0.000
#核支持向量机,失败!
from sklearn.svm import SVC
from sklearn.preprocessing import MinMaxScaler
import mglearn
X_train, X_test, y_train, y_test = train_test_split(blueball_np, y_total,random_state=42)
#svm = SVC(kernel='rbf', C=10, gamma=0.1).fit(X_train, y_train)
# 计算训练数据的最小值和最大值
scaler = MinMaxScaler().fit(X_train)
# 对训练数据进行缩放
scaler.data_max_
X_train_scaled = scaler.transform(X_train)
svm = SVC()
# 在缩放后的训练数据上学习SVM
svm.fit(X_train_scaled, y_train)
# 对测试数据进行缩放,并计算缩放后的数据的分数
X_test_scaled = scaler.transform(X_test)
print("Test score: {:.2f}".format(svm.score(X_test_scaled, y_test)))
Test score: 0.00
#神经网络,失败
from sklearn.neural_network import MLPClassifier
mlp = MLPClassifier(solver='lbfgs', random_state=0, hidden_layer_sizes=[10])
mlp.fit(X_train, y_train)
print("Accuracy on training set: {:.2f}".format(mlp.score(X_train, y_train)))
print("Accuracy on test set: {:.2f}".format(mlp.score(X_test, y_test)))
Accuracy on training set: 0.01
Accuracy on test set: 0.00
output_2_0.png (16.49 KB, 下载次数: 1)
下载附件 保存到相册
2019-11-25 15:52 上传
版权声明:
本站所有资源均为站长或网友整理自互联网或站长购买自互联网,站长无法分辨资源版权出自何处,所以不承担任何版权以及其他问题带来的法律责任,如有侵权或者其他问题请联系站长删除!站长QQ754403226 谢谢。