深度学习之搭建LSTM模型预测股价

大家好,我是带我去滑雪!
本期利用股价数据集,该数据集中in.csv为训练集,t.csv为测试集,里面有开盘价、最高股价、最低股价、收盘价、调整后的收盘价、成交量,2021年11月以前,可以在美国Yahoo网站下载股价历史数据,但现在对中国已经禁用了,可以去其他地方进行下载 。本次使用调整后的收盘价进行预测 。
目录
1、导入相关模块和数据集
2、产生训练所需的特征和标签数据
3、转换数据为(样本数,时步、特征)的张量
4、定义LSTM模型
5、使用已经训练好的LSTM模型预测股价
6、绘制真实股价与预测股价的对比图
1、导入相关模块和数据集
numpy as np
as pd
from .
from keras.
from keras.Dense, ,LSTM,,GRU
# 载入股价数据集
= pd.(r'E:\工作\硕士\博客\博客37-\in.csv',="Date",=True)
print()
= pd.(r'E:\工作\硕士\博客\博客37-\t.csv',="Date",=True)
print( )
输出结果:
OpenHighLowCloseAdj Close\Date2012-01-03324.360352331.916199324.077179330.555054330.5550542012-01-04330.366272332.959412328.175537331.980774331.9807742012-01-05328.925659329.839722325.994720327.375732327.3757322012-01-06327.445282327.867523322.795532322.909790322.9097902012-01-09321.161163321.409546308.607819309.218842309.218842..................2016-12-23790.900024792.739990787.280029789.909973789.9099732016-12-27790.679993797.859985787.656982791.549988791.5499882016-12-28793.700012794.229980783.200012785.049988785.0499882016-12-29783.330017785.929993778.919983782.789978782.7899782016-12-30782.750000782.780029770.409973771.820007771.820007VolumeDate2012-01-0374008002012-01-0457652002012-01-0566084002012-01-0654207002012-01-0911720900......2016-12-236234002016-12-277891002016-12-2811538002016-12-297422002016-12-301770000[1258 rows x 6 columns]OpenHighLowCloseAdj Close\Date2017-01-03778.809998789.630005775.799988786.140015786.1400152017-01-04788.359985791.340027783.159973786.900024786.9000242017-01-05786.080017794.479980785.020020794.020020794.0200202017-01-06795.260010807.900024792.203979806.150024806.1500242017-01-09806.400024809.966003802.830017806.650024806.650024..................2017-04-24851.200012863.450012849.859985862.760010862.7600102017-04-25865.000000875.000000862.809998872.299988872.2999882017-04-26874.229980876.049988867.747986871.729980871.7299802017-04-27873.599976875.400024870.380005874.250000874.2500002017-04-28910.659973916.849976905.770020905.960022905.960022VolumeDate2017-01-0316573002017-01-0410730002017-01-0513352002017-01-0616402002017-01-091272400......2017-04-2413725002017-04-2516720002017-04-2612372002017-04-2720268002017-04-283219500[81 rows x 6 columns]
2、产生训练所需的特征和标签数据
= .iloc[:,4:5].
#数据归一化
sc = ()
= sc.()
def (ds, =1):
,= [],[]
for i in range(len(ds)-):
.(ds[i:(i+), 0])
.(ds[i+, 0])
np.array(), np.array()
= 60
print("回看天数:", )
# 分割成特征数据和标签数据
,= (, )
输出结果:
回看天数: 60
Out[5]:
array([0.08291369, 0.07626093, 0.0815312 , ..., 0.94758974, 0.94336851,0.92287887])
3、转换数据为(样本数,时步、特征)的张量
= np.(, (.shape[0], .shape[1], 1))
.shape
输出结果:
(1198, 60, 1)
4、定义LSTM模型
在编译模型中,损失函数为MSE,优化器为adam 。在训练模型中,训练周期为100,批次尺寸为32 。
model = ()
model.add(LSTM(50, =True,
=(.shape[1], 1)))
model.add((0.2))
model.add(LSTM(50, =True))
model.add((0.2))
model.add(LSTM(50))
model.add((0.2))
model.add(Dense(1))
model.()
#编译模型
pile(loss="mse", ="adam")
#训练模型
model.fit(, , =100, =32)