Pandas基础操作--数据操作

it2022-12-28  68

数据操作

一、pandas基本数据操作二、赋值操作三、逻辑符号四、统计运算

一、pandas基本数据操作

(1)读取csv文件

data = pd.read_csv("./stock_day/stock_day.csv") data = data.drop(["ma5","ma10","ma20","v_ma5","v_ma10","v_ma20"], axis=1)#选择drop不需要的列筛除

data.head()

open high close low volume price_change p_change turnover 2018-02-27 23.53 25.88 24.16 23.53 95578.03 0.63 2.68 2.39 2018-02-26 22.80 23.78 23.53 22.80 60985.11 0.69 3.02 1.53 2018-02-23 22.88 23.37 22.82 22.71 52914.01 0.54 2.42 1.32 2018-02-22 22.25 22.76 22.28 22.02 36105.01 0.36 1.64 0.90 2018-02-14 21.49 21.99 21.92 21.48 23331.04 0.44 2.05 0.58

索引: (1) data["open"]["2018-02-26"] #必须先列后行,否则报错

(2) data.loc["2018-02-26"]["open"]=data.loc["2018-02-26", "open"]#使用loc就可以先行后列 (3)data.iloc[1, 0]#通过行列位置索引 (4)data.ix[:4, ['open', 'close', 'high', 'low']] # 获取行第1天到第4天,['open', 'close', 'high', 'low']这个四个指标的结果

二、赋值操作

data.open = 100 data.iloc[1, 0] = 222

输出:

open high close low volume price_change p_change turnover 2018-02-27 100 25.88 24.16 23.53 95578.03 0.63 2.68 2.39 2018-02-26 222 23.78 23.53 22.80 60985.11 0.69 3.02 1.53 2018-02-23 100 23.37 22.82 22.71 52914.01 0.54 2.42 1.32 2018-02-22 100 22.76 22.28 22.02 36105.01 0.36 1.64 0.90 2018-02-14 100 21.99 21.92 21.48 23331.04 0.44 2.05 0.58

行索引排序:

data.sort_index().head()

输出:

open high close low volume price_change p_change turnover 2015-03-02 100 12.67 12.52 12.20 96291.73 0.32 2.62 3.30 2015-03-03 100 13.06 12.70 12.52 139071.61 0.18 1.44 4.76 2015-03-04 100 12.92 12.90 12.61 67075.44 0.20 1.57 2.30 2015-03-05 100 13.45 13.16 12.87 93180.39 0.26 2.02 3.19 2015-03-06 100 14.48 14.28 13.13 179831.72 1.12 8.51 6.16

某一列值排序:

sr = data["price_change"] sr.sort_values(ascending=False).head() 2015-06-09 3.03 2017-10-26 2.68 2015-05-21 2.57 2017-10-31 2.38 2017-06-22 2.36 Name: price_change, dtype: float64 sr.sort_index().head() 2015-03-02 0.32 2015-03-03 0.18 2015-03-04 0.20 2015-03-05 0.26 2015-03-06 1.12 Name: price_change, dtype: float64

三、逻辑符号

data[data["p_change"] > 2].head()# 例如筛选p_change > 2的日期数据 open high close low volume price_change p_change turnover 2018-02-27 100 25.88 24.16 23.53 95578.03 0.63 2.68 2.39 2018-02-26 222 23.78 23.53 22.80 60985.11 0.69 3.02 1.53 2018-02-23 100 23.37 22.82 22.71 52914.01 0.54 2.42 1.32 2018-02-14 100 21.99 21.92 21.48 23331.04 0.44 2.05 0.58 2018-02-12 100 21.40 21.19 20.63 32445.39 0.82 4.03 0.81 data[(data["p_change"] > 2) & (data["low"] > 15)].head()# 完成一个多个逻辑判断, 筛选p_change > 2并且low > 15 open high close low volume price_change p_change turnover 2018-02-27 100 25.88 24.16 23.53 95578.03 0.63 2.68 2.39 2018-02-26 222 23.78 23.53 22.80 60985.11 0.69 3.02 1.53 2018-02-23 100 23.37 22.82 22.71 52914.01 0.54 2.42 1.32 2018-02-14 100 21.99 21.92 21.48 23331.04 0.44 2.05 0.58 2018-02-12 100 21.40 21.19 20.63 32445.39 0.82 4.03 0.81

或者query():

data.query("p_change > 2 & low > 15").head() open high close low volume price_change p_change turnover 2018-02-27 100 25.88 24.16 23.53 95578.03 0.63 2.68 2.39 2018-02-26 222 23.78 23.53 22.80 60985.11 0.69 3.02 1.53 2018-02-23 100 23.37 22.82 22.71 52914.01 0.54 2.42 1.32 2018-02-14 100 21.99 21.92 21.48 23331.04 0.44 2.05 0.58 2018-02-12 100 21.40 21.19 20.63 32445.39 0.82 4.03 0.81

判断:

data[data["turnover"].isin([4.19, 2.39])]# 判断'turnover'是否为4.19, 2.39 open high close low volume price_change p_change turnover 2018-02-27 100 25.88 24.16 23.53 95578.03 0.63 2.68 2.39 2017-07-25 100 24.20 23.70 22.64 167489.48 0.67 2.91 4.19 2016-09-28 100 20.98 20.86 19.71 95580.75 0.98 4.93 2.39 2015-04-07 100 17.98 17.54 16.50 122471.85 0.88 5.28 4.19

四、统计运算

data.describe() open high close low volume price_change p_change turnover count 643.000000 643.000000 643.000000 643.000000 643.000000 643.000000 643.000000 643.000000 mean 100.189736 21.900513 21.336267 20.771835 99905.519114 0.018802 0.190280 2.936190 std 4.811210 4.077578 3.942806 3.791968 73879.119354 0.898476 4.079698 2.079375 min 100.000000 12.670000 12.360000 12.200000 1158.120000 -3.520000 -10.030000 0.040000 25% 100.000000 19.500000 19.045000 18.525000 48533.210000 -0.390000 -1.850000 1.360000 50% 100.000000 21.970000 21.450000 20.980000 83175.930000 0.050000 0.260000 2.500000 75% 100.000000 24.065000 23.415000 22.850000 127580.055000 0.455000 2.305000 3.915000 max 222.000000 36.350000 35.210000 34.010000 501915.410000 3.030000 10.030000 12.560000 data.max(axis=0)

输出:

open 222.00 high 36.35 close 35.21 low 34.01 volume 501915.41 price_change 3.03 p_change 10.03 turnover 12.56 dtype: float64 data.idxmax(axis=0)

输出:

open 2018-02-26 high 2015-06-10 close 2015-06-12 low 2015-06-12 volume 2017-10-26 price_change 2015-06-09 p_change 2015-08-28 turnover 2017-10-26 dtype: object
最新回复(0)