pandas 是基于 Numpy 构建的,让以 Numpy 为中心的应用变得更加简单。pandas是公认的数据处理利器,本章内容主要介绍DataFrame数据结构,在此基础上进行数据处理。除了DataFrame格式,pandas 还包括series、Panel。
格式 | 数组 | 释义 |
---|---|---|
Series | 一维数组 | 与Numpy中的一维array类似。 |
DataFrame | 二维的表格型数据结构 | 可以将DataFrame理解为Series的容器 |
Panel | 三维的数组 | 可以理解为DataFrame的容器 |
开始之前,我们首先掌握导入pandas库,方式如下:
import pandas as pd #pd是pandas的别名
注意:以下内容必须在导入pandas库之后才能运行。
1.Series
由一组数据和与之相关的索引组成。可通过传递一个list对象来创建一个Series,pandas会默认创建整型索引。
创建一个Series:
a=[1,3,5,7,6,8]
a[2]
5
s = pd.Series([1,3,5,7,6,8])
print(s)
0 1 1 3 2 5 3 7 4 6 5 8 dtype: int64
s.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7fcb2e05c9b0>
s.plot.bar()
<matplotlib.axes._subplots.AxesSubplot at 0x7fcb2bd07ef0>
s.plot.pie()
<matplotlib.axes._subplots.AxesSubplot at 0x7fcb2bd444e0>
s.hist()
<matplotlib.axes._subplots.AxesSubplot at 0x7fcb2b5e76d8>
获取 Series 的索引:
s.index.values
array([0, 1, 2, 3, 4, 5])
s=pd.Series([1,3,5,7,6,8],index=['a','b','c','d','e','f'])
print(s)
a 1 b 3 c 5 d 7 e 6 f 8 dtype: int64
s['f']
8
t=pd.Series([10,11,12],index=['c','b','a'])
print(t)
c 10 b 11 a 12 dtype: int64
[1,2,3]+[4,5,6]
[1, 2, 3, 4, 5, 6]
t+s#Series同index的元素值相加
a 13.0 b 14.0 c 15.0 d NaN e NaN f NaN dtype: float64
s=pd.Series([1,3,5,7,6,8],index=['a','b','c','d','e','f'])
print(s)
t=pd.Series([10,11,12],index=['c','b','a'])
print(t)
a 1 b 3 c 5 d 7 e 6 f 8 dtype: int64 c 10 b 11 a 12 dtype: int64
2.DataFrame
DataFrame是一个表格型的数据结构,它含有一组有序的列,每一列的数据结构都是相同的,而不同的列之间则可以是不同的数据结构。DataFrame中的每一行是一个记录,名称为Index的一个元素,而每一列则为一个字段,是这个记录的一个属性,DataFrame既有行索引也有列索引。
创建DataFrame
首先来看如何从字典创建DataFrame。
d = {'one': [1, 2, 3], 'two': [1, 2, 3]}
df = pd.DataFrame(d)
print(df)
one two 0 1 1 1 2 2 2 3 3
d = {'one': [1, 2, 3], 'two': [1, 2, 3]}
df = pd.DataFrame(d,index=['a', 'b', 'c'])
print(df)
one two a 1 1 b 2 2 c 3 3
A B C
one 2 9 4
two 7 5 3
three 6 1 8
d = {'A':[2,7,6],'B':[9,5,1],'C':[4,3,8]}
df=pd.DataFrame(d,index=['one','two','three'])
print(df)
A B C one 2 9 4 two 7 5 3 three 6 1 8
可以使用dataframe.index和dataframe.columns来查看DataFrame的行和列,dataframe.values则以数组的形式返回DataFrame的元素:
print(df.index) #查看行
print(df.columns) #查看列
print(df.values) #查看元素
Index(['one', 'two', 'three'], dtype='object') Index(['A', 'B', 'C'], dtype='object') [[2 9 4] [7 5 3] [6 1 8]]
DataFrame从值是数组的字典创建时,其各个数组的长度需要相同,加强印象,可参考以下报错的例子。
d = {'one': [1, 2], 'two': [1, 2, 3]}
df = pd.DataFrame(d,index=['a', 'b', 'c'])
print(df)
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) /opt/conda/lib/python3.6/site-packages/pandas/core/internals/managers.py in create_block_manager_from_arrays(arrays, names, axes) 1692 try: -> 1693 blocks = form_blocks(arrays, names, axes) 1694 mgr = BlockManager(blocks, axes) /opt/conda/lib/python3.6/site-packages/pandas/core/internals/managers.py in form_blocks(arrays, names, axes) 1762 if len(items_dict["IntBlock"]): -> 1763 int_blocks = _multi_blockify(items_dict["IntBlock"]) 1764 blocks.extend(int_blocks) /opt/conda/lib/python3.6/site-packages/pandas/core/internals/managers.py in _multi_blockify(tuples, dtype) 1844 -> 1845 values, placement = _stack_arrays(list(tup_block), dtype) 1846 /opt/conda/lib/python3.6/site-packages/pandas/core/internals/managers.py in _stack_arrays(tuples, dtype) 1874 for i, arr in enumerate(arrays): -> 1875 stacked[i] = _asarray_compat(arr) 1876 ValueError: could not broadcast input array from shape (3) into shape (2) During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) <ipython-input-26-6714bd1fff47> in <module> 1 d = {'one': [1, 2], 'two': [1, 2, 3]} ----> 2 df = pd.DataFrame(d,index=['a', 'b', 'c']) 3 print(df) /opt/conda/lib/python3.6/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy) 408 ) 409 elif isinstance(data, dict): --> 410 mgr = init_dict(data, index, columns, dtype=dtype) 411 elif isinstance(data, ma.MaskedArray): 412 import numpy.ma.mrecords as mrecords /opt/conda/lib/python3.6/site-packages/pandas/core/internals/construction.py in init_dict(data, index, columns, dtype) 255 arr if not is_datetime64tz_dtype(arr) else arr.copy() for arr in arrays 256 ] --> 257 return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype) 258 259 /opt/conda/lib/python3.6/site-packages/pandas/core/internals/construction.py in arrays_to_mgr(arrays, arr_names, index, columns, dtype) 85 axes = [ensure_index(columns), index] 86 ---> 87 return create_block_manager_from_arrays(arrays, arr_names, axes) 88 89 /opt/conda/lib/python3.6/site-packages/pandas/core/internals/managers.py in create_block_manager_from_arrays(arrays, names, axes) 1696 return mgr 1697 except ValueError as e: -> 1698 construction_error(len(arrays), arrays[0].shape, axes, e) 1699 1700 /opt/conda/lib/python3.6/site-packages/pandas/core/internals/managers.py in construction_error(tot_items, block_shape, axes, e) 1716 raise ValueError("Empty data passed with indices specified.") 1717 raise ValueError( -> 1718 "Shape of passed values is {0}, indices imply {1}".format(passed, implied) 1719 ) 1720 ValueError: Shape of passed values is (2, 2), indices imply (3, 2)
如果DataFrame的值是非数组时,没有这一限制,且自动将缺失值补成NaN。如下示例
d= [{'a': 1.6, 'b': 2}, {'a': 3, 'b': 6, 'c': 9}]
df = pd.DataFrame(d)
print(df)
a b c 0 1.6 2 NaN 1 3.0 6 9.0
d= [{'a': 1.6, 'b': 2}, {'a': 3, 'b': 6, 'c': 9}]
df = pd.DataFrame(d,index=['one','two'])
print(df)
a b c one 1.6 2 NaN two 3.0 6 9.0
df['a'].plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7fcb29cdd978>
df.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7fcb29c6dc18>
在实际处理数据时,有时需要创建一个空的DataFrame,可以这么做:
df = pd.DataFrame()
print(df)
Empty DataFrame Columns: [] Index: []
另一种创建DataFrame的方法十分有用,那就是使用concat函数创建DataFrame,其主要是通过两个行或列相同的DataFrame链接成一个。
a= [{'a': 1.6, 'b': 2}, {'a': 3, 'b': 6}]
df1 = pd.DataFrame(a)
b= [{'a': 4, 'b': 5}]
df2 = pd.DataFrame(b)
df = pd.concat([df1, df2], axis=0)
print(df1)
print(df2)
print(df)
a b 0 1.6 2 1 3.0 6 a b 0 4 5 a b 0 1.6 2 1 3.0 6 0 4.0 5
df = pd.concat([df1, df2], axis=1)
print(df)
a b a b 0 1.6 2 4.0 5.0 1 3.0 6 NaN NaN
注意:concat函数内有axis参数,其中的axis=1表示按列进行合并,axis=0表示按行合并
先从简单的csv导入数据到DataFrame。
# 获取csv文件中的浦发银行的开盘价、最高价、最低价、收盘价,获取格式即为DataFrame
price=pd.read_csv('stock1-105.csv',encoding='gbk',usecols=['date','open','high','low','close'])
price=price.set_index('date')
price
open | high | low | close | |
---|---|---|---|---|
date | ||||
2017-01-03 | 16.21 | 16.44 | 16.17 | 16.30 |
2017-01-04 | 16.29 | 16.35 | 16.18 | 16.33 |
2017-01-05 | 16.30 | 16.38 | 16.24 | 16.30 |
2017-01-06 | 16.30 | 16.30 | 16.13 | 16.18 |
2017-01-09 | 16.24 | 16.29 | 16.13 | 16.20 |
2017-01-10 | 16.18 | 16.24 | 16.14 | 16.19 |
2017-01-11 | 16.24 | 16.24 | 16.15 | 16.16 |
2017-01-12 | 16.18 | 16.20 | 16.11 | 16.12 |
2017-01-13 | 16.10 | 16.29 | 16.10 | 16.27 |
2017-01-16 | 16.23 | 16.60 | 16.10 | 16.56 |
2017-01-17 | 16.46 | 16.54 | 16.37 | 16.40 |
2017-01-18 | 16.42 | 16.55 | 16.36 | 16.48 |
2017-01-19 | 16.43 | 16.64 | 16.43 | 16.54 |
2017-01-20 | 16.58 | 16.66 | 16.50 | 16.60 |
2017-01-23 | 16.66 | 16.69 | 16.51 | 16.57 |
2017-01-24 | 16.58 | 16.70 | 16.58 | 16.69 |
2017-01-25 | 16.69 | 16.74 | 16.61 | 16.69 |
2017-01-26 | 16.69 | 16.84 | 16.61 | 16.74 |
2017-02-03 | 16.82 | 16.85 | 16.62 | 16.63 |
2017-02-06 | 16.75 | 16.78 | 16.66 | 16.66 |
2017-02-07 | 16.65 | 16.74 | 16.63 | 16.67 |
2017-02-08 | 16.63 | 16.69 | 16.54 | 16.67 |
2017-02-09 | 16.68 | 16.77 | 16.59 | 16.72 |
2017-02-10 | 16.76 | 16.84 | 16.70 | 16.78 |
2017-02-13 | 16.88 | 16.90 | 16.78 | 16.85 |
2017-02-14 | 16.87 | 16.88 | 16.74 | 16.75 |
2017-02-15 | 16.82 | 16.93 | 16.75 | 16.84 |
2017-02-16 | 16.88 | 16.91 | 16.76 | 16.78 |
2017-02-17 | 16.78 | 16.82 | 16.59 | 16.64 |
2017-02-20 | 16.62 | 16.98 | 16.62 | 16.91 |
... | ... | ... | ... | ... |
2017-08-07 | 12.87 | 12.95 | 12.83 | 12.93 |
2017-08-08 | 12.91 | 12.93 | 12.83 | 12.87 |
2017-08-09 | 12.84 | 12.88 | 12.75 | 12.76 |
2017-08-10 | 12.74 | 12.88 | 12.70 | 12.79 |
2017-08-11 | 12.72 | 12.73 | 12.46 | 12.68 |
2017-08-14 | 12.62 | 12.63 | 12.52 | 12.56 |
2017-08-15 | 12.58 | 12.76 | 12.56 | 12.59 |
2017-08-16 | 12.56 | 12.56 | 12.47 | 12.49 |
2017-08-17 | 12.49 | 12.55 | 12.47 | 12.53 |
2017-08-18 | 12.50 | 12.57 | 12.46 | 12.51 |
2017-08-21 | 12.50 | 12.57 | 12.46 | 12.51 |
2017-08-22 | 12.51 | 12.53 | 12.41 | 12.43 |
2017-08-23 | 12.44 | 12.66 | 12.41 | 12.57 |
2017-08-24 | 12.58 | 12.66 | 12.45 | 12.47 |
2017-08-25 | 12.47 | 12.80 | 12.47 | 12.78 |
2017-08-28 | 12.83 | 12.96 | 12.75 | 12.89 |
2017-08-29 | 12.85 | 12.98 | 12.77 | 12.96 |
2017-08-30 | 12.91 | 12.98 | 12.82 | 12.87 |
2017-08-31 | 12.83 | 12.84 | 12.65 | 12.71 |
2017-09-01 | 12.68 | 12.87 | 12.68 | 12.77 |
2017-09-04 | 12.78 | 12.84 | 12.65 | 12.78 |
2017-09-05 | 12.78 | 13.09 | 12.78 | 13.03 |
2017-09-06 | 12.99 | 13.05 | 12.90 | 12.96 |
2017-09-07 | 12.94 | 13.02 | 12.82 | 12.85 |
2017-09-08 | 12.85 | 13.12 | 12.83 | 13.03 |
2017-09-11 | 13.15 | 13.15 | 12.95 | 12.99 |
2017-09-12 | 13.00 | 13.04 | 12.91 | 13.04 |
2017-09-13 | 13.01 | 13.01 | 12.86 | 12.87 |
2017-09-14 | 12.88 | 12.98 | 12.81 | 12.85 |
2017-09-15 | 12.82 | 12.86 | 12.75 | 12.83 |
174 rows × 4 columns
df=pd.read_csv('/resources/data/TRD_Dalyr.csv')
df
Stkcd | Trddt | Opnprc | Hiprc | Loprc | Clsprc | Dnshrtrd | Dnvaltrd | Dsmvosd | Dsmvtll | Adjprcwd | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 600600 | 2019-01-02 | 34.91 | 35.68 | 34.12 | 34.50 | 2866826 | 9.993306e+07 | 24009019.79 | 24009019.79 | 57.993438 |
1 | 600600 | 2019-01-03 | 34.80 | 34.89 | 32.93 | 33.30 | 3137838 | 1.058104e+08 | 23173923.45 | 23173923.45 | 55.976275 |
2 | 600600 | 2019-01-04 | 32.19 | 33.93 | 32.19 | 33.40 | 2816908 | 9.389348e+07 | 23243514.81 | 23243514.81 | 56.144372 |
3 | 600600 | 2019-01-07 | 33.49 | 35.18 | 33.10 | 34.86 | 4165242 | 1.435954e+08 | 24259548.69 | 24259548.69 | 58.598587 |
4 | 600600 | 2019-01-08 | 34.72 | 34.72 | 33.20 | 33.49 | 4465646 | 1.502569e+08 | 23306147.03 | 23306147.03 | 56.295659 |
5 | 600600 | 2019-01-09 | 33.51 | 34.68 | 33.28 | 33.75 | 4591794 | 1.564405e+08 | 23487084.57 | 23487084.57 | 56.732711 |
6 | 600600 | 2019-01-10 | 33.73 | 33.73 | 32.87 | 33.18 | 4342328 | 1.436151e+08 | 23090413.81 | 23090413.81 | 55.774559 |
7 | 600600 | 2019-01-11 | 33.20 | 33.60 | 32.90 | 33.21 | 3027414 | 1.004160e+08 | 23111291.22 | 23111291.22 | 55.824988 |
8 | 600600 | 2019-01-14 | 33.26 | 33.28 | 32.65 | 33.09 | 2502931 | 8.233961e+07 | 23027781.59 | 23027781.59 | 55.623272 |
9 | 600600 | 2019-01-15 | 32.98 | 36.40 | 32.83 | 36.40 | 10810126 | 3.784860e+08 | 25331255.66 | 25331255.66 | 61.187280 |
10 | 600600 | 2019-01-16 | 36.00 | 36.87 | 35.86 | 36.10 | 6596757 | 2.388105e+08 | 25122481.57 | 25122481.57 | 60.682989 |
11 | 600600 | 2019-01-17 | 36.50 | 36.50 | 35.80 | 35.80 | 4330457 | 1.564373e+08 | 24913707.49 | 24913707.49 | 60.178698 |
12 | 600600 | 2019-01-18 | 36.00 | 36.47 | 35.76 | 36.19 | 4685634 | 1.695381e+08 | 25185113.80 | 25185113.80 | 60.834276 |
13 | 600600 | 2019-01-21 | 36.17 | 38.76 | 36.09 | 36.78 | 7439315 | 2.771725e+08 | 25595702.83 | 25595702.83 | 61.826048 |
14 | 600600 | 2019-01-22 | 36.50 | 36.74 | 35.80 | 36.17 | 2824918 | 1.021891e+08 | 25171195.53 | 25171195.53 | 60.800657 |
15 | 600600 | 2019-01-23 | 36.00 | 36.68 | 35.81 | 36.30 | 2863203 | 1.040513e+08 | 25261664.30 | 25261664.30 | 61.019183 |
16 | 600600 | 2019-01-24 | 36.22 | 36.81 | 36.04 | 36.58 | 2333696 | 8.502156e+07 | 25456520.11 | 25456520.11 | 61.489854 |
17 | 600600 | 2019-01-25 | 36.57 | 37.56 | 36.16 | 36.83 | 3176591 | 1.168643e+08 | 25630498.51 | 25630498.51 | 61.910096 |
18 | 600600 | 2019-01-28 | 37.00 | 37.94 | 36.60 | 36.96 | 4363808 | 1.625587e+08 | 25720967.28 | 25720967.28 | 62.128622 |
19 | 600600 | 2019-01-29 | 36.80 | 38.38 | 36.10 | 38.13 | 4956990 | 1.859601e+08 | 26535186.22 | 26535186.22 | 64.095356 |
20 | 600600 | 2019-01-30 | 37.54 | 38.68 | 37.16 | 37.19 | 3985083 | 1.509411e+08 | 25881027.42 | 25881027.42 | 62.515245 |
21 | 600600 | 2019-01-31 | 37.05 | 38.12 | 37.05 | 37.73 | 2556359 | 9.631578e+07 | 26256820.77 | 26256820.77 | 63.422968 |
22 | 600600 | 2019-02-01 | 37.77 | 38.20 | 37.50 | 38.10 | 3193602 | 1.211643e+08 | 26514308.81 | 26514308.81 | 64.044927 |
23 | 600600 | 2019-02-11 | 38.10 | 39.00 | 37.50 | 38.70 | 4417674 | 1.690389e+08 | 26931856.98 | 26931856.98 | 65.053509 |
24 | 600600 | 2019-02-12 | 38.67 | 38.70 | 38.10 | 38.45 | 4249343 | 1.627973e+08 | 26757878.57 | 26757878.57 | 64.633267 |
25 | 600600 | 2019-02-13 | 38.40 | 38.70 | 38.25 | 38.55 | 3804185 | 1.463872e+08 | 26827469.94 | 26827469.94 | 64.801364 |
26 | 600600 | 2019-02-14 | 38.48 | 40.24 | 38.38 | 39.46 | 4301795 | 1.699792e+08 | 27460751.33 | 27460751.33 | 66.331046 |
27 | 600600 | 2019-02-15 | 39.39 | 39.39 | 38.20 | 38.33 | 3445361 | 1.336287e+08 | 26674368.94 | 26674368.94 | 64.431551 |
28 | 600600 | 2019-02-18 | 38.38 | 39.51 | 38.35 | 39.06 | 3449336 | 1.341243e+08 | 27182385.88 | 27182385.88 | 65.658659 |
29 | 600600 | 2019-02-19 | 39.14 | 39.37 | 37.55 | 38.20 | 5970855 | 2.285474e+08 | 26583900.17 | 26583900.17 | 64.213025 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
404 | 600600 | 2020-08-31 | 95.44 | 99.50 | 87.36 | 87.36 | 16336429 | 1.485624e+09 | 60795013.58 | 61948165.58 | 149.643013 |
405 | 600600 | 2020-09-01 | 87.36 | 87.66 | 82.50 | 86.30 | 12650742 | 1.073644e+09 | 60057345.15 | 61196505.15 | 147.827290 |
406 | 600600 | 2020-09-02 | 85.28 | 86.63 | 83.75 | 84.45 | 7235267 | 6.149912e+08 | 58769904.96 | 59884644.96 | 144.658339 |
407 | 600600 | 2020-09-03 | 84.45 | 86.86 | 83.55 | 85.34 | 6967213 | 5.962448e+08 | 59389268.07 | 60515756.07 | 146.182861 |
408 | 600600 | 2020-09-04 | 83.50 | 83.56 | 79.61 | 81.50 | 9595196 | 7.745201e+08 | 56716959.79 | 57792759.79 | 139.605146 |
409 | 600600 | 2020-09-07 | 80.57 | 81.30 | 77.40 | 78.40 | 7119585 | 5.663550e+08 | 54559627.57 | 55594507.57 | 134.295012 |
410 | 600600 | 2020-09-08 | 78.88 | 79.29 | 74.55 | 77.01 | 8062892 | 6.141779e+08 | 53592307.65 | 54608839.65 | 131.914016 |
411 | 600600 | 2020-09-09 | 75.88 | 76.19 | 73.70 | 74.98 | 4712071 | 3.520896e+08 | 52179603.00 | 53169339.00 | 128.436734 |
412 | 600600 | 2020-09-10 | 76.00 | 76.99 | 74.77 | 75.24 | 5188079 | 3.926766e+08 | 52360540.54 | 53353708.54 | 128.882100 |
413 | 600600 | 2020-09-11 | 74.68 | 77.11 | 73.82 | 76.77 | 4146573 | 3.145263e+08 | 53425288.38 | 54438652.38 | 131.502908 |
414 | 600600 | 2020-09-14 | 77.50 | 80.08 | 76.88 | 77.95 | 7192602 | 5.670469e+08 | 54246466.45 | 55275406.45 | 133.524185 |
415 | 600600 | 2020-09-15 | 77.77 | 79.50 | 76.23 | 78.85 | 4127403 | 3.231186e+08 | 54872788.70 | 55913608.70 | 135.065837 |
416 | 600600 | 2020-09-16 | 78.65 | 80.17 | 77.67 | 78.65 | 3399733 | 2.676203e+08 | 54733605.98 | 55771785.98 | 134.723248 |
417 | 600600 | 2020-09-17 | 78.00 | 78.00 | 75.49 | 75.90 | 5078531 | 3.865731e+08 | 52819843.53 | 53821723.53 | 130.012645 |
418 | 600600 | 2020-09-18 | 75.77 | 76.21 | 74.26 | 76.21 | 4966692 | 3.739078e+08 | 53035576.75 | 54041548.75 | 130.543658 |
419 | 600600 | 2020-09-21 | 76.20 | 76.20 | 74.28 | 74.92 | 4789182 | 3.579874e+08 | 52137848.19 | 53126792.19 | 128.333957 |
420 | 600600 | 2020-09-22 | 74.00 | 74.99 | 73.50 | 74.30 | 3666905 | 2.729507e+08 | 51706381.74 | 52687141.74 | 127.271930 |
421 | 600600 | 2020-09-23 | 74.30 | 74.64 | 73.36 | 74.15 | 3852423 | 2.854876e+08 | 51601994.70 | 52580774.70 | 127.014988 |
422 | 600600 | 2020-09-24 | 74.15 | 74.35 | 71.40 | 72.41 | 5948893 | 4.317549e+08 | 50391105.01 | 51346917.01 | 124.034461 |
423 | 600600 | 2020-09-25 | 72.41 | 73.67 | 70.00 | 73.51 | 6215862 | 4.502747e+08 | 51156609.99 | 52126941.99 | 125.918702 |
424 | 600600 | 2020-09-28 | 73.00 | 73.50 | 71.17 | 71.49 | 5088792 | 3.657537e+08 | 49750864.48 | 50694532.48 | 122.458550 |
425 | 600600 | 2020-09-29 | 71.98 | 72.50 | 71.17 | 72.08 | 4640643 | 3.343638e+08 | 50161453.51 | 51112909.51 | 123.469188 |
426 | 600600 | 2020-09-30 | 72.50 | 76.08 | 72.21 | 75.11 | 8449161 | 6.319768e+08 | 52270071.77 | 53261523.77 | 128.659416 |
427 | 600600 | 2020-10-09 | 75.83 | 76.81 | 74.37 | 74.62 | 6983298 | 5.230091e+08 | 51929074.10 | 52914058.10 | 127.820072 |
428 | 600600 | 2020-10-12 | 74.50 | 75.19 | 73.17 | 75.10 | 8951159 | 6.663092e+08 | 52263112.64 | 53254432.64 | 128.642286 |
429 | 600600 | 2020-10-13 | 74.85 | 76.70 | 74.08 | 76.49 | 7075436 | 5.376659e+08 | 53230432.56 | 54240100.56 | 131.023282 |
430 | 600600 | 2020-10-14 | 75.39 | 76.45 | 72.96 | 74.15 | 9407050 | 6.974287e+08 | 51601994.70 | 52580774.70 | 127.014987 |
431 | 600600 | 2020-10-15 | 74.15 | 76.60 | 73.37 | 76.00 | 6741385 | 5.082867e+08 | 52889434.89 | 53892634.89 | 130.183938 |
432 | 600600 | 2020-10-16 | 75.44 | 76.88 | 75.02 | 75.40 | 4823905 | 3.662679e+08 | 52471886.72 | 53467166.72 | 129.156170 |
433 | 600600 | 2020-10-19 | 76.16 | 76.16 | 74.00 | 74.53 | 3630250 | 2.712269e+08 | 51866441.88 | 52850237.88 | 127.665907 |
434 rows × 11 columns
df=df.set_index('Trddt')
df
Stkcd | Opnprc | Hiprc | Loprc | Clsprc | Dnshrtrd | Dnvaltrd | Dsmvosd | Dsmvtll | Adjprcwd | |
---|---|---|---|---|---|---|---|---|---|---|
Trddt | ||||||||||
2019-01-02 | 600600 | 34.91 | 35.68 | 34.12 | 34.50 | 2866826 | 9.993306e+07 | 24009019.79 | 24009019.79 | 57.993438 |
2019-01-03 | 600600 | 34.80 | 34.89 | 32.93 | 33.30 | 3137838 | 1.058104e+08 | 23173923.45 | 23173923.45 | 55.976275 |
2019-01-04 | 600600 | 32.19 | 33.93 | 32.19 | 33.40 | 2816908 | 9.389348e+07 | 23243514.81 | 23243514.81 | 56.144372 |
2019-01-07 | 600600 | 33.49 | 35.18 | 33.10 | 34.86 | 4165242 | 1.435954e+08 | 24259548.69 | 24259548.69 | 58.598587 |
2019-01-08 | 600600 | 34.72 | 34.72 | 33.20 | 33.49 | 4465646 | 1.502569e+08 | 23306147.03 | 23306147.03 | 56.295659 |
2019-01-09 | 600600 | 33.51 | 34.68 | 33.28 | 33.75 | 4591794 | 1.564405e+08 | 23487084.57 | 23487084.57 | 56.732711 |
2019-01-10 | 600600 | 33.73 | 33.73 | 32.87 | 33.18 | 4342328 | 1.436151e+08 | 23090413.81 | 23090413.81 | 55.774559 |
2019-01-11 | 600600 | 33.20 | 33.60 | 32.90 | 33.21 | 3027414 | 1.004160e+08 | 23111291.22 | 23111291.22 | 55.824988 |
2019-01-14 | 600600 | 33.26 | 33.28 | 32.65 | 33.09 | 2502931 | 8.233961e+07 | 23027781.59 | 23027781.59 | 55.623272 |
2019-01-15 | 600600 | 32.98 | 36.40 | 32.83 | 36.40 | 10810126 | 3.784860e+08 | 25331255.66 | 25331255.66 | 61.187280 |
2019-01-16 | 600600 | 36.00 | 36.87 | 35.86 | 36.10 | 6596757 | 2.388105e+08 | 25122481.57 | 25122481.57 | 60.682989 |
2019-01-17 | 600600 | 36.50 | 36.50 | 35.80 | 35.80 | 4330457 | 1.564373e+08 | 24913707.49 | 24913707.49 | 60.178698 |
2019-01-18 | 600600 | 36.00 | 36.47 | 35.76 | 36.19 | 4685634 | 1.695381e+08 | 25185113.80 | 25185113.80 | 60.834276 |
2019-01-21 | 600600 | 36.17 | 38.76 | 36.09 | 36.78 | 7439315 | 2.771725e+08 | 25595702.83 | 25595702.83 | 61.826048 |
2019-01-22 | 600600 | 36.50 | 36.74 | 35.80 | 36.17 | 2824918 | 1.021891e+08 | 25171195.53 | 25171195.53 | 60.800657 |
2019-01-23 | 600600 | 36.00 | 36.68 | 35.81 | 36.30 | 2863203 | 1.040513e+08 | 25261664.30 | 25261664.30 | 61.019183 |
2019-01-24 | 600600 | 36.22 | 36.81 | 36.04 | 36.58 | 2333696 | 8.502156e+07 | 25456520.11 | 25456520.11 | 61.489854 |
2019-01-25 | 600600 | 36.57 | 37.56 | 36.16 | 36.83 | 3176591 | 1.168643e+08 | 25630498.51 | 25630498.51 | 61.910096 |
2019-01-28 | 600600 | 37.00 | 37.94 | 36.60 | 36.96 | 4363808 | 1.625587e+08 | 25720967.28 | 25720967.28 | 62.128622 |
2019-01-29 | 600600 | 36.80 | 38.38 | 36.10 | 38.13 | 4956990 | 1.859601e+08 | 26535186.22 | 26535186.22 | 64.095356 |
2019-01-30 | 600600 | 37.54 | 38.68 | 37.16 | 37.19 | 3985083 | 1.509411e+08 | 25881027.42 | 25881027.42 | 62.515245 |
2019-01-31 | 600600 | 37.05 | 38.12 | 37.05 | 37.73 | 2556359 | 9.631578e+07 | 26256820.77 | 26256820.77 | 63.422968 |
2019-02-01 | 600600 | 37.77 | 38.20 | 37.50 | 38.10 | 3193602 | 1.211643e+08 | 26514308.81 | 26514308.81 | 64.044927 |
2019-02-11 | 600600 | 38.10 | 39.00 | 37.50 | 38.70 | 4417674 | 1.690389e+08 | 26931856.98 | 26931856.98 | 65.053509 |
2019-02-12 | 600600 | 38.67 | 38.70 | 38.10 | 38.45 | 4249343 | 1.627973e+08 | 26757878.57 | 26757878.57 | 64.633267 |
2019-02-13 | 600600 | 38.40 | 38.70 | 38.25 | 38.55 | 3804185 | 1.463872e+08 | 26827469.94 | 26827469.94 | 64.801364 |
2019-02-14 | 600600 | 38.48 | 40.24 | 38.38 | 39.46 | 4301795 | 1.699792e+08 | 27460751.33 | 27460751.33 | 66.331046 |
2019-02-15 | 600600 | 39.39 | 39.39 | 38.20 | 38.33 | 3445361 | 1.336287e+08 | 26674368.94 | 26674368.94 | 64.431551 |
2019-02-18 | 600600 | 38.38 | 39.51 | 38.35 | 39.06 | 3449336 | 1.341243e+08 | 27182385.88 | 27182385.88 | 65.658659 |
2019-02-19 | 600600 | 39.14 | 39.37 | 37.55 | 38.20 | 5970855 | 2.285474e+08 | 26583900.17 | 26583900.17 | 64.213025 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2020-08-31 | 600600 | 95.44 | 99.50 | 87.36 | 87.36 | 16336429 | 1.485624e+09 | 60795013.58 | 61948165.58 | 149.643013 |
2020-09-01 | 600600 | 87.36 | 87.66 | 82.50 | 86.30 | 12650742 | 1.073644e+09 | 60057345.15 | 61196505.15 | 147.827290 |
2020-09-02 | 600600 | 85.28 | 86.63 | 83.75 | 84.45 | 7235267 | 6.149912e+08 | 58769904.96 | 59884644.96 | 144.658339 |
2020-09-03 | 600600 | 84.45 | 86.86 | 83.55 | 85.34 | 6967213 | 5.962448e+08 | 59389268.07 | 60515756.07 | 146.182861 |
2020-09-04 | 600600 | 83.50 | 83.56 | 79.61 | 81.50 | 9595196 | 7.745201e+08 | 56716959.79 | 57792759.79 | 139.605146 |
2020-09-07 | 600600 | 80.57 | 81.30 | 77.40 | 78.40 | 7119585 | 5.663550e+08 | 54559627.57 | 55594507.57 | 134.295012 |
2020-09-08 | 600600 | 78.88 | 79.29 | 74.55 | 77.01 | 8062892 | 6.141779e+08 | 53592307.65 | 54608839.65 | 131.914016 |
2020-09-09 | 600600 | 75.88 | 76.19 | 73.70 | 74.98 | 4712071 | 3.520896e+08 | 52179603.00 | 53169339.00 | 128.436734 |
2020-09-10 | 600600 | 76.00 | 76.99 | 74.77 | 75.24 | 5188079 | 3.926766e+08 | 52360540.54 | 53353708.54 | 128.882100 |
2020-09-11 | 600600 | 74.68 | 77.11 | 73.82 | 76.77 | 4146573 | 3.145263e+08 | 53425288.38 | 54438652.38 | 131.502908 |
2020-09-14 | 600600 | 77.50 | 80.08 | 76.88 | 77.95 | 7192602 | 5.670469e+08 | 54246466.45 | 55275406.45 | 133.524185 |
2020-09-15 | 600600 | 77.77 | 79.50 | 76.23 | 78.85 | 4127403 | 3.231186e+08 | 54872788.70 | 55913608.70 | 135.065837 |
2020-09-16 | 600600 | 78.65 | 80.17 | 77.67 | 78.65 | 3399733 | 2.676203e+08 | 54733605.98 | 55771785.98 | 134.723248 |
2020-09-17 | 600600 | 78.00 | 78.00 | 75.49 | 75.90 | 5078531 | 3.865731e+08 | 52819843.53 | 53821723.53 | 130.012645 |
2020-09-18 | 600600 | 75.77 | 76.21 | 74.26 | 76.21 | 4966692 | 3.739078e+08 | 53035576.75 | 54041548.75 | 130.543658 |
2020-09-21 | 600600 | 76.20 | 76.20 | 74.28 | 74.92 | 4789182 | 3.579874e+08 | 52137848.19 | 53126792.19 | 128.333957 |
2020-09-22 | 600600 | 74.00 | 74.99 | 73.50 | 74.30 | 3666905 | 2.729507e+08 | 51706381.74 | 52687141.74 | 127.271930 |
2020-09-23 | 600600 | 74.30 | 74.64 | 73.36 | 74.15 | 3852423 | 2.854876e+08 | 51601994.70 | 52580774.70 | 127.014988 |
2020-09-24 | 600600 | 74.15 | 74.35 | 71.40 | 72.41 | 5948893 | 4.317549e+08 | 50391105.01 | 51346917.01 | 124.034461 |
2020-09-25 | 600600 | 72.41 | 73.67 | 70.00 | 73.51 | 6215862 | 4.502747e+08 | 51156609.99 | 52126941.99 | 125.918702 |
2020-09-28 | 600600 | 73.00 | 73.50 | 71.17 | 71.49 | 5088792 | 3.657537e+08 | 49750864.48 | 50694532.48 | 122.458550 |
2020-09-29 | 600600 | 71.98 | 72.50 | 71.17 | 72.08 | 4640643 | 3.343638e+08 | 50161453.51 | 51112909.51 | 123.469188 |
2020-09-30 | 600600 | 72.50 | 76.08 | 72.21 | 75.11 | 8449161 | 6.319768e+08 | 52270071.77 | 53261523.77 | 128.659416 |
2020-10-09 | 600600 | 75.83 | 76.81 | 74.37 | 74.62 | 6983298 | 5.230091e+08 | 51929074.10 | 52914058.10 | 127.820072 |
2020-10-12 | 600600 | 74.50 | 75.19 | 73.17 | 75.10 | 8951159 | 6.663092e+08 | 52263112.64 | 53254432.64 | 128.642286 |
2020-10-13 | 600600 | 74.85 | 76.70 | 74.08 | 76.49 | 7075436 | 5.376659e+08 | 53230432.56 | 54240100.56 | 131.023282 |
2020-10-14 | 600600 | 75.39 | 76.45 | 72.96 | 74.15 | 9407050 | 6.974287e+08 | 51601994.70 | 52580774.70 | 127.014987 |
2020-10-15 | 600600 | 74.15 | 76.60 | 73.37 | 76.00 | 6741385 | 5.082867e+08 | 52889434.89 | 53892634.89 | 130.183938 |
2020-10-16 | 600600 | 75.44 | 76.88 | 75.02 | 75.40 | 4823905 | 3.662679e+08 | 52471886.72 | 53467166.72 | 129.156170 |
2020-10-19 | 600600 | 76.16 | 76.16 | 74.00 | 74.53 | 3630250 | 2.712269e+08 | 51866441.88 | 52850237.88 | 127.665907 |
434 rows × 10 columns
df['Clsprc'].plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7fcb2a162940>
取上证指数(000001.SH)从20160101到20201019的开盘价、最高价、最低价、收盘价
画出以上收盘价折线图
求所有天的日收益率,并画出直方图
df=pd.read_csv('/resources/data/IDX_Idxtrd.csv')
df.columns=['Code','Date','Open','High','Low','Close']
#df=df.rename(columns={'Idxtrd05':'Close'})
df=df.set_index('Date')
df
Code | Open | High | Low | Close | |
---|---|---|---|---|---|
Date | |||||
2016-01-04 | 1 | 3536.59 | 3538.69 | 3295.74 | 3296.26 |
2016-01-05 | 1 | 3196.65 | 3328.14 | 3189.61 | 3287.71 |
2016-01-06 | 1 | 3291.20 | 3362.97 | 3288.93 | 3361.84 |
2016-01-07 | 1 | 3309.66 | 3309.66 | 3115.89 | 3125.00 |
2016-01-08 | 1 | 3194.63 | 3235.45 | 3056.88 | 3186.41 |
2016-01-11 | 1 | 3131.85 | 3166.22 | 3016.70 | 3016.70 |
2016-01-12 | 1 | 3026.16 | 3047.66 | 2978.46 | 3022.86 |
2016-01-13 | 1 | 3041.11 | 3059.02 | 2949.29 | 2949.60 |
2016-01-14 | 1 | 2874.05 | 3012.29 | 2867.55 | 3007.65 |
2016-01-15 | 1 | 2988.05 | 3001.71 | 2883.87 | 2900.97 |
2016-01-18 | 1 | 2847.54 | 2945.45 | 2844.70 | 2913.84 |
2016-01-19 | 1 | 2914.41 | 3012.07 | 2906.40 | 3007.74 |
2016-01-20 | 1 | 2993.01 | 3016.28 | 2951.92 | 2976.69 |
2016-01-21 | 1 | 2934.39 | 2998.79 | 2880.09 | 2880.48 |
2016-01-22 | 1 | 2911.11 | 2931.36 | 2851.73 | 2916.56 |
2016-01-25 | 1 | 2934.08 | 2955.78 | 2911.83 | 2938.52 |
2016-01-26 | 1 | 2907.72 | 2911.99 | 2743.84 | 2749.79 |
2016-01-27 | 1 | 2756.08 | 2768.77 | 2638.30 | 2735.56 |
2016-01-28 | 1 | 2711.16 | 2740.54 | 2647.49 | 2655.66 |
2016-01-29 | 1 | 2652.85 | 2755.37 | 2649.79 | 2737.60 |
2016-02-01 | 1 | 2730.98 | 2735.26 | 2655.62 | 2688.85 |
2016-02-02 | 1 | 2687.98 | 2755.16 | 2687.98 | 2749.57 |
2016-02-03 | 1 | 2719.57 | 2746.07 | 2696.88 | 2739.25 |
2016-02-04 | 1 | 2751.43 | 2793.30 | 2751.31 | 2781.02 |
2016-02-05 | 1 | 2783.08 | 2790.06 | 2762.16 | 2763.49 |
2016-02-15 | 1 | 2684.96 | 2760.36 | 2682.09 | 2746.20 |
2016-02-16 | 1 | 2758.58 | 2840.62 | 2758.58 | 2836.57 |
2016-02-17 | 1 | 2829.76 | 2868.70 | 2824.36 | 2867.34 |
2016-02-18 | 1 | 2881.78 | 2893.22 | 2857.70 | 2862.89 |
2016-02-19 | 1 | 2854.90 | 2872.72 | 2840.49 | 2860.02 |
... | ... | ... | ... | ... | ... |
2020-08-28 | 1 | 3346.29 | 3405.88 | 3339.65 | 3403.81 |
2020-08-31 | 1 | 3416.55 | 3442.74 | 3395.47 | 3395.68 |
2020-09-01 | 1 | 3389.74 | 3410.61 | 3381.71 | 3410.61 |
2020-09-02 | 1 | 3420.47 | 3421.40 | 3377.21 | 3404.80 |
2020-09-03 | 1 | 3404.03 | 3425.63 | 3374.26 | 3384.98 |
2020-09-04 | 1 | 3336.41 | 3360.11 | 3328.55 | 3355.37 |
2020-09-07 | 1 | 3349.92 | 3368.25 | 3285.63 | 3292.59 |
2020-09-08 | 1 | 3301.22 | 3324.39 | 3276.44 | 3316.42 |
2020-09-09 | 1 | 3281.00 | 3289.62 | 3238.56 | 3254.63 |
2020-09-10 | 1 | 3282.23 | 3285.47 | 3227.64 | 3234.82 |
2020-09-11 | 1 | 3225.78 | 3262.50 | 3220.54 | 3260.35 |
2020-09-14 | 1 | 3275.92 | 3283.53 | 3261.15 | 3278.81 |
2020-09-15 | 1 | 3277.13 | 3297.69 | 3263.75 | 3295.68 |
2020-09-16 | 1 | 3293.17 | 3302.46 | 3271.08 | 3283.92 |
2020-09-17 | 1 | 3277.32 | 3290.41 | 3248.50 | 3270.44 |
2020-09-18 | 1 | 3270.91 | 3338.32 | 3268.53 | 3338.09 |
2020-09-21 | 1 | 3348.90 | 3350.59 | 3313.07 | 3316.94 |
2020-09-22 | 1 | 3290.67 | 3320.23 | 3265.70 | 3274.30 |
2020-09-23 | 1 | 3278.84 | 3289.76 | 3264.89 | 3279.71 |
2020-09-24 | 1 | 3262.88 | 3265.35 | 3221.24 | 3223.18 |
2020-09-25 | 1 | 3234.37 | 3239.47 | 3208.04 | 3219.42 |
2020-09-28 | 1 | 3224.98 | 3238.18 | 3210.89 | 3217.54 |
2020-09-29 | 1 | 3231.86 | 3242.79 | 3219.79 | 3224.36 |
2020-09-30 | 1 | 3232.71 | 3244.91 | 3202.34 | 3218.05 |
2020-10-09 | 1 | 3262.61 | 3280.51 | 3260.19 | 3272.08 |
2020-10-12 | 1 | 3287.33 | 3359.15 | 3286.11 | 3358.47 |
2020-10-13 | 1 | 3353.12 | 3361.83 | 3334.50 | 3359.75 |
2020-10-14 | 1 | 3353.63 | 3353.63 | 3332.96 | 3340.78 |
2020-10-15 | 1 | 3342.92 | 3354.58 | 3330.00 | 3332.18 |
2020-10-16 | 1 | 3334.46 | 3348.95 | 3319.11 | 3336.36 |
1164 rows × 5 columns
df['Close'].plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7fcb29f5eef0>
import numpy as np
df['Return']=np.nan
df
Code | Open | High | Low | Close | Return | |
---|---|---|---|---|---|---|
Date | ||||||
2016-01-04 | 1 | 3536.59 | 3538.69 | 3295.74 | 3296.26 | NaN |
2016-01-05 | 1 | 3196.65 | 3328.14 | 3189.61 | 3287.71 | NaN |
2016-01-06 | 1 | 3291.20 | 3362.97 | 3288.93 | 3361.84 | NaN |
2016-01-07 | 1 | 3309.66 | 3309.66 | 3115.89 | 3125.00 | NaN |
2016-01-08 | 1 | 3194.63 | 3235.45 | 3056.88 | 3186.41 | NaN |
2016-01-11 | 1 | 3131.85 | 3166.22 | 3016.70 | 3016.70 | NaN |
2016-01-12 | 1 | 3026.16 | 3047.66 | 2978.46 | 3022.86 | NaN |
2016-01-13 | 1 | 3041.11 | 3059.02 | 2949.29 | 2949.60 | NaN |
2016-01-14 | 1 | 2874.05 | 3012.29 | 2867.55 | 3007.65 | NaN |
2016-01-15 | 1 | 2988.05 | 3001.71 | 2883.87 | 2900.97 | NaN |
2016-01-18 | 1 | 2847.54 | 2945.45 | 2844.70 | 2913.84 | NaN |
2016-01-19 | 1 | 2914.41 | 3012.07 | 2906.40 | 3007.74 | NaN |
2016-01-20 | 1 | 2993.01 | 3016.28 | 2951.92 | 2976.69 | NaN |
2016-01-21 | 1 | 2934.39 | 2998.79 | 2880.09 | 2880.48 | NaN |
2016-01-22 | 1 | 2911.11 | 2931.36 | 2851.73 | 2916.56 | NaN |
2016-01-25 | 1 | 2934.08 | 2955.78 | 2911.83 | 2938.52 | NaN |
2016-01-26 | 1 | 2907.72 | 2911.99 | 2743.84 | 2749.79 | NaN |
2016-01-27 | 1 | 2756.08 | 2768.77 | 2638.30 | 2735.56 | NaN |
2016-01-28 | 1 | 2711.16 | 2740.54 | 2647.49 | 2655.66 | NaN |
2016-01-29 | 1 | 2652.85 | 2755.37 | 2649.79 | 2737.60 | NaN |
2016-02-01 | 1 | 2730.98 | 2735.26 | 2655.62 | 2688.85 | NaN |
2016-02-02 | 1 | 2687.98 | 2755.16 | 2687.98 | 2749.57 | NaN |
2016-02-03 | 1 | 2719.57 | 2746.07 | 2696.88 | 2739.25 | NaN |
2016-02-04 | 1 | 2751.43 | 2793.30 | 2751.31 | 2781.02 | NaN |
2016-02-05 | 1 | 2783.08 | 2790.06 | 2762.16 | 2763.49 | NaN |
2016-02-15 | 1 | 2684.96 | 2760.36 | 2682.09 | 2746.20 | NaN |
2016-02-16 | 1 | 2758.58 | 2840.62 | 2758.58 | 2836.57 | NaN |
2016-02-17 | 1 | 2829.76 | 2868.70 | 2824.36 | 2867.34 | NaN |
2016-02-18 | 1 | 2881.78 | 2893.22 | 2857.70 | 2862.89 | NaN |
2016-02-19 | 1 | 2854.90 | 2872.72 | 2840.49 | 2860.02 | NaN |
... | ... | ... | ... | ... | ... | ... |
2020-08-28 | 1 | 3346.29 | 3405.88 | 3339.65 | 3403.81 | NaN |
2020-08-31 | 1 | 3416.55 | 3442.74 | 3395.47 | 3395.68 | NaN |
2020-09-01 | 1 | 3389.74 | 3410.61 | 3381.71 | 3410.61 | NaN |
2020-09-02 | 1 | 3420.47 | 3421.40 | 3377.21 | 3404.80 | NaN |
2020-09-03 | 1 | 3404.03 | 3425.63 | 3374.26 | 3384.98 | NaN |
2020-09-04 | 1 | 3336.41 | 3360.11 | 3328.55 | 3355.37 | NaN |
2020-09-07 | 1 | 3349.92 | 3368.25 | 3285.63 | 3292.59 | NaN |
2020-09-08 | 1 | 3301.22 | 3324.39 | 3276.44 | 3316.42 | NaN |
2020-09-09 | 1 | 3281.00 | 3289.62 | 3238.56 | 3254.63 | NaN |
2020-09-10 | 1 | 3282.23 | 3285.47 | 3227.64 | 3234.82 | NaN |
2020-09-11 | 1 | 3225.78 | 3262.50 | 3220.54 | 3260.35 | NaN |
2020-09-14 | 1 | 3275.92 | 3283.53 | 3261.15 | 3278.81 | NaN |
2020-09-15 | 1 | 3277.13 | 3297.69 | 3263.75 | 3295.68 | NaN |
2020-09-16 | 1 | 3293.17 | 3302.46 | 3271.08 | 3283.92 | NaN |
2020-09-17 | 1 | 3277.32 | 3290.41 | 3248.50 | 3270.44 | NaN |
2020-09-18 | 1 | 3270.91 | 3338.32 | 3268.53 | 3338.09 | NaN |
2020-09-21 | 1 | 3348.90 | 3350.59 | 3313.07 | 3316.94 | NaN |
2020-09-22 | 1 | 3290.67 | 3320.23 | 3265.70 | 3274.30 | NaN |
2020-09-23 | 1 | 3278.84 | 3289.76 | 3264.89 | 3279.71 | NaN |
2020-09-24 | 1 | 3262.88 | 3265.35 | 3221.24 | 3223.18 | NaN |
2020-09-25 | 1 | 3234.37 | 3239.47 | 3208.04 | 3219.42 | NaN |
2020-09-28 | 1 | 3224.98 | 3238.18 | 3210.89 | 3217.54 | NaN |
2020-09-29 | 1 | 3231.86 | 3242.79 | 3219.79 | 3224.36 | NaN |
2020-09-30 | 1 | 3232.71 | 3244.91 | 3202.34 | 3218.05 | NaN |
2020-10-09 | 1 | 3262.61 | 3280.51 | 3260.19 | 3272.08 | NaN |
2020-10-12 | 1 | 3287.33 | 3359.15 | 3286.11 | 3358.47 | NaN |
2020-10-13 | 1 | 3353.12 | 3361.83 | 3334.50 | 3359.75 | NaN |
2020-10-14 | 1 | 3353.63 | 3353.63 | 3332.96 | 3340.78 | NaN |
2020-10-15 | 1 | 3342.92 | 3354.58 | 3330.00 | 3332.18 | NaN |
2020-10-16 | 1 | 3334.46 | 3348.95 | 3319.11 | 3336.36 | NaN |
1164 rows × 6 columns
for i in range(1,len(df)):
df['Return'][i]=df['Close'][i]/df['Close'][i-1]-1
df
/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:2: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
Code | Open | High | Low | Close | Return | |
---|---|---|---|---|---|---|
Date | ||||||
2016-01-04 | 1 | 3536.59 | 3538.69 | 3295.74 | 3296.26 | NaN |
2016-01-05 | 1 | 3196.65 | 3328.14 | 3189.61 | 3287.71 | -0.002594 |
2016-01-06 | 1 | 3291.20 | 3362.97 | 3288.93 | 3361.84 | 0.022548 |
2016-01-07 | 1 | 3309.66 | 3309.66 | 3115.89 | 3125.00 | -0.070450 |
2016-01-08 | 1 | 3194.63 | 3235.45 | 3056.88 | 3186.41 | 0.019651 |
2016-01-11 | 1 | 3131.85 | 3166.22 | 3016.70 | 3016.70 | -0.053261 |
2016-01-12 | 1 | 3026.16 | 3047.66 | 2978.46 | 3022.86 | 0.002042 |
2016-01-13 | 1 | 3041.11 | 3059.02 | 2949.29 | 2949.60 | -0.024235 |
2016-01-14 | 1 | 2874.05 | 3012.29 | 2867.55 | 3007.65 | 0.019681 |
2016-01-15 | 1 | 2988.05 | 3001.71 | 2883.87 | 2900.97 | -0.035470 |
2016-01-18 | 1 | 2847.54 | 2945.45 | 2844.70 | 2913.84 | 0.004436 |
2016-01-19 | 1 | 2914.41 | 3012.07 | 2906.40 | 3007.74 | 0.032226 |
2016-01-20 | 1 | 2993.01 | 3016.28 | 2951.92 | 2976.69 | -0.010323 |
2016-01-21 | 1 | 2934.39 | 2998.79 | 2880.09 | 2880.48 | -0.032321 |
2016-01-22 | 1 | 2911.11 | 2931.36 | 2851.73 | 2916.56 | 0.012526 |
2016-01-25 | 1 | 2934.08 | 2955.78 | 2911.83 | 2938.52 | 0.007529 |
2016-01-26 | 1 | 2907.72 | 2911.99 | 2743.84 | 2749.79 | -0.064226 |
2016-01-27 | 1 | 2756.08 | 2768.77 | 2638.30 | 2735.56 | -0.005175 |
2016-01-28 | 1 | 2711.16 | 2740.54 | 2647.49 | 2655.66 | -0.029208 |
2016-01-29 | 1 | 2652.85 | 2755.37 | 2649.79 | 2737.60 | 0.030855 |
2016-02-01 | 1 | 2730.98 | 2735.26 | 2655.62 | 2688.85 | -0.017808 |
2016-02-02 | 1 | 2687.98 | 2755.16 | 2687.98 | 2749.57 | 0.022582 |
2016-02-03 | 1 | 2719.57 | 2746.07 | 2696.88 | 2739.25 | -0.003753 |
2016-02-04 | 1 | 2751.43 | 2793.30 | 2751.31 | 2781.02 | 0.015249 |
2016-02-05 | 1 | 2783.08 | 2790.06 | 2762.16 | 2763.49 | -0.006303 |
2016-02-15 | 1 | 2684.96 | 2760.36 | 2682.09 | 2746.20 | -0.006257 |
2016-02-16 | 1 | 2758.58 | 2840.62 | 2758.58 | 2836.57 | 0.032907 |
2016-02-17 | 1 | 2829.76 | 2868.70 | 2824.36 | 2867.34 | 0.010848 |
2016-02-18 | 1 | 2881.78 | 2893.22 | 2857.70 | 2862.89 | -0.001552 |
2016-02-19 | 1 | 2854.90 | 2872.72 | 2840.49 | 2860.02 | -0.001002 |
... | ... | ... | ... | ... | ... | ... |
2020-08-28 | 1 | 3346.29 | 3405.88 | 3339.65 | 3403.81 | 0.016029 |
2020-08-31 | 1 | 3416.55 | 3442.74 | 3395.47 | 3395.68 | -0.002388 |
2020-09-01 | 1 | 3389.74 | 3410.61 | 3381.71 | 3410.61 | 0.004397 |
2020-09-02 | 1 | 3420.47 | 3421.40 | 3377.21 | 3404.80 | -0.001704 |
2020-09-03 | 1 | 3404.03 | 3425.63 | 3374.26 | 3384.98 | -0.005821 |
2020-09-04 | 1 | 3336.41 | 3360.11 | 3328.55 | 3355.37 | -0.008747 |
2020-09-07 | 1 | 3349.92 | 3368.25 | 3285.63 | 3292.59 | -0.018710 |
2020-09-08 | 1 | 3301.22 | 3324.39 | 3276.44 | 3316.42 | 0.007237 |
2020-09-09 | 1 | 3281.00 | 3289.62 | 3238.56 | 3254.63 | -0.018632 |
2020-09-10 | 1 | 3282.23 | 3285.47 | 3227.64 | 3234.82 | -0.006087 |
2020-09-11 | 1 | 3225.78 | 3262.50 | 3220.54 | 3260.35 | 0.007892 |
2020-09-14 | 1 | 3275.92 | 3283.53 | 3261.15 | 3278.81 | 0.005662 |
2020-09-15 | 1 | 3277.13 | 3297.69 | 3263.75 | 3295.68 | 0.005145 |
2020-09-16 | 1 | 3293.17 | 3302.46 | 3271.08 | 3283.92 | -0.003568 |
2020-09-17 | 1 | 3277.32 | 3290.41 | 3248.50 | 3270.44 | -0.004105 |
2020-09-18 | 1 | 3270.91 | 3338.32 | 3268.53 | 3338.09 | 0.020685 |
2020-09-21 | 1 | 3348.90 | 3350.59 | 3313.07 | 3316.94 | -0.006336 |
2020-09-22 | 1 | 3290.67 | 3320.23 | 3265.70 | 3274.30 | -0.012855 |
2020-09-23 | 1 | 3278.84 | 3289.76 | 3264.89 | 3279.71 | 0.001652 |
2020-09-24 | 1 | 3262.88 | 3265.35 | 3221.24 | 3223.18 | -0.017236 |
2020-09-25 | 1 | 3234.37 | 3239.47 | 3208.04 | 3219.42 | -0.001167 |
2020-09-28 | 1 | 3224.98 | 3238.18 | 3210.89 | 3217.54 | -0.000584 |
2020-09-29 | 1 | 3231.86 | 3242.79 | 3219.79 | 3224.36 | 0.002120 |
2020-09-30 | 1 | 3232.71 | 3244.91 | 3202.34 | 3218.05 | -0.001957 |
2020-10-09 | 1 | 3262.61 | 3280.51 | 3260.19 | 3272.08 | 0.016790 |
2020-10-12 | 1 | 3287.33 | 3359.15 | 3286.11 | 3358.47 | 0.026402 |
2020-10-13 | 1 | 3353.12 | 3361.83 | 3334.50 | 3359.75 | 0.000381 |
2020-10-14 | 1 | 3353.63 | 3353.63 | 3332.96 | 3340.78 | -0.005646 |
2020-10-15 | 1 | 3342.92 | 3354.58 | 3330.00 | 3332.18 | -0.002574 |
2020-10-16 | 1 | 3334.46 | 3348.95 | 3319.11 | 3336.36 | 0.001254 |
1164 rows × 6 columns
df['Return']=df['Close']/df['Close'].shift()-1
df['Return'].hist()
<matplotlib.axes._subplots.AxesSubplot at 0x7fcb29de95c0>
以下为数据查看常用的八项操作:
1.查看前几条数据:
price.head(10)
open | high | low | close | |
---|---|---|---|---|
date | ||||
2017-01-03 | 16.21 | 16.44 | 16.17 | 16.30 |
2017-01-04 | 16.29 | 16.35 | 16.18 | 16.33 |
2017-01-05 | 16.30 | 16.38 | 16.24 | 16.30 |
2017-01-06 | 16.30 | 16.30 | 16.13 | 16.18 |
2017-01-09 | 16.24 | 16.29 | 16.13 | 16.20 |
2017-01-10 | 16.18 | 16.24 | 16.14 | 16.19 |
2017-01-11 | 16.24 | 16.24 | 16.15 | 16.16 |
2017-01-12 | 16.18 | 16.20 | 16.11 | 16.12 |
2017-01-13 | 16.10 | 16.29 | 16.10 | 16.27 |
2017-01-16 | 16.23 | 16.60 | 16.10 | 16.56 |
2.查看后几条数据:
price.tail(10)
open | high | low | close | |
---|---|---|---|---|
date | ||||
2017-09-04 | 12.78 | 12.84 | 12.65 | 12.78 |
2017-09-05 | 12.78 | 13.09 | 12.78 | 13.03 |
2017-09-06 | 12.99 | 13.05 | 12.90 | 12.96 |
2017-09-07 | 12.94 | 13.02 | 12.82 | 12.85 |
2017-09-08 | 12.85 | 13.12 | 12.83 | 13.03 |
2017-09-11 | 13.15 | 13.15 | 12.95 | 12.99 |
2017-09-12 | 13.00 | 13.04 | 12.91 | 13.04 |
2017-09-13 | 13.01 | 13.01 | 12.86 | 12.87 |
2017-09-14 | 12.88 | 12.98 | 12.81 | 12.85 |
2017-09-15 | 12.82 | 12.86 | 12.75 | 12.83 |
3.查看 DataFrame 的索引
price.index
Index(['2017-01-03', '2017-01-04', '2017-01-05', '2017-01-06', '2017-01-09', '2017-01-10', '2017-01-11', '2017-01-12', '2017-01-13', '2017-01-16', ... '2017-09-04', '2017-09-05', '2017-09-06', '2017-09-07', '2017-09-08', '2017-09-11', '2017-09-12', '2017-09-13', '2017-09-14', '2017-09-15'], dtype='object', name='date', length=174)
4.查看 DataFrame 的列名
price.columns
Index(['open', 'high', 'low', 'close'], dtype='object')
5.查看 DataFrame 的值
price.values
array([[16.21, 16.44, 16.17, 16.3 ], [16.29, 16.35, 16.18, 16.33], [16.3 , 16.38, 16.24, 16.3 ], [16.3 , 16.3 , 16.13, 16.18], [16.24, 16.29, 16.13, 16.2 ], [16.18, 16.24, 16.14, 16.19], [16.24, 16.24, 16.15, 16.16], [16.18, 16.2 , 16.11, 16.12], [16.1 , 16.29, 16.1 , 16.27], [16.23, 16.6 , 16.1 , 16.56], [16.46, 16.54, 16.37, 16.4 ], [16.42, 16.55, 16.36, 16.48], [16.43, 16.64, 16.43, 16.54], [16.58, 16.66, 16.5 , 16.6 ], [16.66, 16.69, 16.51, 16.57], [16.58, 16.7 , 16.58, 16.69], [16.69, 16.74, 16.61, 16.69], [16.69, 16.84, 16.61, 16.74], [16.82, 16.85, 16.62, 16.63], [16.75, 16.78, 16.66, 16.66], [16.65, 16.74, 16.63, 16.67], [16.63, 16.69, 16.54, 16.67], [16.68, 16.77, 16.59, 16.72], [16.76, 16.84, 16.7 , 16.78], [16.88, 16.9 , 16.78, 16.85], [16.87, 16.88, 16.74, 16.75], [16.82, 16.93, 16.75, 16.84], [16.88, 16.91, 16.76, 16.78], [16.78, 16.82, 16.59, 16.64], [16.62, 16.98, 16.62, 16.91], [16.88, 17.08, 16.81, 16.88], [16.88, 16.9 , 16.73, 16.75], [16.78, 16.83, 16.64, 16.69], [16.67, 16.75, 16.66, 16.71], [16.69, 16.72, 16.53, 16.59], [16.58, 16.67, 16.53, 16.59], [16.58, 16.62, 16.52, 16.56], [16.62, 16.62, 16.39, 16.41], [16.42, 16.43, 16.32, 16.35], [16.37, 16.53, 16.35, 16.41], [16.38, 16.45, 16.37, 16.43], [16.4 , 16.44, 16.35, 16.39], [16.37, 16.4 , 16.22, 16.22], [16.23, 16.28, 16.17, 16.23], [16.23, 16.34, 16.16, 16.34], [16.34, 16.35, 16.24, 16.26], [16.24, 16.28, 16.17, 16.24], [16.27, 16.35, 16.25, 16.29], [16.3 , 16.32, 16.14, 16.2 ], [16.24, 16.25, 16.13, 16.14], [16.15, 16.16, 15.98, 16. ], [15.98, 15.98, 15.7 , 15.78], [15.8 , 15.92, 15.79, 15.88], [15.85, 16. , 15.83, 15.96], [15.97, 16.13, 15.9 , 16.04], [16.11, 16.13, 15.97, 16.01], [16.01, 16.11, 15.8 , 15.87], [15.88, 15.9 , 15.72, 15.81], [15.78, 16.05, 15.77, 16.01], [16.05, 16.2 , 15.89, 16.16], [16.09, 16.18, 16.03, 16.1 ], [16.06, 16.16, 16.02, 16.1 ], [16.12, 16.12, 15.94, 15.97], [15.97, 15.99, 15.81, 15.89], [15.88, 15.96, 15.75, 15.89], [15.88, 15.89, 15.73, 15.75], [15.76, 15.78, 15.42, 15.51], [15.53, 15.6 , 15.3 , 15.56], [15.5 , 15.52, 15.3 , 15.3 ], [15.25, 15.27, 15.11, 15.11], [15.12, 15.14, 14.9 , 14.91], [14.92, 15.07, 14.85, 15.05], [15.05, 15.11, 14.91, 15. ], [15.02, 15.1 , 14.99, 15.05], [15.06, 15.11, 15. , 15.05], [15.05, 15.25, 15.03, 15.21], [15.15, 15.22, 15.08, 15.21], [15.21, 15.22, 15.13, 15.16], [15.16, 15.16, 15.05, 15.08], [15.07, 15.07, 14.9 , 14.98], [14.95, 14.98, 14.52, 14.92], [14.78, 14.9 , 14.51, 14.86], [14.69, 14.84, 14.66, 14.76], [14.76, 14.85, 14.55, 14.61], [14.57, 14.79, 14.55, 14.75], [14.75, 15.23, 14.73, 15.21], [15.27, 15.43, 15.16, 15.27], [15.23, 15.29, 15.08, 15.27], [15.24, 15.26, 15.07, 15.21], [15.1 , 15.14, 14.95, 15.01], [15.04, 15.08, 14.91, 15.02], [15.02, 15.06, 14.93, 15.04], [15. , 15.57, 14.98, 15.43], [15.38, 15.52, 15.21, 15.47], [11.75, 12.93, 11.72, 12.93], [12.81, 12.91, 12.54, 12.84], [12.75, 13.05, 12.66, 12.84], [12.78, 12.97, 12.69, 12.92], [12.93, 13.01, 12.74, 12.89], [12.79, 12.88, 12.51, 12.85], [12.8 , 12.84, 12.66, 12.8 ], [12.81, 12.99, 12.78, 12.84], [12.76, 12.86, 12.67, 12.84], [12.79, 13.2 , 12.77, 12.93], [12.9 , 13.14, 12.76, 12.78], [12.78, 12.78, 12.56, 12.59], [12.6 , 12.64, 12.31, 12.46], [12.41, 12.46, 12.28, 12.38], [12.38, 12.46, 12.27, 12.35], [12.36, 12.43, 12.3 , 12.4 ], [12.41, 12.44, 12.22, 12.29], [12.4 , 12.4 , 12.26, 12.34], [12.33, 12.84, 12.29, 12.57], [12.55, 12.65, 12.36, 12.64], [12.64, 12.82, 12.55, 12.64], [12.64, 12.73, 12.53, 12.7 ], [12.66, 12.91, 12.6 , 12.79], [12.81, 12.86, 12.7 , 12.76], [12.7 , 12.73, 12.59, 12.65], [12.64, 12.65, 12.47, 12.56], [12.55, 12.58, 12.41, 12.55], [12.5 , 12.65, 12.47, 12.62], [12.62, 12.72, 12.51, 12.66], [12.62, 12.69, 12.55, 12.6 ], [12.59, 12.64, 12.52, 12.53], [12.51, 12.94, 12.51, 12.8 ], [12.8 , 13.06, 12.74, 12.86], [12.82, 13.29, 12.77, 13.27], [13.34, 13.8 , 13.34, 13.59], [13.62, 14.02, 13.28, 13.76], [13.65, 13.72, 13.4 , 13.55], [13.56, 13.78, 13.48, 13.69], [13.67, 13.7 , 13.53, 13.62], [13.61, 13.63, 13.38, 13.45], [13.45, 13.74, 13.36, 13.67], [13.62, 13.75, 13.41, 13.51], [13.45, 13.7 , 13.36, 13.38], [13.38, 13.47, 13.28, 13.38], [13.37, 13.49, 13.3 , 13.47], [13.45, 13.52, 13.31, 13.36], [13.42, 13.49, 13.32, 13.43], [13.44, 13.6 , 13.39, 13.44], [13.42, 13.42, 13.04, 13.08], [13.09, 13.1 , 12.85, 12.87], [12.87, 12.95, 12.83, 12.93], [12.91, 12.93, 12.83, 12.87], [12.84, 12.88, 12.75, 12.76], [12.74, 12.88, 12.7 , 12.79], [12.72, 12.73, 12.46, 12.68], [12.62, 12.63, 12.52, 12.56], [12.58, 12.76, 12.56, 12.59], [12.56, 12.56, 12.47, 12.49], [12.49, 12.55, 12.47, 12.53], [12.5 , 12.57, 12.46, 12.51], [12.5 , 12.57, 12.46, 12.51], [12.51, 12.53, 12.41, 12.43], [12.44, 12.66, 12.41, 12.57], [12.58, 12.66, 12.45, 12.47], [12.47, 12.8 , 12.47, 12.78], [12.83, 12.96, 12.75, 12.89], [12.85, 12.98, 12.77, 12.96], [12.91, 12.98, 12.82, 12.87], [12.83, 12.84, 12.65, 12.71], [12.68, 12.87, 12.68, 12.77], [12.78, 12.84, 12.65, 12.78], [12.78, 13.09, 12.78, 13.03], [12.99, 13.05, 12.9 , 12.96], [12.94, 13.02, 12.82, 12.85], [12.85, 13.12, 12.83, 13.03], [13.15, 13.15, 12.95, 12.99], [13. , 13.04, 12.91, 13.04], [13.01, 13.01, 12.86, 12.87], [12.88, 12.98, 12.81, 12.85], [12.82, 12.86, 12.75, 12.83]])
6.使用 describe() 函数对于数据的快速统计汇总:
price.describe()
open | high | low | close | |
---|---|---|---|---|
count | 174.000000 | 174.000000 | 174.000000 | 174.000000 |
mean | 14.545747 | 14.657241 | 14.436609 | 14.556034 |
std | 1.668836 | 1.642246 | 1.672180 | 1.648151 |
min | 11.750000 | 12.400000 | 11.720000 | 12.290000 |
25% | 12.810000 | 12.942500 | 12.740000 | 12.842500 |
50% | 14.975000 | 15.070000 | 14.875000 | 14.990000 |
75% | 16.230000 | 16.287500 | 16.137500 | 16.200000 |
max | 16.880000 | 17.080000 | 16.810000 | 16.910000 |
7.对数据的转置:
price.T
date | 2017-01-03 | 2017-01-04 | 2017-01-05 | 2017-01-06 | 2017-01-09 | 2017-01-10 | 2017-01-11 | 2017-01-12 | 2017-01-13 | 2017-01-16 | ... | 2017-09-04 | 2017-09-05 | 2017-09-06 | 2017-09-07 | 2017-09-08 | 2017-09-11 | 2017-09-12 | 2017-09-13 | 2017-09-14 | 2017-09-15 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
open | 16.21 | 16.29 | 16.30 | 16.30 | 16.24 | 16.18 | 16.24 | 16.18 | 16.10 | 16.23 | ... | 12.78 | 12.78 | 12.99 | 12.94 | 12.85 | 13.15 | 13.00 | 13.01 | 12.88 | 12.82 |
high | 16.44 | 16.35 | 16.38 | 16.30 | 16.29 | 16.24 | 16.24 | 16.20 | 16.29 | 16.60 | ... | 12.84 | 13.09 | 13.05 | 13.02 | 13.12 | 13.15 | 13.04 | 13.01 | 12.98 | 12.86 |
low | 16.17 | 16.18 | 16.24 | 16.13 | 16.13 | 16.14 | 16.15 | 16.11 | 16.10 | 16.10 | ... | 12.65 | 12.78 | 12.90 | 12.82 | 12.83 | 12.95 | 12.91 | 12.86 | 12.81 | 12.75 |
close | 16.30 | 16.33 | 16.30 | 16.18 | 16.20 | 16.19 | 16.16 | 16.12 | 16.27 | 16.56 | ... | 12.78 | 13.03 | 12.96 | 12.85 | 13.03 | 12.99 | 13.04 | 12.87 | 12.85 | 12.83 |
4 rows × 174 columns
8.按列对 DataFrame 进行排序
price.sort_values(by=['open','close'] , ascending=[False,True])
open | high | low | close | |
---|---|---|---|---|
date | ||||
2017-02-22 | 16.88 | 16.90 | 16.73 | 16.75 |
2017-02-16 | 16.88 | 16.91 | 16.76 | 16.78 |
2017-02-13 | 16.88 | 16.90 | 16.78 | 16.85 |
2017-02-21 | 16.88 | 17.08 | 16.81 | 16.88 |
2017-02-14 | 16.87 | 16.88 | 16.74 | 16.75 |
2017-02-03 | 16.82 | 16.85 | 16.62 | 16.63 |
2017-02-15 | 16.82 | 16.93 | 16.75 | 16.84 |
2017-02-17 | 16.78 | 16.82 | 16.59 | 16.64 |
2017-02-23 | 16.78 | 16.83 | 16.64 | 16.69 |
2017-02-10 | 16.76 | 16.84 | 16.70 | 16.78 |
2017-02-06 | 16.75 | 16.78 | 16.66 | 16.66 |
2017-02-27 | 16.69 | 16.72 | 16.53 | 16.59 |
2017-01-25 | 16.69 | 16.74 | 16.61 | 16.69 |
2017-01-26 | 16.69 | 16.84 | 16.61 | 16.74 |
2017-02-09 | 16.68 | 16.77 | 16.59 | 16.72 |
2017-02-24 | 16.67 | 16.75 | 16.66 | 16.71 |
2017-01-23 | 16.66 | 16.69 | 16.51 | 16.57 |
2017-02-07 | 16.65 | 16.74 | 16.63 | 16.67 |
2017-02-08 | 16.63 | 16.69 | 16.54 | 16.67 |
2017-03-02 | 16.62 | 16.62 | 16.39 | 16.41 |
2017-02-20 | 16.62 | 16.98 | 16.62 | 16.91 |
2017-03-01 | 16.58 | 16.62 | 16.52 | 16.56 |
2017-02-28 | 16.58 | 16.67 | 16.53 | 16.59 |
2017-01-20 | 16.58 | 16.66 | 16.50 | 16.60 |
2017-01-24 | 16.58 | 16.70 | 16.58 | 16.69 |
2017-01-17 | 16.46 | 16.54 | 16.37 | 16.40 |
2017-01-19 | 16.43 | 16.64 | 16.43 | 16.54 |
2017-03-03 | 16.42 | 16.43 | 16.32 | 16.35 |
2017-01-18 | 16.42 | 16.55 | 16.36 | 16.48 |
2017-03-08 | 16.40 | 16.44 | 16.35 | 16.39 |
... | ... | ... | ... | ... |
2017-09-01 | 12.68 | 12.87 | 12.68 | 12.77 |
2017-06-28 | 12.66 | 12.91 | 12.60 | 12.79 |
2017-07-03 | 12.64 | 12.65 | 12.47 | 12.56 |
2017-06-26 | 12.64 | 12.82 | 12.55 | 12.64 |
2017-06-27 | 12.64 | 12.73 | 12.53 | 12.70 |
2017-08-14 | 12.62 | 12.63 | 12.52 | 12.56 |
2017-07-07 | 12.62 | 12.69 | 12.55 | 12.60 |
2017-07-06 | 12.62 | 12.72 | 12.51 | 12.66 |
2017-06-14 | 12.60 | 12.64 | 12.31 | 12.46 |
2017-07-10 | 12.59 | 12.64 | 12.52 | 12.53 |
2017-08-24 | 12.58 | 12.66 | 12.45 | 12.47 |
2017-08-15 | 12.58 | 12.76 | 12.56 | 12.59 |
2017-08-16 | 12.56 | 12.56 | 12.47 | 12.49 |
2017-07-04 | 12.55 | 12.58 | 12.41 | 12.55 |
2017-06-23 | 12.55 | 12.65 | 12.36 | 12.64 |
2017-08-22 | 12.51 | 12.53 | 12.41 | 12.43 |
2017-07-11 | 12.51 | 12.94 | 12.51 | 12.80 |
2017-08-18 | 12.50 | 12.57 | 12.46 | 12.51 |
2017-08-21 | 12.50 | 12.57 | 12.46 | 12.51 |
2017-07-05 | 12.50 | 12.65 | 12.47 | 12.62 |
2017-08-17 | 12.49 | 12.55 | 12.47 | 12.53 |
2017-08-25 | 12.47 | 12.80 | 12.47 | 12.78 |
2017-08-23 | 12.44 | 12.66 | 12.41 | 12.57 |
2017-06-20 | 12.41 | 12.44 | 12.22 | 12.29 |
2017-06-15 | 12.41 | 12.46 | 12.28 | 12.38 |
2017-06-21 | 12.40 | 12.40 | 12.26 | 12.34 |
2017-06-16 | 12.38 | 12.46 | 12.27 | 12.35 |
2017-06-19 | 12.36 | 12.43 | 12.30 | 12.40 |
2017-06-22 | 12.33 | 12.84 | 12.29 | 12.57 |
2017-05-25 | 11.75 | 12.93 | 11.72 | 12.93 |
174 rows × 4 columns
注意:sort_values函数内置参数有by和ascending,by参数是排序指定列,ascending是排序顺序,False是从大到小,True是从小到大。
依旧采用上个小节案例,继续讲述选择数据的八项基本操作。
1.选择一列数据,选取开盘价这列数据:
price['open']
date 2017-01-03 16.21 2017-01-04 16.29 2017-01-05 16.30 2017-01-06 16.30 2017-01-09 16.24 ... 2017-09-11 13.15 2017-09-12 13.00 2017-09-13 13.01 2017-09-14 12.88 2017-09-15 12.82 Name: open, Length: 174, dtype: float64
price.open
date 2017-01-03 16.21 2017-01-04 16.29 2017-01-05 16.30 2017-01-06 16.30 2017-01-09 16.24 ... 2017-09-11 13.15 2017-09-12 13.00 2017-09-13 13.01 2017-09-14 12.88 2017-09-15 12.82 Name: open, Length: 174, dtype: float64
同学们动手试试price.open~
它与price['open']是等效的!
2.选择多列数据:
price[['open','high','close']]
open | high | close | |
---|---|---|---|
date | |||
2017-01-03 | 16.21 | 16.44 | 16.30 |
2017-01-04 | 16.29 | 16.35 | 16.33 |
2017-01-05 | 16.30 | 16.38 | 16.30 |
2017-01-06 | 16.30 | 16.30 | 16.18 |
2017-01-09 | 16.24 | 16.29 | 16.20 |
2017-01-10 | 16.18 | 16.24 | 16.19 |
2017-01-11 | 16.24 | 16.24 | 16.16 |
2017-01-12 | 16.18 | 16.20 | 16.12 |
2017-01-13 | 16.10 | 16.29 | 16.27 |
2017-01-16 | 16.23 | 16.60 | 16.56 |
2017-01-17 | 16.46 | 16.54 | 16.40 |
2017-01-18 | 16.42 | 16.55 | 16.48 |
2017-01-19 | 16.43 | 16.64 | 16.54 |
2017-01-20 | 16.58 | 16.66 | 16.60 |
2017-01-23 | 16.66 | 16.69 | 16.57 |
2017-01-24 | 16.58 | 16.70 | 16.69 |
2017-01-25 | 16.69 | 16.74 | 16.69 |
2017-01-26 | 16.69 | 16.84 | 16.74 |
2017-02-03 | 16.82 | 16.85 | 16.63 |
2017-02-06 | 16.75 | 16.78 | 16.66 |
2017-02-07 | 16.65 | 16.74 | 16.67 |
2017-02-08 | 16.63 | 16.69 | 16.67 |
2017-02-09 | 16.68 | 16.77 | 16.72 |
2017-02-10 | 16.76 | 16.84 | 16.78 |
2017-02-13 | 16.88 | 16.90 | 16.85 |
2017-02-14 | 16.87 | 16.88 | 16.75 |
2017-02-15 | 16.82 | 16.93 | 16.84 |
2017-02-16 | 16.88 | 16.91 | 16.78 |
2017-02-17 | 16.78 | 16.82 | 16.64 |
2017-02-20 | 16.62 | 16.98 | 16.91 |
... | ... | ... | ... |
2017-08-07 | 12.87 | 12.95 | 12.93 |
2017-08-08 | 12.91 | 12.93 | 12.87 |
2017-08-09 | 12.84 | 12.88 | 12.76 |
2017-08-10 | 12.74 | 12.88 | 12.79 |
2017-08-11 | 12.72 | 12.73 | 12.68 |
2017-08-14 | 12.62 | 12.63 | 12.56 |
2017-08-15 | 12.58 | 12.76 | 12.59 |
2017-08-16 | 12.56 | 12.56 | 12.49 |
2017-08-17 | 12.49 | 12.55 | 12.53 |
2017-08-18 | 12.50 | 12.57 | 12.51 |
2017-08-21 | 12.50 | 12.57 | 12.51 |
2017-08-22 | 12.51 | 12.53 | 12.43 |
2017-08-23 | 12.44 | 12.66 | 12.57 |
2017-08-24 | 12.58 | 12.66 | 12.47 |
2017-08-25 | 12.47 | 12.80 | 12.78 |
2017-08-28 | 12.83 | 12.96 | 12.89 |
2017-08-29 | 12.85 | 12.98 | 12.96 |
2017-08-30 | 12.91 | 12.98 | 12.87 |
2017-08-31 | 12.83 | 12.84 | 12.71 |
2017-09-01 | 12.68 | 12.87 | 12.77 |
2017-09-04 | 12.78 | 12.84 | 12.78 |
2017-09-05 | 12.78 | 13.09 | 13.03 |
2017-09-06 | 12.99 | 13.05 | 12.96 |
2017-09-07 | 12.94 | 13.02 | 12.85 |
2017-09-08 | 12.85 | 13.12 | 13.03 |
2017-09-11 | 13.15 | 13.15 | 12.99 |
2017-09-12 | 13.00 | 13.04 | 13.04 |
2017-09-13 | 13.01 | 13.01 | 12.87 |
2017-09-14 | 12.88 | 12.98 | 12.85 |
2017-09-15 | 12.82 | 12.86 | 12.83 |
174 rows × 3 columns
注意:price[['open','close']]中['open','close']是一个由两个字符串(列名)组成的列表,会自动对应到整个DataFrame表结构中,获取到相应的数据。
同学们试试price['open','close'],看看能不能获取到数据~
3.选择多行:
price[0:3]
open | high | low | close | |
---|---|---|---|---|
date | ||||
2017-01-03 | 16.21 | 16.44 | 16.17 | 16.30 |
2017-01-04 | 16.29 | 16.35 | 16.18 | 16.33 |
2017-01-05 | 16.30 | 16.38 | 16.24 | 16.30 |
4.按index选取多行:
price['2017-01-24':'2017-02-25']
open | high | low | close | |
---|---|---|---|---|
date | ||||
2017-01-24 | 16.58 | 16.70 | 16.58 | 16.69 |
2017-01-25 | 16.69 | 16.74 | 16.61 | 16.69 |
2017-01-26 | 16.69 | 16.84 | 16.61 | 16.74 |
2017-02-03 | 16.82 | 16.85 | 16.62 | 16.63 |
2017-02-06 | 16.75 | 16.78 | 16.66 | 16.66 |
2017-02-07 | 16.65 | 16.74 | 16.63 | 16.67 |
2017-02-08 | 16.63 | 16.69 | 16.54 | 16.67 |
2017-02-09 | 16.68 | 16.77 | 16.59 | 16.72 |
2017-02-10 | 16.76 | 16.84 | 16.70 | 16.78 |
2017-02-13 | 16.88 | 16.90 | 16.78 | 16.85 |
2017-02-14 | 16.87 | 16.88 | 16.74 | 16.75 |
2017-02-15 | 16.82 | 16.93 | 16.75 | 16.84 |
2017-02-16 | 16.88 | 16.91 | 16.76 | 16.78 |
2017-02-17 | 16.78 | 16.82 | 16.59 | 16.64 |
2017-02-20 | 16.62 | 16.98 | 16.62 | 16.91 |
2017-02-21 | 16.88 | 17.08 | 16.81 | 16.88 |
2017-02-22 | 16.88 | 16.90 | 16.73 | 16.75 |
2017-02-23 | 16.78 | 16.83 | 16.64 | 16.69 |
2017-02-24 | 16.67 | 16.75 | 16.66 | 16.71 |
5.使用标签选取数据:
price.loc[行标签,列标签]
price.loc['a':'b'] #选取 ab 两行数据
price.loc[:,'open'] #选取 open 列的数据
price.loc 的第一个参数是行标签,第二个参数为列标签,两个参数既可以是列表也可以是单个字符,如果两个参数都为列表则返回的是 DataFrame,否则,则为 Series。
price.loc['2017-01-24','open']
16.58
price.loc['2017-01-24':'2017-01-25']
open | high | low | close | |
---|---|---|---|---|
date | ||||
2017-01-24 | 16.58 | 16.70 | 16.58 | 16.69 |
2017-01-25 | 16.69 | 16.74 | 16.61 | 16.69 |
price.loc[:, 'open']
date 2017-01-03 16.21 2017-01-04 16.29 2017-01-05 16.30 2017-01-06 16.30 2017-01-09 16.24 ... 2017-09-11 13.15 2017-09-12 13.00 2017-09-13 13.01 2017-09-14 12.88 2017-09-15 12.82 Name: open, Length: 174, dtype: float64
price.loc['2017-01-24':'2017-02-25',['open','close']]
open | close | |
---|---|---|
date | ||
2017-01-24 | 16.58 | 16.69 |
2017-01-25 | 16.69 | 16.69 |
2017-01-26 | 16.69 | 16.74 |
2017-02-03 | 16.82 | 16.63 |
2017-02-06 | 16.75 | 16.66 |
2017-02-07 | 16.65 | 16.67 |
2017-02-08 | 16.63 | 16.67 |
2017-02-09 | 16.68 | 16.72 |
2017-02-10 | 16.76 | 16.78 |
2017-02-13 | 16.88 | 16.85 |
2017-02-14 | 16.87 | 16.75 |
2017-02-15 | 16.82 | 16.84 |
2017-02-16 | 16.88 | 16.78 |
2017-02-17 | 16.78 | 16.64 |
2017-02-20 | 16.62 | 16.91 |
2017-02-21 | 16.88 | 16.88 |
2017-02-22 | 16.88 | 16.75 |
2017-02-23 | 16.78 | 16.69 |
2017-02-24 | 16.67 | 16.71 |
6..使用位置选取数据:
price.iloc[行位置,列位置]
price.iloc[1,1] #选取第二行,第二列的值,返回的为单个值
price.iloc[[0,2],:] #选取第一行及第三行的数据
price.iloc[0:2,:] #选取第一行到第三行(不包含)的数据
price.iloc[:,1] #选取所有记录的第二列的值,返回的为一个Series
price.iloc[1,:] #选取第一行数据,返回的为一个Series
price.iloc[1,1] # 选取第二行,第二列的值,返回的为单个值
16.35
price.iloc[[0,2],:] # 选取第一行及第三行的数据
open | high | low | close | |
---|---|---|---|---|
date | ||||
2017-01-03 | 16.21 | 16.44 | 16.17 | 16.3 |
2017-01-05 | 16.30 | 16.38 | 16.24 | 16.3 |
price.iloc[0:20,:] # 选取第一行到第三行(不包含)的数据
open | high | low | close | |
---|---|---|---|---|
date | ||||
2017-01-03 | 16.21 | 16.44 | 16.17 | 16.30 |
2017-01-04 | 16.29 | 16.35 | 16.18 | 16.33 |
2017-01-05 | 16.30 | 16.38 | 16.24 | 16.30 |
2017-01-06 | 16.30 | 16.30 | 16.13 | 16.18 |
2017-01-09 | 16.24 | 16.29 | 16.13 | 16.20 |
2017-01-10 | 16.18 | 16.24 | 16.14 | 16.19 |
2017-01-11 | 16.24 | 16.24 | 16.15 | 16.16 |
2017-01-12 | 16.18 | 16.20 | 16.11 | 16.12 |
2017-01-13 | 16.10 | 16.29 | 16.10 | 16.27 |
2017-01-16 | 16.23 | 16.60 | 16.10 | 16.56 |
2017-01-17 | 16.46 | 16.54 | 16.37 | 16.40 |
2017-01-18 | 16.42 | 16.55 | 16.36 | 16.48 |
2017-01-19 | 16.43 | 16.64 | 16.43 | 16.54 |
2017-01-20 | 16.58 | 16.66 | 16.50 | 16.60 |
2017-01-23 | 16.66 | 16.69 | 16.51 | 16.57 |
2017-01-24 | 16.58 | 16.70 | 16.58 | 16.69 |
2017-01-25 | 16.69 | 16.74 | 16.61 | 16.69 |
2017-01-26 | 16.69 | 16.84 | 16.61 | 16.74 |
2017-02-03 | 16.82 | 16.85 | 16.62 | 16.63 |
2017-02-06 | 16.75 | 16.78 | 16.66 | 16.66 |
price.iloc[:,1:3] # 选取所有记录的第一列的值,返回的为一个Series
high | low | |
---|---|---|
date | ||
2017-01-03 | 16.44 | 16.17 |
2017-01-04 | 16.35 | 16.18 |
2017-01-05 | 16.38 | 16.24 |
2017-01-06 | 16.30 | 16.13 |
2017-01-09 | 16.29 | 16.13 |
2017-01-10 | 16.24 | 16.14 |
2017-01-11 | 16.24 | 16.15 |
2017-01-12 | 16.20 | 16.11 |
2017-01-13 | 16.29 | 16.10 |
2017-01-16 | 16.60 | 16.10 |
2017-01-17 | 16.54 | 16.37 |
2017-01-18 | 16.55 | 16.36 |
2017-01-19 | 16.64 | 16.43 |
2017-01-20 | 16.66 | 16.50 |
2017-01-23 | 16.69 | 16.51 |
2017-01-24 | 16.70 | 16.58 |
2017-01-25 | 16.74 | 16.61 |
2017-01-26 | 16.84 | 16.61 |
2017-02-03 | 16.85 | 16.62 |
2017-02-06 | 16.78 | 16.66 |
2017-02-07 | 16.74 | 16.63 |
2017-02-08 | 16.69 | 16.54 |
2017-02-09 | 16.77 | 16.59 |
2017-02-10 | 16.84 | 16.70 |
2017-02-13 | 16.90 | 16.78 |
2017-02-14 | 16.88 | 16.74 |
2017-02-15 | 16.93 | 16.75 |
2017-02-16 | 16.91 | 16.76 |
2017-02-17 | 16.82 | 16.59 |
2017-02-20 | 16.98 | 16.62 |
... | ... | ... |
2017-08-07 | 12.95 | 12.83 |
2017-08-08 | 12.93 | 12.83 |
2017-08-09 | 12.88 | 12.75 |
2017-08-10 | 12.88 | 12.70 |
2017-08-11 | 12.73 | 12.46 |
2017-08-14 | 12.63 | 12.52 |
2017-08-15 | 12.76 | 12.56 |
2017-08-16 | 12.56 | 12.47 |
2017-08-17 | 12.55 | 12.47 |
2017-08-18 | 12.57 | 12.46 |
2017-08-21 | 12.57 | 12.46 |
2017-08-22 | 12.53 | 12.41 |
2017-08-23 | 12.66 | 12.41 |
2017-08-24 | 12.66 | 12.45 |
2017-08-25 | 12.80 | 12.47 |
2017-08-28 | 12.96 | 12.75 |
2017-08-29 | 12.98 | 12.77 |
2017-08-30 | 12.98 | 12.82 |
2017-08-31 | 12.84 | 12.65 |
2017-09-01 | 12.87 | 12.68 |
2017-09-04 | 12.84 | 12.65 |
2017-09-05 | 13.09 | 12.78 |
2017-09-06 | 13.05 | 12.90 |
2017-09-07 | 13.02 | 12.82 |
2017-09-08 | 13.12 | 12.83 |
2017-09-11 | 13.15 | 12.95 |
2017-09-12 | 13.04 | 12.91 |
2017-09-13 | 13.01 | 12.86 |
2017-09-14 | 12.98 | 12.81 |
2017-09-15 | 12.86 | 12.75 |
174 rows × 2 columns
price.iloc[1,:] # 选取第一行数据,返回的为一个Series
open 16.29 high 16.35 low 16.18 close 16.33 Name: 2017-01-04, dtype: float64
7.更广义的切片方式是使用.ix,它自动根据给到的索引类型判断是使用位置还是标签进行切片
price.ix[1,1]
price.ix['a':'b']
price.ix[1,1]
/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:1: FutureWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing See the documentation here: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated """Entry point for launching an IPython kernel. /opt/conda/lib/python3.6/site-packages/pandas/core/indexing.py:947: FutureWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing See the documentation here: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated return getattr(section, self.name)[new_key]
16.35
price.ix['2017-01-24':'2017-01-25']
/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:1: FutureWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing See the documentation here: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated """Entry point for launching an IPython kernel.
open | high | low | close | |
---|---|---|---|---|
date | ||||
2017-01-24 | 16.58 | 16.70 | 16.58 | 16.69 |
2017-01-25 | 16.69 | 16.74 | 16.61 | 16.69 |
price.ix['2017-01-24','open']
/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:1: FutureWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing See the documentation here: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated """Entry point for launching an IPython kernel.
16.58
price.ix[1,'open']
/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:1: FutureWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing See the documentation here: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated """Entry point for launching an IPython kernel.
16.29
price.ix['2018-01-24',0]
764.46000000000004
8.通过逻辑指针进行数据切片:
price[逻辑条件]
price[price.one >= 2] #单个逻辑条件
price[(price.one >=1 ) & (df.one < 3) ] #多个逻辑条件组合
price.open>16.5
date 2017-01-03 False 2017-01-04 False 2017-01-05 False 2017-01-06 False 2017-01-09 False ... 2017-09-11 False 2017-09-12 False 2017-09-13 False 2017-09-14 False 2017-09-15 False Name: open, Length: 174, dtype: bool
#筛选出 open 大于 16.5的数据
price[price.open > 16.5][['open','close']]
open | close | |
---|---|---|
date | ||
2017-01-20 | 16.58 | 16.60 |
2017-01-23 | 16.66 | 16.57 |
2017-01-24 | 16.58 | 16.69 |
2017-01-25 | 16.69 | 16.69 |
2017-01-26 | 16.69 | 16.74 |
2017-02-03 | 16.82 | 16.63 |
2017-02-06 | 16.75 | 16.66 |
2017-02-07 | 16.65 | 16.67 |
2017-02-08 | 16.63 | 16.67 |
2017-02-09 | 16.68 | 16.72 |
2017-02-10 | 16.76 | 16.78 |
2017-02-13 | 16.88 | 16.85 |
2017-02-14 | 16.87 | 16.75 |
2017-02-15 | 16.82 | 16.84 |
2017-02-16 | 16.88 | 16.78 |
2017-02-17 | 16.78 | 16.64 |
2017-02-20 | 16.62 | 16.91 |
2017-02-21 | 16.88 | 16.88 |
2017-02-22 | 16.88 | 16.75 |
2017-02-23 | 16.78 | 16.69 |
2017-02-24 | 16.67 | 16.71 |
2017-02-27 | 16.69 | 16.59 |
2017-02-28 | 16.58 | 16.59 |
2017-03-01 | 16.58 | 16.56 |
2017-03-02 | 16.62 | 16.41 |
print((price.open > 16.4))
print((price.close < 16.6))
date 2017-01-03 False 2017-01-04 False 2017-01-05 False 2017-01-06 False 2017-01-09 False ... 2017-09-11 False 2017-09-12 False 2017-09-13 False 2017-09-14 False 2017-09-15 False Name: open, Length: 174, dtype: bool date 2017-01-03 True 2017-01-04 True 2017-01-05 True 2017-01-06 True 2017-01-09 True ... 2017-09-11 True 2017-09-12 True 2017-09-13 True 2017-09-14 True 2017-09-15 True Name: close, Length: 174, dtype: bool
#筛选出 open 大于 16.4 的数据,并且 close 小于 16.6 的数据
price[(price.open > 16.4) & (price.close < 16.6)]
open | high | low | close | |
---|---|---|---|---|
date | ||||
2017-01-17 | 16.46 | 16.54 | 16.37 | 16.40 |
2017-01-18 | 16.42 | 16.55 | 16.36 | 16.48 |
2017-01-19 | 16.43 | 16.64 | 16.43 | 16.54 |
2017-01-23 | 16.66 | 16.69 | 16.51 | 16.57 |
2017-02-27 | 16.69 | 16.72 | 16.53 | 16.59 |
2017-02-28 | 16.58 | 16.67 | 16.53 | 16.59 |
2017-03-01 | 16.58 | 16.62 | 16.52 | 16.56 |
2017-03-02 | 16.62 | 16.62 | 16.39 | 16.41 |
2017-03-03 | 16.42 | 16.43 | 16.32 | 16.35 |
#使用 条件过来更改数据。
price[price>16.5]
open | high | low | close | |
---|---|---|---|---|
date | ||||
2017-01-03 | NaN | NaN | NaN | NaN |
2017-01-04 | NaN | NaN | NaN | NaN |
2017-01-05 | NaN | NaN | NaN | NaN |
2017-01-06 | NaN | NaN | NaN | NaN |
2017-01-09 | NaN | NaN | NaN | NaN |
2017-01-10 | NaN | NaN | NaN | NaN |
2017-01-11 | NaN | NaN | NaN | NaN |
2017-01-12 | NaN | NaN | NaN | NaN |
2017-01-13 | NaN | NaN | NaN | NaN |
2017-01-16 | NaN | 16.60 | NaN | 16.56 |
2017-01-17 | NaN | 16.54 | NaN | NaN |
2017-01-18 | NaN | 16.55 | NaN | NaN |
2017-01-19 | NaN | 16.64 | NaN | 16.54 |
2017-01-20 | 16.58 | 16.66 | NaN | 16.60 |
2017-01-23 | 16.66 | 16.69 | 16.51 | 16.57 |
2017-01-24 | 16.58 | 16.70 | 16.58 | 16.69 |
2017-01-25 | 16.69 | 16.74 | 16.61 | 16.69 |
2017-01-26 | 16.69 | 16.84 | 16.61 | 16.74 |
2017-02-03 | 16.82 | 16.85 | 16.62 | 16.63 |
2017-02-06 | 16.75 | 16.78 | 16.66 | 16.66 |
2017-02-07 | 16.65 | 16.74 | 16.63 | 16.67 |
2017-02-08 | 16.63 | 16.69 | 16.54 | 16.67 |
2017-02-09 | 16.68 | 16.77 | 16.59 | 16.72 |
2017-02-10 | 16.76 | 16.84 | 16.70 | 16.78 |
2017-02-13 | 16.88 | 16.90 | 16.78 | 16.85 |
2017-02-14 | 16.87 | 16.88 | 16.74 | 16.75 |
2017-02-15 | 16.82 | 16.93 | 16.75 | 16.84 |
2017-02-16 | 16.88 | 16.91 | 16.76 | 16.78 |
2017-02-17 | 16.78 | 16.82 | 16.59 | 16.64 |
2017-02-20 | 16.62 | 16.98 | 16.62 | 16.91 |
... | ... | ... | ... | ... |
2017-08-07 | NaN | NaN | NaN | NaN |
2017-08-08 | NaN | NaN | NaN | NaN |
2017-08-09 | NaN | NaN | NaN | NaN |
2017-08-10 | NaN | NaN | NaN | NaN |
2017-08-11 | NaN | NaN | NaN | NaN |
2017-08-14 | NaN | NaN | NaN | NaN |
2017-08-15 | NaN | NaN | NaN | NaN |
2017-08-16 | NaN | NaN | NaN | NaN |
2017-08-17 | NaN | NaN | NaN | NaN |
2017-08-18 | NaN | NaN | NaN | NaN |
2017-08-21 | NaN | NaN | NaN | NaN |
2017-08-22 | NaN | NaN | NaN | NaN |
2017-08-23 | NaN | NaN | NaN | NaN |
2017-08-24 | NaN | NaN | NaN | NaN |
2017-08-25 | NaN | NaN | NaN | NaN |
2017-08-28 | NaN | NaN | NaN | NaN |
2017-08-29 | NaN | NaN | NaN | NaN |
2017-08-30 | NaN | NaN | NaN | NaN |
2017-08-31 | NaN | NaN | NaN | NaN |
2017-09-01 | NaN | NaN | NaN | NaN |
2017-09-04 | NaN | NaN | NaN | NaN |
2017-09-05 | NaN | NaN | NaN | NaN |
2017-09-06 | NaN | NaN | NaN | NaN |
2017-09-07 | NaN | NaN | NaN | NaN |
2017-09-08 | NaN | NaN | NaN | NaN |
2017-09-11 | NaN | NaN | NaN | NaN |
2017-09-12 | NaN | NaN | NaN | NaN |
2017-09-13 | NaN | NaN | NaN | NaN |
2017-09-14 | NaN | NaN | NaN | NaN |
2017-09-15 | NaN | NaN | NaN | NaN |
174 rows × 4 columns
观察可以发现,price 中大于 16.5 的数都变为 NaN。
#我们还可以把大于 16.5 的数赋值为1.
price[price > 16.5] = 1
price
open | high | low | close | |
---|---|---|---|---|
date | ||||
2017-01-03 | 16.21 | 16.44 | 16.17 | 16.30 |
2017-01-04 | 16.29 | 16.35 | 16.18 | 16.33 |
2017-01-05 | 16.30 | 16.38 | 16.24 | 16.30 |
2017-01-06 | 16.30 | 16.30 | 16.13 | 16.18 |
2017-01-09 | 16.24 | 16.29 | 16.13 | 16.20 |
2017-01-10 | 16.18 | 16.24 | 16.14 | 16.19 |
2017-01-11 | 16.24 | 16.24 | 16.15 | 16.16 |
2017-01-12 | 16.18 | 16.20 | 16.11 | 16.12 |
2017-01-13 | 16.10 | 16.29 | 16.10 | 16.27 |
2017-01-16 | 16.23 | 1.00 | 16.10 | 1.00 |
2017-01-17 | 16.46 | 1.00 | 16.37 | 16.40 |
2017-01-18 | 16.42 | 1.00 | 16.36 | 16.48 |
2017-01-19 | 16.43 | 1.00 | 16.43 | 1.00 |
2017-01-20 | 1.00 | 1.00 | 16.50 | 1.00 |
2017-01-23 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-01-24 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-01-25 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-01-26 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-03 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-06 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-07 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-08 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-09 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-10 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-13 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-14 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-15 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-16 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-17 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-20 | 1.00 | 1.00 | 1.00 | 1.00 |
... | ... | ... | ... | ... |
2017-08-07 | 12.87 | 12.95 | 12.83 | 12.93 |
2017-08-08 | 12.91 | 12.93 | 12.83 | 12.87 |
2017-08-09 | 12.84 | 12.88 | 12.75 | 12.76 |
2017-08-10 | 12.74 | 12.88 | 12.70 | 12.79 |
2017-08-11 | 12.72 | 12.73 | 12.46 | 12.68 |
2017-08-14 | 12.62 | 12.63 | 12.52 | 12.56 |
2017-08-15 | 12.58 | 12.76 | 12.56 | 12.59 |
2017-08-16 | 12.56 | 12.56 | 12.47 | 12.49 |
2017-08-17 | 12.49 | 12.55 | 12.47 | 12.53 |
2017-08-18 | 12.50 | 12.57 | 12.46 | 12.51 |
2017-08-21 | 12.50 | 12.57 | 12.46 | 12.51 |
2017-08-22 | 12.51 | 12.53 | 12.41 | 12.43 |
2017-08-23 | 12.44 | 12.66 | 12.41 | 12.57 |
2017-08-24 | 12.58 | 12.66 | 12.45 | 12.47 |
2017-08-25 | 12.47 | 12.80 | 12.47 | 12.78 |
2017-08-28 | 12.83 | 12.96 | 12.75 | 12.89 |
2017-08-29 | 12.85 | 12.98 | 12.77 | 12.96 |
2017-08-30 | 12.91 | 12.98 | 12.82 | 12.87 |
2017-08-31 | 12.83 | 12.84 | 12.65 | 12.71 |
2017-09-01 | 12.68 | 12.87 | 12.68 | 12.77 |
2017-09-04 | 12.78 | 12.84 | 12.65 | 12.78 |
2017-09-05 | 12.78 | 13.09 | 12.78 | 13.03 |
2017-09-06 | 12.99 | 13.05 | 12.90 | 12.96 |
2017-09-07 | 12.94 | 13.02 | 12.82 | 12.85 |
2017-09-08 | 12.85 | 13.12 | 12.83 | 13.03 |
2017-09-11 | 13.15 | 13.15 | 12.95 | 12.99 |
2017-09-12 | 13.00 | 13.04 | 12.91 | 13.04 |
2017-09-13 | 13.01 | 13.01 | 12.86 | 12.87 |
2017-09-14 | 12.88 | 12.98 | 12.81 | 12.85 |
2017-09-15 | 12.82 | 12.86 | 12.75 | 12.83 |
174 rows × 4 columns
使用isin()方法来过滤在指定列中的数据,案例延续上面赋值后的price
# 选取 high 列中数为 1 和 774.00的数。
price[price['high'].isin([1,16.24])]
open | high | low | close | |
---|---|---|---|---|
date | ||||
2017-01-10 | 16.18 | 16.24 | 16.14 | 16.19 |
2017-01-11 | 16.24 | 16.24 | 16.15 | 16.16 |
2017-01-16 | 16.23 | 1.00 | 16.10 | 1.00 |
2017-01-17 | 16.46 | 1.00 | 16.37 | 16.40 |
2017-01-18 | 16.42 | 1.00 | 16.36 | 16.48 |
2017-01-19 | 16.43 | 1.00 | 16.43 | 1.00 |
2017-01-20 | 1.00 | 1.00 | 16.50 | 1.00 |
2017-01-23 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-01-24 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-01-25 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-01-26 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-03 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-06 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-07 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-08 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-09 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-10 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-13 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-14 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-15 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-16 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-17 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-20 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-21 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-22 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-23 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-24 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-27 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-02-28 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-03-01 | 1.00 | 1.00 | 1.00 | 1.00 |
2017-03-02 | 1.00 | 1.00 | 16.39 | 16.41 |
2017-03-06 | 16.37 | 1.00 | 16.35 | 16.41 |
510500.OF 南方中证500ETF
510510.OF 广发中证500ETF
510520.OF 诺安中证500ETF
510560.OF 国寿安保中证500ETF
510580.OF 易方达中证500
000905.SH 中证500
etf=['510500.OF','510510.OF','510520.OF','510560.OF','510580.OF']
获取各自2017年1月1日以来的日收益率
获得各自的年跟踪误差TE
TE=std(ETF日收益率-中证500日收益率)*sqrt(250)
std()
哪个做的最好?
idx=pd.read_csv('/resources/data/IDX_Idxtrd_000905.csv')
idx.columns=['Code','Date','Open','High','Low','Close']
idx=idx.set_index('Date')
idxret=idx['Close']/idx['Close'].shift()-1
idxret=idxret[idxret.index>='2018-01-01']
idxret
Date 2018-01-02 0.013024 2018-01-03 0.008847 2018-01-04 0.004585 2018-01-05 -0.000045 2018-01-08 0.004508 ... 2020-10-12 0.028816 2020-10-13 0.001562 2020-10-14 -0.006568 2020-10-15 -0.007426 2020-10-16 -0.004854 Name: Close, Length: 676, dtype: float64
etf=pd.read_csv('/resources/data/Fund_NAV.csv')
etf=etf.set_index('TradingDate')
etf
Symbol | AccumulativeNAV | |
---|---|---|
TradingDate | ||
2018-01-02 | 510560 | 0.6205 |
2018-01-02 | 510520 | 1.6321 |
2018-01-02 | 510580 | 1.0146 |
2018-01-02 | 510500 | 1.8709 |
2018-01-02 | 510510 | 1.7612 |
2018-01-03 | 510560 | 0.6259 |
2018-01-03 | 510520 | 1.6457 |
2018-01-03 | 510580 | 1.0233 |
2018-01-03 | 510500 | 1.8874 |
2018-01-03 | 510510 | 1.7764 |
2018-01-04 | 510520 | 1.6529 |
2018-01-04 | 510510 | 1.7844 |
2018-01-04 | 510560 | 0.6287 |
2018-01-04 | 510580 | 1.0278 |
2018-01-04 | 510500 | 1.8958 |
2018-01-05 | 510520 | 1.6534 |
2018-01-05 | 510560 | 0.6287 |
2018-01-05 | 510580 | 1.0277 |
2018-01-05 | 510510 | 1.7843 |
2018-01-05 | 510500 | 1.8957 |
2018-01-08 | 510560 | 0.6316 |
2018-01-08 | 510500 | 1.9041 |
2018-01-08 | 510520 | 1.6610 |
2018-01-08 | 510580 | 1.0320 |
2018-01-08 | 510510 | 1.7920 |
2018-01-09 | 510510 | 1.7920 |
2018-01-09 | 510520 | 1.6606 |
2018-01-09 | 510560 | 0.6315 |
2018-01-09 | 510580 | 1.0319 |
2018-01-09 | 510500 | 1.9039 |
... | ... | ... |
2020-09-29 | 510510 | 1.8165 |
2020-09-29 | 510560 | 0.6223 |
2020-09-30 | 510580 | 1.0389 |
2020-09-30 | 510510 | 1.8055 |
2020-09-30 | 510500 | 1.9114 |
2020-09-30 | 510560 | 0.6187 |
2020-10-09 | 510580 | 1.0667 |
2020-10-09 | 510510 | 1.8538 |
2020-10-09 | 510560 | 0.6350 |
2020-10-09 | 510500 | 1.9628 |
2020-10-12 | 510560 | 0.6532 |
2020-10-12 | 510580 | 1.0974 |
2020-10-12 | 510500 | 2.0197 |
2020-10-12 | 510510 | 1.9073 |
2020-10-13 | 510580 | 1.0991 |
2020-10-13 | 510500 | 2.0229 |
2020-10-13 | 510510 | 1.9102 |
2020-10-13 | 510560 | 0.6541 |
2020-10-14 | 510510 | 1.8980 |
2020-10-14 | 510580 | 1.0923 |
2020-10-14 | 510500 | 2.0099 |
2020-10-14 | 510560 | 0.6499 |
2020-10-15 | 510560 | 0.6451 |
2020-10-15 | 510580 | 1.0858 |
2020-10-15 | 510510 | 1.8841 |
2020-10-15 | 510500 | 1.9955 |
2020-10-16 | 510580 | 1.0806 |
2020-10-16 | 510510 | 1.8748 |
2020-10-16 | 510560 | 0.6420 |
2020-10-16 | 510500 | 1.9856 |
3170 rows × 2 columns
etfsymbols=[510500,510510,510560,510580]
etflist=[]
for i in etfsymbols:
e=pd.DataFrame(etf[etf['Symbol']==i]['AccumulativeNAV'])
e.columns=[str(i)]
etflist.append(e)
df=pd.concat(etflist,axis=1)
ret=df/df.shift()-1
ret.sub(idxret,axis=0).std()*sqrt(250)
/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:7: FutureWarning: Sorting because non-concatenation axis is not aligned. A future version of pandas will change to not sort by default. To accept the future behavior, pass 'sort=False'. To retain the current behavior and silence the warning, pass 'sort=True'. import sys
510500 0.004283 510510 0.005752 510560 0.005596 510580 0.015394 dtype: float64