Python數據結構與演算法
- 2020 年 1 月 2 日
- 筆記
Python數據結構與演算法
一、篩選數據
0x1 列表
列表解析
[x for x in data if x>= 0]
12 |
[x for x in data if x>= 0] |
---|
filter函數:
g = filter(lambda x : x >=0, data) 在python3中,得到的是構造器,要用list才可以得到結果 list(g)
1234 |
g = filter(lambda x : x >=0, data)在python3中,得到的是構造器,要用list才可以得到結果list(g) |
---|
0x2 字典
字典解析
{k:v for k, v in d.items() if v> 90} d = {'student%d' %i :randint(50,100) for i in range(1,21)} {k:v for k,v in d.items() if v>=90}
12345 |
{k:v for k, v in d.items() if v> 90} d = {'student%d' %i :randint(50,100) for i in range(1,21)}{k:v for k,v in d.items() if v>=90} |
---|
0x3 集合
集合解析
{x for x in s if x%3==0}
12 |
{x for x in s if x%3==0} |
---|
二、元組命名
student = ('mubai',20,'male','[email protected]') 訪問裡面的數據用下標索引,可讀性不好。
方案1:定義一系列數值常量或枚舉類型
NAME,AGE,SEX,EMAIL = range(4) student[NAME] //mubai
123 |
NAME,AGE,SEX,EMAIL = range(4)student[NAME] //mubai |
---|
方案2:使用標準庫中collections.namedtuple替代內置tuple
//使用枚舉 from enum import IntEnum class StudentEnum(IntEnum): NAME = 0 AGE = 1 SEX = 2 EMAIL = 3 student[StudentEnum.AGE] = 20 //使用collections.namedtuple from collections import namedtuple Student = namedtuple('Student',['name','age','sex','email']) s = Student('mubai',20,'male','[email protected]') //s.name = mubai
1234567891011121314 |
//使用枚舉from enum import IntEnumclass StudentEnum(IntEnum): NAME = 0 AGE = 1 SEX = 2 EMAIL = 3student[StudentEnum.AGE] = 20//使用collections.namedtuplefrom collections import namedtupleStudent = namedtuple('Student',['name','age','sex','email'])s = Student('mubai',20,'male','[email protected]')//s.name = mubai |
---|
三、字典排序
根據成績高低排名: {
'LiLei':79,
'Jim':88,
'Lucy':92,
...
}
將字典中的各項轉換為元組,使用內置函數sorted排序。
方案一:將字典中的項轉化為(值,鍵)元組。(列表解析或zip )
//隨機生成成績表 from random import randint g = {k: randint(60,100) for k in 'abcdefgh'} //列表解析 d = [(v,k) for k,v in g.items()] //zip函數 d = list(zip(g.values(),g.keys()) m = sorted(d,reverse=True)
123456789 |
//隨機生成成績表from random import randintg = {k: randint(60,100) for k in 'abcdefgh'}//列表解析d = [(v,k) for k,v in g.items()]//zip函數d = list(zip(g.values(),g.keys())m = sorted(d,reverse=True) |
---|
方案二:傳遞sorted函數的key參數
m = sorted(g.items(),key=lambda item:item[1],reverse=True)
12 |
m = sorted(g.items(),key=lambda item:item[1],reverse=True) |
---|
最終排名:
n = list(enumerate(m,1)) //對m進行排序,加序號 for i,(k,v) in n: //迭代n,增加排序 g[k] = (i,v) //{'a': (1, 100), 'b': (7, 66), 'c': (8, 65), …} o = {k:(i,v) for i,(k,v) in enumerate(m,1)}
12345 |
n = list(enumerate(m,1)) //對m進行排序,加序號for i,(k,v) in n: //迭代n,增加排序 g[k] = (i,v) //{'a': (1, 100), 'b': (7, 66), 'c': (8, 65), …}o = {k:(i,v) for i,(k,v) in enumerate(m,1)} |
---|
四、統計頻度
某隨機序列找到出現次數最高的3個元素,它們出現次數是多少?
方案一:將序列轉換為字典{元素:頻度},根據字典中的值排序。
data = [randint(0,20) for _ in range(30)] //新建一個字典,值全為0 d = dict.fromkeys(data,0) //迭代,值加一 for x in data: d[x] += 1 m = sorted([(v,k) for k,v in d.items()],reverse=True) sorted(((v,k) for k,v in d.items()),reverse=True)[:3] //生成器解析 取前三個 //不用sorted全部計算,採用堆 import heapq m = heapq.nlargest(3,((v,k) for k,v in d.items()))
123456789101112 |
data = [randint(0,20) for _ in range(30)]//新建一個字典,值全為0d = dict.fromkeys(data,0)//迭代,值加一for x in data: d[x] += 1m = sorted([(v,k) for k,v in d.items()],reverse=True)sorted(((v,k) for k,v in d.items()),reverse=True)[:3] //生成器解析 取前三個//不用sorted全部計算,採用堆import heapqm = heapq.nlargest(3,((v,k) for k,v in d.items())) |
---|
方案二:使用標準庫collections中的Counter對象。
from collections import Counter c = Counter(data) //統計詞頻 c.most_common(3) //選前三
1234 |
from collections import Counterc = Counter(data) //統計詞頻c.most_common(3) //選前三 |
---|
統計文本文件詞頻:
from collections import Counter text = open('a.txt').read() import re //正則表達式 word_list = re.split('W+',txt) //非字元切割文本 t = Counter(word_list) t.most_common(10)
1234567 |
from collections import Countertext = open('a.txt').read()import re //正則表達式word_list = re.split('W+',txt) //非字元切割文本t = Counter(word_list)t.most_common(10) |
---|
五、字典中的公共鍵
場景:指定球員多場球賽進球數。 模擬:隨機生成多個字典,找出公共建。
from random import randint,sample d = sample('abcdefgh',randint(3,6)) #進球的球員 d1 = {k:randint(1,4) for k in d} d2 = {k:randint(1,4) for k in d} d3 = {k:randint(1,4) for k in d} dl = [d1,d2,d3]
12345678910 |
from random import randint,sample d = sample('abcdefgh',randint(3,6)) #進球的球員 d1 = {k:randint(1,4) for k in d}d2 = {k:randint(1,4) for k in d}d3 = {k:randint(1,4) for k in d}dl = [d1,d2,d3] |
---|
方案一、迭代每個字典的鍵
//循環迭代 for k in d1: if k in d2 and k in d3: print(k) //列表解析 [k for k in d1 if k in d2 and k in d3] [k for k in dl[0] if all(map(lambda d:k in d, dl[1:]))]
123456789 |
//循環迭代for k in d1: if k in d2 and k in d3: print(k) //列表解析[k for k in d1 if k in d2 and k in d3][k for k in dl[0] if all(map(lambda d:k in d, dl[1:]))] |
---|
方案二、利用集合(set)的交集操作
Step1:使用字典的keys()方法,得到一個字典keys的集合. Step2:使用map函數,得到每個字典keys的集合. Step3:使用reduce函數,取所有字典的keys集合的交集.
# reduce()函數介紹 //10的階乘 from functools import reduce reduce(lambda a,b : a*b,range(1,11)) d1 = {k:randint(1,4) for k in d} d2 = {k:randint(1,4) for k in d} d3 = {k:randint(1,4) for k in d} d1&d3&d2 //下面是最終結果 reduce(lambda a,b:a&b,map(dict.keys,dl))
1234567891011 |
# reduce()函數介紹//10的階乘from functools import reducereduce(lambda a,b : a*b,range(1,11))d1 = {k:randint(1,4) for k in d}d2 = {k:randint(1,4) for k in d}d3 = {k:randint(1,4) for k in d}d1&d3&d2//下面是最終結果reduce(lambda a,b:a&b,map(dict.keys,dl)) |
---|
五、讓字典有序
Python3.5之前的字典不是有序的,存和取位置並不是對應好的。 案例:編寫根據排名獲取選手的函數介面
方案:使用標準庫collections中的OrderedDict
from collections import OrderedDict players = list('abcdefgh') from random import shuffle #洗牌函數 shuffle(players) od = OrderedDict() for i,p in enumerate(players,1): od[p] = i #根據名字查詢成績 def query_by_name(d,name): return d[name] from itertools import islice def query_by_order(d,a,b=null): a -= 1 if b is None: b = a + 1 return list(islice(od,a,b)) query_by_order(od,3) query_by_order(od,3,6)
123456789101112131415161718192021 |
from collections import OrderedDictplayers = list('abcdefgh')from random import shuffle #洗牌函數shuffle(players)od = OrderedDict()for i,p in enumerate(players,1): od[p] = i #根據名字查詢成績def query_by_name(d,name): return d[name] from itertools import islicedef query_by_order(d,a,b=null): a -= 1 if b is None: b = a + 1 return list(islice(od,a,b))query_by_order(od,3)query_by_order(od,3,6) |
---|
六、歷史記錄存儲
案例:現在我們製作了一個簡單的猜數字的小遊戲,如何添加歷史記錄功能,顯示用戶最近猜過的數字?
方案一:使用容量為n的隊列存儲歷史記錄
使用標準庫collections中的deque,它是一個雙端循環隊列
from collections import deque q = deque([],5) //[]初始化 隊列長度 q.append(1) //入隊 q.appendleft(2) //從左入隊 q.pop(1) //出隊
123456 |
from collections import dequeq = deque([],5) //[]初始化 隊列長度q.append(1) //入隊q.appendleft(2) //從左入隊q.pop(1) //出隊 |
---|
方案二:使用pickle模組將歷史記錄存儲到硬碟,以便下次啟動使用
pickle.dump(obj, file, [,protocol]) 註解:將對象obj保存到文件file中去。
import pickle pickle.dump(q.open('save.pkl','wb')) #以二進位方式寫入 p = pickle.load(open('save.pkl','rb')) #以二進位方式讀取
1234 |
import picklepickle.dump(q.open('save.pkl','wb')) #以二進位方式寫入p = pickle.load(open('save.pkl','rb')) #以二進位方式讀取 |
---|