并发编程学习笔记基于Python _线程

并发编程学习基于
【并发编程学习笔记基于Python】学习笔记
链接:
P1 对并发编程的支持
那些程序提速方法：
多线程，利用CPU和IO可以同时执行的原理，让CPU不会干巴巴的等待IO完成；多进程利用多核CPU的能力，真正的并行执行任务；异步IO协程在单线程利用CPU和IO同时执行的原理，实现函数异步执行；
实现方法：
P2选择并发执行的方式什么是CPU密集型计算、IO密集型计算多进程、多线程、多协程的对比
怎样根据任务选择对应的技术
P3全局解释器锁GIL
速度慢的两大原因
相较于C/C++，java确实慢，在一些特俗场景下，速度相差100~200倍。
- 原因1，是动态类型语言，边解释边执行；
- 原因2，GIL导致无法利用多喝CPU并发执行；
GIL是什么
全局解释器锁(Lock,GIL)，是计算机程序设计语言解释器用于线程同步的一种机制，它使任何时刻仅有一个线程在执行。即使在多核处理器上，GIL的计时器也只允许同一时间执行一个线程。
为何引入GIL
怎样规避GIL带来的限制
P4 多线程爬虫创建多线程的方法
# 1.主备一个函数def my_func(a, b):do_craw(a, b)# 2.创建一个线程import threadingt = threading.Thread(target = my_func, args = (100, 200))# 3.启动线程t.srart()# 4.等待结束t.join()
改写爬虫程序，单线程到多线程
# blog_spider.pyimport requestsfrom bs4 import BeautifulSoup# 博客园url列表urls = [f"https://www.cnblogs.com/#p{page}"for page in range(1, 50 + 1)]# 获取网页信息，输出url+内容长度，返回网页内容def craw(url):r = requests.get(url)print(url, len(r.text))return r.text# 下文生产者-消费者模式要用到的爬取信息处理函数# def parse(html):#soup = BeautifulSoup(html, "html.parser")#links = soup.find_all("a", class_="post-item-title")#return [(link["href"], link.get_text()) for link in links]if __name__ == "__main__":for result in parse(craw(urls[2])):print(result)
# 01.multi_thread_craw.pyimport blog_spiderimport threadingimport time# 单线程爬虫def single_thread():print("single_thread begin")for url in blog_spider.urls:blog_spider.craw(url)print("single_thread end")# 多线程爬虫def multi_thread():print("multi_thread begin")threads = []for url in blog_spider.urls:threads.append(threading.Thread(target=blog_spider.craw, args=(url,))# 不加括号是不调用)for thread in threads:thread.start()for thread in threads:thread.join()print("multi_thread end")if __name__ == "__main__":strat = time.time()single_thread()end = time.time()print("single thread cost:", end - strat, "sec")strat = time.time()multi_thread()end = time.time()print("multi thread cost:", end - strat, "sec")
速度对比：单线程爬虫V多线程爬虫

文章插图
begin
end
cost: 7. sec
begin
end
multicost: 0. sec
P5 生产者消费者模式爬虫
多组件的技术架构
生产者消费者爬虫的架构
多线程数据通信的queue.Queue
代码编写实现生产者消费者爬虫

import queueimport blog_spiderimport timeimport randomimport threading# 生产者函数def do_craw(url_queue: queue.Queue, html_queue: queue.Queue):while True:url = url_queue.get()html = blog_spider.craw(url)html_queue.put(html)print(threading.current_thread().name, f"craw{url}","url_queue.size=", url_queue.qsize())time.sleep(random.randint(1, 2))# 消费者函数def do_parse(html_queue: queue.Queue, fout):while True:html = html_queue.get()results = blog_spider.parse(html)for result in results:fout.write(str(result) + "\n")print(threading.current_thread().name, f"results.size",len(results), "html_queue.size=", html_queue.qsize())time.sleep(random.randint(1, 2))if __name__ == "__main__":url_queue = queue.Queue()html_queue = queue.Queue()for url in blog_spider.urls:url_queue.put(url)for idx in range(3):t = threading.Thread(target=do_craw, args=(url_queue, html_queue),name=f"craw{idx}")t.start()fout = open("02.data.txt", "w", encoding='utf-8')for idx in range(2):t = threading.Thread(target=do_parse, args=(html_queue, fout),name=f"parse{idx}")t.start()
上一页
1
2
3
下一页
		  	









幸福感预测  Task14：集成学习案例一 

高并发系统架构设计之实战篇35：计数系统设计之未读数系统 

设计模式学习之策略模式 

阿里天池训练营day10：机器学习实战1 

学习和计算时特别常用的三角公式 

Python趣味编程：从入门到人工智能，从这35个案例开始，越学越有趣! 

深入理解RxJava编程思想 

二十九  Shader学习的基础知识素描风格渲染 

java 设计模式之2策略模式 

ICV   智能网联汽车  技术的发展现状及趋势(论文学习思维导图)

并发编程学习笔记 基于Python

并发编程学习笔记基于Python