技术资料
Mysql
Windows下安装mysql 5.6
Python
Python3.5.2 安装(windows环境)
图片爬取和写入
gevent队列任务
selenium模拟浏览器操作
pandas表格和数据应用
OS文件创建
excel格式转换:csv转xls
email自动发送
excel读取指定多行数据
cookie登录后爬取内容
单页文字图片爬取保存到word
学习实践:知网疾病知识
学习实践:知网指南
字典生成树形目录
docx文本图片存入word
-
+
首页
cookie登录后爬取内容
```python import requests,csv from bs4 import BeautifulSoup headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36', 'Cookie': '__root_domain_v=.ipmph.com; _qddaz=QD.ky39uq.kayj8f.k9m46lu7; UM_distinctid=171c9f3608f367-0db723f0697f73-7373667-e1000-171c9f36090528; Hm_lvt_ec31b23a3a54fb0e85df69fc93bd5de9=1591235564,1593342093,1593394314,1593396522; CNZZDATA1268665702=515724504-1588225736-%7C1593404446; jeesite.session.id=7d33e84d68ce41539397c6bdb2d5c795; JSESSIONID=FFF51BE6759340B300F01CE5479B41C2; Hm_lpvt_ec31b23a3a54fb0e85df69fc93bd5de9=1593407445',} session = requests.Session() logurl = 'http:/**/login?rw_lczsPath=http%3A%2F%2Fccdas.ipmph.com%2F' response = session.get(logurl, headers=headers) url='http://XXX/Id=11284' # url='http://XXX/Id=31569' msg = session.get(url, headers=headers) sop = BeautifulSoup(msg.text, 'html.parser') name = sop.find('div',class_='gu_det_left_top').text.replace(' ','').replace('\n','').replace('\t','').replace('\r','') div = sop.find('div',class_='gu_det_left_con').text with open('{}.doc'.format(name),'wb') as f: f.write(msg.content) # vvv = sop.find_all('div',class_='gu_det_left_con_h1') # kkk = sop.find_all('div',class_='gu_det_left_con_main') # dic,ls1,ls2 = [],[],[] # for v in vvv: # title = v.text.replace(' ','').replace('\n','').replace('\t','').replace('\r','') # ls1.append(title) # for k in kkk: # txt = k.text # ls2.append(txt) # for i in range(len(ls1)): # dic.append(ls1[i]+':>>\n'+ls2[i]) ```
大诚
2022年8月3日 10:30
转发文档
收藏文档
上一篇
下一篇
手机扫码
复制链接
手机扫一扫转发分享
复制链接
Markdown文件
PDF文档
PDF文档(打印)
分享
链接
类型
密码
更新密码