博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
25、Python之禅
阅读量:6951 次
发布时间:2019-06-27

本文共 4025 字,大约阅读时间需要 13 分钟。

要求:
爬取网页你好,蜘蛛侠!中的Python之禅中英文版本,并且打印。
 
目的:
练习使用selenium爬取动态网页的信息。
练习selenium与BeautifulSoup的搭配使用。
 
 
URL    
 
方法一: 用selenium
 
1 from selenium import webdriver 2 import time 3  4 driver = webdriver.Chrome() 5  6 driver.get('https://localprod.pandateacher.com/python-manuscript/hello-spiderman/') 7 time.sleep(2) 8  9 button = driver.find_element_by_class_name('sub')10 button.click()11 time.sleep(1)12 13 python_zens = driver.find_elements_by_class_name('content')14 15 for python_zen in python_zens:16     print(python_zen.find_element_by_tag_name('h1').text,end='\n\n')17     print(python_zen.find_element_by_tag_name('p').text,end='\n\n')18 19 driver.close()
1 The Zen of Python 2  3 Beautiful is better than ugly. 4 Explicit is better than implicit. 5 Simple is better than complex. 6 Complex is better than complicated. 7 Flat is better than nested. 8 Sparse is better than dense. 9 Readability counts.10 Special cases aren't special enough to break the rules.11 Although practicality beats purity.12 Errors should never pass silently.13 Unless explicitly silenced.14 In the face of ambiguity, refuse the temptation to guess.15 There should be one-- and preferably only one --obvious way to do it.16 Although that way may not be obvious at first unless you're Dutch.17 Now is better than never.18 Although never is often better than *right* now.19 If the implementation is hard to explain, it's a bad idea.20 If the implementation is easy to explain, it may be a good idea.21 Namespaces are one honking great idea -- let's do more of those!22 23 Python之禅24 25 优美胜于丑陋26 明了胜于晦涩27 简洁胜于复杂28 复杂胜于凌乱29 扁平胜于嵌套30 间隔胜于紧凑31 可读性很重要32 即便假借特例的实用性之名,也不可违背这些规则33 不要包容所有错误,除非你确定需要这样做34 当存在多种可能,不要尝试去猜测35 而是尽量找一种,最好是唯一一种明显的解决方案36 虽然这并不容易,因为你不是 Python 之父37 做也许好过不做,但不假思索就动手还不如不做38 如果你无法向人描述你的方案,那肯定不是一个好方案;反之亦然39 命名空间是一种绝妙的理念,我们应当多加利用
 
方法二:用selenium 和 BeautifulSoup
 
1 from selenium import webdriver 2 from bs4 import BeautifulSoup 3 import time 4  5 driver = webdriver.Chrome() 6  7 driver.get('https://localprod.pandateacher.com/python-manuscript/hello-spiderman/') 8 time.sleep(2) 9 10 button = driver.find_element_by_class_name('sub')11 button.click()12 time.sleep(1)13 14 pagesource = driver.page_source15 16 soup = BeautifulSoup(pagesource,'html.parser')17 items = soup.find_all(class_='content')18 for item in items:19     print('\n\t'+item.find('h1').text)20     print(item.find('p').text)21 22 driver.close()
1         The Zen of Python 2  3             Beautiful is better than ugly. 4             Explicit is better than implicit. 5             Simple is better than complex. 6             Complex is better than complicated. 7             Flat is better than nested. 8             Sparse is better than dense. 9             Readability counts.10             Special cases aren't special enough to break the rules.11             Although practicality beats purity.12             Errors should never pass silently.13             Unless explicitly silenced.14             In the face of ambiguity, refuse the temptation to guess.15             There should be one-- and preferably only one --obvious way to do it.16             Although that way may not be obvious at first unless you're Dutch.17             Now is better than never.18             Although never is often better than *right* now.19             If the implementation is hard to explain, it's a bad idea.20             If the implementation is easy to explain, it may be a good idea.21             Namespaces are one honking great idea -- let's do more of those!22 23         Python之禅24 25             优美胜于丑陋26             明了胜于晦涩27             简洁胜于复杂28             复杂胜于凌乱29             扁平胜于嵌套30             间隔胜于紧凑31             可读性很重要32             即便假借特例的实用性之名,也不可违背这些规则33             不要包容所有错误,除非你确定需要这样做34             当存在多种可能,不要尝试去猜测35             而是尽量找一种,最好是唯一一种明显的解决方案36             虽然这并不容易,因为你不是 Python 之父37             做也许好过不做,但不假思索就动手还不如不做38             如果你无法向人描述你的方案,那肯定不是一个好方案;反之亦然39             命名空间是一种绝妙的理念,我们应当多加利用

 

转载于:https://www.cnblogs.com/www1707/p/10850638.html

你可能感兴趣的文章
在本地测试一次成功的AJAX请求
查看>>
淘淘商城第二天
查看>>
配置和修改参数
查看>>
DS06--图
查看>>
C#通过XElement写入XML文件
查看>>
1 0 .2 用于监视的工具和技术
查看>>
洛谷2142高精度减法(模板)
查看>>
First Missing Positive && missing number
查看>>
SharePoint服务器端对象模型 之 使用CAML进行数据查询(Part 4)
查看>>
10条设计师应该知道的字体设置技巧
查看>>
Mac Brew Uninstall MySql
查看>>
LeetCode刷题笔记-回溯法-分割回文串
查看>>
Serv-U和win2003防火墙的设置
查看>>
【网摘】ActiveX组件及其注册
查看>>
Bootstrap网格系统(Grid System)
查看>>
linux内核参数优化
查看>>
Utils工具方法集插件详解
查看>>
Windows Server定时执行bat
查看>>
Linux RTC Test Example rtctest.c hacking
查看>>
Linux C enum
查看>>