这篇文章将为大家详细讲解有关怎么在python中实现被动信息搜集,文章内容质量较高,因此小编分享给大家做个参考,希望大家阅读完这篇文章后对相关知识有一定的了解。
Python主要用来做什么
Python主要应用于:1、Web开发;2、数据科学研究;3、网络爬虫;4、嵌入式应用开发;5、游戏开发;6、桌面应用开发。
概述:
被动信息搜集主要通过搜索引擎或者社交等方式对目标资产信息进行提取,通常包括IP查询,Whois查询,子域名搜集等。进行被动信息搜集时不与目标产生交互,可以在不接触到目标系统的情况下挖掘目标信息。
主要方法:DNS解析,子域名挖掘,邮件爬取等。
DNS解析:
1、概述:
DNS(Domain Name System,域名系统)是一种分布式网络目录服务,主要用于域名与IP地址的相互转换,能够使用户更方便地访问互联网,而不用去记住一长串数字(能够被机器直接读取的IP)。
2、IP查询:
IP查询是通过当前所获取的URL去查询对应IP地址的过程。可以利用Socket库函数中的gethostbyname()获取域名对应的IP值。
代码:
import socketip = socket.gethostbyname('www.baidu.com')print(ip)
返回:
156.66.14
3、Whois查询:
Whois是用来查询域名的IP以及所有者信息的传输协议。Whois相当于一个数据库,用来查询域名是否已经被注册,以及注册域名的详细信息(如域名所有人,域名注册商等)。
Python中的python-whois模块可用于Whois查询。
代码:
from whois import whoisdata = whois('www.baidu.com')print(data)
返回:
E:\python\python.exe "H:/code/Python Security/Day01/Whois查询.py"{ "domain_name": [ "BAIDU.COM", "baidu.com" ], "registrar": "MarkMonitor, Inc.", "whois_server": "whois.markmonitor.com", "referral_url": null, "updated_date": [ "2020-12-09 04:04:41", "2021-04-07 12:52:21" ], "creation_date": [ "1999-10-11 11:05:17", "1999-10-11 04:05:17" ], "expiration_date": [ "2026-10-11 11:05:17", "2026-10-11 00:00:00" ], "name_servers": [ "NS1.BAIDU.COM", "NS2.BAIDU.COM", "NS3.BAIDU.COM", "NS4.BAIDU.COM", "NS7.BAIDU.COM", "ns3.baidu.com", "ns2.baidu.com", "ns7.baidu.com", "ns1.baidu.com", "ns4.baidu.com" ], "status": [ "clientDeleteProhibited https://icann.org/epp#clientDeleteProhibited", "clientTransferProhibited https://icann.org/epp#clientTransferProhibited", "clientUpdateProhibited https://icann.org/epp#clientUpdateProhibited", "serverDeleteProhibited https://icann.org/epp#serverDeleteProhibited", "serverTransferProhibited https://icann.org/epp#serverTransferProhibited", "serverUpdateProhibited https://icann.org/epp#serverUpdateProhibited", "clientUpdateProhibited (https://www.icann.org/epp#clientUpdateProhibited)", "clientTransferProhibited (https://www.icann.org/epp#clientTransferProhibited)", "clientDeleteProhibited (https://www.icann.org/epp#clientDeleteProhibited)", "serverUpdateProhibited (https://www.icann.org/epp#serverUpdateProhibited)", "serverTransferProhibited (https://www.icann.org/epp#serverTransferProhibited)", "serverDeleteProhibited (https://www.icann.org/epp#serverDeleteProhibited)" ], "emails": [ "abusecomplaints@markmonitor.com", "whoisrequest@markmonitor.com" ], "dnssec": "unsigned", "name": null, "org": "Beijing Baidu Netcom Science Technology Co., Ltd.", "address": null, "city": null, "state": "Beijing", "zipcode": null, "country": "CN"}Process finished with exit code 0
子域名挖掘:
1、概述:
域名可以分为顶级域名,一级域名,二级域名等。
子域名(subdomain)是顶级域名(一级域名或父域名)的下一级。
在测试过程中,测试目标主站时如果未发现任何相关漏洞,此时通常会考虑挖掘目标系统的子域名。
子域名挖掘方法有多种,例如,搜索引擎,子域名破解,字典查询等。
2、利用Python编写一个简单的子域名挖掘工具:
(以https://cn.bing.com/为例)
代码:
# coding=gbkimport requestsfrom bs4 import BeautifulSoupfrom urllib.parse import urlparseimport sysdef Bing_Search(site, pages): Subdomain = [] # 以列表的形式存储子域名 headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,**; q=0.01', 'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8', 'Accept-Encoding': 'gzip, deflate, br', 'Referer': referer } return headers
3、完整代码:
# coding=gbkimport sysimport getoptimport requestsfrom bs4 import BeautifulSoupimport re# 主函数,传入用户输入的参数def start(argv): url = "" pages = "" if len(sys.argv) < 2: print("-h 帮助信息;\n") sys.exit() # 定义异常处理 try: banner() opts, args = getopt.getopt(argv, "-u:-p:-h") except: print('Error an argument') sys.exit() for opt, arg in opts: if opt == "-u": url = arg elif opt == "-p": pages = arg elif opt == "-h": print(usage()) launcher(url, pages)# banner信息def banner(): print('\033[1:34m ################################ \033[0m\n') print('\033[1:34m 3cH0 - Nu1L \033[0m\n') print('\033[1:34m ################################ \033[0m\n')# 使用规则def usage(): print('-h: --help 帮助;') print('-u: --url 域名;') print('-p --pages 页数;') print('eg: python -u "www.baidu.com" -p 100' + '\n') sys.exit()# 漏洞回调函数def launcher(url, pages): email_num = [] key_words = ['email', 'mail', 'mailbox', '邮件', '邮箱', 'postbox'] for page in range(1, int(pages)+1): for key_word in key_words: bing_emails = bing_search(url, page, key_word) baidu_emails = baidu_search(url, page, key_word) sum_emails = bing_emails + baidu_emails for email in sum_emails: if email in email_num: pass else: print(email) with open('data.txt', 'a+')as f: f.write(email + '\n') email_num.append(email)# Bing_searchdef bing_search(url, page, key_word): referer = "http://cn.bing.com/search?q=email+site%3abaidu.com&sp=-1&pq=emailsite%3abaidu.com&first=1&FORM=PERE1" conn = requests.session() bing_url = "http://cn.bing.com/search?q=" + key_word + "+site%3a" + url + "&qa=n&sp=-1&pq=" + key_word + "site%3a" + url +"&first=" + str((page-1)*10) + "&FORM=PERE1" conn.get('http://cn.bing.com', headers=headers(referer)) r = conn.get(bing_url, stream=True, headers=headers(referer), timeout=8) emails = search_email(r.text) return emails# Baidu_searchdef baidu_search(url, page, key_word): email_list = [] emails = [] referer = "https://www.baidu.com/s?wd=email+site%3Abaidu.com&pn=1" baidu_url = "https://www.baidu.com/s?wd=" + key_word + "+site%3A" + url + "&pn=" + str((page-1)*10) conn = requests.session() conn.get(baidu_url, headers=headers(referer)) r = conn.get(baidu_url, headers=headers(referer)) soup = BeautifulSoup(r.text, 'lxml') tagh4 = soup.find_all('h4') for h4 in tagh4: href = h4.find('a').get('href') try: r = requests.get(href, headers=headers(referer)) emails = search_email(r.text) except Exception as e: pass for email in emails: email_list.append(email) return email_list# search_emaildef search_email(html): emails = re.findall(r"[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.[a-z]" + html, re.I) return emails# headers(referer)def headers(referer): headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36', 'Accept': 'application/json, text/javascript, */*; q=0.01', 'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8', 'Accept-Encoding': 'gzip, deflate, br', 'Referer': referer } return headersif __name__ == '__main__': # 定义异常 try: start(sys.argv[1: ]) except: print("interrupted by user, killing all threads ... ")
关于怎么在python中实现被动信息搜集就分享到这里了,希望以上内容可以对大家有一定的帮助,可以学到更多知识。如果觉得文章不错,可以把它分享出去让更多的人看到。