python连接hive-编程学习网

一、需要安装下载的包

下载pyhive、thrift和sasl三个包（pip install就好）

目前遇到的问题：

sasl安装问题：

（1）sasl安装需要到相关网站下载whl之后找到和python适配的版本进行安装，安装网址：https://www.lfd.uci.edu/~gohlke/pythonlibs/https://www.lfd.uci.edu/~gohlke/pythonlibs/

安装完这个把他放在放入你的终端目录里面去，之后用pip install xxx.whl

（2）有的电脑会提示缺失了与此有关的依赖项或库文件，需要安装一个Visual C++ Build Tools，详细参考：https://go.microsoft.com/fwlink/?LinkId=691126https://go.microsoft.com/fwlink/?LinkId=691126

二、进行查询和连接（例子）

# 一个例子import pandas as pdfrom pyhive import hiveimport thriftimport saslconn = hive.Connection(host="xxx.xxx.xx.xxx", port=10000, username="你的用户名")# 执行查询操作cursor = conn.cursor()cursor.execute("SELECT x.* FROM 表名 x WHERE date_format(start_date, 'yyyy-MM-dd') >= '2021-12-31'")results = cursor.fetchall()# 将结果保存df = pd.DataFrame(results)print(df)# 关闭连接cursor.close()conn.close()