这篇文章主要介绍了如何使用Python Impyla客户端连接Hive和Impala,具有一定借鉴价值,感兴趣的朋友可以参考下,希望大家阅读完这篇文章之后大有收获,下面让小编带着大家一起了解一下。
使用hive模块
pip install sasl
pip install thrift
pip install thrift-sasl
pip install PyHive
[root@ip-172-31-40-242 ~]# more testpyhive.py
from pyhive import hive
conn = hive.Connection(host='xxxxxxx', port=10000, database='collection',username='')
cursor=conn.cursor()
cursor.execute('select * from tb_partition limit 10')
for result in cursor.fetchall():
print result
[root@ip-172-31-40-242 ~]# python testpyhive.py
(u'1', u'2', u'201707')
(u'1', u'2', u'201707')
(u'123', None, u'201709')
(u'123', u'456', u'201709')
(u'45678', u'456', u'201709')
(u'123', None, u'201709')
(u'123', u'456', u'201709')
(u'45678', u'456', u'201709')
(u'123', None, u'201709')
(u'123', u'456', u'201709')
官方API:https://pypi.org/project/PyHive/
如何使用Python Impyla客户端连接Hive和Impala
# -*- coding:utf-8 -*-from impala.dbapi import connectconn = connect(host='172.31.46.109',port=10000,database='collection',auth_mechanism='PLAIN')print(conn)cursor = conn.cursor()#param = '''SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;SET hive.support.concurrency=true;'''cursor.execute(param)cursor.execute('SELECT uid FROM redefine_collection where uid=4028 limit 10')print cursor.description # prints the result set's schemaresults = cursor.fetchall()print results## Python连接Impala(ImpalaTest.py)## from impala.dbapi importconnect## conn = connect(host='ip-172-31-26-80.ap-southeast-1.compute.internal',port=21050)## print(conn)## cursor = conn.cursor()## cursor.execute('show databases')## print cursor.description # prints the result set's schema## results = cursor.fetchall()## print(results)## cursor.execute('SELECT * FROM test limit 10')## print cursor.description # prints the result set's schema## results = cursor.fetchall()## print(results)
感谢你能够认真阅读完这篇文章,希望小编分享的“如何使用Python Impyla客户端连接Hive和Impala”这篇文章对大家有帮助,同时也希望大家多多支持编程网,关注编程网行业资讯频道,更多相关知识等着你来学习!