Backgroup
now databend support hive catalog to run hive queries, this docs shows how to set up databend-hive environment and run hive sqls.
How to set up databend-hive cluster
hiveserver&metastore&hdfs is supposed to be pre-installed.
- download a databend-release with hive support, or build from source
## make sure JAVA_HOME is set
export JAVA_HOME=/path/to/java
export LD_LIBRARY_PATH=${JAVA_HOME}/lib/server:${LD_LIBRARY_PATH}
cargo build --features hive,storage-hdfs
- setup a databend cluster, refer to deploying-databend
- add hive catalog and hdfs storage to databend-query.toml
[storage]
type = "hdfs"
[storage.hdfs]
# hdfs namenode address,such as 127.0.0.1:8020
name_node = "xx"
root = ""
[catalogs.hive]
type = "hive"
# hive metastore address, such as 127.0.0.1:9083
address = "xx"
- run databend-query with java&hadoop environment
export HADOOP_HOME=xxx
export JAVA_HOME=xxx, such as /Library/Java/JavaVirtualMachines/openjdk-11.jdk/Contents/Home
export LD_LIBRARY_PATH=$JAVA_HOME/lib/server:$LD_LIBRARY_PATH
./bin/databend-query -c ./databend-query.toml > query.log 2>&1 &
- setup hive related settings with mysql client
set global sql_dialect = 'hive';
suggest settings:
-- for chinese users
set global timezone = 'Asia/Shanghai';
set global max_execute_time = 180000;
-- support hive nvl function
create FUNCTION nvl as (a,b) -> ifnull(a,b);
- query hive data using mysql client or mysql jdbc client. Note: hive tables must be referred as
hive.db.table
select * from hive.$db.$table limit 10;
Limititions
- only support parquet table, not support orc,txt
- not support struct&map&decimal hive data types
- only support hive select queries, not support DDL, insert, DML sqls
- not support hive udfs, hive functions are limited supported
hive features is now in beta stage, please feel free to report bugs&suggestions in databend issues.