Spark SQL on EMR に JDBC 接続したメモ。
- EMRクラスタを作成する
- ssh でマスターノードにログインする
ssh -i ~/mykey.pem hadoop@ec2-**-***-***-**.ap-northeast-1.compute.amazonaws.com
- thriftserver を起動する
$ sudo -u spark /usr/lib/spark/sbin/start-thriftserver.sh
- 接続してみる
$ sudo -u spark /usr/lib/spark/bin/beeline Beeline version 1.2.1-spark2-amzn-0 by Apache Hive beeline> !connect jdbc:hive2://localhost:10001 Connecting to jdbc:hive2://localhost:10001 Enter username for jdbc:hive2://localhost:10001: hadoop Enter password for jdbc:hive2://localhost:10001:<Return> 18/03/21 10:51:29 INFO Utils: Supplied authorities: localhost:10001 18/03/21 10:51:29 INFO Utils: Resolved authority: localhost:10001 18/03/21 10:51:29 INFO HiveConnection: Will try to open client transport with JDBC Uri: jdbc:hive2://localhost:10001 Connected to: Spark SQL (version 2.2.1) Driver: Hive JDBC (version 1.2.1-spark2-amzn-0) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://localhost:10001> show tables; +-----------+------------+--------------+--+ | database | tableName | isTemporary | +-----------+------------+--------------+--+ +-----------+------------+--------------+--+ No rows selected (1.201 seconds) 0: jdbc:hive2://localhost:10001>
- 切断する
0: jdbc:hive2://localhost:10001> !exit Closing: 0: jdbc:hive2://localhost:10001 Error: Error while cleaning up the server resources (state=,code=0) Connection is already closed.