启动了hive-metastore和hive-server2服务正常.
使用beeline连接hive-server2正常.
执行查询:”show tables”正常
执行查询:”select count(imei) from test “出现异常。仔细查看日志/var/log/hive/hive-server2.log
2013-06-14 07:32:52,945 ERROR exec.Task (SessionState.java:printError(421)) - Job Submission failed with exception 'java.io.IOException(Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.)' java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:121) at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:83) at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:76) at org.apache.hadoop.mapred.JobClient.init(JobClient.java:478) at org.apache.hadoop.mapred.JobClient.(JobClient.java:457) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:426) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1374) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1160) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:973) at org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:116) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:194) at org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:154) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:190) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1193) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1178) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.cli.thrift.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:38) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2013-06-14 07:32:52,947 ERROR ql.Driver (SessionState.java:printError(421)) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask BEELINE: 0: jdbc:hive2://16.187.94.200> select max(dt) from table; Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask (state=08S01,code=1) Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask (state=08S01,code=1) |
可以知道show tables能够执行成功,而后一条sql执行出问题是因为,前者只与需要查询元数据,而后者需要启动MapReduce。然而报不能够初始化集群的问题。
根据文档解决的方法是在文件/etc/default/hive-server2中输出MapReduce的环境变量:
export HADOOP_MAPRED_HOME=/usr/lib/hadoop-0.20-mapreduce
参考:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_18_11.html
https://groups.google.com/a/cloudera.org/forum/#!msg/cdh-user/Zs4X2AcMqRQ/QCpn5VsCskAJ
在执行insert overwrite select from语句时出现下面异常:
2013-09-26 15:49:07,167 ERROR jdbc.JDBCStatsPublisher (JDBCStatsPublisher.java:init(281)) - Error during JDBC initialization. java.sql.SQLException: Failed to create database 'TempStatsStore', see the next exception for details. at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.createDatabase(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection30.(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection40.(Unknown Source) at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source) at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source) at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source) at java.sql.DriverManager.getConnection(DriverManager.java:582) at java.sql.DriverManager.getConnection(DriverManager.java:207) at org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher.init(JDBCStatsPublisher.java:265) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:436) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1383) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1169) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:982) at org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:131) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:209) at org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:154) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:191) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1373) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1358) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.cli.thrift.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:38) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.sql.SQLException: Failed to create database 'TempStatsStore', see the next exception for details. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown Source) ... 33 more Caused by: java.sql.SQLException: Directory /TempStatsStore cannot be created. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown Source) at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source) ... 30 more Caused by: ERROR XBM0H: Directory /TempStatsStore cannot be created. at org.apache.derby.iapi.error.StandardException.newException(Unknown Source) at org.apache.derby.impl.services.monitor.StorageFactoryService$9.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at org.apache.derby.impl.services.monitor.StorageFactoryService.createServiceRoot(Unknown Source) at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source) at org.apache.derby.impl.services.monitor.BaseMonitor.createPersistentService(Unknown Source) at org.apache.derby.iapi.services.monitor.Monitor.createPersistentService(Unknown Source) ... 30 more |
后续异常:
2013-09-26 15:49:20,368 INFO exec.Task (SessionState.java:printInfo(418)) - [Warning] could not update stats.Failed with exception StatsAggregator connect failed jdbc:derby org.apache.hadoop.hive.ql.metadata.HiveException: StatsAggregator connect failed jdbc:derby at org.apache.hadoop.hive.ql.exec.StatsTask.aggregateStats(StatsTask.java:286) at org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:250) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1383) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1169) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:982) at org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:131) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:209) at org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:154) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:191) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1373) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1358) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.cli.thrift.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:38) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) |
解释:出现上述异常是使用hive-server2连接hive-metastore执行“INSERT OVERWRITE“命令时出现的异常。
解决办法:
修改hive.stats.autogather参数为false,因为我觉得hive.stats这个意义不大,于是只是设置下面参数禁用异常。
<property> <name>hive.stats.autogather</name> <value>false</value> <description>A flag to gather statistics automatically during the INSERT OVERWRITE command.</description> </property> |
这是一种办法,还有就是修改hive.stats相关配置到mysql数据库。具体详情可以参考这里
<property> <name>hive.stats.dbclass</name> <!--value>jdbc:derby</value--> <value>jdbc:mysql</value> <description>The default database that stores temporary hive statistics.</description> </property> <property> <name>hive.stats.autogather</name> <value>true</value> <description>A flag to gather statistics automatically during the INSERT OVERWRITE command.</description> </property> <property> <name>hive.stats.jdbcdriver</name> <!-- <value>org.apache.derby.jdbc.EmbeddedDriver</value>--> <value>com.jdbc.mysql.Driver</value> <description>The JDBC driver for the database that stores temporary hive statistics.</description> </property> <property> <name>hive.stats.dbconnectionstring</name> <!--<value>jdbc:derby:;databaseName=TempStatsStore;create=true</value>--> <value>jdbc:mysql://mymaster:3306/TempStatsStore</value> </property> |
再次执行insert overwrite正常。