问题描述: 我用scala语言写了一个读取hbase中的表,并将数据打印出来的操作,但是从显示结果看,只有id字段值正确,其余的字段显示都有问题,但感觉代码没有什么问题,特贴出来,让大家看看有没有问题。
代码:
object Hbase2Rdd {
def main(args: Array[String]): Unit = {
val Array(master) = args
val sparkConf = new SparkConf().setAppName("Hbase2Rdd").setMaster(master)
val sc = new SparkContext(sparkConf)
val sqlContext = new SQLContext(sc)
val conf = HBaseConfiguration.create()
conf.set("hbase.zookeeper.property.clientPort", "2181")
conf.set("hbase.zookeeper.quorum", "192.168.xxx.xxx")
conf.set("hbase.master", "192.168.xxx.xxx:60000")
val scan = new Scan
val tableName = "orders"
conf.set(TableInputFormat.INPUT_TABLE, tableName)
/* val proto = ProtobufUtil.toScan(scan)
val ScanToString = Base64.encodeBytes(proto.toByteArray)
conf.set(TableInputFormat.SCAN, ScanToString)*/
//读取数据并转化成rdd
val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
classOf[ImmutableBytesWritable],
classOf[Result])
val count = hBaseRDD.count()
println("counts : " + count)
//hBaseRDD.saveAsTextFile(path)
import sqlContext.implicits._
val order = hBaseRDD.map(recode => (
//通过列族和列名获取列
Bytes.toString(recode._2.getValue(Bytes.toBytes("cf"), Bytes.toBytes("orderId"))),
Bytes.toInt(recode._2.getValue(Bytes.toBytes("cf"), Bytes.toBytes("createTime"))),
Bytes.toInt(recode._2.getValue(Bytes.toBytes("cf"), Bytes.toBytes("modifiedtime"))),
Bytes.toInt(recode._2.getValue(Bytes.toBytes("cf"), Bytes.toBytes("status")))
)).toDF("orderId", "createTime", "modifiedtime", "status")
order.registerTempTable("orderssss")
val frame = sqlContext.sql("select orderId,createTime,modifiedtime,status from orderssss")
frame.show()
sc.stop()
}
}
结果打印
hbase中数据
请问大家这是什么原因导致打印数据信息与hbase中的不一致呢
|
|