SuperDove 发表于 2016-10-20 16:30:58

pig学习总结:ERROR 0: Scalar has more than one row in the output.

1.现在创建两张表studen和sc表,并插入数据(数据与表都来源于网络修改)
CREATE TABLE student
(
        sid int(8),
        sname varchar(32),
        sage int(8),
        ssex varchar(8),
        PRIMARY KEY (sid)
)ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

CREATE TABLE sc
(
        sid int(8),
        cid int(8),
        score int(8),
        PRIMARY KEY (sid,cid)
)DEFAULT CHARSET=utf8;

insert into student select 1,'刘一',18,'男' union all
select 2,'钱二',19,'女' union all
select 3,'张三',17,'男' union all
select 4,'李四',18,'女' union all
select 5,'王五',17,'男' union all
select 6,'赵六',19,'女' ;

insert into sc
select 1,1,56 union allselect 1,2,78 union allselect 1,3,67 union allselect 1,4,58 union allselect 2,1,79 union allselect 2,2,81 union allselect 2,3,92 union allselect 2,4,68 union allselect 3,1,91 union allselect 3,2,47 union allselect 3,3,88 union allselect 3,4,56 union allselect 4,2,88 union allselect 4,3,90 union allselect 4,4,93 union allselect 5,1,46 union allselect 5,3,78 union allselect 5,4,53 union allselect 6,1,35 union allselect 6,2,68 union all
select 6,4,71 ;

2.使用sqoop从mysql表中导出这两张表的数据并分别更改名称为student.txt和sc.txt
期间遇见一个sqoop的问题,就是mysql数据我取源于网络,没给表设置主键,然后sqoop无法将数据导出来并且日志没报错,提示信息为
sqoop:000> start job --jid 4
Exception has occurred during processing command
Exception: org.apache.sqoop.common.SqoopException Message: CLIENT_0001:Server has returned exception

3.正式进入pig的shell命令端口
将两张表的数据分别导出
student = load '/tmp/pig/student.txt' using PigStorage(',') as (sid:int,sname:chararray,sage:int,ssex:chararray);
sc = load '/tmp/pig/sc.txt' using PigStorage(',') as (sid:int,cid:int,score:int);
然后做了一个表连接的查询
A = join student by sid,sc by sid;
A2 = foreach A generate student.sid;

这个时候是个坑,pig的编译不会报错,但是执行之后会报如下错误
Error: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Scalar has more than one row in the output. 1st : (1,'刘一',18,'男'), 2nd :(2,'钱二',19,'女') (common cause: "JOIN" then "FOREACH ... GENERATE foo.bar" should be "foo::bar" )

后来想着改为A2 = foreach A generate student:sid;
pig的运行前就会报ERROR 1200: <line 33, column 24>Syntax error, unexpected symbol at or near 'student' 错误

最后通过Scalar has more than one row in the output查阅之后将student.sid更改为student::sid就不会报错并运行成功了
A2 = foreach A generate student::sid;
具体原因:describe A结果如下
A2 = foreach A generate student::sid;
A: {student::sid: int,student::sname: chararray,student::sage: int,student::ssex: chararray,sc::sid: int,sc::cid: int,sc::score: int}




页: [1]
查看完整版本: pig学习总结:ERROR 0: Scalar has more than one row in the output.