[mw_shl_code=sql,true] CREATE TABLE words_array AS SELECT tweet_id AS id, split(text,' ') AS words FROM tweets;
[/mw_shl_code]
- 创建一张hive表将文本中的数组切分成单独的单词
[mw_shl_code=sql,true] CREATE TABLE tweet_word AS SELECT id AS id, word FROM words_array LATERAL VIEW explode(words) w as word;
[/mw_shl_code]
- 通过对上述表的Join操作创建新的表
[mw_shl_code=sql,true] CREATE TABLE word_join AS SELECT tweet_word.id, tweet_word.word, sentiment_dictionary.rating FROM tweet_word LEFT OUTER JOIN sentiment_dictionary ON (tweet_word.word=sentiment_dictionary.word);
[/mw_shl_code]
在Atlas中,上述操作生成的word_join表的血缘关系图如下所示: