Hive. Part of Hadoop Ecosystems MapReduce Runtime (Dist. Programming Framework) Hadoop Distributed File System (HDFS) Zookeeper (Coordination) Hbase (Column.

Similar presentations


Presentation on theme: "Hive. Part of Hadoop Ecosystems MapReduce Runtime (Dist. Programming Framework) Hadoop Distributed File System (HDFS) Zookeeper (Coordination) Hbase (Column."— Presentation transcript:

1 Hive

2 Part of Hadoop Ecosystems MapReduce Runtime (Dist. Programming Framework) Hadoop Distributed File System (HDFS) Zookeeper (Coordination) Hbase (Column NoSQL DB) Sqoop/Flume (Data integration) Oozie (Job Workflow & Scheduling) Pig/Hive (Analytical Language) Hue (Web Console) Mahout (Data Mining)

3 Data import Sqoop을 사용하여 mysql에 있는 movie 데이터를 HDFS상 에 올리기 위한 명령어. (테스트를 위한 용도.) movie table을 HDFS로 import movierating table을 HDFS로 import Sqoop import --connect jdbc:mysql://localhost/movielens --table movie --fields-terminated-by '\t‘ --username training –password training sqoop import --connect jdbc:mysql://localhost/movielens --table movierating --fields-terminated-by '\t' --username training --password training

4 Create Table & Load Data movie table 생성 movie data 를 movie table 에 load

5 Describe & select movie table 기본 구조 보기 movie table 전체 데이터 중에 5 개만 보기

6 Data movierating table movie table

7 Where

8 Join Join 을 위해 movie rating 를 위한 table 생성 과 로드.

9 Join 조인된 결과에서 movie 의 이름과 rating 을 5 개만 추출 select movie.name, movierating.rating from movie join movierating on (movie.id = movierating.movieid) limit 5; 각 무비에 대한 rating 의 평균을 구한다. select movie.name, avg(movierating.rating) from movie join movierating on (movie.id = movierating.movieid) group by movie.name limit 5; Rating 평균을 구한 것을 내림차순으로 정렬한다. select movie.name, avg(movierating.rating) c5 from movie join movierating on (movie.id = movierating.movieid) group by movie.name order by c5 desc limit 5; Movie table 과 movierating table 을 movieid 를 키로 하여 join 한다.


Download ppt "Hive. Part of Hadoop Ecosystems MapReduce Runtime (Dist. Programming Framework) Hadoop Distributed File System (HDFS) Zookeeper (Coordination) Hbase (Column."

Similar presentations


Ads by Google