Ch. 16 Design and Business Intelligence

Slides:

Advertisements

Similar presentations

“ PPT WORLD PowerPoint template, you can become an expert. Your wishes for the successful presentation. Our company wishes to own a successful presentation.

Advertisements

Crystal Reports .NET ASTech System.

Table of Contents I. OLAP 의 이해 II. OLAP의 CRM 적용 사례 III. 향후 OLAP의 발전 방향.

Chapter 9. 컴퓨터설계기초 9-1 머리말 9-2 데이터 처리장치 (Datapath)

Chapter 7: Entity-Relationship 모델

FREE ONLINE WHITEBOARD TOOLS

6주차:『GPU(CUDA) Programming』

Multiple features Linear Regression with multiple variables (다변량 선형회귀)

Chapter 7 ARP and RARP.

인재채용의 경쟁우위확보를 위한 역량기반의 구조적 면접 컨설팅 추진방안

IT Application Development Dept. Financial Team May 24, 2005

Chapter 7 데이터웨어하우징 의사결정지원시스템.

4. 데이터 기능 유형.

Chapter 15 aggregates 서울시립대학교 인공지능연구실 홍성학.

SAP QUERY SAP R/3 4.6C.

INI STEEL 성과관리시스템 구축을 위한 SAP 제안설명회

Comshare Decision을 이용한 SCM Monitoring

강좌 개요 2009년 1학기 컴퓨터의 개념 및 실습.

과목 홈페이지  전산학개론 이메일 숙제를 제출할 경우, 메일 제목은 반드시 ‘[전산학개론]’으로 시작.

Google Analytics Seminar

SQL Server 2005 데이터베이스 가용성 강화 측면에서 본 데이터베이스 미러링과 스냅샷, 복제

Information Technology

7장 : 캐시와 메모리.

Internet Computing KUT Youn-Hee Han

Enterprise Data Warehouse

12. 데이터베이스 설계.

[멀티미디어 문서구조화특론 ] Workflow

Excel OLAP Reporting / OWC를 이용한

Your Best Financial Guide

Chapter 2 OSI 모델과 TCP/IP 프로토콜.

EPS Based Motion Recognition algorithm Comparison

장윤석과장 Technology Specialist (주)한국마이크로소프트

On the computation of multidimensional Aggregates

마케팅 분석 시스템 개발 방법론 2004년 5월 27일 ㈜비아이솔루션 김환태

SSAS 변화된 구조와 사용자 분석 화면 구현 우철웅 기술이사 BI 사업부 인브레인.

MySQL 기본 사용법.

Data Modeling Database 활용을 위한 기초 이론 Database의 개요 Data Modeling

BSC 기법에 의한 성과지표설정방법 강사 : 오영환(달란트HR컨설팅 대표/경영학박사)

6장. 물리적 데이터베이스 설계 물리적 데이터베이스 설계

ER-Win 사용 방법.

Dept. of CSE, Ewha Womans Univ.

1 도시차원의 쇠퇴실태와 경향 Trends and Features of Urban Decline in Korea

숭실대학교 마이닝연구실 김완섭 2009년 2월 8일 아이디어 - 상관분석에 대한 연구

Xen and the Art of Virtualization

계수와 응용 (Counting and Its Applications)

ProQuest Dissertations Unlimited

정보처리기사 8조 신원철 양진원 유민호 이기목 김다연 윤현경 임수빈 조현진.

세일즈분석/분석CRM을 위한 데이터마이닝 활용방안

제 8 장 객체지향 데이타베이스와 데이타베이스의 새로운 응용 분야

임상 시나리오를 통해 알아보는 「UpToDate」 사용법

ER-Win 4.0 Database Modeling Ⅰ. Logical Design

(Data Exploration & Analysis)

The Data Warehouse Toolkit, 3rd Edition CH.10 Financial Services

Course Guide - Algorithms and Practice -

The Practice of KM operations

McGraw-Hill Technology Education

의사결정지원시스템 개요 Database DBMS D G M S MBMS Modelbase User Interface

Chapter 12 Memory Organization

시스템 분석 및 설계 글로컬 IT 학과 김정기.

소프트웨어 형상관리: 목차 변경 및 형상관리의 기초 개념 형상항목 확인 및 버전관리 변경관리 감사 및 감사보고 99_11

이산수학(Discrete Mathematics)

Data Warehouse 구축 (설계 위주)

점화와 응용 (Recurrence and Its Applications)

The World of English by George E.K. Whitehead.

창 병 모 숙명여대 전산학과 자바 언어를 위한 CFA 창 병 모 숙명여대 전산학과

1. 관계 데이터 모델 (1) 관계 데이터 모델 정의 ① 논리적인 데이터 모델에서 데이터간의 관계를 기본키(primary key) 와 이를 참조하는 외래키(foreign key)로 표현하는 데이터 모델 ② 개체 집합에 대한 속성 관계를 표현하기 위해 개체를 테이블(table)

이산수학(Discrete Mathematics)

Presentation by Timothy Kane

SQL Server Reporting Services Feature

Chapter 7: Deadlocks.

Presentation transcript:

Ch. 16 Design and Business Intelligence 병렬소프트웨어설계연구실 오찬영

Design and Business Intelligence Good dimensional design 뿐만 아니라 information에 대한 접 근성도 중요 이러한 접근은 다양한 tool을 통해 이루어짐 These tools are referred to as business intelligence tools End user에게 제공하는 각각의 정보를 report라고 부름 The tools do not require the person developing a report to write query

Business Intelligence Tools Numerous formats chart, table, dashboard widget, … Variety of channels computer, mobile devices, telephones, … Varied access paradigms on-demand, scheduled, exceptional alert

Schema-driven SQL Generates The most basic example identify the table that will be used specify columns that is used to lay out a report the tool generates SQL based on what developer has added to the canvas

Semantic Layers A business view of information on top of the technical view report 가능한 것들 = user가 tool을 보는 view Business view가 실제 physical structure와 연결 되어 있는 것이 semantic layer 이를 통해 user는 실제 database의 구조를 몰라도 tool을 사용할 수 있 음

Semantic Layers Semantic layer is defined by a developer Query generator는 semantic layer의 정보를 바탕으로 data를 fetch

The Limitations of SQL Generators SQL generators generates SQL that always follows some standard formats, or templates (not intelligent) 생성 가능한 query는 질의문에 담긴 product에 대한 함수이며, 정확도 는 configuration에 영향을 받는다. 이러한 제약은 생성 가능한 query의 종류를 결정함 Two major limitations Inability to generate a desired query Ability to generate a undesired query

Inability to generate a desired query No tool is capable of generating appropriate query for every situation 하지만 schema design에 의해 해결 될 수 있는 문제일 수 있음 e.g., drill-across를 지원하기 위해서는 merged fact table 추가 기본적으로 특정 tool을 위해 schema를 변경하는 것은 권장되 지 않지만, 상황에 따라 융통성을 발휘하는 것도 필요 derived schema 같은 경우는 schema의 변경 없이 추가만으로 도 구현 가능하기도 함

Ability to generate a undesired query Bank account balance는 시간에 따라 변화하는 값으로, query 를 생성할 때 날짜 등으로 aggregate 되기도 함 tool을 처음 사용하는 경우 정보를 잘못 이해할 수도 있음 마찬가지로, derived schema가 해결책이 될 수 있다 sliced fact table을 이용하여 현재 기간의 balance에만 접근 가능하게 하되, 숙련자는 original data에 접근 가능하게 하는 등의 방법 이런 방법은 multiple semantic layer를 요구함 (one for novices, one for experts)

Guidelines for the semantic layer Features to avoid Renaming attributes Creating virtual attributes Relying on subqueries Features to use Compute nonadditive facts Isolate dimension roles Simplify presentation

Features to avoid Dimensional design should be rich and understandable Renaming attributes business view에서의 naming이 보다 명확할 수 있지만, dimensional design 수준에서의 naming 또한 명확하게 작성해야 함 Semantic layer에서의 naming을 보조 수단(user-friendly translation)으 로 삼지 말 것

Features to avoid Creating virtual attributes Relying on subqueries Semantic layer를 saving space 목적으로 사용하지 말 것 Dimensional design 단계에서 유용한 element의 조합 및 translation을 포함해야 한다 (redundant 할지라도) 예를 들어, full name을 저장하는 대신 last name과 first name을 query 단계에서 concatenation 시키는 경우 복잡한 query에 포함 되면 computing performance에 심각한 영향을 끼칠 수 있음 Relying on subqueries 자주 사용되는 subquery (e.g., categorization based on previous behaviors) 는 design 단계에서 behavioral dimension으로 구성

Features to use Compute nonadditive facts Isolate dimension roles e.g., margin rate = sum(margin_dollars) / sum(order_dollars) fully additive component를 이용한 nonadditive fact는 query time에 계산하는 것이 좋음 Isolate dimension roles 하나의 dimension table이 여러 역할로 사용되는 경우 (여러 reference 가 존재하는 경우) 각각의 역할 별로 구분해서 semantic layer를 구성 하는 것이 좋다  aliasing

Features to use Simplify presentation folder 등으로 attribute를 구분해서 end user가 이해하기 쉽도록 구성

Working with SQL-Generating BI Tools BI tool의 capability를 아는 것이 중요함 addition of derived schema or view에 대한 결정 guarding과 analytic flexibility 사이의 balancing

Multiple stars – Drilling across combining information from more than one fact table: Aggregate facts from each stars, grouping result set by the same dimension Merge the result sets Some configuration is usually required so that the tool will invoke the mechanism (i.e., multi-pass query) automatically Tool이 drill across를 수행할 수 없는 경우에는 merged fact table을 추가해주어야 한다

Multiple stars – Queries with no facts it is called cross-browse 여러 개의 dimension을 결합할 방법이 여러 개가 있을 수 있음 단순히 data의 list만 전달하는 것으로는 부족 e.g., 날짜와 품목을 정해주었을 때, SQL generator는 사용자가 shipment_fact 혹은 order_fact 어느 fact에 대한 질의를 수행했는지 알 수 없음 각 star에 대한 semantic layer를 별도로 구성하여 해결 혹은, shared dimension에 대한 aliasing을 생성 e.g., ordered_product_name, shipped_product_name

Multiple stars – More than one way to compare processes order와 shipment를 비교하는 방법 각 날짜 별 비교 (각 날짜에 발생한 activity) order date를 기준으로 비교 (order의 status) user가 canvas에 date, quantitiy_ordered, quantity_shipped를 올려놓은 경우 수행하고자 하는 질의가 무엇인지 확정 불가 두 merged fact table을 이용하거나, semantic layer를 분리

Multiple stars – Conformed dimensions BI tool은 conformed attribute를 알지 못함 drill across를 자동으로 수행할 수 없음 Snow-flake schema로 구현  schema design에 영향을 줌 merged fact table 구성

Semi-additivity – Using semi-additive facts Periodic snapshot에 대한 fact table은 semi-additive fact를 포 함한다. account balance는 semi-additive 은행, 계좌 등에 대해서는 additive, 시간에 대해서는 non-additive Constrain the query for a specific instance of the non- additive dimension 특정 날짜의 은행 잔고의 합 Group the query results by instances of the non-additive dim. 은행 잔고의 합을 날짜 순 정렬

Semi-additivity sliced fact table을 이용하여 user의 접근을 방지하여 회피할 수 있음 expert 에게는 다른 semantic layer를 제공하여 접근 가능하도록 함 tool이 자체적으로 방지하는 경우, average 를 계산하는 등 실제 sum 연산이 필요한 경우에도 제한될 수 있음 별도의 특별한 연산(averaging)을 위한 기능을 제공

Browse queries – Mini-dim when there is a shortcut between main & mini dim., there can be multiple way to relate dimensions

Browse queries – Mini-dim Role에 따른 aliasing 으로 해결

Bridge tables Bridge tables allow a single fact to refer to more than one instance of a dimension (many-to-many, Ch. 9 and 10) Sum of order_dollars for each sales person != a grand total 판매에 참여한 사람이 여럿인 경우 It is known as a impact table

Bridge tables Primary sales person을 지정  각 order는 한 명의 sales person을 갖도록 for novice user

Hierarchy bridge tables Two configurations, one used for looking down the hierarchy, the other for looking up

Cube-centric Business Intelligence BI tools support interaction with multidimensional databases, rather than a semantic layers. Multiple semantic layers may seem confusing, whereas multiple cubes are natural Different cubes will be provided for different purposes order  order cube, shipment  shipment cube comparing order and shipment  order & shipment cube

Cube-centric Business Intelligence Cube의 pre-computed value를 이용해 OLAP의 “online” 성을 강화 할 수 있음 반면, # attributes가 커지면 performance 감소 less scalable

Multiple cubes Drilling into multiple cubes is less of a concern Just merge the cubes into a new one How to deal with factless query is also far less of a concern cube가 선택되면 scope of analysis가 결정됨 두 process (e.g., order and shipment)를 비교하는 관점이 여럿 존 재하는 경우 각각의 경우에 대해 merged table을 준비 Safety vs. flexibility safe & limited cubes for novice, flexible & dangerous cubes for expert

Multiple cubes A new concern Cube-based approaches are less scalable There should be many ‘targeted’ cubes It can be resulted in over-proliferation of cubes Designer should carefully select the appropriate mix of cubes avoid one-cube-per-report

Auto-generation of cubes SQL generation과 마찬가지로 Cube generation에도 주의가 필요 Easy to generate summarizing the base data without transforming its dimensional structure such as aggregating, slicing Difficult to generate A merged cube must be constructed according to standard drill-across process Manual control의 필요성을 염두 해 두어야 함

Hierarchy of attributes Some tools use attribute hierarchy to control the drilling process. Competing hierarchy (e.g., calendar year vs. fiscal year) will require separate cubes