SQL 서버 성능 문제 해결 및 Locking Internals

Slides:



Advertisements
Similar presentations
Oracle DB 구조 및 트랜잭션 관리 이경화 Database 의 구조 Program Global Area (PGA) Instance Database Buffer Cache Redo Log Buffer Library Cache Shared.
Advertisements

1 Orange Part II WareValley. 2 Loader Tool 3 Loader Tool 실행.
SQL Server 2000 트랜잭션과 잠금 데브피아 세미나
DB2 Information Management DB2 UDB CLP Command Summary.
SAP Tuning 실무 교육 목 차 1. SAP Architecture 의 이해 2. Monitoring 3. Tuning 방법 결정 (DB or ABAP) 4. Performance Trace (DB) 5. Run Time Analysis (CPU)
Linux Debugging issues
C++ Tutorial 1 서강대학교 데이터베이스 연구실.
Chapter 7 ARP and RARP.
Introduction to Django
기술 표준 6대 필수 기술 요소에 대해 지정한 그룹 IT 기술 표준에 따라 DBMS는 MS SQL과 Oracle에 대해 검토 함 구분 OS DBMS WAS Web Sever 검토대상 종합의견 x86 기반 OS(64bit 권장) 성능, 안정성 및 HW의 확장성 향상으로.
데이터 모델링 방법론 2003년 03월.
IT Application Development Dept. Financial Team May 24, 2005
4. 데이터 기능 유형.
SAP QUERY SAP R/3 4.6C.
전자정부 서비스 운영을 위한 SLA 적용 방안 남기찬 교수 서강대학교 아웃소싱연구센터 (
SQLDiag를 이용한 SQL Server 2005 성능진단
Operating Systems Overview
AWR DB 보고서 분석.
SQL Server 2005 데이터베이스 가용성 강화 측면에서 본 데이터베이스 미러링과 스냅샷, 복제
MySQL performance Xhark 김재홍.
Toad for Oracle 설치 방법.
Toad for SQL Server 제품 소개서 – 프로넷소프트㈜.
7장 : 캐시와 메모리.
 midi LOGGER GL220   신제품 소개 Dec, 2011.
McGraw-Hill Technology Education
Uniprocessor Scheduling
12. 데이터베이스 설계.
InstallShield Professional Services ( Services Pack / Education / Consulting ) ㈜소프트뱅크 커머스.
오라클 데이터베이스 성능 튜닝.
SmileEDI 가입 안내서 1. SmileEDI `회원가입 절차 -SmileEDI 접속 방법-
장윤석과장 Technology Specialist (주)한국마이크로소프트
On the computation of multidimensional Aggregates
MySQL 기본 사용법.
6장. 물리적 데이터베이스 설계 물리적 데이터베이스 설계
ER-Win 사용 방법.
SQL Server 2000 세미나 Profiler를 이용한 문제해결
(Network Transaction Application Server)
Lecture #3 프로세스(Process).
DataStage 운영자 지침서 Operator’s Guide
운영체제 (Operating Systems)
목차 회사소개 회사현황 시스템 구성도 SQL Server 사용 로드맵 프로젝트 개요 DB 마이그레이션
SQL Server 성능 개선 방법론 장윤석 과장 기술사업부 ㈜한국마이크로소프트 [제목과 문서속성 값을 입력해 주세요.
강사: 이종인 다우 교육원 전임강사 / 온디멘드 수석 컨설턴트 / FMG 수석 컨설턴트
SQL Server 7.0 세미나 (Performance Tuning)
제4장 유닉스 쉘 숙명여대 창병모 2011 가을.
정보처리기사 8조 신원철 양진원 유민호 이기목 김다연 윤현경 임수빈 조현진.
SYSMAC GATEWAY 간이 매뉴얼.
운영체제 (Operating Systems) (Memory Management Strategies)
Introduction to Programming Language
JFS operation HP Korea / Operations JFS operation.
SmileEDI 가입 안내서 1. SmileEDI `회원가입 절차 -SmileEDI 접속 방법-
Chapter 12 Memory Organization
시스템 분석 및 설계 글로컬 IT 학과 김정기.
Machine architecture Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Chapter 2.
Operating System Multiple Access Chatting Program using Multithread
SQLDiag를 이용한 SQL Server 2005 성능진단
Signature, Strong Typing
Signature, Strong Typing
How I Approach Tuning a SQL Statement
Chapter 30 Creating Web Audio Files
점화와 응용 (Recurrence and Its Applications)
Machine architecture Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Chapter 2.
1. 관계 데이터 모델 (1) 관계 데이터 모델 정의 ① 논리적인 데이터 모델에서 데이터간의 관계를 기본키(primary key) 와 이를 참조하는 외래키(foreign key)로 표현하는 데이터 모델 ② 개체 집합에 대한 속성 관계를 표현하기 위해 개체를 테이블(table)
Machine architecture Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Chapter 2.
I/O Management and Disk Scheduling
ARENA Basic Process Techniques
SQL Server Reporting Services Feature
Chapter 4. Energy and Potential
Chapter 7: Deadlocks.
가상 기억장치 (Virtual Memory)
Presentation transcript:

SQL 서버 성능 문제 해결 및 Locking Internals [제목과 문서속성 값을 입력해 주세요. 문서속성 값은 파일 메뉴>속성에서 입력해 주시면 됩니다.] [하단의 슬라이드 노트 내용 입력란에 슬라이드별 스크립트를 삽입해 주시기 바랍니다.] 김종구 대리 기술지원부 마이크로소프트

강사 소개 김종구/마이크로소프트 기술지원부 (2000 ~ 현재) Infrastructure RRE SQL Support Engineer KT NeOSS, Auction, 삼성생명 등 주요 사이트 기술지원 “SQL Memory Architecture” 등 TechNet 세미나 진행 [강사 소개 부분은 필수 사항은 아니나 되도록이면 사진, 경력 등을 기입해 주시기 바랍니다.]

목적 성능 관련 문제 발생 시 데이터 수집을 위한 PSSDIAG 툴 소개 성능 문제 해결을 위한 전반적인 방법론 소개 및 관련 지식 전달 Locking Internal에 대한 지식 전달

대상 기술범위 PSSDIAG Tool Read80Trace Tool SQL Server Performance Troubleshooting Methodology Locking Internals [세션에서 다루는 기술범위] [예 : Windows Sever 2003 성능]

이 주제를 이해하는 데 필요한 지식 Level 200 - sysprocesses 의 waittype 및 lastwaittype Performance 문제 해결을 위한 기본 데이터에 대한 이해 - sysprocesses 의 waittype 및 lastwaittype - blocking Monitoring방법 - 성능 로그 사용법 및 주요 counter - Profiler 사용법 [이 세션을 쉽게 이해를 하기 위해서 필요한 지식이나 정보를 입력해주세요.] [Level은 다음과 같은 기준으로 입력합니다.] - 레벨 100: 초보자 – 개념, 기능, 장점, 향후 발전 방향 - 레벨 200: 중급자 – 100 내용에서 좀더 자세히 - 레벨 300: 고급자 – 사용경험과 아키텍처, 상세설명 및 성능, 마이그레이션, 설치, 개발 - 레벨 400: 전문가 – 사용경험 위주의 상세한 설명으로 전문가 대상의 상세한 설명 Level 200

목차 PSSDiag SQL Server Performance Troubleshooting SQL Server Locking Internals and Troubleshooting

PSSDiag

What is PSSDiag? Wrapper around data collection APIs commonly used in PSS, particularly SQL Server Support (Profiler, blocking script, Perfmon/Sysmon, SQLDIAG, event logs, etc) Designed to provide double-click simplicity and reduce user error Get all the needed data the first time, collected at the same time See KB 830232 for information and download location

Components GUI configuration utility (PSSDiagConfig.EXE) Configure types of data to collect Save configuration in XML document Can also manage the collector service Collector (PSSDiag.EXE) Collector app, consumes configuration file created by GUI Can run as a service or a console app

GUI configuration utility Typical use: Select the target version of SQL Server Supply authentication mode and relevant info Select and configure the diagnostics you want to collect Click Start The GUI configures and starts the collector service for you Any output messages from the collector will be displayed in the GUI Diagnostic files will be written to the output folder (.\OUTPUT by default)

Collector (PSSDiag.EXE) Can run as a console app or as a service (GUI always runs it as a service) Can compress files using NTFS or ZIP compression via /Cn parameter Logging out while running as a console will stop data collection Works fine from Terminal Services session Set the output folder via /O parameter To uninstall, run PSSDiag /U to uninstall the service (if you’ve installed it), then delete the files extracted from the archive

Collector (PSSDiag.EXE) Optional command line params: /Cn – (/C1 for NTFS background compression, /C2 for ZIP compression at shutdown) /Nn – erase, overwrite, or rename output folder /B YYYYMMDD_HH:MM:SS – Start time /E YYYYMMDD_HH:MM:SS – Automatic shutdown time /G – Generic mode. Disables SQL Server-centric mode to permit collection on machines without SQL Server installed /R – register as a service /U – deregister service

Collecting Data From a Clustered SQL Instance Two options here: Use the default machine name (“.”) when running on a cluster node, and PSSDiag will collect data from all SQL Server virtual servers on the cluster Supply a virtual SQL Server name for the machine name (leave the instance name set to “*”) and PSSDiag will collect from that virtual server only

Collecting Data from a Remote Server Supply the machine name when starting the GUI Configuration is disabled when connecting to a remote machine Output files will be written to remote machine Output path must exist or be creatable Run the collector as a console app if you wish to capture diagnostic files to the local machine: Profiler trace is always a server-side collection Output path must exist on local machine and the SQL Server Never capture Profiler to a UNC or network drive .TRC files will be copied to local OUTPUT dir on shutdown Other data types (blocker, perfmon, etc) will be captured wherever the collector console app is running

Scheduling Collection Start/Stop When running as a service: Schedule an NT job to run PSSDiagControl START to start the service Schedule an NT job to run PSSDiagControl STOP to stop it When running as a console app: Schedule the console app to start via the NT scheduler Use /E or /B parameters if start/stop time and day (or relative time) is known Schedule an NT job to create a file named PSSDiag.STOP in the output folder Don’t KILL or you will leave Profiler trace running on SQL - see KB 283786 for manually stop and delete trace.

Collecting Data for Extended Periods Run as a service so that you can log out of the console (the GUI needn’t keep running) Can schedule start/stop times and delete/rename old output folder Consider using /C1 to minimize space used by rolled over .TRC and .BLG files The fact that .TRC must always be collected on the server makes remote collection not very effective for minimizing disk space use on the server

PSSDiag Impact on Server Performance Impact of PSSDiag.EXE itself is negligible Perf impact of collection equals sum of costs of diagnostics being collected Generally dominated by cost of Profiler tracing (use the “Detailed Performance” trace template only when you actually need it) Blocking detection script and Perfmon shouldn’t have a significant impact

Troubleshooting Intent is to save time – don’t let it become an obstacle Main collector log is ##PSSDiag.LOG All console output also written to application event log Some Perfmon counter errors are normal (e.g. “Could not add counter: XYZ. - The specified object is not found on the system.”) All scripts (and profiler trace) are started via osql.exe. Script output files are .OUT files with names like “##server__Run_sp_trace.OUT”

Demo - Using PSSDIAG Tool [90분 강의는 3~4개의 데모가, 60분 강의는 1~2개의 데모가 반드시 포함되어야 합니다.] [데모 주제 아래의 [작업 1][작업 2]는 데모의 흐름, Task 목록 등을 기입해 주시면 됩니다.] [데모별로 상기의 슬라이드가 데모 전에 나와야 합니다.]

SQL Server Performance Troubleshooting

Agenda Methodology Resource bottlenecks Determining your bottleneck Which queries are responsible Tuning the identified queries

Methodology System performance is the result of aggregate performance of all queries At a high level what type of bottleneck does system have Find the queries using the most of that resource Is resource being used efficiently Always another bottleneck

Common DB Bottlenecks Synchronization (Locks/Latches) CPU IO Memory 224453 is good KB for this CPU Single query/single CPU Single query/parallel Aggregate query load over all CPUs IO Insufficient memory or poor access path? Memory SQL Server throttles the number of concurrently executing queries with sorts/hashes

Performance Monitor Synchronization CPU Locks: Lock Waits/sec, Lock Wait Time (ms) Latches: Total Latch Wait Time, Latch Waits/sec CPU Sustained rates at 75+ percent Compiles/sec, Recompiles/sec

Performance Monitor II IO SQL Server View Page Reads/sec Readahead pages/sec Checkpoint & Lazywrites/sec fn_virtualfilestats Operating System View Avg Disk sec/Read or Write Disk Queue Length is often NOT a good indicator Memory Memory Grants Pending Max/Granted Workspace Memory

DBCC SQLPERF(WAITSTATS) Number of waits & total wait time for each waittype Example PAGEIOLATCH_SH 64.0 7748.0 761.0 PAGEIOLATCH_UP 24.0 2381.0 10.0 PAGEIOLATCH_EX 21.0 2274.0 60.0 KB 822101 for description of the various waittypes Take delta between snapshots, or clear with DBCC SQLPERF(WAITSTATS, CLEAR)

SQL Server Trace (Profiler) Use the sp_trace procedures instead of the GUI Significantly less performance impact Won’t “drop” events if rate is high Write trace files to fast drive(s) Configure what you want to trace in GUI and use File – Script Trace option PSS prefers PSSDiag option for one step collection

Analyzing Trace Data Use GUI option to sort by a column Use fn_trace_gettable to load and query the data Problems Time consuming and generally requires you to have a specific problem in mind Individual queries identified may not be relevant to the problem Too manual—easy to miss things

Introducing Read80Trace All text is “normalized” to remove comments, white space & parameters Database only stores the text of the first “unique” entry Detail data is loaded in normalized format to facilitate joins, reduce redundant data Connections Batches/UniqueBatches Statements/UniqueStatements Plans/UniquePlans See KB 887057 for download location

Demo Using Read80trace Tool - Queries using the most CPU [90분 강의는 3~4개의 데모가, 60분 강의는 1~2개의 데모가 반드시 포함되어야 합니다.] [데모 주제 아래의 [작업 1][작업 2]는 데모의 흐름, Task 목록 등을 기입해 주시면 됩니다.] [데모별로 상기의 슬라이드가 데모 전에 나와야 합니다.] Using Read80trace Tool - Queries using the most CPU - Query that changes execution plans - Comparing “good” trace with “bad” trace

Looking at Specific Queries Does performance change correlate with plan differences Different execution plan Different amount of work performed Majority of bad plans caused by poor cardinality estimates Use STATISTICS PROFILE to find the problematic part of plan

Using STATISTICS PROFILE Everything except Rows/Executes is compile time information Executes column reflects parallelism For example, scan that executes 4 times Compare Rows with (EstimateRows * EstimateExecutes) Find most deeply nested operator where the error originates; it propagates up the tree from there

Demo - Using STATISTICS PROFILE [90분 강의는 3~4개의 데모가, 60분 강의는 1~2개의 데모가 반드시 포함되어야 합니다.] [데모 주제 아래의 [작업 1][작업 2]는 데모의 흐름, Task 목록 등을 기입해 주시면 됩니다.] [데모별로 상기의 슬라이드가 데모 전에 나와야 합니다.] - Using STATISTICS PROFILE

Acceptable Cardinality Error Reasonable margin of error depends on operator Loop joins – within 2x range pretty reasonable Merge join – 5x is reasonable Hash joins – size of build input (first table below join) affects hash table memory size. Probe input (second table) doesn’t matter much Sorts – size affects memory grant and 2x is reasonable Differences in estimates may not be bugs Should the optimizer do better given the available statistics

Cardinality Estimation Histograms contain most useful information for predicates with literal au_lname = ‘Smith’ StartTime BETWEEN ‘2003-01-01’ AND ‘2003-12-31’ Density information ColA = ‘x’ and ColB = ‘y’ Equijoins Equality predicates with variables Auto create statistics only creates single column statistics Other estimates usually based on fixed selectivity estimates Percentage based on the comparison operator See Inside SQL Server 2000 for table of values

Auto Statistics Samples a percentage of the data Minimum sampling ~4MB of data Maximum a function of rows in table If you have issues with bad plans Update statistics (sampled) – if this fixes it then histogram is probably out of date or auto update not triggered soon enough for your query If fullscan required to fix a problem use DBCC SHOW_STATISTICS to see how much the density values differ

Limitations to Consider T-SQL variable (as opposed to parameter) Value not known at compile time so can’t use histogram Builtin functions No statistics available Multi-statement table-valued functions Table variables Temp table & recompile uses statistics

SQL Server Locking Internals and Troubleshooting

Introduction UMS Scheduling and Workers What is a SQL Resource How SQL Server really waits on a ‘resource’ Blocking Crabbing Fetch Rates Physical vs Logical Protection Scans and Lock Classes Lock Escalation

UMS Scheduling User Mode Scheduling Precise Resource Usage Preemptive vs Non-Preemptive

SQL Server Workers What is a worker? Worker Pool Request bound to Worker for (Life Time) - Example Target Setting - Example Not Dynamic Division of Workers

Connection Bound To Scheduler Assignment: Scheduler with fewest users Life time: Bound for connection life time Workload Matters

A SQL Resource Reader / Writer Waiters list FIFO Maintained

Blocked / Blocking Blocked Worker tied up in block scenario Attentions / Query Timeout (Blocking) Diagnostics sysprocesses syslockinfo dbcc opentran - Example Profiler and SQLTrace xact_abort Performance Monitor dbcc sqlperf(waitstats) dbcc pss – Example PSSDiag

Resource Crabbing Maintains Data Stability Acquire Next Release Previous

Client Fetch Rate Matters Sending Results How Crabbing Applies Mobile Links Lock Scope Never Perfect Preemptive Network Writes

Batch Size Can Matter Touching Several Objects External Logic XProcs COM Objects The Transaction Log Sector Alignment Flush to LSN

DTC Locking All Under One Roof SQL 7.0 Behavior SQL 2000 Behavior Deadlocking SPID = -1 or SPID = -2

Physical versus Logical Protection Physical = Latch Reading Page Into Memory Writing A Page To Disk Inserting A Row Internal Data Structures Logical = Lock Inserting / Modifying A Row

Scans (the SDES) Lock Classes What is an SDES? SDES and Query Plan - Example Parallel – Multiple Scans What is a Lock Class? Lock Class and Lock Escalation

Lock Escalation What is Lock Escalation? Lock Escalation Paths Escalation Boundaries Monitoring Lock Escalation Performance Monitor Profiler Work Tables Escalation Failures Trace Flag -T1211 Batch Size Matters

Helpful Tidbits Identity Property Update with Variable Assignment Attentions / Query Cancels Blocked Scheduler Workers Open, Fetch, and Close Implicit Transactions MAXDOP = 1 Locking Hints Isolation Levels

세션 요약 성능관련 문제 해결을 위한 데이터 수집 툴인 PSSDIAG에 대한 사용 방법 소개 성능문제 파악을 위한 주요 방법론 Locking Internal에 대한 개념 [세션 내용을 요약하여 고객께 다시 한번 정리하여 줍니다.]

참고 자료 (1) TechEd 2004 Korea - SQL Server Performance Toolbox (by 하성희) http://msdn.microsoft.com/library/en-us/tsqlref/ts_sys-p_3kmr.asp “SQL Server 2000 Performance Tuning with Waits and Queues” SQL Magazine (January 2004) By Tom Davidson “Inside SQL Server 2000” by Kalen Delaney “Microsoft SQL Server 2000 Performance Tuning Technical Reference “ by Edward Whalen http://sqldev.net/misc/waittypes.htm [참고하신 웹사이트나 관련 문서, 리포트 등의 출처를 기재하여 주십시오.]

참고자료 (2) Q224587 Troubleshooting Application Performance with SQL Server Q251004 How to Monitor SQL Server 2000 Blocking Q224453 Understanding and Resolving SQL Server 7.0 Blocking Problem Q243588 Troubleshooting Performance of Ad-Hoc Queries Q244455 Definition of Sysprocesses Waittype and Lastwaittype Fields for SQL Server 7.0 Q271509 INF: How to Monitor SQL Server 2000 Blocking Q224587 INF: Troubleshooting Application Performance with SQL Server Q243589 INF: Troubleshooting Slow-running Queries Q243586 INF: Troubleshooting Stored Procedure Recompilation Q822101 The waittype and lastwaittype fields in the sysprocesses table Q255596 sp_lock2 Returns Additional Data Q283696 Job to Monitor SQL Server 2000 Performance and Activity Q283784 How to View SQL Server 2000 Activity Data Q262499 INF: Using Output Parameters with sp_executesql Q283786 INF: How to Monitor SQL Server 2000 Traces