Searching for genes involved in common complex trait diseases 서강대 생명과학과 국가지정연구실 ㈜에스엔피제네틱스 신형두
SNP(단염기 다형) (Single Nucleotide Polymorphism) 개인별 다양성 유발 ???? SNPs are single-base variations in the genetic code that occur about every 1,000 bases along the three billion bases of the human genome 개인별 다양성 유발 ????
Human Genetic Variations
(Human Genome Project) Keys to Genome- SNP 병원체 생리 환경 개인별 다양성 (Human Genome Project) 질병감수성의 개인차 약물반응의 개인차 개인차 외모, 지능 Disease genes 맞춤약품 Drug targets 유전자 진단 질병정복
Common Complex Trait Diseases
Common Complex Trait Diseases
Methodology of Genetic Epidemiology Study design 시료/임상자료 확보 유전자 선정 Genetic analysis Statistical analysis
High throughput SNP Genotyping 1. Infinium (Upto 1M SNP genotyping at one time) 2. BeadArray (1,536 SNP genotyping at one time) 3. SNPlex (48 SNP genotyping at one time) 4. TaqMan (Only 3 hours for genotyping) 5. Single Base Extension (SBE)
Genetic Analysis Map of gene Software for LD calculation and haplotype construction LD Block definition Selection of haplotype tagging SNPs HWE, Heterozygosity Association Analysis of Novel TBX21 Variants with Asthma Phenotypes (Chung et al., 2003, Human Mutation IF 6.134)
Haplotype & LD analysis +7462T>C(C104C) +7700G>A(T151T) +7721G>A(V158V) +9868G>A(R272Q) +13044A>G(I349V) +3170A>G(R48H) +13543 A>G -1064C>T -325G>C +4977A>G +5280G>A +5702A>G -992C>G +3377G>T +3491G>A -957C>A +4076T>C ADH3 ADH2
Statistical Analysis X2 tests Fisher’s Exact tests Logistic Regression MHC Cox Relative Hazards Model Survival Analysis DCA
K-M survival Analysis of IL10-ht2 (HCC) (Human Molecular Genetics, Shin et al.)
Aldehyde Dehydrogenase 2 Gene Associated with Risk of Tuberculosis Genotype TB Control Referent analysisa OR(95%CI) P Glu/Glu 391(82%) 559(70.2%) 1 - Glu/Lys 78(16.4%) 216(27.1%) 0.52 (0.39-0.69) 0.000008 Lys/Lys 8(1.7%) 21(2.6%) 0.55 (0.24-1.24) 0.15
New Generation of Human Genetics - Whole Genome SNP Study-
High Throughput Genotyping – SNP Chip 100K 317K 550K 1M . Now!
Chromosome Aberrations CNV Analysis
Identification of Chromosome Aberrations - Normal and B cell line DNA LOH (neutral) Regional Aberration (3N) Whole chr. Aberration(3N) Regional Aberration (1N or LOH loss)
CNV Analysis Background
CNV Analysis – Current Status The Database of Genomic variants # of CNV region: 2,714 Redon et al. 2006.Nature Int’l HapMap 270 samples Affymetrix/CGH/QPCR # of CNV region: 8,955 TaqMan/CGH/QPCR
Major CNV Detection Methods 1. Intensity-SNP Chip, CGH, Q-PCR… 2. Hardy-Weinberg Equilibrium (HWE) 3. Mendelian Error in Family 4. User Specific Detection Algorithms
Copy number difference changes cluster position!!! CNV Analysis - X chromosome Markers Copy number difference changes cluster position!!! 2X (Female) 2X (Female) 2X (Female) 1X (Male) 1X (Male)
3,866 HWE<0.01 801 CNVs Detected CNV Analysis: HWE - Single/Common CNV Detection : HWE (P<0.01) Whole Genome HWE Test (n=327) 3,866 HWE<0.01 801 CNVs Detected
CNV Analysis: Mendelian inconsistence - Family Mendelian inconsistence A/- BB Mother (BB) (AA) (BB) Father (A-) Son (B-) B/- (BB)
Amount relative to the standard CNV Analysis: Validation By Q-PCR - 18 autosome CNVs were validated by QPCR (90%:18/20) Sample Quantity(Q’ty) Amount relative to the standard Average DM211 DM233 DM254 DM255 DM274 DM285 DM286 DM287 DM347 3.0 2.6 2.4 2.5 2.9 2.7 DM189 DM202 DM232 DM342 Father Mother 1.1 1.2 1.0 1.4 DM77 DM182 Son 0.0 2N Father Mother Son 1N Del/Del - Example: Chr 6, rs12530252 *Two CNV : SNP in probe Validation of All CNVs by Illumina Golden Gate Assay (n=1,536)
CNV Analysis: Mendelian inconsistence F F D2 D1 D2 F S M M S 124 CNVs were detected (1 Family, n=5) Large scale family data set will be effective detection of single loci CNVs
AA/(AD+AB)/(BB+BD+DD) BB/(BD+AB)/(AA+AD+DD) DD/(AD+BD)/(AA+AB+BB) CNV Analysis: CNV Association Analysis Allele Test of rs7424350 (n=400) BB Allele Allele Freq. X2-test (allele) DM NC A 49 (12.3%) 55 (13.8%) P = 0.510 B 254 (63.5%) 272 (68.3%) P = 0.149 D 97 (24.3%) 71 (17.8%) P = 0.026 X2-test Global P-value = 0.0829 AA AB BD AD DD Association Analysis of rs7424350 among T2DM study subjects Model Allele Frequency Co-dominant Dominant Recessive Case Control OR(95%CI) P AA/AB/BB 0.157 0.173 0.89(0.55-1.43) 0.62 0.92(0.53-1.57) 0.75 0.54(0.10-3.02) 0.49 AA/AD/DD 0.276 0.563 0.24(0.08-0.73) 0.01 0.13(0.03-0.70) 0.02 0.22(0.04-1.38) 0.11 BB/BD/DD 0.722 0.794 0.67(0.46-0.98) 0.04 0.13(0.03-0.56) 0.007 0.79(0.50-1.24) 0.30 AA/(AD+AB)/(BB+BD+DD) 0.123 0.138 0.87(0.57-1.32) 0.51 0.89(0.57-1.41) 0.49(0.09-2.72) 0.42 BB/(BD+AB)/(AA+AD+DD) 0.635 0.683 0.80(0.60-1.08) 0.15 0.52(0.27-0.98) 0.88(0.59-1.30) 0.52 DD/(AD+BD)/(AA+AB+BB) 0.243 0.178 1.48(1.05-2.10) 0.03 1.31(0.87-1.96) 0.19 7.99(1.80-35.40) 0.006
CNV Analysis: Korean CNV D/B rs#, chromosome, position, gene, discover methods, CNV freq., cluster images Link to NCBI and Int’l HapMap websites (Submitted to Human Mutation, 2007)
Whole Genome Association Study
Statistical Analysis: WGA Chromosome-1 (T2DM) logistic Cholesterol HDL BMI Triglyceride WHR FBS Insulin
Statistical Analysis: Consideration - Current Criteria for Good Association Studies Large sample size Low P value Biological sense Replication High OR/RR/AF
Next Generation Genome Technology 차세대 Whole Genome Sequencing System 1회 1Gb의 sequence 분석 원리: Sequencing By Synthesis
Research Applications Sequencing & Resequencing - 전체 genome을 모두 분석하거나 하나의 chromosome이나 여러 유전자를 포함하는 큰 DNA 조각을 분석할 때 빠르고 경제적인 sequencing tool Immunoprecipitate sequencing - Chromatin immunoprecipitation을 수행한 후 회수된 모든 DNA 조각을 한꺼번에 sequencing하는 ChIP-Seq 방법 (TF, Methylation) Small RNA discovery and analysis - Solexa sequencing technology는 한번에 4백만 small RNAs 를 분석할 수 있어서, 한 시료내의 모든 small RNA의 양과 sequence를 한번에 분석 Digital gene expression - Tag를 이용한 whole genome gene expression profiling. 역시 한번에 시료당 4백만 cDNA tag를 분석하여 각 유전자의 발현 정도나 특이 유전자의 발견 등에 이용
“From Genetics to Pharmacogenomics”
Searching for genes involved in common complex trait diseases 서강대학교 생명과학과 ㈜에스엔피제네틱스 국가지정연구실 신형두