header

Data Source for Association Study

1. Method

We extracted all association studies published in peer-reviewed journals from theSchizophreniaGene database. The extracted information included gene annotations, study information (e.g. ethnic groups), statistical analysis methods in association studies and their results, number of cases and controls, number of families (number of affected and unaffected family members), and genotypes of each polymorphism. Using our previously developed combined odds ratio (OR) method (Sun et al. 2008), we first performed for each gene an extensive evaluation of risk allele of each marker based on its ORs, confidence intervals (CIs), andPvalues in multiple studies. We then calculated ORs using the risk alleles that we evaluated. The largest OR among the markers surveyed in each study was selected to represent its effect size in that association study. These OR values were next combined by using R package "meta" and aPvalue was obtained by a Z-test. Thus, thisPvalue suggests a rough proxy of the magnitude of positive association evidence. Because the smallerP值表示有力的证据,我们分配了一个年代core 3 to a gene whosePvalue is < 0.001, 2 whosePvalue is [0.001 - 0.05), and 0 otherwise.

The combined OR method requires at least two representative markers in each study and at least two association studies to combine their representative OR values (Sun et al. 2008). Some genes having at least two positive association studies might have been excluded in the procedure. Because replication is still a great challenge in schizophrenia research, we assigned a score 2 to those genes with at least two positive results and a score 1 to those with only one positive result to reflect different extent of association. We applied this combinatory strategy (i.e.Pvalue from combined OR method and scores based on the number of positive association studies) to all genes that had association report. Currently, we have 281 genes with the assigned scores ranging from 1 to 3.

2. Dataset Description

The association study generated a gene set containing281records in total, among which 278 are protein-coding genes, 2 are miscRNA genes and 1 with unknown gene-type. The following picture shows the distribution of the association specific scores described above.

header
Figure 1. Score distribution of association study defined gene set
References
  • Allen, N.C., Bagade, S., McQueen, M.B., Ioannidis, J.P., Kavvoura, F.K., Khoury, M.J., Tanzi, R.E., and Bertram, L. (2008) Systematic meta-analyses and field synopsis of genetic association studies in schizophrenia: the SzGene database.Nat. Genet.40: 827 - 834PubMed
  • Sun, J., Kuo, P.H., Riley, B.P., Kendler, K.S., and Zhao, Z. (2008) Candidate genes for schizophrenia: a survey of association studies and gene ranking.Am. J. Med. Genet. B Neuropsychiatr. Genet.147B(7): 1173 - 1181PubMed


Baidu