Type 1 diabetes (T1DM) is a chronic autoimmune disease in which the β cells of the islets of Langerhans are selectively destroyed, resulting in insulin deficiency and hyperglycemia. The disease develops in genetically susceptible individuals, most likely as a result of an environmental trigger. T1DM has an uneven geographical distribution; disease prevalence is highest in populations of white European origin and lowest in those of East Asian descent . A marked gradient in disease risk also exists in Europe, with higher prevalence of T1DM in northern countries, particularly Finland, compared with areas around the Mediterranean. This pattern could be attributed to genetic differences between the populations or to the presence/absence of environmental triggers.
The sibling relative risk (λs) measures the probability of a disease developing in a sibling of a proband compared with the risk in the general population. This index of familial clustering is often used as a measure of genetic effect, although it could also reflect the impact of shared environmental exposures. The λs for T1DMis 15 (sibling risk 6%/population risk 0.4%), one of the highest values observed among common complex diseases, suggesting a substantial inherited component to T1DM . This is further supported by the high concordance rate observed among monozygotic (MZ) twins (>50%), while concordance in dizygotic (DZ) twins is similar to that seen for other siblings (6–10%) . A long-term follow-up study of initially discordant MZ twins showed that 65% of the nonproband co-twins developed T1DM by the age of 60 years, while 78% developed persistent islet autoimmunity, suggesting that genetic susceptibility persists for life . Concordance does not reach unity, however, and there is significant divergence between twins in the time taken to develop disease. This implies that environmental factors are also important. Epidemiologic evidence of a marked increase in T1DM incidence in many areas of the world over the past 20–30 years [5,6] also supports a significant contribution from environmental risk determinants, as do studies of migrant populations, which suggest that individuals from a low-prevalence country can increase their risk of developing T1DM if they move to an area of higher prevalence . This increase in risk may be attenuated by the individual’s genetic background, however, such that they do not acquire as high a disease risk as the indigenous population of their adopted country [8,9]. Conversely, children from high-risk populations who are born and raised in areas with lower T1DM prevalence (such as those of Sardinian ancestry raised in Italy) retain their higher risk of disease compared with the host population, underlining the importance of genetic makeup to T1DM risk .
Type 1 diabetes is a polygenic disease, in which a large number of susceptibility loci contribute to overall disease risk. The susceptibility genes have low penetrance and, as a result, not all individuals judged to be “genetically at-risk” will develop the disease. Furthermore, diabetes can occur in the absence of known high-risk markers. The lack of a simple relationship between genotype and phenotype makes it difficult to identify disease genes at a population level and studies rely on the detection of statistical associations that are unlikely to have occurred by chance. To date over 50 genetic loci have been shown to be reproducibly associated with T1DM [11, http://t1dbase.org]. In most of these regions, however, fine-mapping is required to define the specific gene(s) involved and identify the causal variants. This chapter reviews our current understanding of the genetic basis of the disease and discusses the implications for disease prediction and the identification of novel targets for intervention in the development and progression of islet autoimmunity.
The HLA complex
The genes of the human leukocyte antigen (HLA) complex were first identified as important determinants of T1DM risk in the 1970s . Subsequent family studies comparing disease concordance rates between HLA-identical siblings and monozygotic twins suggested that the HLA genes are the major genetic contributor to T1DM, accounting for about half of the familial aggregation of the disease . Recent studies concur with this estimate and no other locus with such a substantial influence on disease risk has been identified [2,11].
The HLA complex (also known as the human major histocompatibility complex, MHC) maps to a 3.6Mb region on chromosome 6p21.31 and consists of more than 200 identified genes, over half of which are known to be expressed. The HLA class I genes are located at the telomeric end of the complex, with the class II genes at the centromeric end. The 700 kb sequence between these gene clusters is commonly referred to as the HLA “class III” region, although it contains no classical HLA genes (Figure 30.1).The HLA complex plays a crucial role in the immune response, in particular the genes encoding the classical class I (HLA-A, -B, and -C) and class II (HLA-DR, -DQ, and -DP) molecules. The HLA-A, -B, and -C genes each encode a peptide chain that combines with β2 microglobulin to form the HLA class I molecules, expressed on the surface of all nucleated cells (Figure 30.2(a)). These molecules bind to peptides derived from exogenous antigens and present them for recognition by CD8-positive cytotoxic T cells. The class II loci are composed of pairs of genes, an A gene and a B gene, which encode an α and β peptide chain, respectively. These dimerize to form the HLA class II molecules, which are expressed only on the surface of specialized antigen-presenting cells, including monocytes, macrophages, and dendritic cells (Figure 30.2(b)). The DR, DQ, and DP molecules bind to peptides derived from endogenous antigens and present them for recognition by CD4-positive helper T cells (Figure 30.3). The process of antigen presentation is the first step in the activation of a T cell-mediated immune response to the antigen and the HLA molecules therefore play a crucial role in both protection from pathogens and the development of autoimmunity.
The HLA genes are highly polymorphic, some having more than 200 known alleles. This sequence diversity is driven by a strong selective pressure that ensures the recognition of a wide range of antigens to optimize the immune response to a large variety of current and emerging pathogens. As such, the majority of the sequence polymorphism occurs in the gene regions encoding the peptide binding groove. The resulting polymorphic amino acid residues influence the shape and chemical properties of the groove and thereby dictate the repertoire of peptides that can be presented by a given HLA molecule.
Early studies of the HLA gene complex identified disease associations with alleles at individual loci. These genes are in strong linkage disequilibrium (LD), however, which means that recombination between different HLA loci is rare. As a result, combinations of alleles at different genes are inherited together more frequently than expected by chance. When inherited on the same chromosome, this allelic combination is known as a haplotype. For example, the DRB1*0301 allele is most frequently coinherited with DQA1*0501 and DQB1*0201, forming the DRB1*0301-DQA1*0501-DQB1*0201 haplotype. Haplotypes can only be determined directly in family studies, where transmission of allele combinations from parent to child can be recorded. Strong LD relationships between alleles, however, make it possible to assign haplotypes based on the likelihood of allele co-occurrence, using computer algorithms. Unfortunately such relationships also make it difficult to determine which loci make a genuine contribution to disease risk and which are associated with disease secondary to their coinheritance with other disease markers.
The major HLA-encoded susceptibility determinants for T1DM are the class II DR and DQ genes, although the DP genes and the class I genes also influence disease risk. The effect of a given allele/genotype on the risk of developing disease is generally indicated by the odds ratio (OR), which compares the frequency of the disease occurring in individuals positive for the genetic variant with the frequency in individuals lacking the variant. An OR of 1 indicates that the allele/genotype has no influence on disease risk; typically, an OR value greater than 1 suggests that the marker confers susceptibility to the disease, while markers with OR values less than 1 confer protection.