Genetics of Osteoarthritis Literature Review

This review of scientific literature sets out to present and understand the important issues in research on the inherited genetic component of osteoarthritis (OA). It is limited to publications from 2010 onwards, with an emphasis on more recent work, and is thorough but not exhaustive. 

HOW IS OA CLINICALLY DEFINED FOR GENETIC STUDIES? 

OA is a heterogenous combination of cartilage and/or bone and/or clinical complaints. Various  definitions of OA have been used in genetic studies (KL score, self-reported, hospital diagnosis,  surgical intervention) but there does not appear to be a clear winner. Different joints (hip vs knee) may  act differently. 

WHAT DISEASES OR CONDITIONS ARE ASSOCIATED WITH OA AND NEED TO BE CONSIDERED IN ANY ANALYSIS? OA has clinical risk factors (female sex, joint loading activity, previous joint trauma, and smoking  status). It may occur with other diseases and conditions that have a genetic component (type 2 diabetes, obesity, cardiovascular disease, depression, and osteoporosis) where it is difficult to sort out the causal  relationships. For example, is there a genetic cause of both obesity and OA, or does obesity cause OA  by putting more stress on the joints, or does OA contribute to obesity by making it difficult to exercise? 

WHAT ARE THE PARTICULAR CHALLENGES IN STUDYING A POLYGENIC DISEASE SUCH AS OA? The fact that OA is a polygenic disease has implications for how genetic studies must be conducted.  Genome-Wide Association Studies (GWAS) with large sample sizes are necessary to detect the many  lower risk variants. 

WHAT ARE THE SPECIAL STATISTICAL CONSIDERATIONS FOR GWAS? 

Adjustments must be made for relatedness, genetic drift or population stratification, and ethnicity.  Multiple testing inevitably results in a certain fraction of false positives, and although corrections can  be made, it is essential to replicate the results in another independent cohort. 

WHAT IS THE HERITABILITY OF OA? 

Narrow sense heritability, defined as the proportion of variation between individuals that can be  attributed to genetic differences, is 52% for hip, 15% for knee, and 24% for hip and/or knee. Overall, it is estimated that the heritability of OA is about 50%.


WHAT ARE THE EXISTING CLINICAL-GENETIC MODELS? 

One of the best performing clinical-genetic models predicted radiographic progression in knee OA  patients. The clinical model alone had an AUC of 0.66, where 1 would be perfect classification and 0.5  random classification. A genetic score utilizing 8 of 23 candidate SNPs obtained an AUC of 0.78. The combined clinical-genetic model had an AUC of 0.82. There is plenty of opportunity to improve the  performance with more genes. A polygenic risk score that incorporated not only clinical variables but  also time would be an innovation. There are various choices for an endpoint, such as diagnosis of OA, progression to severe disease, or joint replacement surgery. 

WHICH GENES OR GENETIC VARIANTS HAVE BEEN IMPLICATED IN OA? 

The UK Biobank data, fully released in 2017, together with the UK Osteoarthritis Genetics Consortium (arcOGEN), have made possible large-scale GWA studies of polygenic diseases. We are fortunate to  have a well documented genetic study of OA listing 96 associated SNPs. Results included 64 biologic  processes associated with OA, of which 46 were related to bone, cartilage, or chondrocyte morphology. Four genes (TGFB1, FGF18, CTSK, and ILT1) are in clinical trials for OA or cartilage regeneration,  and six others are potential therapeutic targets. 

HOW CAN WE CONFIRM OA SUSCEPTIBILITY LOCI? 

Confirmation can be obtained by replication in independent cohorts, comparing gene expression and  proteins in tissue from patients undergoing surgery, finding relevant rare-human-disease and animal model evidence, and using other sophisticated bioinformatic tools. 

BEYOND DNA, WHAT GENETIC PROCESSES OFFER OPPORTUNITIES FOR DIAGNOSTICS, PROGNOSTICS, AND  THERAPEUTICS? 

The DNA that we inherit is only the first chapter in a complicated process involving many other factors that eventually results in OA.

Venn Diagram Featuring the intersection of Genetics, Epigenetics, and Environment with phenotype at the intersection of all 3

WHAT CAN WE CONCLUDE ABOUT THE CLINICAL APPLICATIONS OF DNA DATA FOR OA?

The immediate clinical applications of genetics in OA will probably be to use biomarkers from blood,  urine, or joint tissue to detect and monitor the disease. These are not unrelated to the genotype of the  patient, in that DNA variants influence the downstream processes of RNA expression, protein  synthesis, and epigenetic regulation, and ultimately the individual's own disease phenotype. Biomarker  tests would be repeated at intervals, and trends in the results would be highly informative. 

A DNA test would be a one-time result, as an individual's genotype is fixed at birth. Although OA has  a high degree of heritability, previous models that combined a relatively few genetic loci and clinical  information have had limited success, and are not to my knowledge being used in practice. Now, a  polygenic risk score could utilize the large number of variants that have recently been discovered in  GWA studies, and benefit from our improved understanding of the etiology of OA. 

The utility of a DNA test would be in identifying younger patients who have a familial predisposition  or susceptibility to OA. In slightly older patients with symptomatic OA, a polygenic risk score would be used to predict progression. There is also an age limitation on the utility of DNA testing; several  studies have shown that the hereditary component of OA becomes attenuated after age 65, when the  normal effects of aging on the joints take over. 

A polygenic risk score that was combined not only with clinical factors, but also with time, would be  an innovation. There are various choices for an endpoint, such as diagnosis of OA, progression to  severe disease, or joint replacement surgery.

Opinion Leaders and Institutions

Literature Review of DNA Studies in Osteoarthritis 

 

The goal of this review was to understand the important issues in genetic research for this particular  disease. Each of these questions is addressed in a separate section below. 

  1. How is OA clinically defined and how have those definitions been adapted to classify  cases and controls in genetic studies? 
  2. What diseases or conditions are associated with OA and need to be considered in any  analysis? 
  3. What are the particular challenges in studying a polygenic disease such as OA? 
  4. What are the special statistical considerations for a genome-wide association study  (GWAS)? 
  5. What is the heritability of OA? 
  6. What are the existing clinical-genetic models? 
  7. Which genes or genetic variants have been implicated in OA? 
  8. How can we confirm OA susceptibility loci? 
  9. Beyond DNA, what genetic processes offer opportunities for diagnostics, prognostics,  and therapeutics? 
  10. What can we conclude about the clinical applications of DNA data for OA? 

Finally, there are two supplementary tables containing a list of OA databases that may be available for  future studies, and a list of some of the opinion leaders in the field. 

1) DEFINITION OF OA 

Finding a reliable indicator of OA has proven difficult because it is such a heterogenous combination of cartilage and/or bone features and/or clinical complaints. OA afflicts the movable joints in various  ways: degeneration of the cartilage that provides a smooth lubricated surface for articulation,  remodeling of the underlying bone, and build up of synovial fluid with inflammation of the membrane  lining the joint cavity. These features may or may not cause pain and functional impairment in the  patient. 

Radiographic OA is commonly used to classify cases in epidemiological studies. The most widely used metric is the Kellgren and Lawrence (KL) system, but that is based on osteophytes, or bone spurs, not  on loss of cartilage and narrowing of joint space, which is more important in hip than in knee OA  (Valdes 2018). 

Self-reporting has been found to be reliable when validated against primary care records, correctly  identifying 93% of individuals who do not have OA, while diagnoses from hospitals are more accurate  (Zengini 2018). Zengini et al. conclude that the self-reported dataset yields a larger sample size with  more power to detect genetic associations, which makes up for uncertainties in the definition of cases.  On the other hand, the hospital dataset exhibits higher heritability and captures slightly different demographics. Both types of OA definitions have their uses. 

A study of UK primary care electronic health records found evidence of under-reporting in diagnosed  OA (Yu 2018). Even if symptoms of peripheral joint pain were included as a diagnosis, only 75% of  patients undergoing hip or knee replacements had recorded diagnoses in the 10 years leading up to their surgery. 

2) ASSOCIATION WITH OTHER DISEASES OR CONDITIONS 

Schematic Diagram of the different type of risk factors for OA shows that genetics is only one part of the puzzleFigure 1. Schematic diagram of the different types of risk factors for OA shows that genetics is  only one part of the puzzle (Warner and Valdes 2016 Fig 1).  

OA has long been recognized to occur in conjunction with other diseases and conditions (Figure 1). In  any genetic study, it is imperative to adjust for these factors so that differences in the make-up of the  case and control groups will not confuse the issue. However, the relationship between these factors and  the etiology of OA is still largely unknown. For example, is there a genetic cause of both obesity and  OA, or does obesity cause OA by putting more stress on the joints, or does OA contribute to obesity by  making it difficult to exercise? One goal of GWA studies is to be able to separate these elements. 

CLINICAL RISK FACTORS: 

female sex 

 joint loading activity 

previous joint trauma 

smoking status

DISEASES: 

Type 2 diabetes 

Obesity 

Cardiovascular disease 

Depression 

Osteoporosis 

AGE-RELATED MECHANISMS (Magnusson 2018; Trachana 2019): 

metabolic deregulation 

inflammation (inflammaging) 

aging cartilage (chondrocyte senescence) 

mitochondrial imbalance and redox imbalance 

environmental and lifestyle changes 

Genetic correlations between different OA sites and other diseases or conditions is diagrammed in  Figure 2. Compared to knee OA, hip OA is weakly associated with all of the body measures, but not  significantly different with respect to lumbar spine bone mineral density. This supports the observation  that hip OA is more heritable and less influenced by other physical factors.

Genetic correlations between OA and other traits and diseases

Figure 2. Genetic correlations between OA and other traits and diseases (Tachmazidou 2019 Fig  1) 

3) POLYGENIC DISEASE 

OA is a polygenic disease (Warner and Valdes 2016; Zengini 2016; Valdes 2018) and this has  implications for how genetic studies must be conducted. 

Prior to 2000, family-based study designs were the primary means of investigating the genetic basis for disease. They were especially successful in finding relatively rare (<2%), highly penetrant (high-risk)  mutations in causal genes and understanding dominant or recessive modes of inheritance. These  techniques compare affected and unaffected family members. Some do not require obtaining DNA  from biologic specimens (twin studies, sibling correlations, familial aggregation, segregation analysis)  and others do (linkage analysis, association studies). 

With the advent of high-throughput microarrays and next generation sequencing (NGS), large-scale  genome-wide association studies (GWAS) have allowed investigators to identify lower risk but more 

common variants. A polygenic disease such as OA results from the combined action and interaction of  variants in multiple genes. Each variant has a relatively minor effect, and only a subset may be present  in an affected individual. 

4) GENOME-WIDE ASSOCIATION STUDIES (GWAS) 

Whole genome scans are peculiar in that they interrogate millions of loci whose function is actually  unknown. The pair of bases (A, C, T, or G) on the two nucleotides at a specific position in the DNA  sequence varies between individuals. For example, different people may have AA, AT, or TT  genotypes. This variation is normal in the population. A single nucleotide polymorphism (SNP) refers  to a variant that is fairly common (>1%) whereas a rare variant is called a mutation. It can be  confusing, because these terms – loci, SNP, nucleotide, base, base pair, variant, mutation,  polymorphism, and allele – are sometimes used interchangeably in reference to a particular location in  the genome. Another source of confusion is that people commonly talk about “missing a gene” when in fact they mean that there is a variant (or an insertion/deletion) at one location within the gene that  changes or disables its function. 

The GWA study tests for correlation between one of the alleles (in our example, either A or T) and  disease status. Association with disease is reported as an odds ratio (OR) for a case-control study. The  OR for a polygenic risk score (PRS) tells how much the odds of being a case relative to the odds of  being a control are multiplied for each unit increase in the score; for a rare condition, it is  approximately equivalent to a relative risk (RR). Another kind of study may look at the elapsed time  from a starting point to an event, such as the time from OA diagnosis to total joint replacement. For this type of analysis, called survival analysis, the result is reported as a hazard ratio (HR). In a slightly  different way than the OR or RR, the HR tells how much an increase in the polygenic risk score  increases the instantaneous risk of experiencing the event. These measures (OR, RR, and HR) are  relative terms; however, if we know the baseline risk, we can estimate absolute risk. 

A SNP that correlates with disease is probably not causative itself, but instead located in the vicinity of  the variant that is responsible. In some cases, it is possible to determine that a SNP alters the DNA code so as to change the function of the gene. More commonly, a SNP is located in a non-coding intronic  region within the gene, upstream or downstream of the gene, or between two genes either of which  could be involved in the disease, and the mechanism of action is unknown. The credible set is a list of  SNPs most likely to encompass the location of the true causal variant, which may not have been probed by the SNP array. 

In order for a SNP to be informative, the allele must be fairly pervasive in the population unless the  effect is unusually strong. Usually the minor allele frequency is >5% (Zengini 2018 Supp Fig 8). This  means that most of the SNPs we study probably originated hundreds of thousands of years ago in order  to become common, and are not one of the spontaneous changes that are constantly occurring in the  human genome. 

Time-to-event data are usually not available in GWA studies, an exception being the UK Biobank  (Wenjian 2020). It is also computationally difficult to apply standard survival analysis techniques such  as Cox proportional-hazards on the scale of a GWAS, and Wenjian (2020) devised some more efficient methods.

Important concepts of GWAS design: 

  1. ALLELE FREQUENCY VS RISK RATIO VS SAMPLE SIZE 

Increasing any one of these three factors will increase the power to detect a correlation with  disease. Typically the risk associated with an individual variant in a polygenic disease is on the  order of 1.1 to 1.3 times the risk in the normal population. Sample sizes of at least 7000 cases  and 7000 controls are required to detect common variants (Valdes 2018). 

  1. RATIO OF CASES TO CONTROLS 

Most GWAS's have a case-control design and include as many cases as possible. The ratio of  cases to controls is determined to achieve maximum power with the minimum sample size. 

  1. HARDY-WEINBERG EQUILIBRIUM 

Some investigators filter SNPs by testing whether the relative frequency of its genotypes are in  the expected proportion, meaning that there has been no genetic selection (Meng 2019). 

  1. RELATEDNESS 

Statistical analysis depends on the assumption that the genotype of each individual is  independent. Therefore it is important to adjust for family relations, who are more likely to  share the same genotype, or more distant relatives who may not even be aware of the  connection. UK Biobank estimates the degree of relatedness or kinship coefficient between  each pair of individuals using the SNP data (Bycroft 2018). 

  1. GENETIC DRIFT OR POPULATION STRATIFICATION 

A population such as the UK may develop allele frequencies in certain geographic regions or  social groups that are different from the frequencies in the population as a whole. Genetic drift  occurs due to chance as individuals die or do not reproduce and an allele spontaneously  disappears over the course of generations. This can be adjusted for by a statistical method called Principal Components, which clusters individuals into genetically similar groups. 

  1. DISTINCT POPULATIONS 

Ethnicity and race leading to distinct populations are an important factor to consider. Allele  frequency may differ between ethnic or racial groups due to both genetic drift and natural  selection. Isolated populations may also develop sporadic mutations that are unique to that  group. Variants associated with OA have been found in the Icelandic, Japanese, Han Chinese,  and African-American populations that are rare or absent in Caucasians (Chapman 2012; Liu, Y 2017). 

  1. AGE 

Polygenic late-onset diseases such as OA show a decline in the predictive power of genetics  with age, as the effects of genetic causes merge with those of age-related mechanisms that also  deteriorate cartilage and cause inflammation. Individuals with higher polygenic risk scores tend  to develop disease at an earlier age, thus decreasing the component of heritability in the  remaining pool (Oliynky 2019). This can be compensated for to some extent by stratifying the  sample by age. In general, genetic effects will most easily be uncovered in cohorts under the  age of 55 to 60 years. 

  1. REPLICATION OF RESULTS 

It is essential to replicate the results of a GWAS in another, independent cohort. There are two 

obvious reasons why a SNP may show a spurious association with disease, i.e. a false positive.  One is due to multiple testing, whereby the more SNPs that are tested, the more likely it is to  find a significant association by chance. This is corrected by the investigator setting a high bar  for significance, conventionally a p-value less that 1 x 10-7 (Valdes 2018 Table 24.2). Depending on the number of signals, the genome-wide-significance level may be even lower,  such as the 3 x 10-8 threshold used by Tachmazidou (2019). 

The second pitfall is confounding by some other variable that is related to disease and also  associated with a SNP. For example, high BMI is definitely a risk factor for OA, and average  BMI varies between ethnicities. Suppose there are two ethnicities, A and B, and A tends to have higher BMI. If the cases and controls are sampled without regard to ethnicity, it is likely that the cases will include more individuals from A, who naturally tend to have higher BMI. The result  is that any SNP that is more common in Ethnicity A than Ethnicity B will falsely appear to be  associated with OA. Here the association between the SNP and OA is confounded by BMI. The  remedy is either to sample from just one ethnicity, or even better, to adjust for BMI in the  statistical model. 

5) HERITABILITY 

Broadly speaking, heritability describes how much your DNA influences a certain trait or predisposes  you to develop a particular disease. Statistically, it is the proportion of the variation between  individuals that can be attributed to genetic differences. 

For a polygenic disease such as OA, we look at narrow-sense heritability (h2). This is defined as the  proportion of trait variance that is due to additive genetics factors, i.e. the combination of many  variants each conferring a small risk. The total narrow sense heritability for hip and knee OA in the UK Biobank and arcOGEN cohorts was calculated as a function of allele frequency weighted by risk  compared to the population prevalence (Tachmazidou 2019 Supp Table 17). Heritability is higher for  hip OA than knee OA. This may be because the variants associated with hip OA are actually more  harmful, or because the knees are more likely to be impacted by physical causes, such as injuries and  obesity. 

51.9% for hip 

14.7% for knee 

24.2% for hip and/or knee 

22.5% for any site 

These estimates are consistent with results from earlier UK female twin studies that show higher  heritability of hip OA (Spector and MacGregor 2004). Overall, these studies estimated around 50%  heritability of OA. 

60% for hip 

39% for knee 

70% for spine 

6) CLINICAL-GENETIC MODELS 

A number of models have been devised using clinical features to predict OA incidence or progression  (Zhang 2011; Halilaj 2018; Joseph 2018; Magnusson 2019; Dunn 2020).

The first clinical-genetic model for OA was fit on 2158 Japanese case-control subjects using only three  susceptibility genes: ASPN, GDF5, and DVWA (Takahashi 2010). The number of risk alleles  possessed by a subject was used to predict knee OA. The predictive power of the genetic model by  itself was poor (AUC† 0.554), but improved by adjustment for gender, age, and BMI (AUC 0.742). 

Area Under the Curve (AUC) of 0.5 signifies random classification; AUC of 1 is perfect. An AUC of 0.8 to 0.9  is generally considered good to excellent. The FDA has no formal requirement for the AUC of a predictive test;  distinct from the AUC of a bioequivalence claim from a pharmacokinetic study.  

A prognostic model for incident radiographic knee OA was developed in a prospective cohort of 2628  individuals < 65 years from the Rotterdam Study (Kerkhof 2014). The outcome was progression from a KL score of < 2 at baseline to KL ≥ 2 with a mean follow-up time of 9.4 years. A genetic score was  calculated as the unweighted sum of the risk alleles. The genetic score (AUC 0.65) was comparable to  the clinical score, which combined gender, age, and BMI (AUC 0.66). Incorporating the baseline KL  score, which captures mild degenerative features, dramatically improved the AUC to 0.86. Importantly, the genetic score was much less predictive in older patients ≥ 65 years (AUC 0.55). 

An exclusively genetic prognostic tool to predict radiographic progression in knee OA patients was  developed in a retrospective cohort of 595 multisite Spanish patients ≥ 40 years with KL grades of 2 or  3 at time of diagnosis (Blanco 2015). The outcome was progression to a KL score of 4, or total knee  replacement within 8 years of diagnosis, with a minimum follow-up time of 2 years. Clinical variables  included gender, BMI, age, OA in the contralateral knee or other joints. The clinical model alone had  an AUC of 0.66. A genetic score utilizing 8 out of 23 candidate SNPs obtained an AUC of 0.78.  Combining the genetic and clinical features increased the AUC to 0.82. The performance of this  genetic model was considerably better than previous ones. This may be attributed to the fact that all the subjects were affected OA cases and the outcome was progression rather than incidence. 

In considering the required sample size to power a study, here are some estimates of the cumulative  incidence for end-stage knee OA and knee joint replacement over a four-year period in a sub population of the OA Initiative (Table 1). 

Selected Incident rates for patients age 45-65 years

Caution: fitting a polygenic risk score in the same cohort where the SNPs were first identified  introduces a bias. It is likely that the estimates of the effects, i.e. coefficients, for individual SNPs will  by chance have been somewhat higher in the discovery cohort than in the population as a whole. 

7) GENES OR GENETIC VARIANTS 

UK Biobank first released genotype data for 150K participants in May 2015, followed by the full 100K cohort in July 2017 (Bycroft 2018). Thus it has only been recently possible to conduct genetic studies  using this resource.

Eleftheria Zeggini PhD is the principal investigator on the largest genome wide association studies  (GWAS) of OA to date. The first study interrogated 16.5M loci in the UK Biobank data and replicated  results in the Icelandic deCODE cohort (Zengini 2018). OA cases were defined based on self-reported  status, hospital diagnosis, and joint-specificity of disease (knee and/or hip). One of the goals of this  study was to compare self-reported with hospital diagnosis, finding that there was strong correlation  between the two, and the increase in power due to larger sample size made up for the lower sensitivity  of self-reporting. A total of nine novel variants were identified (Figure 3), adding to the 21 previously  identified (Figure 4). The combined set of established OA loci accounted for 26.3% of trait variance.


Figure 3 shows the 9 new genes and previous 21 implicated in OA from this study, which was  conducted in UK Biobank and replicated with an independent Icelandic deCODE cohort.  Typically for a polygenic disease, most of the odds ratios (OR) are less than 1.5, and the risk  alleles are fairly common. The sibling relative risk ratio is the risk for the sibling of an affected  individual compared to the prevalence in the general population, and is an indicator of  heritability (Zengini 2018 Supp Fig 8). 

Figure 4 shows the 21 previously identified genes included in Zeggini 2012. This figure adds  information about the year of discovery and the affected joint. The strongest effects, i.e. largest  odds ratios, are associated with hip OA, which has greater heritability than knee or other joints  (Uhalte 2017 Fig 2). 

The second study extended the scope of the initial study with the goal of identifying new therapeutic  targets (Tachmazidou 2019). GWAS was performed separately in the UK Biobank and arcOGEN  cohorts, then a meta-analysis was performed across the two cohorts. As before, UK Biobank cases were defined based on self-reported or hospital diagnosis of OA at any site, and hospital diagnosis of knee  and/or hip OA. The arcOGEN cases were defined for hip and/or knee from total joint replacement  records or radiographic evidence of disease (controls were obtained from the United Kingdom  Household Longitudinal Study). A total of 64 signals were discovered, 52 of which were novel. This increased the number of established loci from 34 to 96, benefiting from the larger sample size and  number of SNPs. 

IMPORTANT FINDINGS FROM TACHMAZIDOU (2019): 

  1. Most comprehensive listing to date of genetic associations with hip and knee OA 
  2. Identified 64 biological processes associated with OA, of which 46 were related to bone,  cartilage, or chondrocyte morphology; enrichment for genes affecting bone development 
  3. Identified putative effector genes, and 8 out of the 10 were in concordance with animal studies  (Supp Table 21) 
  4. Found significant correlations between OA and obesity, cognition, smoking, bone mineral  density, and reproductive traits 
  5. Estimated total narrow sense heritability (14.7% for knee, 51.9% for hip, 24.2% for hip and/or  knee, and 22.5% for OA at any site) 
  6. Proposed new targets for OA drug discovery and viable options among existing therapeutics. Four genes (TGFB1, FGF18, CTSK, and ILT1) are in clinical trials for OA or cartilage  regeneration, and six others are potential therapeutic targets (Table 2). Caveat: CTSK and  DIABLO have not been shown to be associated with predisposition for OA. 

A tabulation of the types of OA associated with the 96 variants identified to date suggests that it may  be easier to detect an association with hip OA (Table 2). 

Number of variants associated with different types ofOA

Casalone (2018) used UK Biobank to replicate findings of a GWAS case-control study of total joint  replacement in the arcOGEN cohort that showed a significant association with an intronic variant in  GLIS3, which is expressed in cartilage. This analysis also reported a link between OA and diabetes that is not exerted through risk of obesity. 

Styrkarsdottir (2018) identified 22 loci associated with hip and knee OA, of which 16 were novel, from a meta-analysis of the Iceland deCODE and UK Biobank cohorts. They observed an interesting  association between OA risk and height: many of the variants were associated either positively or  negatively with height, suggesting a shared pathway affecting growth and differentiation of bone and  cartilage. 

Meng (2019) used the UK Biobank data to study genetic associations with pain, and detected two loci  associated with knee pain in GDF5 and near COL27A1. The GDF5 loci has previously been implicated in OA. Meng (2020) also used the UK Biobank data to find genetic variants associated with neck or  shoulder pain. 

Wenjian (2020) demonstrated some alternative methods for time-to-event data analysis on 12 common  diseases tracked in UK Biobank, with age-of-onset as the event. He identified 38 loci that would have  been missed with a logistic regression framework using a binary phenotype (OA yes or no), including 5

associated with osteoarthrosis. One of those was in MAPT and was also identified by Tachmazidou  (2019) and the other four were in RFLNA, LRRC37A4P/MAPK8IP1P2, RP1L1/MIR4286, and  KANSL1/LRRC37A. Wenjian's results imply that there may be some genes that specifically affect age of-onset and possibly disease progression, i.e. time-to-event. 

8) CONFIRMATION 

Confirmation for the 96 variants associated with OA (Tachmazidou 2019) has been obtained by various types of evidence (Table 3). The latest set of 52 novel variants, together with 12 that were previously  reported, were tested in the independent arcOGEN cohort, although not all of them achieved  significance there (Tachmazidou 2019 Supp Table 3). Each variant was linked to functions of nearby  genes, which were determined by differences in gene expression and proteins in tissue from patients  undergoing surgery, relevant rare-human-disease and animal-model evidence, and other sophisticated  bioinformatic tools (Tachmazidou 2019 Supp Table 5). Most of the established variants have been  confirmed by replication in independent cohorts or across multiple ethnic populations. 

Summary of confirmatory evidence for variants associated with OA

Another approach to verifying genetic variants found by GWAS is to look for genes that cause related  but more extreme phenotypes. The rationale is that these genes would also be involved in milder  conditions such as OA. Tachmazidou (2019 Supp Table 20) lists nine such genes, six involved in  monogenic bone diseases (COL27A1, FGFR3, GDF5, RNF135, TBX4, and TKT) and three in early onset OA (COL11A1, COL11A2, and SMAD3). OA variants associate with four of these genes  (COL27A1, GDF5, COL11A1, and SMAD3). 

Valdes (2018 Table 24.1) gives a partial list of genes associated with monogenic diseases that display  early onset OA or OA-like symptoms (COL2A1, COL11A1, COL11A2, COMP, COL9A2, COL9A3,  MATN3, COL9A1, DYM, WISP3, AGC1, PAP552, MATN2, MMP13, and SEDL). OA variants  associate with three of these genes (COL11A1, MATN3, and COL9A1). 

9) BEYOND DNA – RNA, PROTEINS, AND METABOLITES (AMINO ACIDS, LIPIDS, ETC.)  

The DNA that we inherit is only the first chapter in a complicated process involving many other factors that eventually results in OA. A great deal of genetic research has been done to understand these other  factors and elucidate diagnostic, prognostic, and therapeutic targets (Thysen 2015). Changes in the  DNA, flagged by SNP associations, can have multiple effects downstream by altering transcription and  protein function and influencing epigenetic mechanisms such as methylation and micro-RNA  recruitment. Huang (2015) provides an understandable discussion for complex diseases in general.  Epigenetic mechanisms are a major conduit through which OA gene variants affect OA and will be the  key to personalized medicine (Rice 2020; Simon and Jeffries 2017; Ramos and Meulenbelt 2017).  Functional genomics is the broad area connecting genotypes and phenotypes (Steinberg and Zeggini 2016). The three main areas are diagrammed in Figures 5 and 6 below. 

Transcriptome (expression) 

Proteome (biomarkers) 

Epigenome (DNA methylation, etc.)

Central dogma of molecular biology defines the usual flow of information in a cell from  DNA to RNA to protein

Figure 5. Central dogma of molecular biology defines the usual flow of information in a cell from  DNA to RNA to protein (Central dogma).

Each cell has its own epigenetic signature, which reflects genotype and environmental  influence, and is ultimately reflected in the phenotype of the cell and organism. Thus, most  genetic findings must be considered in an epigenetic and environmental context

Figure 6. Each cell has its own epigenetic signature, which reflects genotype and environmental  influence, and is ultimately reflected in the phenotype of the cell and organism. Thus, most  genetic findings must be considered in an epigenetic and environmental context (Dwivedi 2011  Fig 3). 

METHYLATION (REYNARD 2017) 

SNPs can influence DNA methylation, an epigenetic process that adds methyl groups to the  DNA molecule and typically acts to suppress gene transcription. This mechanism has been  demonstrated in OA genes involved in cartilage: GLT8D1, SUPT3H, ALDH1A2 and GDF5. 

INFLAMMATION (ROGERS 2015; SHEN 2017) 

Although OA has an inflammatory component, it appears that DNA variation in inflammatory  genes does not contribute to OA susceptibility. However, epigenetic processes involving  inflammation may be related to OA initiation and development, and may be a target for  therapeutic intervention. Of particular interest is that OA patients can be stratified according to  whether inflammation-related genes are involved in their disease, which may explain the  variable effectiveness of NSAIDs in reducing pain. 

BIOMARKERS (RUIZ-ROMERO 2018 REVIEW) 

Microarrays can be used to identify potential target genes from differential expression between  samples from healthy synovial membrane and early and late-stage OA (Li, Meng 2017). 

Budd (2018) focuses on biomarkers expressed in synovial fluid and blood that could be used to 

detect and monitor OA disease. These tests could be done at a pre-symptomatic stage long  before changes to the joint become apparent on imaging. 

Joseph (2018a) examined associations between serum/urine biomarkers for OA and MRI-T2  evidence of early cartilage degeneration. This is an example of the utility of biomarkers in  detecting and monitoring OA. 

Li (Li, Ping 2019) performed a transcriptome-wide association study on UK Biobank data and  found multiple candidate genes for chondropathies (cartilage disease). 

Xiao (2019) reports on candidate proteins for a diagnostic urine test for OA. 

Liu (Liu, Li 2020) performed a large scale genetic correlation scan of human plasma proteins,  also on the UK Biobank data. 

10) CLINICAL APPLICATIONS 

The immediate clinical applications of genetics in OA will probably be to use biomarkers from blood,  urine, or joint tissue to detect and monitor the disease. These are not unrelated to the genotype of the  patient, in that DNA variants influence the downstream processes of RNA expression, protein  synthesis, and epigenetic regulation, and ultimately the individual's own disease phenotype. Biomarker  tests would be repeated at intervals, and trends in the results would be highly informative. 

A DNA test would be a one-time result, as an individual's genotype is fixed at birth. Although OA has  a high degree of heritability, previous models that combined a relatively few genetic loci and clinical  information have had limited success, and are not to my knowledge being used in practice. Now, a  polygenic risk score could utilize the large number of variants that have recently been discovered in  GWA studies, and benefit from our improved understanding of the etiology of OA. A polygenic risk  score that was combined not only with clinical factors, but also with time, would be an innovation.  There are various choices for an endpoint, such as diagnosis of OA, progression to severe disease, or  joint replacement surgery. 

The utility of a DNA test would be in identifying younger patients who have a familial predisposition  or susceptibility to OA. In slightly older patients with symptomatic OA, a polygenic risk score would  be used to predict progression. There is also an age limitation on the utility of DNA testing; several  studies have shown that the hereditary component of OA becomes attenuated after age 65, when the  normal effects of aging on the joints take over. 

This literature review found no studies of whether an individual's inherited DNA genotype could be  used directly to select either preventative or restorative treatments. It appears that for now, the therapy  is chosen to address an observed problem with the patient's joint. However, it is easy to speculate that  an inherited defect in a particular gene, for example, one involved in cartilage repair, might be reason  to select a treatment that targeted that gene specifically.  

SUMMARY 

  1. OA is a heterogenous combination of cartilage and/or bone and/or clinical complaints. Various  definitions of OA have been used in genetic studies (KL score, self-reported, hospital diagnosis,  surgical intervention) but there does not appear to be a clear winner. There is also evidence that OA is under-reported, possibly because individuals can be pre-symptomatic, or do not seek care until their  symptoms are severe.  
  1. OA has clinical risk factors (female sex, joint loading activity, previous joint trauma, and smoking  status). It may occur with other diseases and conditions that have a genetic component (type 2 diabetes, obesity, cardiovascular disease, depression, and osteoporosis) where it is difficult to sort out the causal  relationships. For example, is there a genetic cause of both obesity and OA, or does obesity cause OA  by putting more stress on the joints, or does OA contribute to obesity by making it difficult to exercise? All of these factors need to be included in a genetic model to eliminate confounding. 
  2. The fact that OA is a polygenic disease has implications for how genetic studies must be conducted.  GWA studies with large sample size are necessary to detect the many lower risk variants. 
  3. GWAS designs have special considerations. Adjustments must be made for relatedness, genetic drift  or population stratification, and ethnicity. Multiple testing inevitably results in a certain fraction of  false positives, and although corrections can be made, it is essential to replicate the results in another  independent cohort. 
  4. Narrow sense heritability, defined as the proportion of variation between individuals that can be  attributed to genetic differences, is 52% for hip, 15% for knee, and 24% for hip and/or knee. Overall, it  is estimated that the heritability of OA is about 50%. 
  5. One of the best performing clinical-genetic models predicted radiographic progression in knee OA  patients. The clinical model alone had an AUC of 0.66, where 1 would be perfect classification and 0.5  random classification. A genetic score utilizing 8 of 23 candidate SNPs obtained an AUC of 0.78. The  combined clinical-genetic model had an AUC of 0.82. The performance of this genetic model was  considerably better than previous AUC's around 0.65. A possible lesson is that the outcome was  progression rather than incidence. End-stage OA is also a better endpoint than surgery. (An important  caution is that fitting a polygenic risk score in the same cohort where the SNPs were first identified  introduces a bias. It is likely that the estimates of the effects, i.e. coefficients, for individual SNPs will  by chance have been somewhat higher in the discovery cohort than in the population as a whole.) 
  6. The UK Biobank data, fully released in 2017, together with the UK Osteoarthritis Genetics  Consortium (arcOGEN), have made possible large-scale GWA studies of polygenic diseases. We are  fortunate to have a well documented genetic study with sophisticated methodology listing 96 SNPs  associated with OA (Tachmazidou 2019). Results included 64 biologic processes associated with OA,  of which 46 were related to bone, cartilage, or chondrocyte morphology. Four genes (TGFB1, FGF18,  CTSK, and ILT1) are in clinical trials for OA or cartilage regeneration, and six others are potential  therapeutic targets (Tachmazidou 2019 Table 2).  
  7. Most of the 96 SNPs associated with OA have been confirmed either by replication in independent  cohorts, or by looking at gene expression and proteins in tissue from patients undergoing surgery,  relevant rare-human-disease and animal-model evidence, and using other sophisticated bioinformatic  tools. 
  8. Changes in the DNA, flagged by SNP associations, can have multiple effects downstream by altering transcription and protein function and influencing epigenetic mechanisms such as methylation and  micro-RNA recruitment.
  9. The immediate clinical applications of genetics in OA will probably be to use biomarkers from  blood, urine, or joint tissue to detect and monitor the disease. These are not unrelated to the genotype of the patient, in that DNA variants influence the downstream processes of RNA expression, protein  synthesis, and epigenetic regulation, and ultimately the individual's own disease phenotype. Biomarker  tests would be repeated at intervals, and trends in the results would be highly informative. 

A DNA test would be a one-time result, as an individual's genotype is fixed at birth. Although OA has  a high degree of heritability, previous models that combined a relatively few genetic loci and clinical  information have had limited success, and are not to my knowledge being used in practice. Now, a  polygenic risk score could utilize the large number of variants that have recently been discovered in  GWA studies, and benefit from our improved understanding of the etiology of OA. 

A polygenic risk score that was combined not only with clinical factors, but also with time, would be  an innovation. There are various choices for an endpoint, such as diagnosis of OA, progression to  severe disease, or joint replacement surgery. 

The utility of a DNA test would be in identifying younger patients who have a familial predisposition  or susceptibility to OA. In slightly older patients with symptomatic OA, a polygenic risk score would  be used to predict progression. There is also an age limitation on the utility of DNA testing; several  studies have shown that the hereditary component of OA becomes attenuated after age 65, when the  normal effects of aging on the joints take over. 

This literature review found no studies of whether an individual's inherited DNA genotype could be  used directly to select either preventative or restorative treatments. It appears that for now, the therapy  is chosen to address an observed problem in the patient's joint. However, it is easy to speculate that an  inherited defect in a particular gene, for example, one involved in cartilage repair, might be reason to  select a treatment that targeted that gene specifically rather than other genes in the same pathway.  

SUPPLEMENTARY TABLES

REFERENCES 

Blanco et al. (2015). Improved prediction of knee osteoarthritis progression by genetic polymorphisms: the Arthrotest Study, Rheumatology, 54(7):1236-1243. doi.org/10.1093/rheumatology/keu478 

Budd et al. (2018). Extracellular genomic biomarkers of osteoarthritis. Expert Rev Mol Diagn.  18(1):55-74. doi.org/10.1080/14737159.2018.1415757 

Bycroft et al. (2018). The UK Biobank resource with deep phenotyping and genomic data. Nature,  562:203-209. doi.org/10.1038/s41586-018-0579-z

Casalone et al. (2018). A novel variant in GLIS3 is associated with osteoarthritis. Ann Rheum Dis.,  77(4):620-623. doi:10.1136/annrheumdis-2017-211848 

Chapman et al. (2012). Genetic factors in OA pathogenesis. Bone, 51(2):258-64.  doi:10.1016/j.bone.2011.11.026 (copy not obtained) 

Dunn et al. (2020) Risk scoring for time to end-stage knee osteoarthritis: data from the Osteoarthritis  Initiative. Osteoarthritis Cartilage, 28(8):1020-1029. doi:10.1016/j.joca.2019.12.013 

Dwivedi et al. (2011). Beyond genetics: epigenetic code in chronic kidney disease. Kidney  International, 79(1):23-32. doi.org/10.1038/ki.2010.335 

Flechsenhar et al. (2019). Sample size calculations for detecting disease-modifying osteoarthritis drug  effects on the incidence of end-stage knee osteoarthritis in clinical trials: Data from the Osteoarthritis  Initiative. Semin Arthritis Rheum, 49(1):3-8. doi:10.1016/j.semarthrit.2018.12.002 

Halilaj et al. (2018). Modeling and predicting osteoarthritis progression: data from the osteoarthritis  initiative. Osteoarthritis Cartilage, 26(12):1643-1650. doi:10.1016/j.joca.2018.08.003 

Huang et al. (2015). Genetic study of complex diseases in the post-GWAS Era. Jour Genetics and  Genomics, 42(3):87-98. doi.org/10.1016/j.jgg.2015.02.001 

Joseph et al. (2018). Tool for osteoarthritis risk prediction (TOARP) over 8 years using baseline  clinical data, X-ray, and MRI: Data from the osteoarthritis initiative. J Magn Reson Imaging,  47(6):1517-1526. doi:10.1002/jmri.25892 

Joseph et al. (2018a). Associations between molecular biomarkers and MR-based cartilage composition and knee joint morphology: data from the Osteoarthritis Initiative. Osteoarthritis Cartilage,  26(8):1070-1077. doi:10.1016/j.joca.2018.04.019 

Kerkhof et al. (2014). A genome-wide association study identifies an osteoarthritis susceptibility locus  on chromosome 7q22. Arthritis Rheum, 62(2):499–510. doi:10.1002/art.27184 

Li, Meng et al. (2017). Identification of potential target genes associated with the pathogenesis of  osteoarthritis using microarray based analysis. Mol Med Rep, 16(3):2799-2806.  doi:10.3892/mmr.2017.6928 

Li, Ping et al. (2019). Integrating transcriptome-wide study and mRNA expression profiles yields novel insights into the biological mechanism of chondropathies. Arthritis Res Ther, 21(1):194.  doi:10.1186/s13075-019-1978-8 

Liu, Li et al. (2020). Assessing the genetic relationships between osteoarthritis and human plasma  proteins: a large scale genetic correlation scan. Ann Transl Med, 8(11):677. doi:10.21037/atm-19-4643 

Liu, Y et al. (2017). Genetic Determinants of Radiographic Knee Osteoarthritis in African Americans. J Rheumatol. 44(11):1652-1658. doi:10.3899/jrheum.161488 

Magnusson et al. (2018). A naturally aging knee, or development of early knee osteoarthritis?. 

Osteoarthritis Cartilage, 26(11):1447-1452. doi:10.1016/j.joca.2018.04.020 

Meng et al. (2019). A genome-wide association study identifies that the GDF5 and COL27A1 genes  are associated with knee pain in UK Biobank (N = 171, 516). doi: https://doi.org/10.1101/525147 

Meng et al. (2020). A genome-wide association study finds genetic variants associated with neck or  shoulder pain in UK Biobank, Human Molecular Genetics, 29(8):1396-1404.  doi.org/10.1093/hmg/ddaa058 

Oliynky (2019). Age-related late-onset disease heritability patterns and implications for genome-wide  association studies. PeerJ, 7:e7168 doi.org/10.7717/peerj.7168 

Ramos and Meulenbelt (2017). The role of epigenetics in osteoarthritis: current perspective. Curr Opin Rheumatol. 29(1):119-129. doi:10.1097/BOR.0000000000000355 

Reynard (2017). Analysis of genetics and DNA methylation in osteoarthritis: What have we learnt  about the disease? Semin Cell Dev Biol, 62:57-66. doi: 10.1016/j.semcdb.2016.04.017 

Rice et al. (2020). Interplay between genetics and epigenetics in osteoarthritis. Nat Rev Rheumatol.  16(5):268-281. doi:10.1038/s41584-020-0407-3 

Rogers et al. (2015). The role of inflammation-related genes in osteoarthritis. Osteoarthritis Cartilage, 23(11):1933-1938. doi: 10.1016/j.joca.2015.01.003 

Ruiz-Romero et al. (2018). What did we learn from 'omics' studies in osteoarthritis. Curr Opin  Rheumatol. 30(1):114-120. doi:10.1097/BOR.0000000000000460 

Shen et al. (2017). Inflammation and epigenetic regulation in osteoarthritis. Connect Tissue Res, 58(1):49-63. doi: 10.1080/03008207.2016.1208655 

Simon and Jeffries (2017). The Epigenomic Landscape in Osteoarthritis. Curr Rheumatol Rep, 19, 30  (2017). doi.org/10.1007/s11926-017-0661-9 

Spector and MacGregor (2004). Risk factors for osteoarthritis: genetics. Osteoarthritis and Cartilage,  12:S39-S44. 

Steinberg and Zeggini (2016). Functional genomics in osteoarthritis: Past, present, and future. J Orthop Res, 34(7):1105-1110. doi: 10.1002/jor.23296 

Styrkarsdottir et al. (2018). Meta-analysis of Icelandic and UK data sets identifies missense variants in  SMO, IL11, COL11A1 and 13 more new loci associated with osteoarthritis. Nat Genet, 50(12):1681- 1687. doi: 10.1038/s41588-018-0247-0 

Tachmazidou et al. (2019). Identification of new therapeutic targets for osteoarthritis through genome wide analyses of UK Biobank data. Nature Genetics, 5:230-236. doi.org/10.1038/s41588-018-0327-1 

Takahashi et al. (2010). Prediction model for knee osteoarthritis based on genetic and clinical  information. Arthritis Res Ther, 12:R187. doi:10.1186/ar3157

Thysen et al. (2015). Targets, models and challenges in osteoarthritis research. Dis Model Mech,  8(1):17-30. doi: 10.1242/dmm.016881 

Trachana et al. (2019). Understanding the role of chondrocytes in osteoarthritis: utilizing proteomics.  Expert Rev Proteomics, 16(3):201-213. doi:10.1080/14789450.2019.1571918 

Uhalte et al. (2017). Pathways to understanding the genomic aetiology of osteoarthritis. Hum Mol  Genet, 26(R2):R193-R201. doi:10.1093/hmg/ddx302 

Valdes (2018). Chapter 24 - Osteoarthritis: Genetic studies of monogenic and complex forms in  Genetics of bone biology and skeletal disease (second edition). Academic Press, 421-438. doi.org/10.1016/B978-0-12-804182-6.00024-1 

Warner and Valdes (2016). The Genetics of Osteoarthritis: A Review. J. Funct. Morphol. Kinesiol., 1,  140-153. doi.org/10.3390/jfmk1010140 

Wenjian et al. (2020). A Fast and Accurate Method for Genome-Wide Time-to-Event Data Analysis  and Its Application to UK Biobank. Am Jour Human Gen, 107:222-233.  

doi.org/10.1016/j.ajhg.2020.06.003 

Xiao et al. (2019). Urine Proteomics Profiling and Functional Characterization of Knee Osteoarthritis  Using iTRAQ Technology. Horm Metab Res, 51(11):735-740. doi:10.1055/a-1012-8571 

Yu et al. (2018). Underrecording of osteoarthritis in United Kingdom primary care electronic health  record data. Clin Epidemiol. 10:1195-1201. doi:10.2147/CLEP.S160059 

Zeggini et al. (2012). Identification of new susceptibility loci for osteoarthritis (arcOGEN): a genome wide association study. Lancet, 380(9844):815-823. doi:10.1016/S0140-6736(12)60681-3 

Zengini et al. (2016). The Genetic Epidemiological Landscape of Hip and Knee Osteoarthritis: Where  Are We Now and Where Are We Going? J Rheumatol, 43(2):260-266. doi: 10.3899/jrheum.150710 

Zengini et al. (2018). Genome-wide analyses using UK Biobank data provide insights into the genetic  architecture of osteoarthritis. Nat Genet, 50(4):549-558. doi:10.1038/s41588-018-0079-y 

Zhang et al. (2011). Nottingham knee osteoarthritis risk prediction models. Annals of the Rheumatic  Diseases, 70(9):1599-1604. doi:10.1136/ard.2011.149807

Back to blog