Transcription factor IIH, or TFIIH, pronounced “TF two H,” is a veritable workhorse among the protein complexes that control human cell activity. It plays critical roles both in transcription — the highly regulated enzymatic synthesis of RNA from a DNA template — and in the repair of damaged DNA. But how can one protein assembly participate in two such vastly different and extremely important genomic tasks?
A team of researchers led by chemistry professor Ivaylo Ivanov of Georgia State University used the Summit supercomputer at the Department of Energy’s Oak Ridge National Laboratory to help answer that question. By conducting multiple molecular dynamics simulations of TFIIH in transcription and DNA repair-competent states and then contrasting the structural mechanisms at work, Ivanov and his team made an interesting discovery: TFIIH is a shapeshifter, reconfiguring itself to meet the demands of each task.
Unraveling the inner workings of TFIIH at the interface of transcription and DNA repair is key for understanding the origins of genetic disorders caused by mutations — hereditary diseases such as xeroderma pigmentosum, trichothiodystrophy and Cockayne syndrome. The GSU team published its results in the journal Nature Communications.
“This project illustrates how versatile protein assemblies can be, given they participate in vastly different cellular processes. Understanding how genetic mutations impair the function of TFIIH is the first step in designing therapeutic strategies such as gene editing,” Ivanov said.
The project’s findings are only the latest in Ivanov’s ongoing research into the molecular machinery of gene expression using the supercomputers at the Oak Ridge Leadership Computing Facility, a DOE Office of Science user facility at ORNL.
Transcription initiation vs. DNA repair
The structure of TFIIH has been mapped through cryo-electron microscopy, but understanding its functional dynamics during transcription initiation and DNA repair required the GSU team to model the large-scale dynamics of systems of nearly 2 million atoms — with multiple copies running simultaneously.
“We often rely on chain-of-replicas simulations to describe large-scale conformational changes in biomolecular complexes,” Ivanov said. “To carry out these types of simulations, you must be able to run many replicas of the simulation system at the same time. This only becomes possible if you have a large number of GPU nodes available, such as on Summit. In one case, we used about 70 replicas, so the computational cost to delineate any of these mechanisms rises very quickly.”
TFIIH is an integral component of the transcription preinitiation complex, or PIC, which is an assembly of proteins vital to gene expression that Ivanov and his team had also previously modeled on Summit. As its name indicates, the PIC helps trigger the transcription process wherein a gene's DNA sequence is copied into messenger RNA. The mRNA then delivers that genetic information into the cell's cytoplasm, where it is translated into a protein, thereby allowing it to begin its encoded function, e.g., preventing disease or supplying energy.
“TFIIH is the component of the assembly that contains the molecular motor that unwinds the duplex DNA at a specific location in the genome and pushes it toward the RNA polymerase active site. Without this initial DNA unwinding to expose the template strand, transcription really wouldn't work,” Ivanov said.
Transcription factor IIH is also a key constituent of the protein machinery that performs nucleotide excision repair — a versatile DNA repair pathway that removes a wide range of genomic lesions that result from things like ultraviolet light, chemotherapy treatments and exposure to environmental carcinogens.
The team focused on how TFIIH’s two subunits, XPB and XPD, acted differently to reshape the DNA. XPB and XPD sit at the edges of the TFIIH’s horseshoe-shaped assembly. In transcription initiation, the horseshoe has an open conformation with XPB serving as the active component unwinding DNA. XPD, meanwhile, fulfills a purely structural role — DNA is directed away from it, and its DNA binding groove is blocked.
“XPD is regulated in a way to prevent it from processing DNA. Another subunit of TFIIH called p62 serves a regulatory role — it inserts itself into the DNA binding groove of XPD and blocks its function,” Ivanov said.
However, when scanning for lesions during DNA repair — either nucleotide excision repair or transcription-coupled NER — TFIIH adopts a closed conformation, and the roles of XPB and XPD are reversed.
“Previously, we had modelled TFIIH dynamics within the PIC, which allowed us to partition the complex into functional modules,” Ivanov said. “We noticed that, intriguingly, the interfaces between functional modules harbored most of the disease-associated TFIIH mutations. However, at the time, we didn't have the simulations of the nucleotide excision repair competent state — and that provided an incomplete picture of what the TFIIH was doing in DNA repair.”
The new, detailed picture of TFIIH’s mechanical dynamics provides insights into the principal motions that allow TFIIH to remodel DNA in transcription initiation versus nucleotide excision repair. This can be useful information in the quest to treat genetic disorders.
“Innovative computational approaches, such as those described in this report, enliven static images of biological machines and enrich dynamic views of how they work,” said Manju Hingorani, a program director in the National Science Foundation’s Directorate for Biological Sciences. “In this case, new knowledge of how a protein complex reshapes and self regulates to allow repair of damaged DNA and restore cellular function can explain how defects in the process cause disease.”
Dynamic structural analysis
The GSU team used graph algorithms to partition TFIIH’s protein network into strongly connected components, thereby allowing them to identify dynamic modules — the pieces that move together. In turn, these models showed how the modules move with respect to other parts of the structure.
“Now we can compare and contrast the functional dynamics of TFIIH when it is active in transcription versus when it is active in nucleotide excision repair,” Ivanov said. “All of a sudden, you see communities that were previously locked together begin to open up and participate in motions that you would not have anticipated just by looking at the transcription competent state.”
The researchers can also map different types of information onto the protein network model, such as dynamic correlations or contact probabilities. This allows them to focus on the important interfaces that are changing in the respective structural transitions and analyze them in detail. It then becomes possible to classify the mutations of different disease phenotypes based on where they sit in TFIIH’s structure and the dynamic roles that they play.
“Having these different dynamic ensembles in the transcription case versus the NER competent state, you can do a very detailed analysis of how patient mutations for various genetic disorders are positioned with respect to the dynamic communities that we have identified,” Ivanov said. “Basically, there is a possibility — by understanding the mechanisms of transcription and NER in TFIIH — to direct its function toward one or the other pathway.”
This study was funded by the National Science Foundation’s Directorate for Biological Sciences, the National Institute of Environmental Health Sciences and the National Cancer Institute. An award of computer time was provided by the Innovative and Novel Computational Impact on Theory and Experiment Program at the OLCF.
UT-Battelle manages ORNL for DOE’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science. — Coury Turczyn