RETROTRANSPOSONS AS MODELS FOR THE
REPLICATION OF RETROVIRUSES
Photo of Dr. Henry Levin

Henry L. Levin, PhD, Head, Section on Eukaryotic Transposable Elements

Angela Atwood-Moore, BA, Senior Research Assistant

Tracy Ripmaster, PhD, Research Assistant

Min-Kyeong Kim, PhD, Postdoctoral Fellow

Hirotaka Ebina, PhD, Visiting Fellow

Young-Eun Leem, PhD, Visiting Fellow

Xu Lin, PhD, Visiting Fellow

Amnon Hizi, PhD, ORISE Fellow1

Kenechi Ejebe, BA, Postbaccalaureate Fellow

Marc Heincelman, BA, Postbaccalaureate Fellow

Robert Judson, BA, Postbaccalaureate Fellow

Felice Kelly, BA, Postbaccalaureate Fellow

Julie McClure, BA, Postbaccalaureate Fellow

Jay Myung, BA, Postbaccalaureate Fellow

Christopher Plymire, BA, Postbaccalaureate Fellow

The prevalence of diseases caused by retroviruses such as AIDS and leukemia has intensified the need to understand the mechanisms of retrovirus replication. Our primary objectives are to understand how reverse transcription of viral mRNA occurs and how the cDNA products are integrated into the genome of infected cells. Given their similarity to retroviruses, LTR-retrotransposons (long-terminal repeat transposons) are important models for retrovirus replication. We are studying the Tf1 element, the retrotransposon of the fission yeast Schizosaccharomyces pombe. During the synthesis of cDNA, reverse transcriptase (RT) generates a series of highly specific intermediates. We identify which amino acids of RT recognize specific cDNA intermediates by screening large numbers of mutant transposons with genetic assays that measure cDNA intermediates. After reverse transcription, the mature cDNA is transported within a large preintegration complex into the nucleus. Using a battery of molecular techniques, we determine which factors mediate the traffic of Tf1 into the nucleus. Once in the nucleus, the cDNA is integrated at positions controlled by interactions with integrase (IN) and features of chromatin structure. The integration of Tf1 occurs specifically into pol II promoters. We study the structures of IN and of the target sites that are responsible for this interesting mechanism.

Integration preference of Tf1 for Pol II promoters

The complete DNA sequence of the genome of S. pombe provided the opportunity to investigate the entire complement of transposable elements (TEs) and their chromosomal distribution. Our analysis identified 186 pre-existing insertions of Tf transposons. We found that 96 percent of all the insertions were located in intergenic regions of the genome that contained pol II promoters. In addition, the LTRs were clustered within 300 nucleotides of the 5´ end of the open reading frames (ORF). The bias in the position of Tf sequences was highly similar in pattern and magnitude to the positions of insertions resulting from the induction of Tf1 transposition under laboratory conditions. Such extensive similarities strongly suggest that the biases in the position of Tf sequences result from preferences of the integration process. The data indicate that Tf elements recognize and insert upstream of RNA polymerase II promoters.

To define the determinants of the target sites, we developed an in vivo assay for integration by using a plasmid that contained ade6 as the target. To cause target plasmids with insertions to gain resistance to kanamycin, we expressed a version of Tf1 that contained a neo gene. When Tf1-neo was expressed, the plasmid with ade6 served as an efficient target for integration. To determine which regions of ade6 were important for integration, we created a series of deletions and tested each plasmid as a target. While the entire ORF of ade6 could be removed without reducing integration, deletion of the sequence upstream of the ORF caused integration to drop significantly. Deletions in the 405 nucleotide region upstream of ade6 revealed that only the center third was required for efficient integration. We determined the positions of insertion by isolating kanamycin-resistant plasmids and sequencing the DNA flanking Tf1. We isolated 50 separate insertions in the intact target plasmid and found that 95 percent occurred within 300 nucleotides of the ade6 ORF. All insertions occurred within the same central third of the upstream region that was found to be necessary for efficient integration. In addition, the insertions clustered into four sites separated by 30 nucleotides each. To explore the significance of these four clusters, we used micrococcal nuclease to scan the intergenic region for transcription. The clusters of integration corresponded closely to sites of micrococcal supersensitivity, implying that Tf1 integration in the promoter of ade6 is positioned by transcription factors. Together, the data indicate that elements of the ade6 promoter are necessary and sufficient for recognition by Tf1. Additional experiments using fbp1 as a target support the conclusion that pol II promoters are the determinants of Tf1 integration. Over 85 percent of the insertions in the plasmid with fbp1 occurred in the promoter region.

Bowen N, Jordan I, Epstein J, Wood V, Levin H. Retrotransposons and their recognition of pol II promoters: a comprehensive survey of the transposable elements derived from the complete genome sequence of Schizosaccharomyces pombe. Genome Res 2003;13:1984-1997.

Kelly F, Levin HL. The evolution of retrotransposons in Schizosaccharomyces pombe. Cytogenet Genome Res 2005;110:566-574.

The GPY domain and the chromodomain of Tf1 integrase

The IN of Tf1 contains Zn-finger and catalytic domains similar to those of retroviral INs and, at its C-terminus, possesses a GPY domain and a chromodomain. Although nothing is known about GPY domains, they are present in the INs of a few retroviruses and retrotransposons. Chromodomains are present in many chromatin-modifying factors and bind directly to histone H3 in specific nucleosomes, suggesting that Tf1 integration may be mediated by an interaction between IN and specific nucleosomes at pol II promoters.

By generating single amino acid substitutions in highly conserved residues, we tested whether the GPY domain and chromodomain of Tf1 participate in transposition. We expressed copies of Tf1 with these mutations in S. pombe and tested for transposition activity. Three alanine mutations in the conserved residues of the GPY domain each caused substantial reductions in integration. Likewise, the alanine mutations in the conserved residues of the chromodomain created equally low levels of transposition. Immunoblots revealed that none of these mutations reduced the expression of IN. Thus, the domains possessed specific activities that were critical for Tf1 integration. We used the same mutations to test whether the domains contributed to the position of integration. Results from the target plasmid assay revealed that the mutations caused a modest but reproducible reduction in the proportion of inserts that occurred in the promoter of ade6. The experiments indicate that both the GPY domain and the chromodomain play a critical role in integration but suggest that they serve a lesser function in positioning insertion.

To identify the function of the chromodomain, we examined the biochemical activities of IN boyh with and without the chromodomain (CH). Using a six-his tag and Ni resin, we purified both proteins from E. coli. Even though the INs of retroviruses are extremely insoluble and difficult to concentrate, we were surprised to find that the full-length IN of Tf1 was easy to purify and could be readily isolated in concentrations of 20 mg/ml. We tested the purified proteins for catalytic activity by using oligonucleotide substrates that mimic the ends of the transposon. We used one set of substrates to measure the reverse of integration or so called “disintegration” activity. The reaction was highly sensitive and revealed that both IN and CH possessed high levels of activity. We used various substrates to measure the integration activities of the proteins. Here, too, we observed high levels of activity, as indicated by the ability of the enzymes to insert oligonucleotides into each other. The most surprising results of the disintegration and integration assays were that CH was seven times more active than IN, indicating that the chromodomain possessed inhibitory activity.

The INs of retroviruses have an absolute requirement for the dinucleotide CA at the 3´ termini of their cDNA. Oligonucleotides altered to test whether Tf1 IN required specific sequences at the 3´ ends revealed that the IN did indeed require the presence of the dinucleotide CA at the 3´ ends. We also tested whether the chromodomain contributed to the recognition of the terminal dinucleotide. The CH–  protein exhibited a surprising relaxation of the sequence requirement at the 3´ end of the donor DNA. Taken together, the results indicate that the chromodomain functions as a negative regulator of integration and as a specificity factor for the donor DNA.

The INs of retroviruses possess processing activity that removes two terminal nucleotides from the 3´ ends of the cDNA to produce termini with the CA dinucleotide. Using oligonucleotides that mimic the ends of the cDNA, we assayed the Tf1 IN for processing activity. The IN had strong processing activity that removed between two and five additional nucleotides from the 3´ end of the oligonucleotides in vitro, suggesting that the several nucleotides 3´ of the CA dinucleotide are likely removed by IN in vivo.

The results of the processing and integration assays demonstrate that the IN of Tf1 has the same activities as retroviral INs and is therefore an excellent model for the IN of HIV-1. In addition, the high solubility of the Tf1 IN suggests that the protein may form crystals, allowing the first high-resolution structure of an intact IN to be determined and thus significantly expanding our understanding of retrovirus integration.

Hizi A, Levin H. The integrase of the LTR-retrotransposon Tf1 has a chromodomain that modulates integration activities. J Biol Chem 2005;280:39086-39094.

Specific recognition and cleavage of the plus strand primer by reverse transcriptase

Reverse transcription of retroviruses and LTR-retrotransposons involves a complex sequence of reactions that produce several critical intermediate products, including the initial minus strand product of cDNA called a strong stop, the extended minus strand, the plus strand primer (PPT), the plus strand strong stop, and, ultimately, the full-length double-stranded cDNA. The synthesis of these intermediates requires both the DNA polymerization and RNase H activities of RT. The RNase H domain must degrade the RNA once it is used as a template and then remove the PPT once it has primed plus strand synthesis. Although much is known about the amino acids that catalyze the DNA synthesis and RNA degradation, less is known about the residues and structures that are required for the recognition and removal of the PPT.

By screening large numbers of RT mutants, we identified residues of Tf1 RT that recognize specific intermediates of cDNA. A combination of genetic assays and physical analyses identified a set of 35 mutations that inhibited integration without reducing reverse transcription. Our experiments focused on a cluster of mutations in RNase H that included a region with five single amino acid substitutions in a five–amino acid segment. Surprisingly, the mutations in RNase H did not reduce the levels of full-length double-stranded cDNA. Crystallographic studies by Edward Arnold and colleagues indicated that the corresponding residues of HIV-1 RT interact directly with the PPT. This observation led us to test whether the mutations in the RNase H of Tf1 were defective for either the recognition or processing of the PPT. A defect in the position of the PPT cleavage would alter the sequences at the 3´ end of the minus strand and, as a result, have a drastic impact on the ability of IN to catalyze strand transfer. We used ligation-mediated PCR to determine the sequence from the 3´ ends of the minus strand cDNA of Tf1 particles. We examined approximately 100 cDNA sequences produced by each RNase H mutant. The mutations clearly increased the levels of cDNA that retained the PPT at the 3´ end. The results demonstrated that the cluster of residues we identified in RNase H had the specific function of processing the PPT.

During the selection of the primer, RNase H cleaves on either side of the PPT. In addition, RNase H cleaves the PPT to remove the primer after initiation of the reverse transcription of the plus strand. Alterations in either the selection or removal of the PPT could result in cDNA that retains the PPT sequence at the 3´ end. A defect in the selection of the PPT would result in changes in the DNA sequence at the 3´ and 5´ ends of the cDNA while a reduction in PPT removal would not alter the DNA sequence at the 5´ end of the cDNA. To distinguish between these two possibilities, we analyzed the 5´ ends of the cDNA by primer extension. In examining the two mutants that produced the highest levels of PPT at the 3´ end of the minus strand, we found no changes in the sequence at the 5´ end of the plus strand. The results provide strong evidence that the mutations in RNase H specifically inhibited the removal of the PPT RNA from the 5´ end of the plus strand. Thus, our data identified a cluster of conserved amino acids in RNase H that have the specific function of removing the PPT.

Atwood-Moore A, Ejebe K, Levin HL. Specific recognition and cleavage of the plus strand primer by reverse transcriptase. J Virol 2005;79:14863-14875.

Nuclear localization and particle assembly of Tf1

The IN and cDNA of retroviruses form a preintegration complex (PIC) that must access the nucleus to perform integration. Because HIV-1 infects nondividing cells, the PIC must enter the nucleus through the nuclear pore complex (NPC). The matrix, virus protein R, IN, and DNA flap structure of the viral cDNA are among the components that have nuclear-localizing activity and may contribute to the nuclear import of the PIC. In addition to its ability to traffic through the NPC of nondividing cells, the PIC of HIV-1 can be imported into the nucleus of dividing cells while the nuclear envelope remains intact, evidence underscoring the possibility that mechanisms of nuclear import may be important for the propagation of other retroviruses.

In an effort to model the import of HIV-1 into the nucleus, we examined the retrotransposon Tf1 in S. pombe. The Gag and cDNA of Tf1 enter the nucleus only after cells reach the stationary phase of growth. Previous studies identified a nuclear localization signal (NLS) in the N-terminus of Gag that is required for transposition. Mutations in the NLS cause a severe defect in the nuclear localization of Gag and the cDNA. In separate experiments, we found that Nup124p, a factor of the nuclear pore, had a specific activity required for nuclear import of Tf1. Mutations in nup124 cause a significant defect in the import of Tf1 Gag and, surprisingly, do not reduce the import of other proteins. The results of two-hybrid analyses and precipitation studies revealed an interaction between the N-terminus of Nup124p and the Gag of Tf1. We proposed that the binding of Gag to Nup124p could mediate the nuclear import of Tf1.

By studying the import of large proteins consisting of sections of Gag fused to GFP and lacZ, we further explored the function of Nup124p in the import of Tf1 Gag. Interestingly, Nup124p was required for import of the first 50 amino acids of Gag fused to GFP-lacZ. The requirement for Nup124p was mapped to residues 10 through 30 of Gag. To understand how the Gag of Tf1 is imported into the nucleus and to determine the role of Nup124p in the import process, we introduced five independent mutations in Gag residues 10 through 30. The transposition activities of the Tf1 elements containing the five alanine mutations A1, A2, A3, A4, and A5 were significantly less than that of the wild-type Tf1, indicating that the region of Gag residues 10 through 30 is critical for transposition.

The localization of Gag by indirect immunofluorescence revealed that Gag with either of the A1, A2, or A3 mutations was not imported into the nucleus. The residues are adjacent to the NLS and may contribute to its activity. We tested whether the mutants A1, A2, or A3 altered the interaction between Gag and Nup124p. The results of precipitation experiments showed that the mutations did not reduce binding with Nup124p. In addition, the interaction with Nup124p was mapped to 50 amino acids at the C-terminus of Gag.

Surprisingly, at both the log and stationary phase of cell growth, Gag bearing mutations A4 and A5 localized in the nucleus and its localization did not depend on Nup124p. The reduced localization of Gag in the nucleus caused by mutations A1, A2, or A3 and the premature localization of Gag in the nucleus of log-phase cells caused by mutations A4 or A5 may be attributable to changes in the structure of the virus-like particles (VLPs). The assembly of Gag, RT, IN, and cDNA into VLPs can be detected as large macromolecular complexes. By subjecting cell extracts to gradient sedimentation, we found that Gag-A4 and Gag-A5 were defective for particle formation. Electron micrographs confirmed these results. Therefore, the ability of Gag to enter the nucleus corresponds to the loss in particle structure, suggesting that, at least in stationary-phase cells, the import of Tf1 into the nucleus is impeded by its particle structure and that Nup124p is required to overcome the block.

Kim M, Claiborn K, Levin H. The long terminal repeat-containing retrotransposon Tf1 possesses amino acids in Gag that regulate nuclear localization and particle formation. J Virol 2005;79:9540-9555.

Teysset L, Dang V, Kim M, Levin H. An LTR-retrotransposon of Schizosaccharomyces pombe expresses a Gag-like protein that assembles into virus-like particles and mediates reverse transcription. J Virol 2003;77:5451-5463.

1Oak Ridge Senior Fellow Program

2Katy Claiborn, BA, former Postbaccalaureate Fellow

For further information, contact henry_levin@nih.gov.

Top of Page