Viruses: Genes Gone Rogue

Viruses: Genes Gone Rogue

What Are Viruses?

Viruses are microscopic parasites, which only really do anything when they're inside host cells. Depending on your mood, you might call them intracellular parasites, mobile genetic elements, or freeloading gits.

Anatomy of a virus: SARS-CoV-2 has around 30,000 nucleic acid bases which make up 15 genes, encased in a capsid and envelope studded with spike proteins

Anatomy of a virus. The now infamous SARS-CoV-2 has around 30,000 nucleic acid bases which make up 15 genes. The RNA is encased in a protective capsid and an envelope studded with spike proteins.

More than 140,000 virus species were recently identified in the human gut, half of which had never been seen before. These are bacteriophages: viruses that exploit our gut bacteria.

Viruses possess incredible diversity, from the smallest (adeno-associated viruses are just 25 nanometres) to the largest (mimiviruses are 700nm, visible under a light microscope).

So-called giant viruses (aka giruses) are actually bigger than bacteria, preying on single-celled organisms like amoebas and algae.

In fact, different viruses infect all three domains of life: archaea, bacteria, and eukarya. We are eukaryotes, as are all plants, fungi, and animals.

Viruses are classified by their general structure. At their core, all viruses are made of RNA or DNA surrounded by a protective capsid. Some also have an outer envelope.

Virus types by shape: helical, polyhedral, spherical, complex

Viruses are the epitome of selfish genes. They're either helical (rod-shaped), polyhedral (multi-sided), spherical (multi-sided with an envelope), or complex (robot aliens).

When they infect us, viruses damage our cells and trigger an immune response, both of which create the symptoms of infectious disease.

Virus Disease Genome
Rhinovirus Common Cold RNA
Adenovirus Common Cold, Pneumonia DNA
Coronavirus Common Cold, SARS, COVID RNA
Rubella Virus German Measles RNA
Variola Virus Smallpox DNA
Norovirus Gastroenteritis RNA

But let's examine viruses in lesser known terms. Where do viruses come from? How do they hack our DNA? And how do they mutate so fast during a pandemic?

Are Viruses Alive?

Traditionally, biologists said no: viruses are not alive because they lack the equipment to metabolise, grow, and self-replicate.

But we also know that viruses possess genes, allowing them to adapt and evolve. And they share the same genetic code as living cells, suggesting they branch from the universal tree of life.

So how did viruses become so different as to be relegated to the world of the undead? Consider the three classical hypotheses of viral origins: the Virus First, Reduction, and Escape hypotheses.

Virus Origins: The Virus First, Regressive, and Escape Hypotheses

Classical hypotheses of viral origins. (1) The Virus First Hypothesis says viruses preceded all cellular life. (2) The Reduction Hypothesis says some primitive cells spun-off into the first viruses, while others became modern cells. (3) The Escape Hypothesis says viruses are genetic packets that adapted to survive outside modern cells.

Recent comparisons of viral and cellular proteins reveal intricate overlaps in their proteomes (protein sets). This favours the reduction hypothesis, which says that billions of years ago, simple parasitic cells went through reductive evolution.

These parasitic cells dropped all the standard cell equipment to evolve fantastically streamlined genomes (gene sets). And the first viruses were born.

Today's viruses have just 4-200 genes. This compares to 180-12,000 genes in bacteria. Humans have around 20,000 genes. And water fleas have a remarkable 31,000 genes.

So the question—are viruses alive—is somewhat open. We might think of viruses in the wild as dormant, coming to life only when they enter cells. While some simply redirect our biological machinery, others set up compartmentalised virus factories where they metabolise and reproduce with autonomy.

It's life, Jim, but not as we know it.

The Great Cell Hijack

What do viruses do inside our cells? And are all infected cells doomed?

Cells are the multipurpose biological factories that make up our tissues. They're up to 1,000 times bigger than viruses, and have complex internal structures bustling with organelles and enzymes.

Our complete DNA set lives in the nucleus of virtually every cell in the body, serving as a kind of recipe book. It informs the production of all cell components, as well as the essential molecules secreted beyond.

Animal Cell Diagram Cartoon Style

The basic features of an animal cell.

DNA is expressed on a continuous basis. It's broadly a two-step process, beginning with snippets of DNA being transcribed (copied) into messenger RNA within the nucleus.

The mRNA then makes its way to the cell cytoplasm to be translated into chains of amino acids. These chains fold and combine to form proteins: the essential molecules of life.

Congratulations, you've survived another second.

The Central Dogma of DNA expression explains how DNA works using transcription, RNA processing, and translation

The Central Dogma of DNA expression. Double-stranded DNA is transcribed into single-stranded RNA for translation into proteins. See How Does DNA Work?

Proteins are large, complex molecules that keep us alive, whether they're retained for work within the cell or secreted around the body. Hormones, antibodies, and enzymes are all types of proteins.

Being devious little wretches, viruses sneak right into this pathway. Many species drop off their genes for translation in the cytoplasm.

Crucially, human RNA and viral RNA are made of the same nucleic acid bases (adenine, cytosine, guanine, and uracil). They also adhere to the same genetic code, which translates bases to amino acids. And this is how the covert infiltration goes undetected.

The viral pathway in cells: how coronavirus infects cells

The pathway of a coronavirus in cells. (1) A virion binds with a cell receptor to gain entry. (2) The entire unit is engulfed inside a lipid bubble called an endosome, which it escapes to (3) release its RNA into the cytoplasm. (4) The RNA is translated into chains of amino acids which fold up to form proteins. (5) The viral proteins self-assemble in new virions and are packaged into a secretory vesicle for (6) exocytosis out of the cell.

Cells that permit infection are doomed. Viral genes are translated at the expense of host genes, potentially damaging or destroying the cell in the process. Then the nanoscale army continues on its path of destruction.

On infecting a cell, viruses leave behind calling cards: molecular structures called antigens that the immune system uses to identify the invader.

After a few days, killer T-cells detect these markers and destroy the contaminated cells. Meanwhile, B-cells release antibodies into the blood and mucus membranes, which bind to and neutralise roaming virions. But this adaptive immune response takes time.

Until then, the viral army scales to extraordinary volumes. For instance, at peak COVID infection, just one millilitre of saliva contains 200 million SARS-CoV-2 virions. So-called supercarriers can host up to 6 trillion virions within every ml of saliva.

Even so, viruses are so tiny that the entire mass of SARS-CoV-2 in the global population amounts to just 1-10kg.

How Viruses Integrate with Our DNA

Most viruses deposit their genes in the cell cytoplasm, and don't appear to interact with our master DNA at all. But there are exceptions.

Retroviruses are hell-bent on eternal life, integrating their DNA alongside our own within the nucleus. Clinically, the most significant retrovirus is HIV, the virus that causes AIDS.

As part of its infection cycle, HIV uses an on-board enzyme called reverse transcriptase to convert its RNA into DNA. It then navigates to the nucleus and injects its DNA through a nuclear pore complex.

Here, the viral DNA integrates permanently with the cell DNA to become a provirus. Nestled among active genes, the provirus genes are expressed to reproduce HIV for life.

How retroviruses integrate their RNA into human DNA in cells

The retrovirus infection cycle. (1) The retrovirus binds to a cell receptor to gain entry. (2) Reverse transcriptase converts the viral genome from RNA to DNA. (3) The capsid injects the viral DNA along with an enzyme called integrase into the nucleus. (4) Integrase catalyses the insertion of viral DNA at target sites to create a permanent store. (5) To replicate, the provirus is then transcribed back into mRNA which is (6) exported out of the nucleus. (7) The mRNA is translated into viral proteins which self-assemble and (8) exit the cell.

HIV targets immune cells, ultimately leading to acquired immunodeficiency syndrome (AIDS). But it attacks other cells too. When HIV infects germ cells (sperm and eggs), it hitches a ride within the genome of future generations.

While HIV is thought to have jumped the species barrier from chimpanzees to humans in the 1920s, other retroviruses have been infecting our animal ancestors for around half a billion years.

This is why retroviral genes make up 8% of the human genome today, with a further 40% in question. Ancient infections have stuffed our genetic cache full of foreign sequences.

But if natural selection prunes away non-beneficial genes, why are proviral sequences still with us today?

Jumping Genes

When viral genes remain intact long enough, they can actually be co-opted for new purposes in the host.

For instance, we know that certain retrovirus genes were activated in mammals 130 million years ago. They gave us novel proteins which, in aiding fusion between cells, facilitated the evolution of the placenta.

Over time, however, mistakes during our DNA replication sees viral genes mutate within our genome. When the sequences no longer code for proteins, we're left with strings of non-coding DNA cluttering up our genetic bank.

Scrambled or unscrambled, these viral elements have a propensity to replicate over and over within our genome, which helps explain their abundance today. Of the 3 billion bases in human DNA, up to 1.4 billion are suspected of having viral origin.

Once allotted to the realm of "junk" DNA, these relatively short snippets of As, Cs, Gs, and Ts may actually provide a genetic sandbox from which novel genes can emerge.

What's more, viral sequences can wander around within our DNA, earning them the moniker of "jumping genes". We also have jumping genes of non-viral origin.

Jumping genes can copy and paste themselves throughout our DNA over and over. The precise landing site of these transposable elements determines whether they change our genetic code in a positive, neutral, or negative way.

Jumping genes or transposable elements can interrupt our DNA to cause disease

Jumping genes can produce novel coding sequences, alter the regulation of nearby genes, or directly interrupt the sequence of healthy coding genes.

This is how the restless mobilisation of viral elements can start, stop, and change the expression of our genes.

In evolutionary terms, this can be hugely beneficial. But as individuals, we're nature's guinea pigs.

Viruses have also littered our genome with extra promoter sequences, which serve as "on" switches at the start of coding genes. We've co-opted many viral promoters for good in our evolution. But when they interact with oncogenes in an individual, a single promoter can cause cancer.

Researchers are finding a growing number of mechanisms by which these self-appointed gene managers can trigger diseases like ALS, MS, haemophilia, and schizophrenia.

How often do jumping genes move around in our DNA? That's a big question. Here's one answer.

Consider the most abundant mobile element, Alu, which comprises around 10% of our DNA. At 300 base pairs long, Alu has copied and pasted itself a million times since it took up residence in our genome 65 million years ago.

These wandering elements are so active that Alu germline insertions are estimated to affect 1 in 20 births.

While Alu sequences are known to trigger blood and neurological disorders, most of the time they land in non-coding regions of DNA. What's more, in the grand scheme of evolution they've discovered beneficial roles by serving as helpful gene regulators.

How Viral Variants Evolve

Let's emerge from this rabbit hole, squinting and confused in the light of day, and return to good old non-integrating viruses. The kind that very much prefer to threaten our existence in real-time. The kind like SARS-CoV-2.

How has the novel coronavirus mutated during the pandemic? Why is the Delta variant more transmissible? Is the Omicron variant more pathogenic? It's time to examine viral mutation and the infamous spike protein.

The novel coronavirus is covered in around 24 glycoprotein spikes which serve as biological entry keys to our cells

The novel coronavirus is covered in around 24 glycoprotein spikes which serve as biological entry keys to our cells.

When we zoom in on an individual spike protein, we see a cluster of amino acids at one end called the receptor-binding domain. If the spike protein is the key, then the RBD is the unique combination of notches that complements cell locks (ACE2 receptors).

When the spike makes contact with a cell receptor, the RBD oscillates, like jiggling the key in the lock. On successful binding, the cell draws the virion into the cytoplasm via endocytosis.

At the same time, the spike protein shapeshifts. From a prefusion state, it folds and breaks apart to take on the molecular configuration known as the postfusion state.

The prefusion and postfusion states of the SARS-CoV-2 spike protein, illustrating the RBD and NTD, and S1 and S2 subunits

The shapeshifting spike protein. (1) The prefusion state sees the receptor-binding domain (RBD) oscillate on contact with ACE2 cell receptors. Likewise, the N-terminal domain (NTD) can interact with AXL cell receptors, giving SARS-CoV-2 at least two front door keys. (2) Receptor binding triggers enzymes to cleave the spike protein into two postfusion subunits: S1 subunits float free, while the S2 subunit remains anchored to the cell membrane.

Earlier we talked about antigens: the molecular markers which our immune system uses to identify invaders. During a SARS-CoV-2 infection, around 90% of our antibodies target the RBD on the spike protein, making it the primary antigen.

This prompted researchers to use spike protein genes in COVID vaccines, with Pfizer-BioNTech, Moderna, and Johnson & Johnson modifying the genes to produce the prefusion spike only.

Whether we're infected or vaccinated, our antibodies are geared significantly toward the RBD on the spike protein, creating an evolutionary pressure on the virus to evolve different receptor-binding domains.

Every COVID infection provides SARS-CoV-2 with the opportunity to mutate. In fact, mutations are so common that we can trace the genetic signature of the virus and link cases to relatively small clusters of infections.

While many mutations have no effect, key changes to the structure of the spike protein have created antigenic drift. In other words, the antigen has changed. And our antibodies against the original wildtype variant are now less effective at neutralising the Delta and Omicron variants.

The Delta variant has eight mutations on the spike protein, including two mutations on the RBD. They give it superior receptor-binding strength, while evolving away from our wildtype antibodies.

Fortunately, Delta's immune escape is incomplete. Our outdated antibodies still take the Delta variant down, but they can be slower to do so.

All else being equal, outdated antibodies are still better than no antibodies at all against Delta. We can see this by comparing COVID hospitalisation rates between vaccinated and unvaccinated adults.

Will Omicron and future variants become deadlier? Smith's law of declining virulence says no. Natural selection favours viral mutants that have longer incubation periods and higher transmission rates.

The same isn't true for increased fatality; a dead host is no good for a virus, individually or as an evolving species. Milder variants simply outcompete them by spreading to more people.

Having said that, the counter-intuitive reality is that a virus with 50% more transmissibility can ultimately kill more people than a virus with 50% more pathogenicity. More infections equals more deaths, even if the fatality rate is unchanged.

In November 2021, the Omicron variant was detected in South Africa after an exponential rise in cases, half of which were attributable to the new mutant. While Omicron has already spread internationally, travel bans have been implemented to slow its spread while we learn more.

But we do know that Omicron has a whopping 30 mutations on the spike protein, including 15 in the receptor binding domain.
Omicron vs Delta vs Alpha Variant RBD Spike Mutations

The Omicron variant has 15 mutations on the receptor-binding domain (RBD); significantly more than the Alpha and Delta variants.

These mutations make Omicron more infectious than Delta, outcompeting it at an extraordinary rate. Omicron has superior receptor-binding strength, translating to faster reproduction and a heavier viral load.

Breakthrough infections are being tracked to determine if Omicron has greater immune escape among previously infected and vaccinated people.

After two years, the pandemic is raging. While scientists and politicians scramble to make the best strategic moves, viral mutation is just one small but integral part of the complex pandemic dynamics.

Pandemic dynamics include population size, vaccine efficacy, vaccine coverage, vaccine uptake, antibody duration, antigenic drift, immune escape, disease severity, transmission rates, incubation period, and social behaviour

Scratching the surface of pandemic dynamics.

Final Thoughts

Viruses are everywhere. In supermarkets. In labs. In bacteria. In our DNA. We even use them in medicine: viruses can be adapted to carry genetic material to our cells, whether applied therapeutically in gene therapy or prophylactically in genetic vaccines.

Viruses have been around for billions of years and, looking at the state of our genome, will be with us for a long time to come. Good or bad, dead or alive, if there's one thing you can say about viruses... it's that they're spectacularly successful at what we're all ultimately programmed to do: replicate our genes.

Free Science Content
Becky Casale Author Bio

Becky Casale is the founder, keyboard smasher, and drinks lady at Science Me. If you like her content, please take a hot second to share it with your favourite people. If you don't like it, why not punish your enemies by sharing it with them?