Orphan Genes Part 3: De Novo Gene Origination- What are the Odds?

In Part 1 of this series we discussed the discovery of orphan genes and in Part 2 we tracked evolutionist response: initial rejection of their possible existence transitioning into reluctant acceptance due to repeated undeniable confirmation. Since evolutionists operate under the assumption that evolution is true, this acceptance necessitated a response regarding proposed naturalistic methods for the origination of these genes which evolution requires to emerge “de novo” (or “from scratch”) into the genome. The plausibility of these propositions will be the focus of the final installment of this topic.

A Trip Down Memory Lane

It’s not as if the methods by which genetic diversity manifest in the genome had never been considered. The reluctance of evolutionary science to embrace the existence of orphan genes is completely understandable given the historical conclusions drawn regarding de novo gene origination. A giant in his field (Head of the Dept. of Cell Genetics at Institut Pasteur in 1960 and 1965 Nobel Prize in Physiology or Medicine winner), Francois Jacob, emphatically denounced de novo gene origination in his 1977 work Evolution and Tinkering:

Evolution does not produce novelties from scratch. It works on what already exists, either transforming a system to give it new functions or combining several systems to produce a more elaborate one.” He continued, “The probability that a functional protein would appear de novo by random association of amino acids is practically zero. In organisms as complex and integrated as those that were already living a long time ago, creation of entirely new nucleotide sequences could not be of any importance in the production of new information.” (emphasis mine)

Francois Jacob (via Wikipedia)

A Second Look at the Junk Pile

Confronted with new facts, evolutionists turned to re-examine what they had previously considered a DNA garbage heap. The majority of DNA (99%) is non-coding, meaning that it doesn’t provide instructions for making proteins. As this NIH article explains:

Scientists once thought noncoding DNA was ‘junk,’ with no known purpose. However, it is becoming clear that at least some of it is integral to the function of cells, particularly the control of gene activity. For example, noncoding DNA contains sequences that act as regulatory elements, determining when and where genes are turned on and off. Such elements provide sites for specialized proteins (called transcription factors) to attach (bind) and either activate or repress the process by which the information from genes is turned into proteins (transcription).”

According to the same source, types of regulatory elements found in junk DNA include promoters, enhancers, silencers, and insulators. Since the revelation that this junk DNA is not actually useless is fairly recent, it’s not surprising that “the identity of regulatory elements and other functional regions of noncoding DNA is not completely understood.”

Proposed Models

How could this “junk” DNA give rise to de novo origination of genes? This McLysaght/Guerzoni study concludes:

We may thus imagine two scenarios: one where an arbitrary ORF appears in a locus of significant transcription (‘RNA first’) and one where a cryptic, arbitrary ORF experiences some low, perhaps sporadic, transcription (‘ORF first’).”

The authors go on to state, “Either way, evolutionary tinkering with this pool of genetic potential may have been a significant player in the origins of lineage-specific traits and adaptations.”

Of course, these conceptual models derive from what Dr. Kevin Anderson (writing for AIG) terms “historical reconstructions” which by their very nature “are only as good as the assumptions of the reconstruction.” In this case the assumption is evolution via mutation. No other possibility is considered.

Emily Singer writes in her article for Quanta, “The junk DNA must accumulate mutations that allow it to be read by the cell or converted into RNA, as well as regulatory components that signify when and where the gene should be active. And like a sentence, the gene must have a beginning and an end…In addition, the RNA or protein produced by the gene must be useful.”

What are the Odds?

Possibility is one thing. Plausibility is entirely another. On the likelihood of such a scenario occurring Singer notes, “…creating a gene from a random DNA sequence appears as likely as dumping a jar of Scrabble tiles onto the floor and expecting the letters to spell out a coherent sentence.”

If these are the odds of even one gene emerging de novo from junk DNA, what then must be the odds of such an event taking place over and over in every living species? Furthermore, trends indicate that scientists may merely have uncovered the tip of the iceberg when it comes to the number of de novo genes. Singer writes, “As scientists…are implementing new gene discovery technologies…the number of de novo genes might explode.”

More Problems…

Statistical improbability isn’t the only issue with de novo gene origination via mutation. Dr. Anderson writes, “If it takes at least seven mutations to transform a functional gene into a different gene, then it would require far more mutations to truly evolve a de novo gene…the more mutations required, the greater the potential that some will be harmful. Evolutionists recognize this issue as well. Joanna Masel, a University of Arizona biologist studying how evolution might avoid this pitfall, explains: “Proteins have a strong tendency to misfold and cause havoc. It’s hard to see how to get a new protein out of random sequence when you expect random sequences to cause so much trouble.”

Closely related to the issue of mutations is the amount of time it would take these mutations to result in the de novo emergence of a gene. Dr. Anderson writes, “…the time needed to transform a functional gene into a different gene bursts the evolutionary timescale. The de novo formation of new genes takes this problem to even greater magnitudes. Humans, for example, are supposed to have evolved from a primate ancestor in just 4–6 million years. Even by the most generous calculations, this is insufficient time for the de novo construction of the hundreds of human orphan genes.”

De novo gene origination is just one piece of the puzzle. Singer poses the next, equally confounding question: “how de novo genes get incorporated into the complex network of reactions that drive the cell.” And that’s not the only concern, “Evidence suggests that a portion of de novo genes quickly become essential. About 20 percent of new genes in fruit flies appear to be required for survival.” She continues, “It’s as if a bicycle spontaneously grew a new part and rapidly incorporated it into its machinery, even though the bike was working fine without it.”


Evolutionists routinely disparage creation science by labeling it pseudoscience and calling foul based on Biblical bias. However, the case of orphan genes is an excellent example of the hypocrisy of such a claim. Secular science operates under its own bias- faith in evolution. Dr. Anderson aptly describes the evolutionist view of de novo gene origination, “This conclusion is not based on observational data, but rather on evolutionary necessity. The presumption of evolution is so prevalent in biology that it trumps everything else, even if it means depending upon events with a ‘practically zero’ chance of occurring.”

While orphan genes are definitely a wrench in evolutionary theory, Dr. Anderson notes that orphan genes fit “within a biblical creation model, where humans, animals, plants were created with a fully functional genome. Since this initial creation, subsequent changes in the genome have introduced many mutations and other alterations to the DNA. Some of these have even provided a specific (and likely limited) adaptive benefit. Yet these benefits result from degenerative mutations, not the formation of new genes.”

Orphan Genes Part 2: Evolutionists’ Response

In Part 1 of this series we discussed how the relatively recent technological advances in DNA sequencing led to some very unexpected findings. In particular, the discovery of the prolific existence of “orphan genes.” Given the evolutionary assumption of shared genes among all living things with changes occurring incrementally over vast eras of time, these mystery genes are a direct contradiction to any scenario predicted by evolutionary theory on a foundational level. Therefore, such evidence requires a very serious response. As we’ll see, the explanations evolutionists offer have certainly been revelatory, but not from a scientific standpoint. What has been revealed is a highly unscientific, faith-based commitment to the theory of evolution.

Nelson Velasco Debate

In the 2014 design vs evolution debate between Paul Nelson (Discovery Institute) and Joel Velasco (Texas Tech), the subject of orphan genes arose. Velasco’s 5 points are perfectly representative of the initial evolutionist response. Cornelius Hunter, writing for Evolution News, recounts Velasco’s arguments:

  1. “… there isn’t much to be concerned with here because ‘Every other puzzle we’ve ever encountered in the last 150 years has made us even more certain of a fact that we already knew, that we’re all related.’”
  2. …the whole orphan problem is contrived, as it is nothing more than a semantic misunderstanding — a confusion of terms…”
  3. … many of the orphans are so categorized merely because the search for similar sequence is done only in ‘very distantly related’ species.”
  4. …orphans are really nothing more than a gap in our knowledge… the more we know about a species, the more the orphan problem goes away. And which species do we know the most about? Ourselves of course…: ‘…How many orphan genes are in humans?… Zero.’”
  5. …while new orphans are discovered with each new genome that is decoded, the trend is slowing and is suggestive that in the long run relatives for these orphans will be found..”

As you can see, Velasco doesn’t offer a scientific explanation for the existence of orphan genes. Initially, evolutionists were very reluctant to even concede that they legitimately existed in numbers large enough to warrant discussion. Instead, he frames his case around faith that an explanation which fits evolutionary theory will arise. His answer is a catch all attempt to cover all the bases. Hunter sums up the inadequacy of Velasco’s view:

So to summarize Velasco’s position, the orphan problem will be solved so don’t worry about it, but actually orphans are not a problem at all but rather a semantic misunderstanding, but on the other hand the orphan problem is a consequence of incomplete genomic data, but actually on the other hand the problem is a consequence of insufficient knowledge about the species, and in any case even though the number of known orphans keeps on rising, they will eventually go away because the orphans as a percentage of the overall genomic data (which has been exploding exponentially) are going down.”

Velasco’s 4th Point

The one point listed that most resembles an actual argument is Velasco’s 4th. Is it true that the human genome, the one we know most about, doesn’t have any orphan genes?

The short answer is no.

A 2007 study by the Lander group did indeed reject thousands of proposed orphan genes that had been identified within the human genome, but not all. Authors of the study noted that not all proposed orphans were able to be rejected. In fact, this 2015 study “identified 634 human-specific genes” that appear to have arisen de novo in the human genome. Most telling, however, is why the Lander study rejected the majority of the orphans:

If the orphans represent valid human protein-coding genes, we would have to conclude that the vast majority of the orphans were born after the divergence from chimpanzee. Such a model would require a prodigious rate of gene birth in mammalian lineages and a ferocious rate of gene death erasing the huge number of genes born before the divergence from chimpanzee. We reject such a model as wholly implausible. We thus conclude that the vast majority of orphans are simply randomly occurring ORFs that do not represent protein-coding genes…” (emphasis mine)

On what grounds would such a model be considered “wholly implausible”? Apparently, because their existence cannot be plausibly explained within the constraints of evolutionary theory. Hunter notes the following:

This is what philosophers refer to as theory-ladenness…There was no scientific evidence that those human sequences, identified as orphans, were ‘spurious.’ The methods used in the Lander study were full of evolutionary assumptions. The results entirely hinged on evolution. Although the paper did not explicitly state this, without the assumption of evolution no such conclusions could have been made. Although the paper authoritatively concluded that the vast majority of the orphans in the human genome were spurious, this was not an empirical observation or inference…”

On Second Thought…

Over time evolutionists have been forced to accept that orphan genes do in fact exist in numbers great enough to require a revamping of long held beliefs regarding the formation of genes. In other words, the evolutionists’ explanation of the origin of genes had to… evolve.

Since a designed genome is not an option for evolutionists, an alternative explanation for the existence of these orphan genes had to be considered. In late 2014, Tautz D. published the following conclusion in his The discovery of de novo gene evolution:

Genes can evolve via duplication and divergence mechanisms, but also de novo out of non-coding intergenic sequences. This latter mechanism has only recently become fully appreciated, while the former mechanism was an almost exclusive dogma for quite some time. This essay explores the history of this development: why a view developed, with the alternative hardly being explored. Because of the prevailing view, an important aspect of the nature of genes and their evolutionary origin escaped our attention. Evidence is now rapidly accumulating that de novo evolution isa very active mechanism for generating novelty in the genome, and this will require anew look at how genes arise and become functional.” (emphasis mine)

With evolution assumed, Tautz concludes that new genes must be able to arise “from scratch” (“de novo”) from non coding sequences. He also makes three admissions: (1) they have only recently become forced to abandon (due to the discovery of the existence of these orphan genes) the exclusive evolutionary dogma that dominated genetic understanding; (2) this dogma caused a blindness with regard to their understanding of the nature and origin of genes; (3) they will have to figure out how these genes could exist.

As the 20th century evolutionist Theodosius Dobzhansky famously said, “Nothing in biology makes sense except in the light of evolution.” This is the bias mainstream science operates under. The theory of evolution is never questioned- it is an assumed foundational truth. Since orphan genes are now acknowledged to exist, evolutionists assume that there must be a naturalistic mechanism to explain new genes appearing from scratch in the genome.

What Next?

Evolutionists are left with the task of explaining a naturalistic mechanism by which these “de novo” genes come to exist. In Part 3, we’ll take a look at the plausibility of these proposed mechanisms.

Orphan Genes Part 1: Unexpected Results in DNA Sequencing

In his book, Why Evolution is True, biologist Jerry A. Coyne describes modern evolutionary theory with the following statement:

Life on earth evolved gradually beginning with one primitive species—perhaps a self-replicating molecule—that lived more than 3.5 billion years ago; it then branched out over time, throwing off many new and diverse species.”

Based on this foundational principle, various species are expected to share similar (or homologous) structures due to common ancestry. Berkeley’s Evolution 101 page notes, “Evolutionary theory predicts that related organisms will share similarities that are derived from common ancestors. Similar characteristics due to relatedness are known as homologies.”

Wikipedia’s homology entry lists the following example: “…the forelimbs of vertebrates, where the wings of bats, the arms of primates, the front flippers of whales and the forelegs of dogs and horses are all derived from the same ancestral tetrapod structure.”

Image via wikipedia: “The principle of homology: The biological relationships (shown by colours) of the bones in the forelimbs of vertebrates were used by Charles Darwin as an argument in favor of evolution.”

The Berkeley source provides the following example in the case of plants:

However, the relatively recent advent of DNA sequencing has produced some very unexpected results- surprisingly contrary to this evolutionary theory prediction.

DNA Sequencing- We’ve Come a Long Way Baby

DNA is the blueprint, or instruction manual, containing the instructions which make every species unique. DNA (pictured below) is defined as, “…a thread-like chain of nucleotides carrying the genetic instructions used in the growth, development, functioning and reproduction of all known living organisms and many viruses.”

In the 1960’s scientists developed the ability to “read” this DNA instruction manual in a process called DNA sequencing. This was a monumental scientific breakthrough. Britannica defines DNA sequencing and reveals its significance with the following entry:

…technique used to determine the nucleotide sequence of DNA… The nucleotide sequence is the most fundamental level of knowledge of a gene or genome. It is the blueprint that contains the instructions for building an organism, and no understanding of genetic function or evolution could be complete without obtaining this information.” (emphasis mine)

The last sentence is imperative. While Darwin and his predecessors could hypothesize that common ancestry is the most reasonable explanation for similarities shared among various species based on visual comparison, common ancestry cannot be proven and the very concept of evolution cannot be understood without the ability to “read” an organism’s instruction manual (DNA).

For several decades DNA sequencing was a very slow and expensive process. However, the Human Genome Project, initiated in 1990 and completed in 2003, had a revolutionary effect. The goal of this international project, to map the entire human genome, spurred tremendous technological advances in gene sequencing which has continued far beyond the project’s completion. James Heather concludes his History of Sequencing DNA by stating, “Over the years, innovations in sequencing protocols, molecular biology and automation increased the technological capabilities of sequencing while decreasing the cost, allowing the reading of DNA hundreds of basepairs in length, massively parallelized to produce gigabases of data in one run.”

It is this burgeoning wealth of genetics information that has revealed the “mystery” of orphan genes.

What are Orphan Genes and Why are They Problematic for Evolutionary Theory?

Cornelius Hunter, writing for Evolution News, provides the following definition, “The term orphan refers to a DNA open reading frame, or ORF, without any known similar sequence in other species or lineages. Hence ORFan, or ‘orphan.’” The author of this article in Uncommon Descent explains orphan genes this way, “Orphan genes are presumed protein coding genes that exist in only one species and have such non-similarity to anything in any other species they are called orphans…”

Why is this troubling? If the theory of evolution and (by default) common ancestry are true, a coding gene that is species specific, with no recognizable counterpart in other species should be an extreme rarity. Ann Gauger writes in Orphan Genes: A Guide for the Perplexed, “The working assumption had been that, given common descent and the fact that most housekeeping genes are shared among living things, and the assumption hitherto that evolution occurs by incremental small changes, orphan genes…should be rare if not non-existent.”

So, just how common are they? This 2009 study published in Trends in Genetics found, “Comparative genome analyses indicate that every taxonomic group so far studied contains 10-20% of genes that lack recognizable homologs in other species.” According to Richard Buggs (writing for Ecology and Nature), researchers originally believed that the mystery of these orphan genes would be resolved over time as more genomes were sequenced, finding precursors for the sequences that are now categorized as orphans. However, the opposite has proven true.

For example, Dr. Jeffrey Tompkins discusses ants, “When comparing the ant genes to other insects, researchers discovered 28,581 genes that were unique only to ants and not found in other insects. While the various ant species shared many groups of genes, only 64 genes were common to all seven ant species…The researchers concluded that on average, each ant species contained 1,715 unique genes—orphan genes.”

In Buggs’ 2017 ash tree genome paper published in Nature, he and his colleagues report that of the over 38,000 protein-coding genes found, “…one quarter (9,604) were unique to ash. On the basis of our research so far, I cannot suggest shared evolutionary ancestry for these genes with those in ten other plants we compared ash to: coffee, grape, loblolly pine, monkey flower, poplar, tomato, Amborella, Arabidopsis, barrel medic, and bladderwort. This is despite the fact that monkey flower and bladderwort are in the same taxonomic order (Lamiales) as ash.”

Not only are orphan genes common, they also appear to be functional. Dr.Tompkins writes, “These orphan genes are also being found to be particularly important for specific biological adaptations that correspond with ecological niches in relation to the creature’s interaction with its environment. The problem for the evolutionary model of animal origins is the fact that these DNA sequences appear suddenly and fully functional without any trace of evolutionary ancestry (DNA sequence precursors in other seemingly related organisms).”


Orphan genes are certainly a fly in the ointment for evolutionary theory, but no surprise to either creation science or intelligent design. As Gauger points out, “ Then there is the elephant in the room that evolutionary biologists don’t want to acknowledge. Perhaps we see so many species- and clade-specific orphan genes because they are uniquely designed for species- and clade-specific functions. Certainly, this runs contrary to the expectation of common descent.”

In Part 2 of this series, we’ll take a look at how evolutionary biology responds to orphan genes.