• Share

Thinking about DNA and genomes

How 'big' is the human genome?
Explaining the size of the human genome to my 3 year old daughter

December 26, 2017

One thing I've learned as a parent is that, if you have a bit patience and a good sense of humor, it can be very entertaining (and challenging!) to explain difficult concepts to young children. In fact, it can be downright hilarious. So I relish opportunities to explain some facts of life, a.k.a. biology, to my three year old daughter whenever I can.

My child has a fondness for stamping. I guess I did, too, when I was a kid. I always enjoyed going to my dad's workplace and using those old date-adjustable or stampers with useful business terms on them, e.g. "For Deposit Only" or "Fragile".

For Christmas, my daughter got a new alphabet stamp set; it was love at first sight. Prior to receiving it, she was a bit obsessed with getting her hand stamped at school and after tumbling class. As parents, we just needed to tie her enthusiasm for stamps to something educational. Her Grandmother had a great idea, which was to get her the alphabet stamp set that you can see at right. Seeing her with this set is seeing one of her dreams have come true. So Christmas was a success (thanks to Grandma). As you can see, it has all the letters of the alphabet--uppercase and lowercase--as well as a few bonus characters.

This morning, she asked me (very nicely) if I would like to do some stamping with her. "Of course!" I said. We were busily stamping away, and I showed her how we could string letters into words, like her name for instance. 
How many 'letters' in the human genome?
Answer: There are approximately 3,234,830,000 nucleotides in the human genome.

For this example, I rounded to 3 billion bases.
How many letters are we stamping in a minute?
We are stamping 60 letters a minute, which gives 3,600 letters an hour.
How many letters in a 15 hour "stamp" work day?
If you stamped a 15 hour day, you would stamp 54,000 letters in a day.
How long to stamp out the human genome?
If you stamped for 15 hours a day for 60,000 days straight, you would stamp out the entire human genome.
A closer approximation is it would take 165 years of stamping 15 hours a day to stamp out the ~3 billion letters in the human genome.
Then my inner biologist got the better of me (how could I resist?), and I strung together some As, Ts, Cs, and Gs--more or less at random.

(In case you're wondering, yes, that is dot-matrix printer paper.)
These four letters are shorthand for the four molecules (nucleotides) that make up the central information storage property of DNA: Adenine, Cytosine, Guanine, and Thymine. The letters are strung together one after another. In the case of our human genome, there are 22 large stringy cassettes (plus two "X" cassettes or an "X" and a "Y" cassette); each cell in your body has a pair of each of these stringy behemoths.

Let's say you get bored one day and you really want to know what the particular order (sequence) of some number of nucleotides is at some location in your genome. So you swab your mouth to get some saliva, from which you will get some DNA. This is ever more popular now in the 23-and-me era. If you had your own in-house lab, you could process the DNA to get the answer to your question. You would simply put to use a common method that allows you to color each nucleotide base of the particular region you are interested in, and the results would look something like this: 
That's cool and all, but the question I posed to my daughter was:

How long would it take you to stamp out your own genome given you knew the entire order of bases (and you had endless supplies of time, energy, and ink)?

We discussed this as fathers and 3-year-old daughters do, and we did some back of the envelope calculations to obtain a first approximation. The answer really gets at the heart of just how enormous our genome is. 

The first thing we need is rate: How fast can we stamp each letter? OK, so let's say we are expert stampers and had some assistance. Perhaps we have someone reading the bases out to us and we are just stamping away. I estimated, gratuitously, one letter per second.

The second thing we need is genome "size": How many As, Ts, Cs, and Gs are strung along one strand of DNA in the human genome (assuming all chromosomes were tied together end-to-end)? The answer is huge: about 3 billion bases. Well, that's not that big though, right, I mean US Government Debt is north of19,000 billion (19 trillion)

With rate and size estimated, we can get to our calculation.

(1) If you can stamp 60 letters in a minute, then you can stamp 3,600 letters in an hour (60 x 60). 

(2) If you worked for 10 hours a day, then you could stamp 36,000 letters in a day (10 x 3,600). Not bad, let's say you stamp 15 hours a day, then you could stamp 54,000 letters in a day (36,000 x 1.5)! Yes, that is good progress, right?

OK so you are stamping 1 letter per second for 15 hours a day, getting through 54,000 letters per day at this pace.

(3) If you worked for 10 days at this pace, you would stamp 540,000 letters over this time (54,000 x 10). Likewise, if you stamped for 60 days straight, you would get through approximately 3 million letters (54,000 x 10 x 6). Now we are getting somewhere!

(4) Since there are 3 billion bases, to get to that mark, you would need to stamp for 60 days 1,000 times, or 60,000 days!

(5) As there are approximately 365.25 days in a year, there are 3,652.5 days in 10 years (365.25 x 10). And approximately 36,525 days in 100 years (3,652.5 x 10). So to get near 60,000 days, we would need to stamp for approximately 150 years (36,525 x 1.5 gives ~55,000 days). You could add a few more 10 year intervals to get to 60,000.

Thus, (and approximately of course!) to stamp out the entire length of one strand of the human genome, if the large segments called chromosomes were tied end-to-end and if we stamped continuously for 15 hours a day everyday, we calculated we would need to stamp for approximately 165 years

"Wow, Dad, that is forever," said my young daughter, scientist-in-training.

Now we can better appreciate what a Herculean effort the Human Genome Project was. And that doesn't even take into account the technological hurdles of that effort. 
Today we take for granted how routine it is to determine every letter on the entire length of an organism's genome. We do it routinely now for humans and other organisms. It is a truly a stunning feat of human achievement.

So my daughter and I have a new appreciation for just how enormous the human genome is. And we did all of that while taking part in one of her new favorite activities: stamping.
Note 1: There have been many excellent analogies circulated through the years to gain an appreciation of how enormous the human genome is. There is likely a better version out there; in fact, I know there is. This one just arose in our conversation today. Feel free to share your own below.

Note 2: This was sub-consciously inspired by Rob Phillips at Caltech.