Cataloguing a world of species with the Barcode of Life

Charles Darwin's expedition on the HMS Beagle led to many of the world's species being catalogued. Paul Hebert has embarked on a similar modern-day journey - a journey that began in his own backyard. His Consortium for the Barcode of Life is now aiming to establish a species bar code for all living matter.

Ask Paul Hebert to describe his place in the world of biology and he insouciantly answers, "I am a panhandler for species bar-coding."

"You mean you are a good salesperson," the interviewer tries to correct him.

"Okay, that's probably more polite, but I still feel like a guy sitting there with a tin cup," Hebert continues.

Whoa. The man whom the world has taken to calling the Linnaeus of Guelph. The man whose discovery is quite literally revolutionizing species identification. The same man who is the Canada Research Chair in Molecular Biodiversity, the 2003 Premier's Research Excellence Award winner, the Fellow of the Royal Society of Canada and the director of the Biodiversity Institute of Ontario unabashedly sees himself as a person whose promotion and fundraising for his science have been, are and must be in the future shameless.

That should really come as no surprise. Hebert's chutzpah - his passionate belief in and passionate promotion of what he does - more than almost anything explains why Ontario is now the world leader in using DNA to determine what is and is not a species. Why the $10 million facility that Hebert heads is arguably the best in the world and is currently receiving a steady stream of specimens to provide species designations across the planet. And why he is in ongoing discussions with Internet giant Google to see how they might interact and support his research.

One dream is to create a handheld machine that anyone on a nature trip could, in a few minutes, use to determine the species of everything they are seeing. Along with this, the device would hook into a database where everything known about the plant or animal would be conveyed back to them.

But to understand all this, let's back up to the late 1990s. That's when Hebert started to wonder if he could use the increasingly sophisticated tools of modern genetic analysis to extract a DNA diagnostic for a species. Think of the species equivalent of the techniques that have been revolutionizing forensics by permitting scientists to identify criminals from their spit, sperm, hair or any other DNA-bearing material.

Up until then, species identification was rooted in the 250-year-old ideas of Swede Carolus Linnaeus. His research effectively stated that if something looked different and acted different from everything else you knew about, you called it a new species.

While there had been talk of genetic species identification by others before Hebert, the general consensus was that while it might be possible, doing it was going to require a long and arduous effort. "I don't think there was anyone else who supposed that you could take a single gene region and tell apart effectively all animal species. Nobody anticipated that," Hebert reflects today.

Undeterred by skepticism, Hebert decided to see what would be revealed if you looked at differences in a gene in the mitochondria - tiny pieces of DNA that lie outside the cell's nucleus. Hebert reasoned that nuclear DNA was too complicated - he likened it to a hectic, genetic New York City - and too slowly changing to provide an easy measure of species differentiation. Mitochondrial DNA, on the other hand, was fast-changing but small enough to focus attention on - Hebert likened it to the genetic equivalent of a bustling small town.

So he fixed his gaze on what is known as the CO1 gene and started to examine what its genetic signature looked like in different species. While his technology might have been 21st-century, his species collection technique mimicked great 19th-century naturalists like Charles Darwin. But instead of circling the globe in search of species to catalogue, Hebert's voyage of the Beagle took place in his own lit-up backyard in Guelph. There he started catching the butterflies and moths that came flittering by. He eventually caught 200 different species. And when he looked at their CO1 genes, he could see a clear pattern of DNA differentiation that exactly followed the classical species divisions on which lepidopterists had previously agreed.

Interesting, but Hebert feared this discriminatory capacity could be restricted to some butterfly and moth species. So he then looked at DNA from all of North America's identified bird species and - shazam, kaboom - not only did these, too, show DNA species differences, they also indicated that there were probably four additional species that nobody had identified by just looking at bird forms and bird behaviours. Even before he published his results, another shazam and kaboom occurred. A scientific presentation given in eastern Canada inspired the New York-based Sloan Foundation to come up with a million dollars to set up what is now called the Consortium for the Barcode of Life (CBOL). It aims to establish a species bar code for all living matter.

Why call it a species bar code, you might ask. In addition to being one of the world's pre-eminent scientific panhan- oops - salespeople, Hebert also understands the sometimes paramount virtue of metaphor in making science understandable to the general public. He was walking one day through a grocery store thinking about his efforts and suddenly was taken by what he calls a "gee-whiz fact."

"You have all these products on the shelf, and they have these codes that identify them. But these codes are incredibly short - 11 different numbers really. If short strings were able to tell apart every supermarket product, well, gee, shouldn't there be a similar combinatorial diversity in DNA that tells species apart?"

So "genetic species identification" became "DNA bar-coding" or sometimes "the Barcode of Life."

To date, more than 150 organizations in 50 countries have come together in CBOL to bar-code as many of the world's suspected 100 million-plus species as possible. This bar-coding could be completed over the next few years. They will be able to do this because not only do birds, butterflies and moths' CO1 gene bar-code for species, but it appears that all - one should emphasize that - all animals do. So do many fungi, micro-algae and protists. Plants have proven to be more difficult to identify because, for a variety of reasons, the CO1 gene doesn't distinguish their species. However, an intense search is under way in a number of laboratories to identify the gene, or more likely the combination of genes, that will bar-code plants.

Hebert's own research budget now includes pledges of $35 million, some of which has gone into his new approximately 1,400-square-metre laboratory in Guelph and its state-of-theart robotic sequencing technology. When it reaches its technological potential, what is sometimes called a "bar-coding factory," it should be able to identify half a million specimens a year. By way of comparison, the Linnaean "if it looks like a duck and quacks like a duck, it is a duck" methodology has identified only about 1.7 million species in the last 250 years. But in many ways, scientific success is also proving as difficult to manage as business success. The Guelph facility officially opened in 2007 but is already too small for the 40-odd people who now work in it.

So Hebert is about to put in an application for a facility twice the size of what he now inhabits. More chutzpah, more brazenness, but that pales in comparison to IBOL, the proposed $150 million International Barcode of Life Project that Hebert has been pushing with all his salesman and panhandler abilities. The goal of the project is to bring together a number of species-rich countries - think almost everywhere in the tropics - with technology-rich countries to bar-code five million specimens in five years. This information could, among other things, be used to identify species in ecosystems that are threatened with extinction.

Did we also mention that in the less than five years since its birth, species bar-coding is already spawning clear applications outside of academic taxonomy?

A study in the U.S. used the technique to understand the species of birds that had flown into planes at airports. Since what was left was often just a smear of blood and feathers, before bar-coding became available officials generally had to use a "guess and golly" approach to decide which birds had to be controlled to make flying safer.

A study using DNA bar codes discovered that a significant percentage of fillets sold in New York City fish markets weren't what the sellers said they were. Cheaper cuts and sometimes endangered species were being snuck into the mix. In response the U.S. Food and Drug Administration is in the process of creating a databank of DNA bar codes for all commercial fish. Then there is the problem of invasive species, often just larvae, coming in with fruit, flowers and other agricultural shipments. The present methods can take days to identify what some insect or animal is - if, indeed, given the generic look of many larvae, they can identify them at all. By then the imported item is often not fit for sale.

DNA bar-coding can be completed in only a few hours, which is why Flowers Canada, the Ontario Wheat Board, the Ontario Soybean Growers and the Ontario Greenhouse Vegetable growers are also sponsoring bar-coding research to create DNA bar codes for agricultural pests.

The explosion of interest in a field that has remained essentially closed and academic for 250 years has been startling. But equally startling are Hebert's reflections on what it all means. First, timing is everything.

"I entered academic science in 1976 and had a rather happy rise in my research grants, up to $120,000 dollars or so per year. Then it topped out. I could not get more money and so for most of the 1990s I spent my time being distracted from science because you could not get money in Canada to do much research. So I did things like create a digital media group just to be able to do something interesting. It was a total distraction from my science interest. Why would you do that, Paul, you might ask. Well, either I had to leave Canada and make myself busy somewhere else or if I stayed in Canada, then I had to find ways to keep myself occupied. So that's what I did. I busied myself," he says.

"In that context, if you had said as this idea was starting to bubble up that in less than a decade, 'You'll be trying to lead a $150 million project that's going to codify this many things,' I'd have said that's the way it should be but in the world I live in, that's completely impossible," says Hebert.

Indeed the costs of his entire backyard moth and butterfly capture program, the root of the bar-coding revolution, was entirely underwritten by a certain Dr. Paul Hebert. And now that funding, both provincially and federally, has become widely available.

"And now the playing field has changed! Now we can dream in Technicolor and make colour movies! We can now, I think, compete dollar for dollar with the largest nations on the planet," he enthuses about what the recent influx of technology and facilities has done for him.

Alex Smith, a research scientist at the Guelph facility, has another image to describe the change. "We scientists are the seeds, but unless there is a garden, you can cast seeds about as much as you want and they still fall on barren ground. This facility, built by funding by OIT, CFI and others, is our garden."

And another way the change in scientific opportunity has manifested itself is that whenever staff positions related to DNA bar-coding open up at Guelph, Hebert and the university find themselves besieged with applications from cradles of discovery like Harvard, the Smithsonian Tropical Research Institute and the Natural History Museum in London. It's not money per se that draws them.

"People want to come here because they see that this is a boat that's heading off on a really interesting voyage and they want to be on-board," says Hebert and one can almost hear tones of a modern Darwin rising in his voice.

And the future? If the province and the country continue to support DNA bar-coding, it has the chance to become that rarest of things - a national brand that lives through time. "My view is that hardly ever does a nation get to capture one of the crown jewels of human understanding, and we should be bold in investing to make sure that this understanding develops here in Canada," Hebert says.

"A thousand years from now if we do things right, people will be saying that we're still using that species identification system that they developed way back there all those years ago in that place called Canada."

In the nearer term, one has to begin thinking about using the intellectual capital that is flowing out of bar-coding. It may well be that in the foreseeable future the actual extracting of DNA from legs or hair or fins or leaves can be done more cheaply in places like China and Mexico. But, says Hebert, someplace is going to have to be thinking about organizing that data. "It's like what I think is happening with the big couture houses in Italy and France. They don't actually make the dresses they sell there. Their contribution is in the brain stuff, in their feel for design. I think we're starting to build a bar-coding culture here in Canada that understands how to deal with these data. The whole field is very young. Really, we're in diapers today," he says.

"But at some quite foreseeable point down the road, what will really electrify people will be thinking about and using the information we gather. It will be organizing this information. The question is will that be done by Google? Will that be done by Google Canada? Who will do it? The question is how do we as a province and a country take advantage of what we have learned, of the revolution in understanding biology we have started?"

And come to think of it, that doesn't sound like a bar-coding panhandler or salesman. That sounds like a bar-coding realist.

Q Let's start out with an extremely blunt question. How good is Canadian research?
A The short answer is, extremely good. Canada, with about 0.5 per cent of the world's population, produces 2 per cent of the world's GDP. But at the same time Canadian science creates 4 per cent of Earth's global knowledge. Papers, patents, citations - the research measurables.
Read full Q A session
Countries such as Ireland, Finland, Sweden, Israel and Singapore have far outpaced Canada in embracing a knowledge-based economy - and in doing so, have generated tremendous job and wealth creation in the high tech sectors (information/computer technologies, pharmaceutical technologies and biotechnologies).