But while the publication of the human genome holds great scientific and medical promise, serious challenges have to be faced, from research systems and costs to intellectual property regimes.
The human genome - that is, the sequence of some 3 billion pieces of information that constitute the physical recipe for making a human being - was published in February 2001 following years of effort by public and private research institutions. Innumerable commentators have pronounced this event to be a milestone in the history of science and medicine, and so it is. Humans now join that select company of two dozen or so species whose DNA blueprints are known. This is a decidedly motley group, including the worm Caenorhabditis elegans, the fruit fly Drosophila melanogaster, the cholera bacterium Vibrio cholerae, and the common weed Arabidopsis thaliana.
The human sequence can be useful in and of itself (it is a basis, for example, for studying genetic susceptibility to diseases) but a key message to policymakers is that sequencing is merely the beginning of a much bigger quest to determine the functions of the proteins whose composition is encoded by DNA. It is functional information about the immensely complex and finely balanced interactions of large protein molecules (and other components of living cells) that will produce the ultimate rewards of the genomics revolution: innovative medical tests, drugs, and therapies.
This new quest, sometimes known as "post-genomics", will require very large investments in infrastructure and training, and it will pose tricky ethical, legal, and political problems. Government officials, legislators, judges, academics and members of the public will all be called upon to wrestle with some difficult problems in the years ahead. A preview of these challenges and opportunities can be had by examining one of several new sub-fields of modern biology: "structural genomics". This was recently the subject of an intergovernmental consultation by the OECD Global Science Forum.
Knowledge of the three-dimensional physical structures of proteins is vitally important, since structure is closely linked to biological function. For example, knowledge of the atomic configuration of a key protein component of the HIV virus (the enzyme reverse transcriptase) has allowed scientists to design a small molecule that latches onto the protein and interferes with its function, thus slowing the progress of the disease.
The availability of complete genomes has inspired some scientists to propose that the corresponding structural information should be obtained for hundreds of thousands of proteins. Hence the term "structural genomics". Interestingly, structural genomics initiatives have arisen almost simultaneously in the academic and industrial communities. Since these undertakings would be very costly and time-consuming, the obvious questions arise: what are the most appropriate roles for the public and private sectors, and how should public money be spent to optimise desirable outcomes - advancing science, promoting economic growth, and delivering the fruits of research to the public.
Post-genomics may also be a major source of future industrial competitiveness and wealth creation. Not surprisingly, public authorities are concerned about getting left behind. US researchers, benefiting from generous government programmes, have built up a strong lead in the field, with their Japanese colleagues not far behind, while Europeans are searching for ways to advance via the right mix of national and EU-based programmes. Three aspects of structural genomics deserve the special attention of science policymakers: research infrastructure (facilities and equipment), the scope of the research, and intellectual property rights (IPR).
Take research infrastructures first. Structural analysis of proteins consists of a series of complex and difficult steps. It typically takes from several weeks to a few months to determine the structure of a single protein of medium size, starting from the simple knowledge of the corresponding gene sequence. The work must be done by a doctoral-level scientist, using sophisticated instrumentation and high-performance computers. To get an idea of the scale of the challenge posed by structural genomics, genomes of even simple organisms encode information for thousands of distinct proteins (for humans, this number probably exceeds 100,000). Two principal experimental methods are in use: X-ray crystallography, and nuclear magnetic resonance (NMR).
X-rays for structural measurements are generated in electron storage rings which can cost several hundred million dollars. They are built and operated by governments, and each machine can service dozens of experiments simultaneously. University-based scientists do not have to pay for X-rays if their work has been approved and funded by the appropriate government agency. Private companies can perform proprietary research at an X-ray facility, provided they pay a fair share of the operating cost. They may also finance the construction of special purpose equipment (a "beam line") for their own exclusive use.
But to what extent should governments use public funds to build large research infrastructures for use by private companies? During the next few years, policymakers and laboratory officials may be faced with difficult questions like this, about allocating X-ray sources between paying and non-paying users. In the long run, science and technology may provide a solution. "Free-electron lasers" are currently under development in several countries, which could generate X-ray beams with intensities several orders of magnitude higher than those from storage rings. That would provide more than enough X-rays to go around, although the facilities themselves would remain extremely expensive.
The second big challenge concerns the scope of structural genomics. Because structural analysis is so expensive and time-consuming, great care must be taken in choosing the right set of proteins for study. Industrial researchers are more likely to focus on molecules that are linked to diseases (for example, viral enzymes) since these may be promising "drug targets". Academic researchers may be more inclined to study proteins that provide insight into broader questions, like cellular metabolism or evolutionary theory. There is no clear dividing line between these lines of inquiry. Some co-ordinating mechanism, possibly involving all interested governments, may probably need to be established to promote exchange of information about which proteins are being analysed, and to avoid unnecessary duplication of effort.
The results of industrial R&D are not always published, and this can lead to peculiar effects. The genome of rice was recently sequenced by a private company, but not made fully available to the scientific community. Therefore, a publicly financed rice-sequencing project is proceeding (at great expense to the international taxpayer) with the goal of providing the public with the same information that already exists in a private data bank. Similar situations can be expected in the case of structural genomics. Private companies may, however, be moving in the direction of more openness and transparency. Discussions are currently under way towards the establishment of industrial consortia that will jointly undertake high-throughput structural work on very large numbers of proteins, and will publish most of the results. While this approach may seem to weaken the competitive advantage of each member of the consortium, it also protects all the partners from having a competitor stumble onto a major discovery without the others knowing. This is a safe business strategy, considering that structure-determination is just one of the long and costly steps involved in developing a drug and bringing it to market. Such arrangements could well involve partnerships with publicly funded institutions.
Intellectual Property Rights (IPR) form the third key challenge. Questions surrounding the patentability of the results of genomic research are controversial and complex, and the courts will be busy for some time before a consistent set of rules emerges. The degree to which protein structures are themselves patentable is not fully resolved. Neither is the connection to any underlying patent on the genetic sequence of the same protein. To some extent, IPR represents an obstacle to the advance of structural genomics, since many researchers may be understandably reluctant to put potentially lucrative information into the public domain. It is not clear whether the standards agreed to by scientists and institutions that participated in the Human Genome Project (which put great emphasis on the rapid release of raw data) would be easily transferable to a publicly-funded structural genomics project. The differences between patent regulations in Europe, the United States and Japan, for example, with regard to the "grace period" that applies between releasing results and applying for patent protection, complicate matters still further.
Interestingly, the advance in science itself may render obsolete some of the difficult questions that surround patentability of genes and protein structures. The granting of a patent is contingent on the "novelty" and "non-obviousness" of the discovery. Many scientists hope one day to be able to derive protein structures computationally from genomic sequences alone, thus saving enormous amounts of time, money and effort. Should this happen, structure determination would become routine and inexpensive. This would radically alter the legal environment within which modern biology is developing. So while the publishing of the human genome was a giant step, it was just the beginning.
©OECD Observer No 226/227, Summer 2001