Biological structure and function result in general from the interaction of simple biological entities of same or different nature. Relations are therefore essential in biology. As a consequence, graphs occupy an important place in the algorithms that exist or are required to address a wide range of biological problems.
Trees are a special class of graphs that have already been extensively used as modelling tools for evolutionary events (edges in this case correspond in general to time) or planar representations of 3D molecular structures (mainly of RNAs).
Even in the case of evolution, trees are however often not the most appropriate model as possible exchanges of genetic material between species located at different branches of the tree of life confer a graph-like structure to evolution. Graphs are also required to model full 3D structures (RNAs but also proteins or DNA), the network of gene expression inside an organism, metabolic pathways, host-parasite relations etc.
As in the case of strings, many classical problems in graphs, such as, for instance, graph isomorphism, maximal common subgraph, network flow, have applications to computational biology. However, allowed variations introduce an approximability aspect into such classical problems that have to be accounted for by using new metrics. These will differ depending on:
what the graphs are modelling: evolution, 3D structures, genetic or metabolic networks, etc.
how they are modelling it,
what is being measured.
Inference problems are also of primordial importance with the added difficulty in relation to string inference problems that characterising "unexpected" behaviour is even harder to appropriately and accurately assess in the case of graphs.
It is important to infer unexpected characterics of the graphs modelling existing relations, but also to explore potential evolutionary, structural or systems pathways that have not been used by nature. This introduces a new problem that in general, in terms of modelling and algorithms, differs from simple graph traversals or enumeration by the complexity of the objective function one seeks to optimize.