11.3 Solving the Problem

The measure of connectedness I am going to use for coloring the pedigree is the proportion of animals in the pedigree that are descended from each animal in the pedigree. In order to do this we need to do the following:
  1. Compute the proportion of animals in the pedigree that are descended from each animal in the pedigree; the values will be stored in a dictionary keyed by animal IDs.
  2. Map the proportion of descendants from decimal values on the interval (0,1) to RGB triples.
  3. Use the RGB triples to set the fill color for nodes.
There is not an existing function for the first item, but there is a function in the pyp_network module, find_descendants(), for identifying all of the descendants of an animal. We can use the length of the list of descendants and the number of animals in the pedigree to calculate the proportion of animals in the pedigree descended from that animal. The color_pedigree() function creates a dictionary and loops over the pedigree to compute the proporions. It also calls draw_colored_pedigree(), which is a modified version of pyp_graphics.draw_pedigree(), to draw the pedigree with colored nodes.
##
# color_pedigree() forms a graph object from a pedigree object and
# determines the proportion of animals in a pedigree that are
# descendants of each animal in the pedigree.  The results are used
# to feed draw_colored_pedigree().
# @param pedobj A PyPedal pedigree object.
# @return A 1 for success and a 0 for failure.
# @defreturn integer
def color_pedigree(pedobj):
    _pedgraph = pyp_network.ped_to_graph(pedobj)
    _dprop = {}
    # Walk the pedigree and compute proportion of animals in the
    # pedigree that are descended from each animal.
    for _p in pedobj.pedigree:
        _dcount = pyp_network.find_descendants(_pedgraph,_p.animalID,[])
        if len(_dcount) < 1:
            _dprop[_p.animalID] = 0.0
        else:
            _dprop[_p.animalID] = float(len(_dcount)) / \
                float(pedobj.metadata.num_records)
    del(_pedgraph)
    _gfilename = '%s_colored' % \
        (pyp_utils.string_to_table_name(pedobj.metadata.name))
    draw_colored_pedigree(pedobj, _dprop, gfilename=_gfilename,
        gtitle='Colored Pedigree', gorient='p', gname=1, gdirec='',
        gfontsize=12, garrow=0, gtitloc='b')
pyp_graphics.draw_pedigree() was copied into pyp_jbc, renamed to draw_colored_pedigree(), and modified to draw colored nodes. Two basic changes were made to accomplish that: the function was altered to accept a dictionary of weights to be used for coloring, and code for actually coloring the nodes was written. The first change was simply the addition of a new required parameter, shading, to the function header. The second step required a little more work. For each animal in the pedigree, the descendant proportion is looked-up in the shading dictionary, the proportion is passed to get_color_32() and converted into an RGB triple, and the filled and color attributes for the node representing that animal are set. The hardest part of creating this routine was determining where changes should be made when modifying pyp_graphics.draw_pedigree().
##
# draw_colored_pedigree() uses the pydot bindings to the graphviz library
# to produce a directed graph of your pedigree with paths of inheritance
# as edges and animals as nodes.  If there is more than one generation in
# the pedigree as determind by the 'gen' attributes of the animals in the
# pedigree, draw_pedigree() will use subgraphs to try and group animals in
# the same generation together in the drawing.  Nodes will be colored
# based on the number of outgoing connections (number of offspring).
# @param pedobj A PyPedal pedigree object.
# @param shading A dictionary mapping animal IDs to levels that will be
#                used to color nodes.
# ...
# @return A 1 for success and a 0 for failure.
# @defreturn integer
def draw_colored_pedigree(pedobj, shading, gfilename='pedigree', \
    gtitle='My_Pedigree', gformat='jpg', gsize='f', gdot='1', gorient='l', \
    gdirec='', gname=0, gfontsize=10, garrow=1, gtitloc='b', gtitjust='c'):

    from pyp_utils import string_to_table_name
    _gtitle = string_to_table_name(gtitle)
    ...
    # If we do not have any generations, we have to draw a less-nice graph.
    if len(gens) <= 1:
        for _m in pedobj.pedigree:
            ...
            _an_node = pydot.Node(_node_name)
            ...
            _color = get_color_32(shading[_m.animalID],0.0,1.0)
            _an_node.set_style('filled')
            _an_node.set_color(_color)
            ...
    # Otherwise we can draw a nice graph.
    ...
        ...
            for _m in pedobj.pedigree:
                ...
                _an_node = pydot.Node(_node_name)
                ...
                _color = get_color_32(shading[_m.animalID])
                _an_node.set_style('filled')
                _an_node.set_color(_color)
                ...
The get_color_32() function is a modified version of pyp_graphics.rmuller_get_color() that returns RGB triplets of the form "#1a2b3c", which are required by the program that renders the graphs. This is another example of how code reuse can reduce development time.
##
# get_color_32() Converts a float value to one of a continuous range of colors
# using recipe 9.10 from the Python Cookbook.
# @param a Float value to convert to a color.
# @param cmin Minimum value in array (0.0 by default).
# @param cmax Maximum value in array (1.0 by default).
# @return An RGB triplet.
# @defreturn integer
def get_color_32(a,cmin=0.0,cmax=1.0):
    try:
        a = float(a-cmin)/(cmax-cmin)
    except ZeroDivisionError:
        a=0.5 # cmax == cmin
    blue = min((max((4*(0.75-a),0.)),1.))
    red = min((max((4*(a-0.25),0.)),1.))
    green = min((max((4*math.fabs(a-0.5)-1.,0)),1.))
    _r = '%2x' % int(255*red)
    if _r[0] == ' ':
        _r = '0%s' % _r[1]
    _g = '%2x' % int(255*green)
    if _g[0] == ' ':
        _g = '0%s' % _g[1]
    _b = '%2x' % int(255*blue)
    if _b[0] == ' ':
        _b = '0%s' % _b[1]
    _triple = '#%s%s%s' % (_r,_g,_b)
    return _triple
This change will probably be to rolled into rmuller_get_color() so that the form of the return triplet is user-selectable.

The program new_jbc.py demonstrates use of the new pyp_jbc.color_pedigree() routine:

options = {}
options['renumber'] = 1
options['sepchar'] = '\t'
options['missing_parent'] = 'animal0'

if __name__=='__main__':
    options['pedfile'] = 'new_ids2.ped'
    options['pedformat'] = 'ASD'
    options['pedname'] = 'Boichard Pedigree'
    example = pyp_newclasses.loadPedigree(options)
    pyp_jbc.color_pedigree(example)
The resulting colorized pedigree can be seen in Figure 11.1. Each of the nodes is colored according to the proportion of animals in the complete pedigree descended from a given animal. Clearly there is still room for improvement; for example, there is no key provided in the image so that you can see how colors map to proportions. Implementation of a key is left as an exercise for the reader.
Figure 11.1: Colorized version of the pedigree in Figure 9.2
See About this document... for information on suggesting changes.