PyPedal 2 CHANGELOG =================== "---" indicates a known bug; "+++" indicates a feature addition; "***" indicates an API change or a major bugfix; "'''" indicates a minor bugfix or feature enhancement. "???" indicates a possible problem (i.e. bug) that has not been verified. "XXX" indicates a feature that has been deprecated or removed. "!!!" indicates a feature that is planned or stubbed, but is not yet working. "###" indicates a note or idea I don't want to forget. CHANGES in PyPedal 2.0.3 (Vetinari) =================================== ''' 05/15/2012 Added a new pedigree option, "has_header", which will automatically treat the first row of a pedigree file as though it is a comment without having to modify the file to add a comment character. Its default value is 0. ''' 05/15/2012 Modified pyp_newclasses/NewPedigree::preprocess() and pyp_newclasses/NewPedigree::__init__() to handle the new has_header option. ''' 05/15/2012 Minor documentation updates, including an example on reading CSV files, and easier-to-find warnings about the need to enclose commas in double quotation marks when setting sepchar in a configuration file. CHANGES in PyPedal 2.0.2 (Vetinari) =================================== ''' 05/04/2012 Modified pyp_network/ped_to_graph() to fix the broken- ness in pyp_newclasses/fromgraph(). The function now properly uses DiGraph node attributes for storing and retrieving information. I'm not actually sure now how the code ever actually worked in the first place. ??? 03/22/2011 There may be a bug in the effective_founders_lacy() function in pyp_metrics.py that results in a KeyError. This could be due to a cast that is not being performed. I'm trying to get a pedigree in which the problem manifests itself. Thanks to Blair Kelly for the bug report. ''' 03/22/2011 Fixed a small bug in pyp_network.ped_to_graph() that was a result of the XDiGraph to DiGraph change. Thanks to Blair Kelly for the bug report and patch. *** 03/22/2011 The NetworkX API changed a little, and the XDiGraph class no longer exists. All instances of XDiGraph in pyp_network.py and pyp_newclasses.py have bee changed to DiGraph. This means that you MUST install NetworkX 0.99 or later! Thanks to Blair Kelly for the bug report and patch. ''' 03/22/2011 Changed the AUTHORS.txt file to remove e-mail addresses to better protect the privacy of contributors. CHANGES in PyPedal 2.0.1 (Vetinari) =================================== ''' 12/30/2010 Wrapped a stray debugging message in pyp_nrm/inbreeding() so that it will only print when the debugging_messages option is set. *** 12/30/2010 An incorrect index was being used in pyp_nrm/inbreeding() to match-up COI with animal IDs when COI are read from an existing NRM. The procedure was looping from 0 to (number of animals - 1), but 1 was being subtracted from the index when reaching into the pedigree to assign the COI. This means that the COI for animal 0 (first in the pedigree) was being assigned to animal -1 (the last animal in the pedigree), the COI for animal 1 was being assigned to animal 0, etc. This bug affected any inbreeding calculations based on an attached NRM, regardless of the pedformat. --- 12/30/2010 Found a bug in pyp_nrm/inbreeding() that affects ASD ped- igrees to which a NRM is attached at load time. When the COI are copied from the NRM to the results dictionary the wrong COI is being associated with an animal ID. ''' 12/30/2010 Added a new flag, force, to pyp_nrm/inbreeding() to over- ride the use of an attached NRM for finding COI. This is needed for de- bugging a possible problem with correctly mapping IDs in ASD pedigrees when NRM are attached. CHANGES in PyPedal 2.0.0 (Vetinari) =================================== ''' 09/29/2010 The 'Input and Output' chapter in the manual probably won't get any stranger than it is now. ''' 09/29/2010 Updated the 'High Level Overview' chapter in the manual to include several new pedigree options that were not previously documented, as well as default values that have changed. ''' 09/28/2010 Updated the 'Input and Output' chapter in the manual to cover NewPedigree::save(). ''' 09/28/2010 Had to recreate documentation updates from the latest PDF version of the manual because incautious rsync-ing apparently caused the most recent version of the LaTeX files to be overwritten with an earlier version. Crumbs. ''' 09/28/2010 NewPedigree::save() has been tested and is working as intended. ''' 09/28/2010 The default values of the keywords missing_name, missing_herd, and missing_breed in NewPedigree::__init__.kw() have been changed to use underscores rather than spaces. This prevents people from unintentionally munging up pedigree files written with NewPedigree::Save() that use a space (' ') for the sepchar. If you don't like this you can override the defaults when a pedigree is instantiated. ''' 09/28/2010 A little more clean-up in the API documentation generated by Doxygen. Now only the PyPedal modules are included. Previously, ADOdb and a few other stray bits that don't have any Doxygen markup were being included. *** 09/22/2010 Major clean-up in the API documentation generated by Doxygen. ''' 09/20/2010 As part of adding the new save(), the pedformat_codes list was moved from NewPedigree::preprocess() to NewPedigree::__init__(). This avoids the need to keep multiple copies of the list. !!! 09/20/2010 Added a new version of the save() method to NewPedigree for writing custom pedigree files. XXX 09/20/2010 NewPedigree::save() has been renamed NewPedigree::oldsave(). *** 09/17/2010 Fixed the disappearing logfile problem, which was caused by emitting log messages in modules, e.g., pyp_nrm, that were imported by pyp_newclasses. The messages were reporting when psyco could not be imported, and I don't know why it caused a problem with later logging, but it did. The problem only manifests when logging messages are sent outside of functions, in this case in the header of the file during the module importation process. Logging messages sent from functions that are called after a pedigree is instantiated work fine. The problem is fixed, but not really solved since it is still not properly understood by me. ''' 09/17/2010 Changed a number of warning messages to info messages. If the psyco module cannot be found it is not a problem -- PyPedal will still work properly -- and calling that a warning is somewhat misleading. ''' 09/17/2010 Fixed a typo in pyp_nrm.py that prevented a module import message from being logged properly. +++ 09/17/2010 Added a new keyword, foundercoi, to NewPedigree::__init__() which tells pyp_nrm.fast_a_matrix() whether or not to use coefficients of inbreeding from the pedigree to augment the diagonals for founders. If the user does not provide coefficients of inbreeding in the pedigree file then the diagonals will remain 1 because fa defaults to 0. ''' 09/01/2010 Turned off some debugging messages that accidentally were left on. CHANGES in PyPedal 2.0.0rc9 (Vetinari) ====================================== +++ 09/01/2010 Updated documentation to include information about the new GENES I/O. +++ 09/01/2010 Added the NewPedigree::savegenes() method to support easy export to GENES studbook files. *** 09/01/2010 The function pyp_io.save_to_genes() is now working. It has been only lightly tested because I don't have good test cases, so be careful when using. ''' 08/31/2010 Docstring work and improved comments in most modules. ''' 08/31/2010 Restored try-except blocks in many functions that had been disabled for debugging. This means that people need to actually look at their log files when they have problems because, for the most part, there won't be traces from unhandled exceptions dumped to STDOUT. *** 08/31/2010 The function pyp_io.read_from_genes() actually now works, so the 'genesfile' pedsource also works. You can test it using the SAMPLE.DBF file that Robert Lacy distributes with GENES 1.20. *** 08/31/2010 The function pyp_io.save_from_genes() now works. CHANGES in PyPedal 2.0.0rc8 (Vetinari) ====================================== +++ 01/05/2009 Added a new value of pedsource, 'genesfile', which loads a pedigree from the DBF file format used by GENES 1.20 (http://www.vortex9.org/genes.html), which is an accessory program to the SPARKS software for studbook management. GENES also can be used as a stand-alone program for pedigree analysis. +++ 01/05/2009 Added two functions to pyp_io: read_from_genes() parses GENES 1.2 .DBF files and write_to_genes() writes NewPedigree objects to a GENES 1.2 .DBF file. ''' 10/22/2008 pyp_nrm/fast_a_matrix() and pyp_nrm/fast_a_matrix_r() now check for PySparse, and default to using dense NumPy arrays if sparse matrices are not available. ''' 10/22/2008 Apparently you can't write to the upper triangle of a sparse matrix: "spmatrix.error: write operation to upper triangle of symmetric matrix". Clearly, this needs to be fixed. --- 05/30/2008 Apparently you can't write to the upper triangle of a sparse matrix: "spmatrix.error: write operation to upper triangle of symmetric matrix". Clearly, this needs to be fixed. ''' 05/30/2008 Custom SQL queries are allowed in NewPedigree::load() to accommodate pre-existing databases that are not in ASDx format using the new 'database_sql' option proposed by Matthew Kelly. ''' 05/30/2008 Added a new option, 'matrix_type', that's used by the inbreeding routines in pyp_nrm.py. It takes the values 'dense' or 'sparse', which is passed to fast_a_matrix() to specify that dense or sparse matrices be used. This may help when very large pedigrees are processed. Modified inbreeding(), inbreeding_tabular(), inbreeding_vanraden(), and partial_inbreeding() to use 'matrix_type'. !!! 05/??/2008 Matthew Kelly code to handle NULLs in database result sets. ''' 05/15/2008 Added an entry for new_sqlite.py to the example programs table in the Appendix of the manual. ''' 05/15/2008 Documented the decomposition routines in pyp_nrm, as well as examples/new_decompose.py. *** 05/15/2008 Changed the default renumbering option from 0 (do not renumber) to 1 (renumber) in pyp_newclasses::__init__(). ''' 05/15/2008 Added material to the manual discussing when and why pedigrees should be renumbered. +++ 05/13/2008 Added a new example program, new_sqlite.py, to demonstrate how to load pedigrees from databases. ''' 05/13/2008 Made a small change to NewPedigree::preprocess() to use the code provided by Matthew Kelly for loading from a database. Apparently the dbstream.pop() didn't work for him on OS/X 1.0.4. Tested it against new_db.py and it seems to work. ''' 05/13/2008 Added code to pyp_metrics/relationship() to warn the user when the pedigree they'be provided is not renumbered. Also added a keyword, 'renumber', that will let the routine renumber the pedigree if it has not already been renumbered; its default is 'False' to pre- serve original behavior. Thanks to Matthew Kelly for noting that the original behavior was unexpected. ''' 05/09/2008 Modified pyp_graphics/plotxy() so that it sorts the values to be plotted by the keys in the input dictionary. The line plots now work as expected. ''' 05/09/2008 Fixed a couple of typos in logging and error messages in pyp_db. ''' 05/09/2008 examples/new_graphics.py wasn't really broken, but pyp_reports/mean_metric_by() was. I fixed it to work with the new API in pyp_db. The graph it produces is correct in stating that the mean COI by birth year is 0.0 -- there are no inbred animals in the pedigree file. CHANGES in PyPedal 2.0.0rc7 (Vetinari) ====================================== ''' 05/07/2008 Added a check to pyp_graphics/draw_pedigree() to catch empty gtitles, which cause dot to crash. If the gtitle is empty then it is set equal to the 'pedname' attribute of the NewPedigree object. ''' 05/07/2008 Fixed pyp_network/get_node_betweenness() to use the correct betweenness function from NetworkX. *** 05/07/2008 Made lots of changes to pyp_graphics/draw_pedigree() to get it working with pydot 1.0.2. I believe that it now works again in that it will draw pedigrees, but they're not looking great. This area will receive attention for PyPedal 2.0.1. ''' 05/07/2008 Updated the example program new_renumbering.py to use the correct syntax to load a pedigree. ''' 05/07/2008 Added messages to the example program new_options.py to indicate which pedigree load steps are supposed to fail. ''' 05/07/2008 Removed some casts that were unnecessary from pyp_jbc/new_draw_colored_pedigree(). ''' 05/07/2008 inbreeding(), inbreeding_tabular(), and inbreeding_vanraden() in pyp_nrm now all correctly initialize reldict to include all fields. The r_nonzero_sum field had been left out of the intialization. ''' 05/07/2008 Added userField handling to the printme() and stringme() methods of pyp_newclasses/NewAnimal. ''' 05/07/2008 Corrected pedigree format string in new_ids.ini. ''' 05/07/2008 Checked all example programs by hand to make sure that they're consistent with the current API and feature-set to avoid the kinds of problems that Matthieu (and probably others) have had with the examples being very out-of-date. ''' 05/07/2008 The nrm attribute of pyp_newclasses/NewAMatrix if now set to False on instantiation so that the instance can easily be queried for the existence of an NRM. +++ 05/07/2009 Added a new example program, new_decompose.py, to demonstrate the use of the routines in pyp_nrm for decomposing A such that A = TDT', as well as the code for directly forming A-inverse with or without inbreeding. *** 05/06/2008 Fixed several functions in pyp_nrm (a_decompose, form_d_nof, a_inverse_dnf, and a_inverse_df) so that they're using the correct Numpy dtypes. *** 05/06/2008 Fixed several functions in pyp_nrm (a_decompose, form_d_nof, a_inverse_dnf, and a_inverse_df) that were still trying to use the 'num_recs' attribute of the pedigree metadata, which was renamed to 'num_records' quite a while back, but which was not noted in This File. ''' 05/06/2008 Fixed a spelling error in the psyco import blocks in pyp_graphics, pyp_metrics, and pyp_networks, and pyp_nrm. ''' 05/06/2008 Updated examples/new_amatrix.py to use Numpy instead of Numarray, as well as to remove calls to deprecated methods. Thanks to Matthieu Authier for reporting this and other important bugs. ''' 05/06/2008 Removed an extra space from the metadata output that's been bugging me for a Long Time Now. ''' 05/06/2008 Added import block for psyco in pyp_nrm. CHANGES in PyPedal 2.0.0rc6 (Vetinari) ====================================== ''' 05/01/2008 When I added the psyco hooks in pyp_graphics, pyp_metrics, and pyp_networks I also added logging hooks. The calls were at the start of the file, in the psyco import statement, and those modules are loaded in pyp_newclasses before the logging object is created and configure, and all of the logging messages ended-up going to the wrong place (the console). I fixed it by changing the logging messages about psyco import to simple print statements. ''' 05/01/2008 Added the parameter 'quiet' to pyp_metrics/ effective_founder_genomes(). +++ 05/01/2008 Added pyp_utils/founder_allele_dict(), which returns a dictionary whose keys are the unique founder alleles in the pedigree and whose keys are 0.0. ''' 05/01/2008 pyp_utils/renumber() now takes a parameter 'cleanmap' that indicates whether or not the ID map files created during renumbering should be cleaned-up (deleted) or left alone once renumbering is complete. The inbreeding routine already uses pyp_utils/delete_id_map() for this, and it's just an oversight here. ''' 04/28/2008 Added the parameters chrometype and heterogametic to pyp_metrics/effective_founder_genomes() to allow gene dropping of sex chromosomes. CHANGES in PyPedal 2.0.0rc5 (Vetinari) ====================================== ''' 04/11/2008 Added the parameter debugLoad to the loadPedigree() function in pyp_newclasses.py. It defaults to False; when True it loads the pedigree without using a try-except block so that errors are not suppressed. ''' 04/11/2009 Fixed pyp_network/mean_geodesic() to properly catch an exception raised by NetworkX when no path exists between two nodes in a graph. CHANGES in PyPedal 2.0.0rc4 (Vetinari) ====================================== ''' 04/09/2008 More fidgeting with setup.py to get setup.py and the whole installation process working more easily. In order to do that I had to remove all of the required packages from setup.py, which means that the user will have to install Numpy etc. before installing PyPedal. I'm apparently not quite bright enough to get setuptools working correctly. Well, OK, the actual problem is that several of the dependencies were crashing on compile. Windows users can use Enthought's distribution as a way of easily-getting most of the required packages, and I believe that there are Ubuntu packages for everything, too. People not using Windows or Debian-based Linux will just have to do a little more work. ''' 04/09/2008 Updated the manual to describe the installation process in more detail. CHANGES in PyPedal 2.0.0rc3 (Vetinari) ====================================== ''' 03/28/2008 Various fidgeting with setup.py, MANIFEST.in, and __init__.py to (hopefully) get ADOdb working correctly out-of-the- box. ''' 03/28/2008 Changed the keyword dbtable_name to database_table so that database-related keywords all begin with "databse". That, and it annoyed me that "dbtable" was different from everything else. Chalk-up another another victory for conformity! ''' 03/28/2008 Updated the manual to include a more thorough discussion of working with databases. ''' 03/28/2008 Updated pyp_newvlasses/NewPedigree so that pedigree loading and saving to/from databases now work with ADOdb. *** 03/28/2008 Rewrote the database and reports modules to use ADOdb for Python (http://phplens.com/lens/adodb/adodb-py-docs.htm) as the database abstraction library rather than SQLAlchemy. I also bundled the ADOdb files into PyPedal so that you don't have to download it separately. Currently, PyPedal supports MySQL, SQLite, and Postgres. ''' 03/12/2008 Updated the docstring for pyp_metrics/mating_coi() and pyp_metrics/mating_coi_group() so that they show the correct default value for gens as 0. *** 03/12/2008 Fixed a bug in pyp_metrics/inbreeding_vanraden() in which sibs were not being assigned the correct COI when they already had a full-sib with a known COI. CHANGES in PyPedal 2.0.0rc2 (Vetinari) ====================================== ### 03/11/2008 The relationship metadata returned by pyp_metrics/ inbreeding() are not guaranteed to be correct when method = 'vanraden' is used. This is because inbreeding_vanraden() uses a speed-up when there are full-sibs in the pedigree to avoid repeating calculations. The metadata should be pretty accurate for pedigrees with few or no full-sibs. The summary statistics will not be very accurate in the case of pedigrees that contain lots of full-sibs. I'm not sure that this is so much a bug as an optimization tradeoff. ''' 03/11/2008 Fixed a bug in pyp_metrics/inbreeding() and pyp_metrics/inbreeding_vanraden() so that non-zero averages are correctly calculated. ''' 03/11/2008 Fixed a minor bug in pyp_metrics/inbreeding_vanraden() so that the correct relationship count is returned. ''' 03/10/2008 Fixed pyp_newclasses/NewAMatrix::load() so that it uses correct keyword arguments for NumPy rather than Numarray. ''' 03/10/2008 Fixed a dictionary lookup error in pyp_newclasses/ NewAMatrix::save(). --- 03/10/2008 Removed pyp_newclasses/NewAMatrix::info() because I misunderstood just what numpy.info() was returning. ''' 03/10/3008 pyp_metrics/relationship() now checks the pedigree to see if it has an attached NRM; if so, relationship() will look up the coefficient of relationship rather than extracting pedigrees and calling fast_a_matrix(). This should buy a little performance by avoiding unnecessary recalculations. ''' 03/10/3008 When output is requested pyp_nrm/inbreeding() will now write original IDs to the file of the pedigree format contains 'asd' or names if the pedigree format contains 'ASD', which makes it much easier to match output to animals in ASD pedigrees, for in which original IDs are meaningful only internally. ''' 03/10/3008 Changed the default value of the cleanmaps parameter of pyp_nrm/inbreeding_vanraden() from 0 to 1; this means that ID maps created for inbreeding calculations will be deleted, rather than left lying around. *** 03/10/3008 Fixed a very nasty misplaced-parenthesis bug in pyp_nrm/ fast_a_matrix() that affected animals with both parents unknown. Thanks to Dan Cieslak for bringing this to my attention. Fixed the same bug in pyp_nrm/fast_partial_a_matrix(), which was derived from fast_a_matrix(). ''' 03/05/3008 Fixed a bug in pyp_metrics/descendants() that prevented all offspring from being properly enumerated. ''' 03/05/3008 NewPedigree::simulate() now uses numpy.random for generating random variates rather than the random module. *** 03/05/2008 Fixed a bug in pyp_metrics/mating_coi_group() in which only the last mating was retained for an individual because the group of matings was stored in a dictionary keyed by animals. The fix required an API change such that mating_coi_group() now takes a list of matings of the form "parent1_parent2", rather than a dictionary. Also fixed a few other small bugs in the function. +++ 03/05/2008 Added a new chapter to the manual, 'Working with Pedigrees'. ''' 03/04/2008 Fixed a small bug in NewPedigree::addanimal() ('userfield' not in missing dictionary). CHANGES in PyPedal 2.0.0rc1 (Vetinari) ====================================== +++ 03/04/2008 Added a new chapter to the manual, 'Input and Output'. +++ 03/04/2008 Added a new method, tostream(), to NewPedigree that returns a text stream version of a pedigree. ''' 03/04/2008 Fixed a small bug in NewPedigree::fromgraph() ('userfield' not in missing dictionary). +++ 03/04/2008 Loading pedigrees from databases using pedsource = 'db' now works. ASDx-formatted pedigrees are loaded from the database and table specified in the pedigree options database_name and dbtable_name. NewPedigree::prepeocess() was modified to support this as well. +++ 03/04/2008 Added pyp_newclasses/NewPedigree::savedb(), which saves a pedigree to a database table in ASDx format for NewAnimals and LightAnimals. This method uses the pedigree options database_name and dbtable_name. Existing databases and tables will be silently overwritten and data in them lost! CHANGES in PyPedal 2.0.0b23 (Vetinari) ====================================== XXX 03/03/2008 After careful consideration I decided that the GUI is a non-starter and ripped it all out. So there you are. ''' 03/03/2008 Added a new parameter, 'output', to pyp_nrm/inbreeding() to allow suppression of output files. ''' 02/29/2008 Added a new pedsource, pedstream, and the necessary changes made to pyp_newclasses/loadPedigree(), NewPedigree::__init__() and NewPedigree::prepeocess() to support reading pedigrees from text strings rather than files. It works ONLY for ASD-formatted pedigrees with comma-delinited animal IDs and '\n'-separated lines. This was a request made by Dan Cieslak, who would like to use PyPedal as a web service and wants to not have to write temporary text files all over the place. ''' 02/20/2008 Put the family ID into the herd field ib GEDCOM file import. *** 02/19/2008 The GEDCOM functions in pyp_io (read_from_gedcom(), write_from_gedcom(), and read_from_gedcom()) now work correctly. The name problem detailed below on 01/17/2008 was worked-around by storing the name in a user-defined field and using 'u' in the pedigree format. *** 02/13/2008 Made changes to several methods of the PedigreeMetadata class so that they handle LightAnimal pedigrees correctly. The fixed routines used attributes, such as founder, that the LightAnimal class lacks. IMPORTANT: most of the methods now use the Python set() function on the results of a list comprehension to get the unique elements in the list. This means that at least Python 2.4 is now the earliest Python version that can run PyPedal. ''' 02/12/2008 Fixed NewAnimal::__init__() so that half-founders (animals with one unknown parent) contribute one "novel" allele. +++ 02/12/2008 Added the routine pyp_metrics/ballou_ancestral_inbreeding() which calculates ancestral inbreeding coefficients using the recursion equation of Ballou (1997). ''' 02/12/2008 Modified NewPedigree::preprocess() to flag pedigrees as renumbered if the pedigree file includes coefficients of inbreeding (has 'f' in the format string). This will prevent unnecessary calculations in routines that operate on coefficients of inbreeding. +++ 02/11/2008 Added the routine pyp_metrics/dropped_ancestral_inbreeding() which calculates ancestral inbreeding coefficients using the gene dropping method of Suwanlee et al. (2007). ''' 02/08/2008 Updated the documentation to cover the calculation of partial inbreeding. +++ 02/08/2008 Added to pyp_nrm the procedures partial_inbreeding() and pyp_nrm/fast_partial_a_matrix(). They are used for the calculation of coefficients of partial inbreeding. Results have been validated using the pedigree in Figure 2 of Gulisija and Crow (2007). ''' 02/07/2008 pyp_utils/reorder() has been modified so that founders (animals with unknown parents) are always at the beginning of the pedigree. This prevents the partial inbreeding code from driving me to madness as surely as if I had seen a Great Old One hovering over Beltsville. +++ 02/05/2008 Since PySparse has been updated to work under Numpy as of the 1.0.1 release I've re-enabled it in a couple of places in pyp_mrm and pyp_metrics. I've also added checks to make sure that only 'dense' and 'sparse' are passed as values to the method parameter. The method defaults to 'dense' when an invalid value is passed. ### 01/18/2008 For now I am going to add the GEDCOM ID to the name field, e.g. "John /Cole/ (I1), and pull them back out when exporting to GEDCOM. I'm sure that with some nasty two- or three-layer dictionary wizardry can be used to index the pedigree by original GEDCOM IDs. !!! 01/17/2008 Found a pretty nasty bug that affects ASD pedigrees, and thus the import and export of GEDCOM data. ASD pedigrees use the 'name' field to store the original animal ID read from the input file, which is then hashed. The hashed value is placed in the animaID and originalID fields. When the pedigree is renumbered the new [renumbered] ID takes the place of the animal ID. Fine. But now there's no place to store the original string that was read as the animal ID from the input file. The problem never arose before since I never loaded a pedigree that had both names AND string-IDs. I'll have to take a close look at things before I can proceed. ''' 01/17/2008 Fixed a small bug in NewPedigree::preprocess() that resulted in attempted assignments to non-existent dictinaries in pedigrees with names. Now namemap and namebackmap are created right after the pedigree format code is checked. ''' 01/17/2008 Fixed the pedigree format codes so that 'y' indicates birth year and 'b' indicates birth date, rather than the insanely- confusing reverse of that. Why did I think that was a good idea? ''' 01/17/2008 Fixed NewAnimal::__init__() so that it uses the missing_sex option rather than a hard-coded value of 'u'. ''' 01/15/2008 Fixed PedigreeMetadata::nuf() so that it handles LightAnimal pedigrees correctly. The problem was that LightAnimal objects don't have founder flags so the sire and dam need to be compared to the missing_parent keyword in order to identify the founders. Both cases now use list comprehensions. +++ 01/14/2008 Added a new value of pedsource, 'gedcomfile', which loads a pedigreefrom a GEDCOM 5.5 file, which is commonly-used by programs for human genealogy. +++ 01/14/2008 Added several functions to pyp_io: read_from_gedcom() parses GEDCOM 5.5 files, write_from_gedcom() writes GEDCOM 5.5-sourced data from read_from_gedcom() to an ASD file, and write_to_gedcom() writed a NewPedigree object to a GEDCOM 5.5 file. Note that only a few GEDCOM tags are actually supported -- just enough to get individual, sire, dam, sex and birthdate (if known) into PyPedal. The GEDCOM format is a pain in the neck to parse. It would probably be easiest to dump everything into SQLite and then use SQL to put everything together the way it should be. Maybe later. +++ 01/02/2008 Added pyp_utils/founders_from_list() which takes a list of NewAnimal objects and returns a list of animalIDs that represent founders in that pedigree. Note the use of a list comprehension -- more to come soon. CHANGES in PyPedal 2.0.0b22 (Vetinari) ====================================== *** 06/26/2007 More bugfixes to pyp_utils/reorder(). The orderdict and orderbackdict dictionaries are now correctly updated when an animal in the pedigree is moved. This fixed cases where the pedigree was not correctly reordered. When this happened, pyp_utils/renumber() caught KeyErrors while looking up sires/dams and set parents to unknown, ignoring known relationships in the pedigree. George Wiggans's per- sistent reports that the relationships being calculated were just not right led to the root cause of the problem. Thanks, George! XXX 06/25/2007 Removed pyp_utils/reorder_list(). +++ 06/21/2007 pyp_utils/reorder() will now write error messages to the logfile and STDIO if a pedigree could not be renumbered in renumber_max_rounds of iteration. +++ 06/21/2007 Added new parameter, max_rounds, to pyp_utils/reorder(). pyp_newclasses/NewPedigree passes renumber_max_rounds to reorder, but the routines in pyp_nrm/* do not. +++ 06/21/2007 Added new option, renumber_max_rounds, to pyp_newclasses/ NewPedigree::__init__(). The default value is 100 rounds. *** 06/21/2007 Completely rewrote pyp_utils/reorder(), which has been the bane of my existence and source of most PyPedal bugs for a Long Time Now. ''' 06/21/2007 Fixed bug in pyp_nrm/inbreeding() so that it correctly works with pre-calculated NRM. +++ 06/20-/2007 Added a new function, pyp_io/write_ijk(), which saves an NRM to a disk in ijk format, where i and j are animal IDs and k is either (1. + the coefficient of inbreeding) when (i == j) or coefficient of relationship when (i != j). ''' 06/20/2007 Rewrote an error message in pyp_newclasses/ NewPedigree::preprocess() so that it now makes sense. CHANGES in PyPedal 2.0.0b22 (Vetinari) ====================================== ''' 06/01/2007 Applied attribute lookup optimization to pyp_metrics/pedigree_compelteness(). ''' 06/01/2007 Applied some attribute lookup optimization to pyp_metrics/a_effective_ancestors_definite(). ''' 06/01/2007 Applied attribute lookup optimization to pyp_metrics/a_effective_founders_boichard(). ''' 06/01/2007 Applied attribute lookup optimization to pyp_metrics/a_effective_founders_lacy(). Removed hard-coded missing parent value and replaced with kw['missing_parent']. ''' 06/01/2007 Fixed a small bug in pyp_utils.set_ancestor_flag() (filename out-of-scope for except block error reporting). ''' 06/01/2007 pyp_utils/reorder() has been rewritten to use a dictionary rather than a list to track animal indices. Local variables are used in the inner loop to avoid the overhead of repeated attribute lookups. The new version seems to be faster than the original, and I think that the losing-animals-between-rounds bug may finally be eliminated. The original version of reorder() that used lists for tracking animal locations has been renamed to reorder_list() and will be removed sometime down the road. ''' 05/07/2007 Made a minor change to NewPedigree::preprocess() so that it will now catch animals with an ID that's the same as the pedigree's missing parent indicator. Any such records will be skipped, the event logged, and a message written to the console. *** 05/03/2007 Under 64-bit Linux an animal is being lost between the end of the first pass of pyp_ utils/reorder() and the second pass. The length of the order list is the same. Why is an animal getting lost? I did not figure out why a given animal was getting lost, but I did fix the overall problem. The code originally created a new version of the list that stored the order in which animals appeared in the pedigree for each pass through the pedigree. This was unnecessary, and once I eliminated that step the problem with the missing cow went away. I'm sending the new version to MK for testing. CHANGES in PyPedal 2.0.0b21 (Vetinari) ====================================== *** 04/06/2007 Fixed a bug in the NewAnimal::string_to_int() and LightAnimal::string_to_int() methods. The bug (reported by Matt Kelly) manifested on Windows XP/Python 2.5 and Max OS/X platforms and resulted in the failure of pyp_utils/reorder) to successfully reorder pedigrees when the 'ASD' pedigree format code was used. I believe that the problem was related to the value of sys.maxint, which varies between 32-bit and 64-bit platforms. string_to_int() now uses a hash calculated using the Python MD5 module as the main hashing method, and the old method as a backup should the first throw an exception (such as could be caused by an unsuccessful cast of a long hexadecimal value to a (long) integer). In that case, the old method will now use a hard-coded value in calculating the hash rather than sys.maxint. This bug did not affect pedigrees coded 'asd'. CHANGES in PyPedal 2.0.0b20 (Vetinari) ====================================== ''' 03/22/2007 Minor documentation fixes and additions. ''' 03/22/2007 Added a new method, savegraph(), to pyp_newclasses/NewPedigree for saving graphs to adjacency matrices. ''' 03/22/2007 Added a new value of pedsource, 'graphfile', which can load a pedigree stored as a directed graph in a text file in an adjacency list. ''' 03/22/2007 Added a new argument to pyp_utils/renumber(), animaltype, that allows for proper handling of NewAnimal versus LightAnimal instances. *** 03/22/2007 Fixed a bug in pyp_newclasses/NewPedigree::preprocess() that prevented correct loading of pedigrees using the LightAnimal class. No namemap or namebackmap can be formed using LightAnimal objects because they have no name property. Also made related changes in pyp_db, pyp_nrm, and pyp_utils, most notably to pyp_utils/renumber(). CHANGES in PyPedal 2.0.0b19 (Vetinari) ====================================== ''' 03/14/2007 Removed another dependency on Numarray and replaced it with a NumPy call instead. Thanks to Matt Kelly for the bug report. That should completely remove all dependencies on Numarray. CHANGES in PyPedal 2.0.0b18 (Vetinari) ====================================== ''' 03/14/2007 Removed a dependency on Numarray in pyp_utils.renumber() and replaced it with a NumPy call instead. Thanks to Matt Kelly for the bug report. ''' 12/13/2006 Changed PedigreeMetadata::nuf() to assign animal IDs to the unique founder list rather than original IDs. This change fixes indexing problems when using pedigree metadata on renumbered pedigrees. CHANGES in PyPedal 2.0.0b17 (Vetinari) ====================================== ''' 11/03/2006 pyp_jbc.color_pedigree() now takes an additional argument, drawer (new|old), that indicates whether or not draw_colored_pedigree() of new_draw_colored_pedigree() should be used. An easy PyGraphviz installer is not available for Windows and this lets Windows users use the available graph layout library without having to change the code in color_pedigree(). ''' 09/15/2006 Installation instructions updated. ''' 11/03/2006 More changes to packaging to try and make installation on Windows easier. ''' 11/03/2006 Added missing file, ez_setup.py, to MANIFEST.in. CHANGES in PyPedal 2.0.0b16 (Vetinari) ====================================== ''' 11/02/2006 Changed the MANIFEST.in file used for packaging releases to include only the PS and PDF manuals from the documents directory. ''' 10/29/2006 Fixed a bug in pyp_newclasses/NewPedigree::simulate() so that pedigree format code "u" is correctly handled. CHANGES in PyPedal 2.0.0b15 (Vetinari) ====================================== ''' 09/18/2006 new_draw_pedigree() uses rectangles to indicates known males, circles to indicate known females, and octagons to indicate animals of unknown sex. ''' 09/18/2006 Added new_draw_colored_pedigree() to pyp_jbc as an eventual replacement for draw_colored_pedigree(). I plan to see if I can roll the pedigree coloring functionality into new_draw_pedigree so that there is only one piece of code to maintain. +++ 09/18/2006 Added a new function to pyp_graphics, new_draw_pedigree(), that used pygraphviz to produce dot files for graph visualization. The code is a lot cleaner than the new_pedigree routine which is based on pydot, and I plan to replace draw_pedigree() with draw_new_pedigree(). *** 09/18/2006 Fixed a big bug in pyp_utils.assign_sexes() that was causing pedigree loading to fail. +++ 09/18/2006 Added a new keyword, default_fontsize, to specify the default font size used in pyp_graphics. This saves a lot of cruft in, e.g., draw_pedigree() and makes it easier for the user to see what's going on with font sizes. If the font size cannot be cast to an integer, it is set to the default value of 10. Font sizes less than zero are set to the default of 10. +++ 09/15/2006 Added a new keyword, nrm_format, to specify if an NRM written to a file should be saved as text or binary. Array elements in text files are separated by sepchar. This makes it possible to export and NRM so that it can be read into, e.g., Octave. ''' 09/15/2006 Updated pyp_newclasses/NewAMatrix::form_a_matrix() so that the keywords dictionary is passed to pyp_nrm/fast_a_matrix(). This fixes a bug experienced when users instantiated a NewAMatrix object after a pedigree had already been created. ''' 09/15/2006 Updated pyp_newclasses/NewAMatrix::info() so that it uses the correct numpy function for returning array information (note this may only work with the most recent version of NumPy, the 1.0 RC). +++ 09/15/2006 Modified pyp_newclasses/NewAMatrix::save() so that users can store NRM in either binary or text formats (the latter is useful for getting an NRM into Octave). +++ 09/15/2006 Clean-up in pyp_graphics so that the module can be imported even if matplotlib (or one of its dependencies) is not present. This means that there is import cruft in all of the functions that use matplotlib, but it also means that draw_pedigree and some of the other routines are useable even when matplotlib is not available. ''' 09/15/2006 Lots of documentation updates. +++ 06/20/2006 Added a new pedigree format code, "u", which contains a user-defined field as a string. This field can be used to mark or label records, for example, as having associated DNA samples or not. +++ 06/13/2006 Added pyp_network.get_closeness_centrality(), which returns a dictionary of the closeness centrality (1/(average distance to all nodes from n)) for each node in the graph. +++ 06/09/2006 Added pyp_network.mean_degree_centrality(), which calculates mean in- and out-degree centralities for directed graphs and simple degree-centralities for undirected graphs. If the normalize flag is set, each node's centralities are weighted by the number of edges in the (di)graph. +++ 06/09/2006 Added pyp_network.dyad_census(), which calculates the number of null, asymmetric, and mutual edges between all pairs of nodes in a directed graph. +++ 06/09/2006 Added pyp_network.graph_density(), which calculates the density of a digraph, which is the ratio of edges in the graph to the maximum possible number of edges. +++ 06/09/2006 Added pyp_network.mean_geodesic(), which calculates the mean geodesic (shortest) distance between two vertices in a network. +++ 06/09/2006 Added pyp_network.get_node_degree_histograms(), which returns a dictionary containing histograms of the number of vertices (nodes) in pg with a given number of incoming, outgoing, or total edges. +++ 06/09/2006 Added pyp_network.get_node_degrees(), which returns a dictionary containing the number of vertices (nodes) in pg with a given number of incoming, outgoing, or total edges. ''' 06/09/2006 Fixed a bug in pyp_network.ped_to_graph() that prevented animals with two parents and no offspring from being added to the graph. ''' 06/08/2006 Changed NewAnimal::__init__() so that originalHerd is set correctly. +++ 06/08/2006 Added code to pyp_utils.renumber() to update the sons, daus, and unks dicts for each animal after they've been renumbered. The keys are updated to the offspring's renumbered IDs, while the values remain the offspring's original animalID. ''' 06/08/2006 The printme(), stringme() and dictme() methods of NewAnimal now handle birth dates. ''' 06/08/2006 Changed NewAnimal::__init__() so that the default animal name in 'asd' pedigrees is the animal ID rather than the value of the 'missing_name' option. This means that the name attribute of animal objects will correspond to the value stored in the namemap for that pedigree. When all animals with no name are assigned 'missing_name' the namemap gets screwed up and the 'missing_name' key in namemap resolves only to the most-recently-added animal. CHANGES in PyPedal 2.0.0b14 (Vetinari) ====================================== --- 05/16/2006 There is a problem with logfile creation when loading a pedigree from a digraph. I'm not sure what the problwm is, but I'm punting a fix until b15. ''' 05/16/2006 pyp_network/ped_to_graph() now uses the XDiGraph class for creating graphs. Nodes between sires and offspring and dams and offspring are labelled as 's' and 'd', respectively. This allows NewPedigree::fromgraph() to distinguish between sires and dams. +++ 05/16/2006 Added the method NewPedigree::fromgraph() which loads a PyPedal pedigree from an XDiGraph object. ''' 05/16/2006 Updated LightAnimal::string_to_int() to use the same hashing algorithm as NewAnimal::string_to_int(). +++ 05/16/2006 Added a new method, dictme(), to the NewAnimal and LightAnimal classes. dictme() returns the attributes of an animal object as a dictionary. ''' 05/16/2006 Fixed the founder assignment code in NewAnimal::__init__(), which was not correctly identifying founders in ASD pedigrees. ''' 05/16/2006 Removed some redundant code from the sire and dam assignment code in NewAnimal::__init__(). ''' 05/16/2006 Fixed an edge case in NewPedigree::preprocess() in which unknown parents were being added to ASD pedigrees as actual animals. +++ 05/16/2006 Added a new procedure, pyp_utils.subpedigree(), which takes a NewPedigree object and list of animal IDs and returns a NewPedigree object containing only the animals in the animals list. CHANGES in PyPedal 2.0.0b13 (Vetinari) ====================================== ''' 05/15/2006 I tried using a fancy list comprehension `` (http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52560) in pyp_metrics/relationship() to make sure that there are no duplicate animals in the pedigree, and it seemed to work in some cases and not in others. For now, I've replaced it with a much-less-sexy pair of loops using a dictionary tracking "seen" animals that works nicely. This should fix side-effect bugs in pyp_metrics_coi() as well. ''' 05/15/2006 Fixed NewPedigree::addanimal() to correctly perform sire and dam lookups using the backmap and namebackmap dictionaries. This fixes a problem in pyp_metrics/mating_coi() in which the new (default) algorithm was returning 0. because addanimal() was handling an exception poorly. CHANGES in PyPedal 2.0.0b12 (Vetinari) ====================================== ''' 05/15/2006 Fixed a bug in pyp_metrics/relationship() that resulted in incorrect relationships in some circumstances. Now a pedigree is formed for both anim_a and anim_b, and a list comprehension is used to remove duplicate IDs. This should result in correct relationships being computed in all cases. This fix also handles an edge case in which an animal would not be inlcuded in the pedigree used for the calculation if anim_a and anim_b were not related; this was handled correctly by the try...except loop, but only accidentally. CHANGES in PyPedal 2.0.0b11 (Vetinari) ====================================== +++ 05/12/2006 Added PyPedalError and PyPedalPedigreeInputFileNameError classes. Now the exception raised in NewPedigree::__init__() actually works sort-of correctly. ''' 05/12/2006 Did some more work in pyp_newclasses/NewAnimal:: __init__() to make sure that string methods are not called on non-strings. *** 05/12/2006 pyp_newclasses/NewAnimal::string_to_int() now uses a new hashing algorithm taken from "Character String Keys" in "Data Structures and Algorithms with Object-Oriented Design Patterns in Python" by Bruno R. Preiss: http://www.brpreiss.com/books/opus7/html/page220.html#progstrnga. This fixes the collision problem with the previous algorithm (but any algorithm that produces, say, 32-bit integers will collide under the right circumstances). --- 05/12/2006 OK, the problem with pyp_utils/reorder() has been identified: Ori Peleg's hashing approach used in pyp_newclasses/ NewAnimal::string_to_int() is colliding. For example, the strings 'ANJHL5876DC' and 'ONNLDF348RC' hash to the same value. ''' 05/11/2006 Made a change to pyp_newclasses/NewAnimal::__init__() so that animal, sire, and dam names are correctly assigned even if integral IDs are passed rather than strings. ''' 05/11/2006 pyp_newclasses/NewPedigree::preprocess() no longer casts animal, sire, and dam IDs to strings when adding pedigree entries for missing sires and dams. The cast was breaking the founder assignment code in pyp_newclasses/ NewAnimal::__init__(). ''' 05/11/2006 Turned off a bunch of debugging messages left on in pyp_utils/reorder(). ''' 05/10/2006 pyp_utils/assign_offspring() was rather thoroughly broken due to errors introduced when the new object model conversion was made. CHANGES in PyPedal 2.0.0b10 (Vetinari) ====================================== *** 04/28/2006 pyp_utils/reorder() is fixed. The original code was supposed to make a copy of an animal object to keep things straight when moving an animal ahead of its sire or dam. The code was using an "a = b" type of operation rather than using the copy module to make a true copy. That's been fixed and the beef cattle pedigree provided by Matthew Kelly can be successfully reordered. w00t! --- 04/27/2006 In pyp_utils/reorder() there appears to be a problem with properly reordering IDs from ASD pedigrees. I THINK that it may be a lookup problem -- when a sire or dam appearing only in an offspring record is added to the pedigree their ID does not get mapped correctly when looking up the index in reorder(). More details. I think that it's definitely a problem with ASD pedigrees. Here's what happens. You specify ASD. Even if an animal ID in the pedigree file is an integer, it gets hashed to an integer as though it were string because, well, it is at that time. Then later, the reorder routine tries to determine the sire's location in the pedigree using the index() method on a list. But...that lookup fails because the animal object was not updated to include the sire's hashed ID -- it still has the original ID. I think. *** 04/27/2006 Okay, I think that I've squashed another tricky bug with IDs. Now when pedigree entries are created for sires or dams appearing only as a parent, the animal ID and missing parent IDs (used for the sire and dam fields) are cast to strings before being passed to NewAnimal::__init__(). ''' 04/27/2006 Removed pointless casting from NewAnimal::printme(). XXX 04/27/2006 The 'id_first' and 'id_last' attributes added back in Beta 7 have been removed. They were not actually needed for their intended purpose and were causing breakage with ASD pedigrees. ''' 04/27/2006 Corrected docstrings in pyp_db/loadPedigreeTable(). ''' 04/27/2006 NewAnimal::__init__() now correctly assigns values to the 'herd' attribute when defaulting to the value specified in the 'missing_herd' attribute. This was causing breakage in pyp_db/loadPedigreeTable(). ''' 04/27/2006 NewAnimal::__init__() uses a cast to make sure that animal, sire, and dam IDs are integers, rather than strings, when 'asd' formats are used. This fixes a problem with comparing sire and dam IDs to the default missing parent value of 0; lines read from the pedigree file are treated as strings, which means that any value derived from, say, split()-int the input line is also a string. IDs read using the 'asd' format are required to be integers, but this was not being enforced by a cast. *** 04/27/2006 Major audit of code in all pyp_ modules to ensure that code uses the 'missing_parent' attribute instead of comparing sire and dam IDs to 0. This eliminates lots of potential bugs. I hope. CHANGES in PyPedal 2.0.0b9 (Vetinari) ===================================== ''' 04/26/2006 pyp_jbc/draw_colored_pedigree() now correctly handles user-specified missing parent codes. ''' 04/26/2006 pyp_network/ped_to_graph() now correctly handles user-specified missing parent codes. ''' 04/26/2006 Fixed PedigreeMetadata::nud() and PedigreeMetadata::nus() so that they correctly handle user-specified missing parent codes. ''' 04/26/2006 Added a new parameter, missingparent, to pyp_utils/reorder(). reorder() can now correctly handle user-specified missing parent codes. ''' 04/26/2006 Made a small change in the way that NewPedigree::preprocess() handles user-supplied missing parent codes. Sire and dam IDs are now compared to kw['missing_parent'] instead of being cast to integers and compared to 0. ''' 04/26/2006 Fixed a typo in pyp_metrics/mating_coi_group() that caused a fatal error on module importation. CHANGES in PyPedal 2.0.0b8 (Vetinari) ===================================== ''' 04/14/2006 Much tweaking of NewPedigree::addanimal() and NewPedigree::delanimal() to get them working correctly with string names (ASD). addanimal() now correctly renumbers the dummy animal as well. ''' 04/14/2006 Added a new keyword, 'newanimal_caller', to NewPedigree. It is for INTERNAL USE ONLY and is needed so that addanimal() correctly works with ASD pedigrees. +++ 04/14/2006 Added new function, pyp_metrics/mating_coi_group(), that is used to identify the minimum-inbreeding matings from a set of proposed parents. +++ 04/14/2006 Added a 'namebackmap' to the NewPedigree class that is the reverse of the sadly-undocumented 'namemap' that was added in Beta 7. ''' 04/14/2006 The algorithm in NewAnimal::string_to_int() was replaced with a much better algorithm inspired by Ori Peleg's "Pseudo- random string to float conversion" recipe in the ActiveState Python Cookbook at: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/391413 The new algorithm returns much smaller intergers than the old, stupid algorithm I was using before, which would overflow if the name (string) passed to the method was too long. CHANGES in PyPedal 2.0.0b7 (Vetinari) ===================================== ''' 04/13/2006 Cleaned-up NewAnimal::__init__() so that missing sires and dams are assigned IDs of "missing_parent" and names of "missing_name". Animals with unknown herds receive a default of 'u' with the 'h' format code and "missing_herd" with the 'H' format code. ''' 04/13/2006 Fixed pyp_metrics/pedigree_completeness() to correctly call pyp_nrm.recurse_pedigree_n(). ''' 04/13/2006 Added two new keywords, "pedcomp" and "pedcomp_gens" to NewPedigree. When "pedcomp" is 1, pedigree completeness will be calculated for "pedcomp_gens" generations. *** 04/12/2006 Fixed a pretty nasty bug in the preprocessing of animal records where IDs are strings (ASD) that caused sires and dams without their own pedigree file entries to have wrong sire and dam IDs themselves; the missing parent codes were incorrectly being converted to IDs. +++ 04/12/2006 Added a new report, pdf3GenPed(), to the reports module. It takes an animal ID, or a list of animal IDs, and produces a PDF with a three-generation pedigree for each animal in the list on a separate page. ''' 04/11/2006 Added three new keywords, 'missing_name', 'missing_herd', and 'missing_breed', to pyp_newclasses/NewPedigree. Updated NewAnimal.__init__() to assign missing names, herds, and breeds using these new keywords. ''' 04/11/2006 Added a _frame_height attribute to the _pdfCalcs dictionary in pyp_reports. XXX 04/10/2006 Removed support for PySparse until it is in-synch with NumPy rather than Numeric. No one is using it, as far as I know. If you really, really need/want to use it, you'll have to hack the code. It's still in pyp_nrm, it's just been disabled by a hard-coded value of the 'method' argument. *** 04/10/2006 Replaced dependency on Numarray with NumPy to gain faster performance on small matrices. All of the code should be using NumPy now rather than Numeric. *** 04/10/2006 pyp_metrics/mating_coi() now takes a 'gens' argument. When gens is -1, it calculates the expected COI by taking half of the parents' coefficient of relationship. When gens is 0, it inserts a dummy animal in the pedigree and computes the COI using the full pedigree. When gens is >0, the COI of a dummy animal is calculated using only gens generations of the pedigree. ''' 04/10/2006 Updated pyp_metrics/relationship() to work correctly with the reorder*(), renumber*(), and fast_a_matrix*() routines in pyp_nrm. It wasn't updated months back when I made some API changes. ''' 04/10/2006 Fixed some bugs in pyp_network/find_ancestors() by casting animal IDs to integers. +++ 04/10/2006 Updated examples/new_methods.py to include tests of NewPedigree::addanimal() and NewPedigree::delanimal(). +++ 04/10/2006 Added 'id_first' and 'id_last' attributes to the NewPedigree keywords dictionary. They store, as integers, the first and last animal IDs in the pedigree, respectively. They can be used to automate the addition of new animals to the pedigree by NewPedigree::addanimal(). They should also probably go in the metadata, but I want to make sure that they are always attached to the NewPedigree. +++ 04/10/2006 Added the pyp_newclasses/NewPedigree::addanimal() method that is used for adding animals to pedigrees. Not safe for use on un-renumbered pedigrees! +++ 04/10/2006 Added the pyp_newclasses/NewPedigree::delanimal() method that is used for deleting animals from pedigrees. Not safe for use on un-renumbered pedigrees! ''' 04/10/2006 Fixed several typos in PedigreeMetadata::printme() due to injudicious cutting-and-pasting. ??? 02/16/2006 There might be a problem if a user specifies that more than one column should be skipped using the 'Z' pedigree format code. CHANGES in PyPedal 2.0.0b6 (Vetinari) ===================================== ''' 02/09/2006 Added a new parm, 'gshowall', to pyp_graphics/ draw_pedigree(). When 1, draws all animals in pedigree, including those with no ancestors or descendants. When 0, only draws animals with at least one known parent or child. ''' 02/09/2006 The 'missing_parent' option in pyp_newclasses/ NewPedigree now defaults to the number 0, not the string '0'. ''' 02/09/2006 pyp_newclasses/NewPedigree::renumber() now passes the 'missing_parent' option to pyp_utils/renumber(). ''' 02/09/2006 Because pyp_utils/renumber() does not take a pedobj, it did not correctly renumber pedigrees that used a missing parent indicator other than 0. This was fixed by adding a new parameter, 'missingparent', to the function call; its default value is 0. ''' 02/09/2006 pyp_utils.renumber() now changes the 'name' attribute in animal objects to match the renumbered ID if the originalID and the name are the same. This fixes a tricky problem in pyp_graphics/draw_pedigree() in which pedigree drawings were incorrect for both un-renumbered and renumbered reasons. --- 02/09/2006 pyp_graphics/draw_pedigree() does not correctly render pedigrees that have not been renumbered b/c it uses animal, sire, and dam ID as indices into the pedigree. A warning is now written to the screen (if enabled) and the logfile whenever an un-renumbered pedigree is passed to draw_pedigree(). ''' 02/09/2006 pyp_graphics/draw_pedigree() now uses the 'missing_parent' pedigree option to suppress drawing unknown parents rather using a hard-coded value of 0. ''' 02/08/2006 pyp_graphics/draw_pedigree() now gets the list of unique generations from the pedigree metadata rather than looping over the pedigree to form its own list from scratch. ''' 02/08/2006 Added a new recognized value to pedigree_summary, 2. If pedigree_summary > 1 then PedigreeMetadata::printme() displays some additional details, such as unique sire, dam, and birthyear lists. ''' 02/08/2006 pedfile now defaults to 'simulated_pedigree' for simulated pedigrees; prevents an exception being thrown when the filetag is formed. +++ 02/07/2006 Major work on pyp_newclasses/NewPedigree.simulate() -- it works now. ''' 02/06/2006 Cleaned-up some PythonDoc strings in the NewAnimal() and LightAnimal() classes to more clearly differentiate between the two classes. ''' 02/06/2006 Fixed the n__() methods in the PedigreeMetadata() class so that they handle pedigrees of LightAnimal() objects by returning counts of 0 and empty dictionmaries of distinct levels. ''' 02/06/2006 Fixed the printme() and stringme() methods in the NewAnimal() class so that they no longer alter object attributes by casting. +++ 02/06/2006 Added a new class, LightAnimal(), for use with the graph routines in pyp_network. It is a scaled-down version of NewAnimal() that does not track as many attributes. +++ 02/06/2006 Added a new pedigree option, animal_type, that in- dicates which animal class should be used to instantiate the animal records, NewAnimal() or LightAnimal(). Recognized values are 'new' and 'light'. ### 12/21/2005 I'm trying to determine whether or not there is still a big performance difference between Numeric and numarray for small matrices. If there is, I am going to rollback the requirement for numarray to a requirement for Numeric instead. This is a big issue in, e.g., pyp_nrm.inbreeding_vanraden() that creates lots of small matrices. ''' 12/21/2005 Fixed bugs in pyp_nrm/fast_a_matrix() and fast_a_matrix_r() that caused an exception to be thrown whenever debugging messages were turned on (careless use of 'pedobj'). CHANGES in PyPedal 2.0.0b5 (Vetinari) ===================================== *** 12/19/2005 pyp_metrics/common_ancestors() now works correctly with pedigree objects. ''' 12/19/2005 pyp_metrics/a_effective_ancestors_definite() now writes a warning to the logfile whenever there are no distinct generat- ions in the pedigree. ''' 12/19/2005 pyp_nrm.inbreeding() now writes a message to the logfile whenever a pedigree is dispatched to pyp_nrm.inbreeding_vanraden() instead of pyp_nrm.inbreeding_tabular() because of its length. ''' 12/19/2005 pyp_nrm.inbreeding_tabular() now honors the 'nrm_method' pedigree option when forming NRM. ''' 12/19/2005 pyp_nrm.inbreeding_tabular() accepts the rels option. ''' 12/19/2005 pyp_nrm.inbreeding_vanraden() now honors the 'nrm_method' pedigree option when forming NRM. An example of why this is necessary may be seen by studying the horse pedigree used in examples/new_inbreeding2.py and running the analysis with 'nrm_method' set to 'nrm' and 'frm' in turn. *** 12/19/2005 Fixed a bug in pyp_nrm/fast_a_matrix_r() that caused an exception to be thrown whenever the routine was entered (careless use of 'pedobj'). ''' 12/16/2005 pyp_nrm.inbreeding_vanraden() accepts the rels option. ''' 12/16/2005 pyp_nrm.inbreeding() now has a new options, rels, that causes the routine to return a second dictionary containing summary statistics on coefficients of relationship in a pedigree. ''' 12/15/2005 Cleaned up pyp_nrm.inbreeding() so that sensible values are returned for all metadata when there are no inbred animals in the pedigree. +++ 12/15/2005 Added a new pedigree option, pedigree_summary, that in- dicates whether or not the pedigree loading details and summary are printed to STDOUT. +++ 12/15/2005 pyp_nrm.inbreeding() now takes an argument, gens, that specifies how many generations are to be used in calculating COI. pyp_nrm.inbreeding_vanraden() and pyp_nrm.inbreeding_tabular() have both been modified to accept the gens option. For an example, see examples/new_inbreeding2.py. ''' 12/14/2005 Most routines in pyp_graphics now write messages to the logfile when there are problems, as well as to STDOUT if requested. --- 12/14/2005 The pyp_graphics routines that use matplotlib are failing with the error: "AttributeError: 'file' object has no attribute 'rfind'" when pylab.savefig() is called. I haven't been able to find a fix yet, but I think it's only a problem on the 64-bit build. User feedback is really needed here. ''' 12/13/2005 Fixed a minor bug in pyp_graphics.plot_line_xy(). ''' 12/13/2005 Added an optimization to pyp_graphics.rmuller_spy_matrix_pil() for square matrices, but it's still pretty slow for large NRM. *** 12/13/2005 Added a _backmap to pyp_nrm.inbreeding_vanraden() for reverse lookups (renumberedID => originalID). That fixes a bug that did not show up in small pedigrees in which all of the animals were extracted into the same subpedigree, but which did affect larger pedigrees. ''' 12/13/2005 pyp_nrm.inbreeding_vanraden() now uses pyp_network.find_ancestors() to form subpedigrees. I don't have any strict metrics to prove it, but I believe that there's a performance gain of at least an order of magnitude over using pyp_nrm.recurse_pedigree(). ''' 12/13/2005 pyp_nrm.inbreeding_vanraden() no longer reads ID maps from disc using pyp_utils.load_id_map(). ''' 12/13/2005 pyp_utils.renumber() now accepts an argument, returnmap. If returnmap is 0, the default, pyp_utils.renumber() does not write idmaps to disc, returning them on exit along with the renumbered pedigree. Maps are only written to disc when returnmap is <> 0. This improves the performance of, e.g., pyp_nrm.inbreeding_vanraden() by eliminating a lot of disc reads/writes. ''' 12/13/2005 pyp_nrm.inbreeding_vanraden() now honors the slow_reorder keyword. ''' 12/13/2005 Fixed a minor bug in pyp_newclasses/NewAMatrix.form_a_matrix() that resulted in a default keyword being assigned the wrong key. ''' 12/13/2005 All programs in the examples/ subdirectory have been fixed to use pyp_newclasses.loadPedigree() for loading pedigrees. The old two- step process still works, which will not change. ''' 12/13/2005 pyp_newclasses.NewPedigree.__init__() now uses the dict() method on Dict4Ini to convert options read from a configuration file to an actual Python dictionary. +++ 12/12/2005 Dict4Ini 0.4 (http://wiki.woodpecker.org.cn/moin/Dict4Ini) has been added to the distribution. PyPedal can now read its options from a configuration file or from a dictionary that is created by the user and passed to pyp_newclasses.NewPedigree(). The support in pyp_new- classes.NewPedigree.__init__() is not a perfect solution because it can let a naked exception propagate back to the user, but __init__() methods can only return None, so there is no nice way to pass back a message if an exception is thrown. If dict4ini.DictIni(kwfile) fails, then kw will be an empty dictionary. The user will only see this when the load() method is called on the NewPedigree instance. The way around this is to use the pyp_newclasses.loadPedigree() convenience function. +++ 12/12/2005 Added the convenience function pyp_newclasses.loadPedigree(), which wraps pedigree object instantiation and loading into one call. examples/new_options.py demonstrates its useage. ''' 12/09/2005 After running PyLint on the tree I went all David Bruce Banner and added docstrings to everything, even though in most cases they are just copies of the PythonDoc comments. ''' 12/07/2005 All programs in the examples/ subdirectory have been tested and are known good. --- 12/07/2005 Despite lots of poking around, upgrading, and tweaking of version numbers I still cannot get a clean build on a binary RPM using setuptools. Only three packages (networkx, testoob, and pydot) even have correct links in the Python Cheese Shop so that setuptools can automatically download them. For now, I am only going to distribute the source tarball and a binary .egg for AMD64 machines. ''' 12/06/2005 Cleaned up the examples/ subdirectory. Removed old programs, updated all programs to use correct imports and the current API. Removed old pedigree and output files. ''' 12/06/2005 Major cleanup in setup.py and MANIFEST.in. ''' 12/06/2005 networkx now has a Python Cheese Shop entry. It has been added to the setuptools script. CHANGES in PyPedal 2.0.0b4 (Vetinari) ===================================== ''' 12/06/2005 Added PythonDoc strings to the __init__() methods for the NewPedigree() and NewAMatrix() classes. +++ 12/05/2005 Added a printme() method to pyp_newclasses/NewAMatrix. ''' 12/05/2005 The default for gtitle in pyp_graphics/draw_pedigree() has been changed from 'My_Pedigree' to ''. ''' 12/05/2005 The default page orientation in pyp_graphics/draw_pedigree() has been changed from landscape ('l') to portrait ('p'). ''' 12/05/2005 pyp_graphics/draw_pedigree() can now form syntactically-correct Dot files when gtitle is an empty string, e.g. ''. In those cases, no label is added to the graph. *** 12/05/2005 Fixed three major bugs in pyp_newclasses/NewPedigree.load() that caused the pedigree to be overwritten with a function return status flag in some cases. ''' 12/05/2005 Updated OPTIONS.txt so that it includes documentation of all pedigree options. CHANGES in PyPedal 2.0.0b3 (Vetinari) ===================================== ''' 11/29/2005 pyp_db/createPedigreeTable() has been updated such that columns are created for pyp_newclasses.NewAnimal() herd, originalHerd, and gencoeff attributes. +++ 11/29/2005 Added pyp_db/tableDropTable(), which drops a table from the data- base. It is similar to tableDropRows(), but tableDropRows() only drops the contents of a table, not the table itself. Use caution when calling tableDropTable() as it will delete data! ### 11/29/2005 Python bindings for the Boost Graph Library are now available. I looked that them briefly, but have no plans of replacing NetworkX with BGL any time soon. I will revisit this decision as BGL coverage improves. ''' 11/29/2005 Added pyp_reports/pdfMeanMetricBy(), which produces a PDF version of the results dictionary created by pyp_reports/MeanMetricBy(). This was done as much to provide another example of creating user-defined printed reports as anything. ''' 11/29/2005 pyp_reports/_pdfCreateTitlePage() now automagically wraps long lines. This behavior is currently hard-coded. It may need tweaking to look okay on A4 paper or if you change the default typeface. ''' 11/29/2005 Added a printoptions() method to pyp_newclasses/NewPedigree() so that the state of a pedigree can be more easily inspected. *** 11/29/2005 pyp_reports/meanMetricBy() now checks to see if the pedigree on which it's working has been loaded into the database. If it has not been loaded, it's loaded and a message to that effect is written to the log. This prevents meanMetricBy() from crashing if the user does not explicitly load the pedigree into the database by using pyp_db/loadPedigreeTable(). +++ 11/28/2005 Continuing work on the documentation. There have been major additions to the manual lately. It is now, hopefully, much friendlier. ''' 11/28/2005 Added two new option: default_unit and default_report; they are documented in OPTIONS.txt. ''' 11/23/2005 Changed pyp_demog/set_base_year() to use 1900 as the default year for setting the global BASE_DEMOGRAPHIC_YEAR. ''' 11/23/2005 Added three new options: missing_bdate, missing_byear, and paper_size. They are documented in OPTIONS.txt. +++ 11/23/2005 Added pyp_reports_templates, which contains Platypus templates for use with ReportLab. +++ 11/23/2005 I have added pyp_reports/pdfPedigreeMetadata() to the reporting module. It uses ReportLab (http://www.reportlab.org/) to generate a PDF summary of a pedigree. That report is not of great interest, but it us useful for demonstrating how a user might create custom PDF reports using pyp_reports and ReportLab. I know, the API is supposed to be frozen, but the reports module was just too light without this. +++ 11/23/2005 ReportLab (http://www.reportlab.org/) has been added as a dependancy. CHANGES in PyPedal 2.0.0b2 (Vetinari) ===================================== +++ 11/22/2005 A current HTML version of the manual is now included in the distribution. *** 11/22/2005 Fixed a typo in pyp_newclasses/NewAnimal.__init__() that caused generations to be coded as unknown ('-999') even when they were included in the pedigree file. This bug was found because it broke pyp_metrics/a_effective_ancestors_definite(). *** 11/22/2005 Cleaned-up several cases where arguments for output strings were not enclosed in parentheses when they needed to be, and would cause crashes. *** 11/22/2005 pyp_network/founder_descendants() was renamed to get_founder_descendants() so that there is not a name collision with pyp_metrics/founder_descendants(). *** 11/22/2005 Fixed a bug in pyp_metrics/effective_founders_lacy() that was preventing non-renumbered pedigrees from being renumbered correctly. *** 11/22/2005 The descendants() and founder_descendants() have been added back to pyp_metrics. When they were removed on 11/07/2005 they broke effective_founders_lacy(), but I did not catch that problem until today. ''' 11/21/2005 pyp_metrics/inbreeding() now checks to see if there is a NRM attached to a pedobj. If there is, the COI are taken from the diagonal of that NRM. ''' 11/21/2005 While investigating poor performance in pyp_metrics/ inbreeding_vanraden() I added a lookup table that is used to detect full-sibs. Once the NRM for one full sib has been computed the coefficient of inbreeding for that sib is assigned to any successive full sibs in the pedigree file. This eliminates a lot of excessive computation, and improves performance, but inbreeding_vanraden() is still much slower than I think it should be. My current thinking is that the algorithm as I've implemented it is much faster when working with very deep pedigrees, where a large number of COI can be computed in a single round, than with a large number of small families that require lots of reordering and renumbering calls. +++ 11/16/2005 A new module, pyp_jbc.py, has been added to the dist- ribution. It accompanies the example of how to add new features to PyPedal that has been added to the manual. +++ 11/16/2005 A joint copyright assignment form based on the Sun Joint Copyright Assignment used for OpenOffice.org contributions has been written and is included as pypedal_copyright_assignment.pdf. +++ 11/16/2005 A new module, pyp_template.py, has been added to the dist- ribution. It provides a template to be used in creating new user- defined modules. CHANGES in PyPedal 2.0.0b1 (Vetinari) ===================================== *** 11/14/2005 PyPedal 2.0.0a20 has been rebranded 2.0.0b1 and we're going to the first Beta version of PyPedal 2, "Vetinari". There will be no release tagged 2.0.0a20; all of those changes are incorporated into 2.0.0b1. CHANGES in PyPedal 2.0.0a20 =========================== *** 11/09/2005 Extensive work has been done on the documentation, including up- dating the API, greatly expanding the installation instructions, and providing some new HOWTOs. --- 11/09/2005 Closed bug 1151280 (Sex code assignment broken with un-renumbered pedigrees). This routine needs to be handed a reordered and renumbered pedigree, and this is noted in the API docs. This is not a bug so much as a limitation of the program. It may be fixed in a future version but is low priority right now. *** 11/09/2005 Fixed bug 1151282 (sons/daus/unks not renumbered). This bug is fairly complicated and needs some explanation. If you load a pedigree and renumber it out-of-the-date, i.e. using the renumber option, there is no problem. When a pedigree is loaded, pyp_utils/assign_offspring is always called by the preprocessor, and this call is made after any calls to pyp_utils/renumber(). In this case, the offspring lists contain renumbered IDs. However, if you do not renumber a pedigree when you initially load it, and you call pyp_utils/renumber() later, the offspring lists are NOT updated. However, setting pedobj.kw['renumber']=1 and calling pedobj.renumber() will always correctly renumber both the pedigree and the offspring lists. ''' 11/08/2005 Added a new argument to pyp_graphics/draw_pedigree(). gtitjust indicates if the title should be center- ('c'), left- ('l'), or right-justified ('r'). ''' 11/08/2005 Added a new argument to pyp_graphics/draw_pedigree(). gtitloc is used to indicate if the graph title should be placed above ('t') or below ('b') the image. ''' 11/08/2005 Fixed pyp_newclasses/PedigreeMetadata.nuherds() so that it counts and returns unique herds based on the originalHerd attribute and does not count the default level ('u') for unknown herds. XXX 11/07/2005 pyp_metrics/partial_inbreeding(), which never worked correctly, has removed from PyPedal. In the end, while I did get the computational details worked out, I couldn't really fund a way to use the partial CoI in a typical analysis. Sometimes it's best to let something go. *** 11/07/2005 pyp_newclasses/PedigreeMetadata() now includes a count and list of unique herds. The fileme(), printme(), and stringme() methods were updated to deal with them. A method to identify and count unique herds, nuherds(), was added to the class. *** 11/07/2005 Fixed a typo in pyp_newclasses/PedigreeMetadata.fileme() that would have caused crashes. XXX 11/07/2005 pyp_metrics/num_traced_gens() and num_equiv_gens() have been removed. The more I squinted at the (very) brief writeup on them in Valera et al. (2005) the more they looked like a way to compute pedigree completeness which is already available in pyp_metrics/pedigree_completeness(). *** 11/07/2005 pyp_metrics/founder_descendants() and descendants() have been moved into the pyp_network module. descendants() has been replaced by pyp_network/find_descendants(). Trust me, this is best for everyone. NetworkX makes it much easier to get ancestor and descendants lists for this kind of work. It also makes it easy to count the number of steps (generations) between animals. The spaghetti code in pyp_metrics was a disaster. +++ 11/07/2005 Added two new attributes, herd and originalHerd, to NewAnimal objects. They can be populated using the new pedigree format codes 'h' (for integers) and 'H' (for strings). This field is supposed to be used for management groups, be they herds or flocks or kennels, etc. The stringme() and printme() methods have been updated to include these attributes. ''' 11/07/2005 Added pyp_graphics/rmuller_pcolor_matrix_pil() a dictionary to which colors can be cached. This should reduce the number of calls to pyp_graphics/rmuller_get_color(). This should lead to dramatic performance improvements becuase rmuller_pcolor_matrix_pil() currently calls rmuller_get_color() for each of the n**2 elements of A. The cache will reduce that to the number of unique values in A, which should be much smaller. ''' 11/03/2005 Added a new argument to pyp_graphics/draw_pedigree(). garrow is a flag used to indicate that arrowheads should be drawn on pedigrees to indicate the direction of gene flow (1) or that they should not be drawn (0). ''' 11/03/2005 Made changes to the code that creates _gtitle in pyp_graphics/ draw_pedigree(). _gtitle is used internally to name graphs and is not displayed to the used. _gtitle cannot have any special characters in it, but it is formed from the gtitle argument to draw_pedigree(). pyp_utils/ string_to_table_name() is now used to make sure that _gtitle is a valid graph name. ''' 11/03/2005 Two new attributes, sireName and damName, were added to NewAnimal objects. The stringme() and printme() methods were updated to include them. *** 11/03/2005 Some bugs in handling the ASD pedigree format codes have been fixed. Most of the changes were to ensure that ID maps are updated with names instead of hashed ID values. The program examples/new_doug.py demonstrates the use of strings for animal, sire, and dam IDs. --- 11/02/2005 There are problems in pyp_utils/reorder() when processing a pedigree with animal, sire, and dam IDs as strings. Maybe a hashing problem? Ah! I see. The problem is that missing sires and dams are not handled correctly when the ASD codes are used. For an example, delete Pachesi from doug.ped and take a look at things. ''' 11/02/2005 Added four new arguments to pyp_graphics/draw_pedigree(). gorient controls the orientation of the pedigree on the page: 'p' = portrait and 'l' = landscape. gdirec controls the direction of "flow" from parents to offspring: 'TB': top-bottom, 'LR': left-right, 'RL': right-left. gname indicates whether or not names (1) or ID numbers (0) should be used to label nodes. gfontsize sets the size of the typeface to be used in node labels. ''' 11/02/2005 pyp_graphics/draw_pedigree() now draws square boxes around males and ovals around females in pedigrees where sexes are known. The sample program is examples/new_graphics2.py. +++ 11/02/2005 Added support for Pattie's (1965) generation coefficients. This includes a new pedigree format code ('p') as well as new logic in pyp_utils/set_generations(). If generation coefficients are provided in the pedigree, they are loaded into the new gencoeff attribute of NewAnimal objects. If they are not provided but you would like to infer them, set the 'gen_coeff' option to 1 and call set_generations(); the full coefficient will be assigned to gencoeff and igen will be set to the gencoeff rounded to the nearest whole integer. ''' 10/28/2005 Updated the setuptools script to require all external modules that have entries in the Python Cheese Shop. I think that the only non-Cheese Shop extensions are NetworkX and PySparse. --- 10/27/2005 The graph display feature in the GUI is broken. I am going to wait on fixing it while I decide whether or not the GUI is, in the end, worth the trouble. wxPython is a real pain in my untutored neck when it comes to documentation, and the only other option that looks worthwhile is PyQt, which is a real mess if I care at all about people on Windows using it. In fact, you should probably just avoid the GUI altogether. CHANGES in PyPedal 2.0.0a19 =========================== +++ 10/25/2005 Added new functions to pyp_network for identifying ancestors, identifying descendants, identifying immediate family members (defined as parents and offspring, and does not include siblings), and identifying influential progeny based on the number of progeny they produce. *** 10/20/2005 Fixed to pyp_network/ped_to_graph() so that graphs are now ordered in the correct direction. Before this fix, the graph was ordered back- wards, so that offspring preceded parents in the graph. !!! 10/20/2005 It looks like most of the stuff on pyp_demog can be moved into pyp_reporting. I'm going to think about this. The demographics reports may be moved. The code would be a lot cleaner if those reports were queries against the database rather than formed from walking the pedigree. ''' 10/20/2005 Added a new option, oid, to pyp_network/ped_to_graph(). ''' 09/23/2005 There is now a simple document history in the GUI. The "View log" feature in pyp_gui has been rewritten so that it displays the logfile associated with the loaded pedigree rather than presenting the user with a file selection dialogue. *** 09/23/2005 I decided to back out of using Wax and just stick with plain old wxPython for the GUI. After messing with it for a couple of days I have decided that (i) Wax has a lot of potential and (ii) it's just not quite there yet in terms of functionality and, more importantly, documentation. If I am going to spend hours in the wxPython docs anyway I'll just write with that tool. Despite the much-despised IDs that I have to assign, pass, and debug. +++ 09/21/2005 I have added a new PyPedal file, pyp_gui_metrics, which contains convenience functions for entries in the Metrics menu to reduce repetitive code in pyp_gui. *** 09/21/2005 pyp_metrics/a_effective_founders_lacy() and pyp_metrics/ effective_founders_lacy() now return a dictionary that contains summary statistics, including the effective founder number. This change breaks at least one example program and may break at least one unit test. ''' 09/21/2005 Added a new option, log_long_filenames, that indicates whether or not logfile names should include datestamps. The default is to not use them. +++ 09/20/2005 Added pyp_io/summary_inbreeding() which returns a string representation of the data contained in the 'metadata' dictionary contained in the output dictionary returned by pyp_nrm/pyp_inbreeding(). +++ 09/20/2005 I've been messing with Wax and pyp_gui today. I have added a new PyPedal file, pyp_gui_graphs, to package the classes subclassed from Dialog, such as PyPedalGraphDialogInbreeding(). Oh, sure, I tried to have a single PyPedalGraphDialog() class that I could use for any graph by passing titles and filenames, but I could not get it to work correctly. If anyone wants to try and fix the dreaded-and-deadly "TypeError: __init__() got multiple values for keyword argument 'pgdTitle'" problem they are welcome to it. For now it is easiest to just have a subclass for each graph that I am going to draw. ''' 09/20/2005 Cleanups in pyp_db so that messages are only printed to STDOUT when the 'messages' option is set to 'verbose'. *** 09/20/2005 pyp_nrm/inbreeding() now returns a dictionary that contains two dictionaries: 'metadata', which contains summary statistics for the CoI in the pedigree, and 'fx', which contains the actual CoI for each animal. This change breaks at least one example program and may break at least one unit test. +++ 09/19/2005 Added a new function, plot_line_xy(), to pyp_graphics. plot_line_xy produces a plot of the values in an input dictionary by levels of the keys in the dictionary. It can take the dictionary returned by pyp_reports/meanMetricBy() and produce a simple chart from it. ''' 09/19/2005 Cleaned up logging in pyp_db/loadPedigreeDatabase(). ''' 09/19/2005 Automatically-generated logfile names now include datestamps. +++ 09/19/2005 Added a new function, pyp_datestamp(), to pyp_utils which returns a datestamp of the form YYYYMMDDHHMMSS. ''' 09/19/2005 Added a new option, log_ped_lines, that indicates how many lines of the pedigree file should be written to the logfile for debugging. The default is zero. Any value other than a non-negative integer is set to 0 and a warning is written to the log. ''' 09/16/2005 pyp_newclasses/NewPedigree.save() has been modified so that it will save CoI whenever they have been computed for a pedigree. ''' 09/16/2005 pyp_nrm/inbreeding_vanraden() now writes some summary statistics to the logfile, and the screen when requested, whenever a round of processing included at least 1% of the animals in the pedigree. ''' 09/16/2005 Modified pyp_nrm/inbreeding_vanraden() to stop overwriting known CoI in the fx{} dictionary unnecessarily. This may not buy much in terms of efficiency, but why make lots of writes when you don't need to? ''' 09/16/2005 pyp_nrm/inbreeding() now calls pyp_nrm/inbreeding_vanraden() for pedigrees of 1,000 animals or more rather than 10,000. ''' 09/16/2005 Modified pyp_nrm/inbreeding() to set the f_computed flag before returning when it has been successful. ''' 09/16/2005 Added a new option, f_computed, that indicates whether or not CoI have been computed for animals in the current pedigree. If the pedigree format string includes 'f' this will be set to 1; it is also set to 1 on a successful return from pyp_nrm/inbreeding(). ''' 09/14/2005 Added some code in pyp_io/unpickle_pedigree() to prevent a the addition of a .pkl extension to filenames that already have extensions. This solves the dreaded-and-deadly Double Pickle Problem. ??? 09/14/2005 pyp_io/pickle_pedigree() and pyp_io/unpickle_pedigree() seem to work on Python 2.4.1 compiled from source for 64-bit processors. More investigation is needed to see why this works on the 366 but not the 440. ''' 09/14/2005 Fixed a typo in an option name in pyp_newclasses/NewAMatrix(). +++ 08/23/2005 Added a new module, pyp_network, for experimenting with the NetworkX graph library for Python (https://networkx.lanl.gov/). This provides a way to represent pedigrees (for example) as algebraic graphs and may provide a nice way to get around some of the problems I am having trying to code routines such as pyp_metrics/num_traced_gens(). ''' 08/23/2005 pyp_utils/assign_offspring() now checks the pedigree format string to see if animal sexes were provided. If they were, offspring are assigned to their parents' correct sons or daghters list rather than to the unknowns list. +++ 08/22/2005 pyp_nrm/fast_a_matrix() and pyp_nrm/fast_a_matrix_r() now take an optional argument, 'method', that indicates whether a dense ('dense') or sparse ('sparse') matrix should be used for storing the NRM. The sparse matrix support is provided by PySparse (http://pysparse.sourceforge.net/), and there are source code and binary (Python 2.4 for Windows 32) versions available for download. This should allow PyPedal to manipulate larger NRM than is possible with the dense matrices provided by Numarray. ''' 08/22/2005 Several routines that did not return any values before now return some result (dictionaries of summary statistics, 0/1 on failure/success, etc.). In addition, many routines were not guaranteed to return the value(s) specified in the docstrings. This has been fixed with the judicious use of try/except blocks and default values, such as initially empty lists and dictionaries. The end result should be more stability from the user's perspective due to fewer instances of behavior contrary to the documentation. ''' 08/22/2005 Almost all routines now write messages to the logfile when they are entered and exited. Exceptions are made for, e.g., pad_id() which would result in an entry being made for each animal in the pedigree. try/except blocks are used to make sure that things don't fail if no logfile has been created, for example if someone is using an odd PyPedal routine or two out- of-context as it were. ''' 08/22/2005 Cleaned-up the output-to-file code. There were lots of places where output was created as a string in one line and written to an output file on a second line. This needless separation of powers was eliminated. ''' 08/22/2005 Lots of work has been done on the documentation. LaTeX is the One True Way. Repent, non-believers, or Many Bad Things will befall you in the next life! ''' 08/19/2005 Added a new pedigree format code, Z, that can be used to skip columns when reading a pedigree file. ''' 08/04/2005 pyp_newclasses/NewPedigree.preprocess() now checks to see if the datalines read from the pedigree file contain the same number of columns as specified in the pedigree format string. If there is a mis-match an error message is written to the console and PyPedal halts. ''' 08/04/2005 Fixed a possible bug in pyp_nrm/inbreeding_vanraden() caused by recent change in the arguments expected by pyp_nrm/fast_a_matrix(). ??? 08/04/2005 Added a new function, unpickle_pedigree(), to pyp_io that unserializes (unpickles) a pedigree using the cPickle module. It is used to load a pedigree from a pickled file created with the pickle_pedigree() function. --- 08/04/2005 Added a new function, pickle_pedigree(), to pyp_io that serializes (pickles) pedigrees using the cPickle module. Unfortunately, the call to the dump function in cPickle is throwing an exception I cannot decipher. I have e-mailed David S. to bug him for ideas. ''' 08/04/2005 More work on pyp_db and pyp_reports. ''' 08/02/2005 Minor bugfixes in pyp_nrm. ''' 08/02/2005 Some minor work on pyp_db and pyp_reports. +++ 07/22/2005 Added a new function, string_to_table_name, to pyp_utils(). It is used to produce strings that are safe for use as SQLite table names. ''' 07/22/2005 Added two new options, database_name and dbtable_name, for support of the new SQLite/pyp_reports features. +++ 07/21/2005 Added new modules pyp_db and pyp_reports. pyp_db is an optional module that can use SQLite (http://www.sqlite.org/) and the Python bindings to SQLite, pysqlite (http://initd.org/tracker/pysqlite) to store (and retrieve) pedigrees in a simple database. pyp_reports uses pyp_db to prepare summary reports, produce figures, etc. Neither of these modules is much use without SQLite. The thinking here is that there are lots of reports that are pretty trivial to produce with SQL, but which require lots of looping over PyPedal pedigrees. So I decided to give this a whirl. SQLite, "a small C library that implements a self-contained, embeddable, zero-configuration SQL database engine" is in the public domain. ??? 07/21/2005 It seems like FTP upload to Sourceforge is still FUBARed. I don't know when I will get the 2.0.0a18 tarballs uploaded. CHANGES in PyPedal 2.0.0a18 =========================== ''' 07/19/2005 pyp_metrics/min_max_f() now works, at least sort-of. +++ 07/19/2005 Added a new function, pyp_utils.sort_dict_by_values(), which sorts dictionaries by their values and keys within values. It Returns a list of tuples in sorted order. ''' 07/19/2005 pyp_metrics/a_coefficients() and pyp_metrics/fast_a_coefficients() have been updated to check for an attached NRM when processing a pedigree. If kw['form_nrm'] is '0' they will form the NRM from scracth; otherwise they use the NRM attached to the pedobj. They return a dictionary of individual non-zero COI. ''' 07/19/2005 Rewrote part of pyp_newclasses/NewAMatrix. There is now a single method, NewAMatrix.form_a_matrix(), for creating NRM. The option 'nrm_method' is used to determine whether or not to correct for parental inbreeding. The default is no correction. ''' 07/19/2005 pyp_newclasses/NewPedigree.preprocess() checks for conflicts between the sepchar and alleles_sepchar options when allelotypes are provided in an input file. In case of a conflict warnings are written to the console and the logfile, and the allelotypes are ignored. +++ 07/19/2005 Added a new option, form_nrm, that will result in the formation of a NRM as an instance of a NewAMatrix object that is attached to your NewPedigree instance. This is probably best avoided with large pedigrees, at least until I have tested it further. +++ 07/19/2005 Changed from using distutils to using setuputils in an effort to make installation simpler. --- 06/21/2005 pyp_metrics/min_max_f() must never have been tested -- there is no way, looking at the code, that it can work as written. ''' 06/21/2005 Cleaned-up several functions in pyp_metrics so that they obey the 'quiet' form of kw['messages']. ''' 06/21/2005 Many small bug/typo fixes in pyp_metrics thanks to the unit tests. XXX 06/21/2005 Removed pyp_metrics/a_effective_ancestors() because the two subroutines that it called return different values (a single value versus a tuple). There is no particularly good reason to hide them from the user, anyway. +++ 06/20/2005 Added a unit testing framework using TestOOB (http://testoob.sourceforge.net/). ''' 06/20/2005 Fixed a bug in pyp_newclasses/NewPedigree.load() in which the wrong argument was was passed to pyp_utils/assign_offspring(). ''' 06/20/2005 Fixed a bug in pyp_newclasses/NewPedigree.renumber() in which the wrong argument was was passed to pyp_utils/assign_offspring(). ''' 06/20/2005 Fixed a typo in pyp_nrm/fast_a_matrix() that caused programs to crash. ''' 06/20/2005 Changed option, 'is_renumbered', that allows the user to specify whether or not the pedigree is already renumbered, to 'pedigree_is_renumbered'. ''' 06/20/2005 Hackage on pyp_nrm/fast_a_matrix() so that it no longer takes a PyPedal pedigree pyp_metrics/effective_founders_lacy use this routine to form NRMs from "subpedigrees", which are lists of animals rather than instances of PyPedal pedigree objects. CHANGES in PyPedal 2.0.0a17 =========================== ''' 05/19/2005 Updated all routines in pyp_nrm to conform to new object model. ''' 05/19/2005 Updated all routines in pyp_metrics to conform to new object model. ''' 05/19/2005 Updated all routines in pyp_io to conform to new object model. XXX 05/19/2005 pyp_io/id_map_from_file() has been removed. Similar functionality id provided by pyp_utils/load_id_map(). XXX 05/19/2005 pyp_io/a_matrix_from_text_file() has been removed. This procedure was stubbed and never written. XXX 05/19/2005 pyp_io/a_matrix_to_file() and pyp_io/a_matrix_from_file() have been removed. Similar functionality is provided by the load() and save() methods of pyp_newclasses/NewAMatrix. ''' 05/19/2005 Updated all routines in pyp_graphics to conform to new object model. Note that the routines that need a NRM are still using "raw" numarray matrices rather than instances of the NewAMatrix class. ''' 05/19/2005 Updated all routines in pyp_demog to conform to new object model. ''' 05/19/2005 Added a new option, 'debug_messages', that indicates whether or not PyPedal should print debugging information. ''' 05/13/2005 Updated all routines in pyp_utils to conform to new object model. ''' 05/13/2005 pyp_utils/set_ancestor_flag() has been updated to take only a single argument, an instance of a NewPedigree object. Keyword options in the 'NewPedigree.kw' dictionary are used to control messages and I/O. The logging module is used for recording operations. The documentation has been updated. ''' 05/13/2005 Added a new option, 'file_io', that tells routines that can write results to output files to do and put messages in the program log to that effect. XXX 05/13/2005 pyp_utils/new_preprocess() has been removed. XXX 05/13/2005 pyp_utils/preprocess() has been removed. XXX 05/13/2005 pyp_utils/load() has been removed. ''' 05/13/2005 Updated __version__.py. ''' 05/06/2005 Added support for psyco metaclasses to pyp_newclasses. If the psyco optimizing compiler for Python is installed on your system all of the methods in the classes defined in pyp_newclasses will automatically be bound by psyco. More details may be found in the psyco documentation (http://psyco.sourceforge.net/). +++ 05/06/2005 Added two new routines to pyp_graphics: spy_matrix_pylab() and spy_matrix_pylab(). They are matplotlib implementations of the rmuller_spy_matrix_pil() and rmuller_pcolor_matrix_pil() functions, respectively. They are not well-tested, spy_matrix_pylab() seems to only use greyscale at the moment, and the output from pcolor_matrix_pylab() is rotated 90-degrees from what it should be. Patches are welcome. +++ 05/04/2005 Added support for animal/sire/dam IDs as strings. There are new pedigree format codes (A, S, D) corresponding to these formats. The 'sepchar' character should NOT appear in the ID string; if it does, breakage will occur. ''' 05/04/2005 Added a new method, string_to_int(), to pyp_newclasses/NewAnimal() that converts any Python string to a string composed of the ASCII values of each character in the original string. The new string can be cast to an integer. I prefer this to the Python hash() function because the hash() function can return negative values. CHANGES in PyPedal 2.0.0a16 =========================== ''' 05/03/2005 Finally started updating the non-API documentation. ''' 05/03/2005 Added a new option, 'missing_parent', that allows the user to specify the value in the pedigree used to indicate missing/unknown parents. This defaults to '0', and whenever a parent ID matching the 'missing_parent' value is encountered '0' is recorded in the offspring's record. ''' 05/03/2005 Rewrote the script which autogenerates the documentation so that it can pull the HTML API into the LaTeX document used to created the PS and PDF manuals. +++ 05/03/2005 Added a new class, NewAMatrix, to pyp_newclasses. This class is a wrapper around a Numarray matrix that provides convenience methods for saving and loading numerator relationship matrices. CHANGES in PyPedal 2.0.0a15 =========================== ''' 04/28/2005 Some minor cleanup in all files to enforce consistens useage of pyp_utils/pyp_nice_time() for date/time reporting. ''' 04/28/2005 Fixed a typo in pyp_newclasses/NewAnimal.__init__() that caused an error when trying to read coefficients of inbreeding from a pedigree file. ''' 04/28/2005 Added a new option, 'pedigree_is_renumbered', which indicates whether or not a pedigree has been renumbered. This should NOT be confused with the 'renumber' flag, which indcates that a pedigree should be renumbered. This is an informational flag, not a command flag, and is not documented in OPTIONS. +++ 04/28/2005 Added a new method, renumber(), to pyp_newclasses/NewPedigree. This is used by pyp_newclasses/NewPedigree.load() and pyp_metrics/effective_founders_lacy() to renumber pedigrees. It will eventually be used by any routine that needs to renumber the pedigree. ''' 04/28/2005 pyp_metrics/effective_founders_lacy() will renumber a pedigree passed to it if the pedigree's 'renumber' flag is set to 0, update the ID maps, and assign offspring. These actions are noted in the logfile and fix the impact of the pyp_utils/assign_offspring() side-effect. A better solution is probably to run pyp_utils/assign_offspring() in the background whenever a pedigree is renumbered. --- 04/28/2005 While investigating Edward H. Hagen's bug report I found a problem in pyp_utils/assign_offspring(). If you pass it an un-renumbered pedigree, offspring were getting assigned to their parents's unks dictionaries but the dictionaries were not being "cleaned out" before the updating. This resulted in two sets of offspring IDs in the same dictionary, the original IDs and the renumbered IDs. Further downstream, for example in pyp_metrics/effective_founders_lacy(), this causes problems with pyp_metrics/descendants() and pyp_metrics/founder_descendants() such that incorrect answers are returned. ''' 04/28/2005 Edward H. Hagen reported an error in pyp_metrics/effective_founders_lacy() that has been fixed. When a renumbered pedigree is used, the renumbered founder IDs need to be looked up from the idmap dictionary in the NewPedigree object. A sim- ilar change was made to pyp_metrics/founder_descendants(). CHANGES in PyPedal 2.0.0a14 =========================== ''' 04/27/2005 An OPTIONS file, which describes the keyword options that PyPedal currently understands, is now included in the distribution file. ''' 04/27/2005 pyp_newclasses/NewPedigree.load() calls pyp_utils/reorder() instead of pyp_utils/fast_reorder() if the input pedigree file does not contain birth year or birth date and if you set the option 'slow_reorder' to 1. The new default behavior is to use the slower, but more-likely-to-be-correct, reorder() routine unless you are more concerned with speed than correctness. The pad_id() method in pyp_newclasses/NewAnimal uses the animal ID and birth year to form an ID used by pyp_utils/fast_reorder() for quick sorting; if your pedigree file is numbered such that offspring always have larger IDs than their parents and your birthyears (if provided) are correct (that is, parents always born BEFORE offspring) then pyp_utils/fast_reorder() works fine, so it is not completely useless. If you do not provide birthyears in your pedigree file but your parent IDs are always smaller than your animal IDs you will likewise be okay. Messy (i.e. real) pedigrees are likely to have errors that could give incorrect results with fast_reorder(). Large pedigrees should be reordered and renumebred and written to a file. That way you only have to pay the performance penalty for slow, but correct, renumbering once. *** 04/27/2005 Much rewriting of pyp_utils/reorder(). In fact, the routine has been completely rewritten. While it is still noticeably slower than pyp_utils/fast_reorder(), it is guaranteed to put animals in the correct order. Barring pedigree errors, of course. --- 04/27/2005 There is now a known bug with pyp_utils/fast_reorder(): in pedigrees with no birth years or birth dates AND animals whose parents' ID numbers are larger than the animal's are reordered incorrectly. This first manifested itself in screwed-up inbreeding calculations. ''' 04/27/2005 A new pedigree format code, 'A', has been added to support alleles. ''' 04/27/2005 The PEDIGREE_FORMAT_CODES file is now included in the distribution file. CHANGES in PyPedal 2.0.0a13 =========================== ''' 04/26/2005 In pyp_newclasses/NewPedigree.save() accepts an option, idformat, that specifies which animal, sire, and dam IDs are written. The 'o' (original) option writes a pedigree with the original IDs as read from the original input pedigree file. The 'r' (renumbered) option will write a pedigree file containing renumbered animal, sire, and dam IDs. ''' 04/26/2005 In pyp_newclasses/NewPedigree.save() accepts an option, outformat, that specifies how the saved pedigree is written. The 'o' (original) option writes a pedigree with the same pedformat as the original input pedigree file; this is useful if you have computed CoI, inferred sex, and that kind of thing. The 'l' (long) option will write a pedigree file containing all known fields in the animal object for which there is are pedigree format codes (see the file PEDIGREE_FORMAT_CODES). ''' 04/26/2005 In pyp_newclasses/NewPedigree.__init__() the default logfile name is now .log. ''' 04/26/2005 Some changes were made to layout options in pyp_graphics/draw_pedigree(). Pedigrees are now drawn landscaped on US letter-sized pages (8.5 in x 11 in) and will, in theory, be tiled across pages if they cannot fit on a single page. This does not work as well as hoped, but I am working on it. ''' 04/26/2005 pyp_graphics/draw_pedigree() now takes an optional parameter, gdot, that tells draw_pedigree() whether or not write the raw (dot language) representation of the pedigree to a file. Code is written to a file named _pedigree.dot. ''' 04/25/2005 pyp_graphics/draw_pedigree() now takes an optional parameter, gsize, that tells draw_pedigree() whether or not write the raw (dot language) representation of the pedigree to a file. ''' 04/25/2005 pyp_graphics/draw_pedigree() now takes an optional parameter, gsize, that specifies the size of the resulting graphic: 'f' (default) produces as large a graph as necessary to accomodate the layout and 'l' produces a diagram scaled to fit on a letter-sized sheet of paper. +++ 04/25/2005 Added a new method, save(), to pyp_newclasses/NewPedigree(). This long-overdue feature lets you easily save a pedigree after, for example, computing CoI. It eliminates the need to perform time-consuming computations on pedigrees every time they are accessed by making it easy to store a "large format" PyPedal pedigree. *** 04/25/2005 Fixed a bug in pyp_newclasses/NewPedigree.preprocess() in which records for sires and dams that appear in a pedigree, but which do not have individual entries in the pedigree file, were assigned birth years of 0 when dummy records were inserted into the pedigree. This was causing pyp_newclasses/NewAnimal.pad_id() to return a munged up paddedID that caused problems in pyp_utils/fast_reorder(). Tricky problem to find, that was. ''' 04/25/2005 Made a small change to pyp_newclasses/NewPedigree.preprocess() so that blank lines are caught and handled correctly. Before this fix a blank line with, say, an embedded TAB character would cause a fatal error b/c it was treated as a "regular" record. CHANGES in PyPedal 2.0.0a12 =========================== ''' 04/19/2005 Rolled back changes to pyp_newclasses/NewAnimal.pad_id() in response to a bug report that I could not duplicate. CHANGES in PyPedal 2.0.0a11 =========================== ??? 04/15/2005 I think that pyp_graphics/draw_pedigree() may be inserting a spurious node when drawing the pedigree, but I have not yet figured out where it is happening. ''' 04/15/2005 Removed references to "species" from pyp_newclasses/NewAnimal.printme() and pyp_newclasses/NewAnimal.stringme(). ''' 04/15/2005 Tweaked pyp_newclasses/NewAnimal.pad_id() so that it casts values to INTs before concatenating them. *** 04/15/2005 pyp_newclasses/NewPedigree.preprocess has been fixed to handle parents that do not have their own entry in the pedigree file. They are added to the pedigree with an unknown sire and dam. ''' 04/15/2005 Changed pyp_nrm/inbreeding() so that the output file written contains the original ID, the renumbered ID, and the CoI (in that order). ''' 04/15/2005 Added a dictionary, "backmap", to pyp_newclasses/NewPedigree that maps renumbered IDs (keys) to original IDs (values). It is the reverse direction of that provided by idmap. +++ 04/15/2005 Added pyp_graphics/plot_pct_founders_by_year() to plot the frequency of founders in each birth year. NOTE: This requires matplotlib (http://matplotlib.sourceforge.net/). If matplotlib is not installed/cannot be imported, a value of 0 is returned. ''' 04/15/2005 Fixed pyp_graphics/draw_pedigree() so that it labels animals with their original IDs instead of their renumbered IDs. ''' 04/15/2005 Fixed pyp_graphics/draw_pedigree() so that it displays the gtitle. ''' 04/14/2005 Fixed a typo in pyp_newclasses/NewAnimal.__init__() that broke proper birthyear assignment. +++ 04/14/2005 Added pyp_graphics/plot_founders_by_year() to write a histogram of number-of-founders by year of birth. NOTE: This requires matplotlib (http://matplotlib.sourceforge.net/). If matplotlib is not installed/cannot be imported, a value of 0 is returned. ''' 04/14/2005 Changed pyp_demog/BASE_DEMOGRAPHIC_YEAR from 1950 to 1900. This brings it in line with the default birthyear of 1900 used in pyp_newclasses. +++ 04/14/2005 Added pyp_demog/founders_by_year() which provides a dictionary, keyed by birthyear, of the number of founders with each birthyear. CHANGES in PyPedal 2.0.0a10 =========================== ''' 04/14/2005 Fixed a typo in the MANIFEST.in file used to roll the distribution. The __init__.py file should be included now. ''' 04/14/2005 Added __version__.py to the distribution. XXX 04/14/2005 Disabled gettext functionality in pyp_classes after receiving a report of problems under FreeBSD (thanks to Thomas von Hassel). CHANGES in PyPedal 2.0.0a9 ========================== ''' 03/30/2005 pyp_io/pyp_file_header() and pyp_io/pyp_file_footer() now work. *** 03/30/2005 Added pyp_metrics/effective_founders_lacy(), which is a re-write of pyp_metrics/a_effective_founders_lacy() that works with the new object model. Correctness was verified by comparing results against Table 3 in Lacy (1989) and Tables I & II in Boichard et al. (1997).You can use examples/new_lacy.py to verify the results. *** 03/30/2005 Fixed a nasty bug in pyp_metrics/a_effective_ancestors_definite() that was due to an indentation screwup when moving from one editor to another. Correctness was verified by comparing results against Tables I and II in Boichard et al. (1997). You can use examples/new_lacy.py to verify the results. ''' 03/30/3005 Added pyp_utils/pyp_nice_time() which returns the current date and time as a nicely-formatted string. ''' 03/29/2005 Added pyp_metrics/descendants() and pyp_metrics/founder_descendants() to support the rewritten pyp_metrics/effective_founders_lacy() routine. ''' 03/29/2005 Added pyp_utils/assign_offspring(), which adds offspring of an animal to that animal's 'unks' list. !!! 03/28/3005 Stubbed pyp_io/pyp_file_header() and pyp_io/pyp_file_footer() in preparation for standardizing the output files written by PyPedal. +++ 03/04/2005 Added pyp_graphics module. It currently includes three functions from the ASPN Python Cookbook (http://aspn.activestate.com/ASPN/Cookbook/Python/) for visualizing the sparsity and the elements of matrices. I have also moved the draw_pedigree() function from pyp_utils to pyp_graphics. From now on, any functions related to visualization will go in pyp_graphics. --- 02/24/2005 It looks like the sons and daus lists get screwed up when the pedigree is re- numbered, but I think that it is a consequence of the item below. --- 02/24/2005 When a pedigree that needs renumbering is read, pyp_utils/preprocess() throws an exception when trying to assign sex codes because it uses the sire's and dam's original IDs as keys. This represents fundamental breakage in the ordering of events in pedigree creation. I have sort-of hacked around this for the moment, but the bug is still there. ''' 02/23/2005 Added a new pedigree format code, asdgb, to pyp_utils/preprocess(). +++ 02/23/2005 Added pyp_metrics/generation_lengths_all() which computes the average generation interval in years for each of the four selection paths (sire-son, sire-daughter, dam-son, and dam-daughter) for all births of a parent's offspring. +++ 02/23/2005 Added pyp_utils/assign_sexes() which iterates over a renumbered PyPedal pedigree to update sexes of sires and dams based on knowledge of their sons and daughters. This seems to catch cases that are missed in pyp_utils/preprocess(), which needs to be cleaned up. ''' 02/23/2005 Upon further examination, it seems like males and females are being correctly assigned. Hm...OK. Fixed a bug in pyp_utils/preprocess() that incorrectly assigned sires and dams with unknown parents to the sons and daus lists of the last animal in the pedigree. This was fixed by casting to an INT before a comparison with 0. --- 02/11/2005 See examples/generations.py -- sons and daughters are not being correctly assigned to foo.sons and foo.daus. --- 02/11/2005 Need to fix a bug in pyp_utils/new_preprocesss() in which unknown sires and dams (animals with IDs of 0) were being put into male, female, son, and daughter lists. ''' 02/11/2005 Fixed a bug in pyp_utils/preprocesss() in which unknown sires and dams (animals with IDs of 0) were being put into male, female, son, and daughter lists. +++ 02/11/2005 Added pyp_metrics/generation_lengths() which computes the average generation interval in years for each of the four selection paths (sire-son, sire-daughter, dam-son, and dam-daughter) for the oldest (first-born) of parents. !!! 02/11/2005 Added pyp_metrics/num_traced_gens(), pyp_metrics/num_equiv_gens(), and pyp_metrics/pyp_partial_inbreeding(). XXX 02/11/2005 Lots of code cleanup in pyp_classes. Removed pad_id() and renamed pad_id_new() to pad_id(). XXX 02/11/2005 Removed the originalID and species attributes from the Animal() class. CHANGES in PyPedal 2.0.0a8 ========================== +++ 11/01/2004 Started working on an output-rendering framework that will easily allow for writing strings as HTML or text, depending on a variable set in pypedal.conf. Right now, use PYPEDAL_OUTPUT_TYPE, which is hard-coded in pyp_classes. *** 07/21/2004 Major overhaul of pyp_utils/preprocess() pedigree format code handling. CHANGES in PyPedal 2.0.0a7 ========================== ''' 08/12/2004 Changed pyp_metrics/fast_a_coefficients() to catch exceptions when no relationship matrix is provided and the pedigree is too large for fast_a_matrix() to compute one. In these cases, a value of -999.9 is returned. ''' 08/12/2004 Changed pyp_metrics/a_effective_ancestors_indefinite() to catch exceptions when no relationship matrix is provided and the pedigree is too large for fast_a_matrix() to compute one. In these cases, a value of -999.9 is returned. ''' 08/12/2004 Changed pyp_metrics/a_effective_ancestors_definite() to catch exceptions when no relationship matrix is provided and the pedigree is too large for fast_a_matrix() to compute one. In these cases, a value of -999.9 is returned. ''' 08/12/2004 Changed pyp_metrics/a_effective_founders_boichard() to catch exceptions when no relationship matrix is provided and the pedigree is too large for fast_a_matrix() to compute one. In these cases, a value of -999.9 is returned. ''' 08/12/2004 Changed pyp_metrics/a_effective_founders_lacy() to catch exceptions when no relationship matrix is provided and the pedigree is too large for fast_a_matrix() to compute one. In these cases, a value of -999.9 is returned. ''' 08/12/2004 Made changes to pyp_metrics/a_coefficients() to catch exceptions in fast_a_matrix() or fast_a_matrix_r() when they cannot allocate a matrix. When an exception is caught all successive computations are performed on a 1x1 matrix whose value is -999.9. This is kind of hacky, but will prevent many problems. ''' 08/12/2004 Added summary statistics (mean/min/max) to the pyp_nrm/inbreeding() routine. *** 08/12/2004 Changed pyp_classes/Pedigree.nus() to use dictionaries instead of lists; Changed pyp_classes/Pedigree.nud() to use dictionaries instead of lists; Changed pyp_classes/Pedigree.nug() to use dictionaries instead of lists; Changed pyp_classes/Pedigree.nuy() to use dictionaries instead of lists; Changed pyp_classes/Pedigree.nuf() to use dictionaries instead of lists. There are, as always, huge gains in large pedigrees from doing this. Why? Because, silly rabbit, you avoid looping over increasingly-large arrays for every animal in the pedigree. It is not a big win on a small pedigree, but on, e.g., an 800,000 animal pedigree it makes a very significant difference. *** 08/02/2004 Changed pyp_utils/renumber() so that it checks sire and dam birthyears before renumbering. If the child has an earlier birthdate than a parent, that parent is set to unknown, '0'. This is a temporary fix pending a rewrite of the actual pedigree component of PyPedal. I am thinking that a dictionary of animal objects might be a better way to handle things than a simple list. If everything was in a dictionary, for example, then it would be simple to check the sire and dam birthyears using a key->value lookup. As is, there is no reliable way to check those sorts of things unless the pedigree has been reordered and renumbered. ''' 07/31/2004 Added a new pedigree format code, asdbx, to pyp_utils/preprocess(). ''' 07/31/2004 Changed pyp_classes/Animal() so that the default birthyear is 1900. ''' 07/31/2004 Added debug statements to several routines in pyp_utils. *** 07/29/2004 Added pyp_utils/new_preprocess() which is the major rewrite of the pedigree format code handling that I have been promising for a while. ''' 07/21/2004 Added a species attribute to the Animal() class which defaults to 'u'. +++ 07/21/2004 Added pyp_utils/reverse_string() to reverse a string. Useful when you have a string on which you cannot readily use string.split(). +++ 07/21/2004 Added pyp_demog/age_distribution() for computing the distribution of ages in a population. +++ 07/21/2004 Added pyp_utils/simple_histogram_dictionary() for creating a simple test-based histogram from a dictionary of integral counts. ''' 07/21/2004 Changed Animal/__init__() so that birthyears default to -999 when they are not specified in the pedigree file. This was done to support age computations in the demographics module. ''' 07/21/2004 Added age and alive attributes to the Animal() class which default to -999. +++ 07/21/2004 Added a new file, pyp_demog.py, which contains some routines for demographic computations, such as age distributions. There are going to be some potentially hairy issues with date handling. Maybe. If I don't get lazy and just say that everything is on a year basis. ''' 07/21/2004 Added a stub file, pyp_peel.py, for forthcoming support for pedigree peeling. ''' 07/20/2004 Added some notes to pyp_utils/preprocess() detailing an idea for greatly improving the way in which pedigree format strings are handled. No code has been written yet, but the idea is on the table. *** 07/20/2004 I *think* that pyp_metrics/pedigree_completeness() works correctly now. ''' 07/20/2004 Added a breedcode attribute to the Animal() class which defaults to 'u'. ''' 07/20/2004 Fixed recurse_pedigree_n() so that it recurses to the correct depth. +++ 07/20/2004 Added recurse_pedigree_onesided() to pyp_nrm. It recurses to return the complete sire or dam side of an animal's pedigree. *** 06/11/2004 Added stringme() methods to the Animal() and Pedigree() classes to support integration with the GUI. The output returned is identical to the printme() methods. +++ 06/11/2004 Started working on a GUI for PyPedal, pyp_gui. It requires that you have the wxPython toolkit installed. How to do that is up to you. *** 05/26/2004 Added pyp_nrm/recurse_pedigree_n(), which returns a pedigree of a specified depth. ''' 05/26/2004 Fixed pyp_classes/Animal() so that animal names are actually assigned correctly in __init__(). *** 05/26/2004 Added pyp_utils/set_generation() to infer the generation to which each individual in the pedigree belongs. This was added to make pyp_metrics/pedigree_completeness() easier to code as the igen is really just a count of the depth of an individual's pedigree. ''' 05/26/2004 Added an igen (inferred generation) attribute to the Animal() class which defaults to -999. A non-negative value will be assigned to this attribute by pyp_utils/set_generation(). ''' 05/26/2004 Added a pedcomp attribute to the Animal() class which defaults to -999.9. A non-negative value will be assigned to this attribute by pyp_metrics/pedigree_completeness(). --- 05/26/2004 There is a bug in pyp_utils/renumber() such that the offspring stored in myped[i].sons, myped[i].daus, and myped[i].unks are not updated to reflect changes in animal IDs when a pedigree is renumbered. --- 05/26/2004 There is still a bug in pyp_utils/preprocess() where the sex of an animal is assigned based on a "best guess". *** 05/26/2004 Added pyp_utils/load_pedigree(), which is a wrapper around several pedigree processing routines. It is a convenient way to roll several common operations (load, reorder, renumber, etc.) into a single call. ''' 05/26/2004 Updated pyp_utils/renumber() so that the renumberedID attribute is set as each animal is renumbered. ''' 05/26/2004 Added originalID and renumberedID properties to the Animal() class with the eventual goal of eliminating ID maps from the renumbering code. originalID defaults to animalID and renumberedID defaults to -999. ''' 05/26/2004 Made small changes to Animal.printme() method to add new attributes. ''' 05/26/2004 Changed pyp_metrics/effective_founder_genomes() so that the quiet flag suppresses all outout to stdio. CHANGES in PyPedal 2.0.0a6 ========================== *** 05/25/2004 Added pyp_metrics/effective_founder_genomes() for running gene-drop simulations on a pedigree to determine the effective number of founder genomes as defined in Lacy (1989) and Boichard et al. (1997). *** 05/25/2004 Added pyp_metrics/assign_founder_alleles() to be used for setting-up gene-drop simulations on pedigrees for which no founder alleles are provided in the input file. *** 05/25/2004 Added a new pedigree format code, 'asdt', to pyp_classes/preprocess() to support simple pedigrees with genotype data (two alleles only). ''' 05/25/2004 Added an alleles attribute to pyp_classes/Animal() to support gene dropping. *** 05/25/2004 Added pyp_utils/sort_dict_by_keys() to return a dictionary where the keys are sorted in ascending order (from "Python Cookbook", P. 39). CHANGES in PyPedal 2.0.0a5 ========================== *** 05/06/2004 Added pyp_nrm/fast_a_coefficients() for testing some loop optimization. *** 05/06/2004 Moved some code in pyp_utils/preprocess() outside of a loop in which it did not belong for a HUGE win in performance! ''' 05/06/2004 Made changes to pyp_utils/preprocess() to support the changed attribute types in pyp_classses/Animal/__init__(). ''' 05/06/2004 Changed self.sons, self.daus. and self.unk from lists to dictionaries. ''' 05/03/2004 Tweaked pyp_classes/Pedigree/nus() and pyp_classes/Pedigree/nud() so that the counts computed do NOT include unknown sires or dams. ''' 04/27/2004 Added some "try...except" code to pyp_utils/preprocess() so that non-renumbered pedigrees do not cause the sex assignment code to halt the program. ''' 04/27/2004 Added a "method" parameter to pyp_metrics/a_coefficients() so that the user can specify which type of relationship they would like -- the NRM using pyp_nrm/fast_a_matrix() or the complete (inbreeding- adjusted) RM using pyp_nrm/fast_a_matrix_r(). Method takes the values 'frm' (full relationship matrix) or 'nrm' (numerator relationship matrix). +++ 04/27/2004 Added pyp_nrm/fast_a_matrix_r(), which corrects the relationships in A for the inbreeding of the parents. The A matrix returned by fast_a_matrix_r is, therefore, NOT a numerator relationship matrix. It is a matrix of coefficients of relationship. CHANGES in PyPedal 2.0.0a4 ========================== ''' 04/23/2004 Possibly corrected a subtle bug in the Animal.pad_id_new method that resulted in incorrect sorting in some cases. +++ 04/23/2004 Added pyp_metrics/mating_coi(), which computes the coefficient of inbreeding of offspring that would result from a matinge between two animals. +++ 04/23/2004 Added pyp_metrics/relationship(), which computes the coefficient of relationship between two animals. ''' 04/23/2004 Added three new attributes to Animal() objects: self.sons, self.daus, and self.unks, which are lists to store renumbered animalIDs of sons and daughters of an animal, as well as the IDs of offspring with un- known sex. ''' 04/20/2004 Added a 'name' attribute to the Animal() object to accomodate, e.g., dog breeders. +++ 04/20/2004 Added a new procedure, pyp_utils/draw_pedigree(), to draw pedigrees using the pydot interface to Graphviz. If the necessary modules are not installed the procedure will return a result of '0' rather than exploding. :-) ''' 04/20/2004 Beginnings of a tutorial in the PyPedal manual. ''' 04/19/2004 Corrected a minor bug in pyp_nrm/inbreeding_tabular() that resulted in negative CoI being written to returned dictionary. ''' 04/19/2004 Enhanced pyp_nrm/inbreeding() to update Animal() instances with the CoI computed by that routine. ''' 04/19/2004 Enhanced pyp_utils/preprocess() to assign sex codes to Animal() instances based on the inferred sex iff no sex code was specified in the pedigree file. CHANGES in PyPedal 2.0.0a3 ========================== +++ 04/19/2004 Added a new routine, pyp_metrics/a_effective_ancestors, that will call either a_effective_ancestors_definite() or a_effective_ancestors_indefinite() depending on the size of the pedigree passed in. Currently, they cutoff is 1,000. +++ 04/19/2004 Added a new routine, pyp_metrics/a_effective_ancestors_indefinite() routine, that attempts to estimate upper and lower bounds for f_a in large pedigrees rather than computing all contributions explicitly. a_effective_ancestors_indefinite() is NOT WELL TESTED. There are almost certainly bugs; the routine does not iterate. All I can really tell you for sure is that it sometimes returns values that are extreme underestimates of f_a. It is supposed to work reasonably well on large pedigrees rather than small ones. *** 04/19/2004 FINALLY fixed all known bugs in the tragically-written pyp_metrics/a_effective_ancestors_definite() routine! +++ 04/16/2004 Added pyp_utils/set_ancestor_flag() to be used to set ancestor flags. +++ 04/16/2004 Added an ancestor flag to pyp_classes/Animal/__init__(). *** 04/15/2004 Fixed bugs in pyp_metrics/a_effective_founders_lacy() and pyp_metrics/a_effective_founders_boichard() that were introduced by changes in pyp_utils/preprocess(). *** 04/15/2004 Changed pyp_utils/preprocess() so that pedigree entries are not made for unknown parents by the "add parent records to the pedigree if they are not already there" routine. +++ 04/15/2004 Added pyp_metrics/common_ancestors() which returns a list of all of the ancestors that two animals share in common. +++ 04/15/2004 Added pyp_metrics/related_animals() which recurses through a pedigree to build a list of all animals related to a given animal, if any. CHANGES in PyPedal 2.0.0a2 ========================== ***/+++ 04/??/2004 Refactored and added new code to pyp_nrm to support VanRaden's iterative method for computing CoI in large pedigrees: inbreeding() inbreeding_vanraden() recurse_pedigree() inbreeding_tabular() CHANGES in PyPedal 2.0.0a1 ========================== --- 03/31/2004 pyp_utils/fast_a_matrix blows up when passed a pedigree of size 80,000 or so; +++ 03/31/2004 pyp_utils/pedigree_range was added -- allows the easy creation of a pedigree containing animals 1 through from a large pedigree. This will be used to determine how large a pedigree PyPedal can currently handle; +++ 03/31/2004 pyp_utils/preprocess rewritten to use dictionary lookups instead of list lookups -- improved the performance of this routine by about 2 orders of magnitude; +++ 03/31/2004 pyp_utils/preprocess now accepts a delimiter to accomodate pedigree files that are not CSV; +++ 03/31/2004 pyp_utils/preprocess now properly handles base animals that do not have an entry in the pedigree file, that is, who only appear as a sire or dam in another animal's record; *** 04/06/2003 Complete rewrite of PyPedal begun. Major changes include incorporation of metadata into the pedigree object. CHANGES in PyPedal 0.0.1 ======================== *** First version released to the general public; --- a_effective_founders_boichard() does not return correct answers. I have not yet found the error in my implementation of Boichard's algorithm; --- a_effective_ancestors_definite() does not return correct answers. I have not yet found the error in my implementation of Boichard's algorithm; ??? a_effective_founders_lacy() is believed to work correctly; --- a_matrix() is deprecataed in favor of fast_a_matrix(). It will return a properly-formed numerator relationship matrix, but it is extremely slow (orders-of-magnitude slower than fast_a_matrix()); --- reorder() is deprecated in favor of fast_reorder(); ??? Neither reorder() nor fast_reorder() should be used on a pedigree returned by renumber() unless the results are checked very carefully. In some cases, renumbered pedigrees are reordered incorrectly. This is due to a bug in the ID padding algorithm which is believed to be fixed, but more testing is needed.