Robust Phylogenetic Inference over Parallel and Distributed Digital Evolution Systems

Retention visualization for hereditary stratigraphy policy.

The capability to detect phylogenetic cues within digital evolution has become increasingly necessary in both applied and scientific contexts. These cues unlock post hoc insight into evolutionary history — particularly with respect to ecology and selection pressure — but also can be harnessed to drive digital evolution algorithms as they unfold. However, parallel and distributed evaluation complicates, among other concerns, maintenance of an evolutionary record. Existing phylogenetic record keeping requires inerrant and complete collation of birth and death reports within a centralized data structure. Such perfect tracking approaches are brittle to data loss or corruption and impose communication overhead.

A phylogenetic inference approach, as opposed to phylogenetic tracking, has potential to improve scalability and robustness. Under such a model, history is estimated from comparison of available extant genomes — aligning with the familiar paradigm of phylogenetic work in wet biology. However, this raises the question of how best to design digital genomes to facilitate phylogenetic inference.

This work introduces a new technique, called hereditary stratigraphy, that works by attaching a set of immutable historical “checkpoints” — referred to as strata — as an annotation on evolving genomes. Checkpoints can be strategically discarded to reduce annotation size at the cost of increasing inference uncertainty. An accompanying software library, hstrat, provides a plug-and-play implementation of hereditary stratigraphy that can be incorporated into any digital evolution system.

Publications & Software

2025 Downstream: efficient cross-platform algorithms for fixed-capacity stream downsampling
arXiv

Download
View at Publisher

Authors	Connor Yang, Joey Wagner, Emily Dolson, Luis Zaman, Matthew Andres Moreno
Date	June 17th, 2025
DOI	10.48550/arXiv.2506.12975
Venue	arXiv

Abstract

Due to ongoing accrual over long durations, a defining characteristic of real-world data streams is the requirement for rolling, often real-time, mechanisms to coarsen or summarize stream history. One common data structure for this purpose is the ring buffer, which maintains a running downsample comprising most recent stream data. In some downsampling scenarios, however, it can instead be necessary to maintain data items spanning the entirety of elapsed stream history. Fortunately, approaches generalizing the ring buffer mechanism have been devised to support alternate downsample compositions, while maintaining the ring buffer’s update efficiency and optimal use of memory capacity. The Downstream library implements algorithms supporting three such downsampling generalizations: (1) “steady,” which curates data evenly spaced across the stream history; (2) “stretched,” which prioritizes older data; and (3) “tilted,” which prioritizes recent data. To enable a broad spectrum of applications ranging from embedded devices to high-performance computing nodes and AI/ML hardware accelerators, Downstream supports multiple programming languages, including C++, Rust, Python, Zig, and the Cerebras Software Language. For seamless interoperation, the library incorporates distribution through multiple packaging frameworks, extensive cross-implementation testing, and cross-implementation documentation.

BibTeX

Authors	Matthew Andres Moreno, Santiago Rodriguez-Papa, Emily Dolson
Date	May 1st, 2025
DOI	10.1162/artl_a_00470
Venue	Artifical Life

Authors	Matthew Andres Moreno, Anika Ranjan, Emily Dolson, Luis Zaman
Date	December 5th, 2024
DOI	10.1109/ALIFE-CIS64968.2025.10979833
Venue	2025 IEEE Symposium on Computational Intelligence in Artificial Life and Cooperative Intelligent Systems

Authors	Matthew Andres Moreno, Connor Yang
Date	December 5th, 2024
Venue	Python package published via PyPI

Authors	Matthew Andres Moreno, Connor Yang, Emily Dolson, Luis Zaman
Date	November 16th, 2024
Venue	The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC24)

Authors	Matthew Andres Moreno, Mark T. Holder, Jeet Sukumaran
Date	September 23rd, 2024
DOI	10.21105/joss.06943
Venue	Journal of Open Source Software

Authors	Matthew Andres Moreno, Luis Zaman, Emily Dolson
Date	September 10th, 2024
DOI	10.48550/arXiv.2409.06199
Venue	arXiv

Authors	Emily Dolson, Santiago Rodriguez-Papa, Matthew Andres Moreno
Date	May 15th, 2024
DOI	10.48550/arXiv.2405.09389
Venue	arXiv

Authors	Matthew Andres Moreno
Date	February 18th, 2024
Venue	Genetic Programming Theory and Practice XX

Authors	Matthew Andres Moreno, Emily Dolson, Santiago Rodriguez-Papa
Date	July 24th, 2023
DOI	10.1162/isal_a_00694
Venue	The 2023 Conference on Artificial Life

Authors	Matthew Andres Moreno, Emily Dolson, Charles Ofria
Date	November 7th, 2022
DOI	10.21105/joss.04866
Venue	Journal of Open Source Software