With the rise of improved sequencing technologies, genomics is expanding from a single reference per species paradigm into a more comprehensive pan-genome approach with multiple individuals represented and analyzed together. Here we introduce a novel O(n log n) time and space algorithm called splitMEM, that directly constructs the compressed de Bruijn graph for a pan-genome of total length n. To achieve this time complexity, we augment the suffix tree with suffix skips, a new construct that allows us to traverse several suffix links in constant time, and use them to efficiently decompose maximal exact matches (MEMs) during a suffix tree traversal.
Categories
Bio-InformaticsLicense
Apache License V2.0Follow SplitMEM
Other Useful Business Software
Securden Privileged Account Manager
Discover and manage administrator, service, and web app passwords, keys, and identities. Automate management with approval workflows. Centrally control, audit, monitor, and record all access to critical IT assets.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of SplitMEM!