Bridge across River

Open-source Software

Listed below are open-source software that I either (a) authored/co-authored, (or) (b) advised/contributed in the design/implementation of the software. Also included is a list of my minor updates to other open source software.

General Purpose Software

sope

Rust library for MPI. Developed based to the mxx C++ library, this library provides a simplified, and type-safe bindings to common MPI operations with error handling. Also includes a collection of scalable, high-performance standard algorithms for parallel distributed memory architectures, such as sorting and distribution. Available at github.

Awitree

A simple interface to jstree, javascript library to construct tree via the AnyWidget protocol. This library can be used to construct interactive trees within
Jupyter and Marimo Notebooks. Python/Javascript sources are available at github

Research Software (as Author/Co-author)

ParEnsNet

Constructs ensemble gene networks for large-scale single-cell data in parallel. Implmented in Rust, this library includes a suite of parallel gene network construction algorithms -- PIDC, GRNBoost, and MI-based methods. Source available at github.

SCEMENT

Software for scalable and memory efficient integration of large- scale single cell RNA-sequencing Data. Implemented in C++ with a python wrapper; Source available at [github]https://github.com/AluruLab/scement.

Apache Airavata Cerebrum

An Integrated Computational Neuroscience Framework, whose goal is to build computational neuroscience models that can be constructed using large single-cell datasets from Brain Atlases. Python source is available at github.

EnGRaiN

EnGRaiN is the first supervised machine learning method to construct ensemble networks using small training datasets of true positives and true negatives among gene pairs. Python source available at github.

ParFastAAI

A C++ Implementation of a parallel algorithm for fast computation of Average Amino-acid Identity for bacterial and archeal genomes. Available at https://github.com/AluruLab/parfastaai.

gbrunner

Utilities and runner scripts for generating gene regulatory net- works with arboreto, XGBoost and Light GBM. Python source available at github.

Ardmore

Ardmore is a suite of docker containers and aggregation scripts for simulated runs that enables reproducibility in evaluation of gene network construction. Source available at github.

ADYAR

ADYAR implements the alignment-free heuristic for fast sequence comparisons with applications to phylogeny reconstruction. C++ source available at github.

ALFRED and ALFRED-G

ALFRED implements an alignment free method to compute the average common substring measure with k mismatches, and is used for Phylogenetic Inference. ALFRED-G a greedy Align- ment Free Distance Estimator software for Phylogenetic Infer- ence, and implements an approximate algorithm that takes linear time. C+++ sources are available here. It is based on the metric of average common substring computed as described in the research paper: doi: 10.1089/cmb.2015.0217 .

Parallel kMCS

Parallel kMCS implements an efficient parallel algorithm that given a database of strings, identifies all pairs of strings that have common substrings, while allowing a limited number of mismatches. C++ sources available at bitbucket.

PSbEC

PSbEC is a parallel spectrum-based error correction software for big genomic datasets. It can be used as framework to parallel- ize any spectrum-based error correction software. C++ sources are available [here] http://srirampc.net/psbec.html.

Research Software (as an Advisor/Collaborator)

ramBLe

ramBLe is a parallel framework for Bayesian Network Learning. C++ sources available at github. This software is selected as one of the three finalists for for inclu- sion as a reproducibility challenge in the SC21 Student Cluster Competition (SCC).

ParsiMoNe

Parallel implementation in C++ for construction of Module Networks.

MCPNet

MCPNet is a gene regulatory network (GRN) reconstruction tool that identify long range indirect interactions based on a novel metric called MCP Score. MCP score uses maximum-capacity-path, a graph theoretical measure, to quantify the relative strengths of direct and indirect gene-gene interactions. MCPNet is implemented in C++ and is parallelized for multi-core and MPI multi-node environments. It is designed to reconstruct networks in unsupervised and semi-supervised manners. C++ sources are available at github

Other Updates

hdf5-rust

A fork of metno/hdf5-rust that includes MPI parallel read/write of HDF5 1D/2D datasets.

dualrnaseq

A fork of nf-core/dualrnaseq that enables long running quantification processes and fix annotation bugs.