Open Access. Powered by Scholars. Published by Universities.®

Life Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics

Bioinformatics

Institution
Publication Year
Publication
Publication Type
File Type

Articles 1 - 30 of 258

Full-Text Articles in Life Sciences

Deep Sequencing Of Pre-Translational Mrnps Reveals Hidden Flux Through Evolutionarily Conserved As-Nmd Pathways, Carrie Kovalak, Mihir Metkar, Melissa J. Moore Nov 2019

Deep Sequencing Of Pre-Translational Mrnps Reveals Hidden Flux Through Evolutionarily Conserved As-Nmd Pathways, Carrie Kovalak, Mihir Metkar, Melissa J. Moore

University of Massachusetts Medical School Faculty Publications

Background The ability to generate multiple mRNA isoforms from a single gene by alternative splicing (AS) is crucial for the regulation of eukaryotic gene expression. Because different mRNA isoforms can have widely differing decay rates, however, the flux through competing AS pathways cannot be determined by traditional RNA-Seq data alone. Further, some mRNA isoforms with extremely short half-lives, such as those subject to translation-dependent nonsense-mediated decay (AS-NMD), may be completely overlooked in even the most extensive RNA-Seq analyses.

Results RNA immunoprecipitation in tandem (RIPiT) of exon junction complex (EJC) components allows for the purification of post-splicing mRNA-protein particles (mRNPs) not ...


Wormcat: An Online Tool For Annotation And Visualization Of Caenorhabditis Elegans Genome-Scale Data, Amy D. Holdorf, Daniel P. Higgins, Anne C. Hart, Peter R. Boag, Gregory J. Pazour, Albertha J. M. Walhout, Amy K. Walker Nov 2019

Wormcat: An Online Tool For Annotation And Visualization Of Caenorhabditis Elegans Genome-Scale Data, Amy D. Holdorf, Daniel P. Higgins, Anne C. Hart, Peter R. Boag, Gregory J. Pazour, Albertha J. M. Walhout, Amy K. Walker

University of Massachusetts Medical School Faculty Publications

The emergence of large gene expression datasets has revealed the need for improved tools to identify enriched gene categories and visualize enrichment patterns. While Gene Ontogeny (GO) provides a valuable tool for gene set enrichment analysis, it has several limitations. First, it is difficult to graphically compare multiple GO analyses. Second, genes from some model systems are not well represented. For example, around 30% of Caenorhabditis elegans genes are missing from analysis in commonly used databases. To allow categorization and visualization of enriched C. elegans gene sets in different types of genome-scale data, we developed WormCat, a web-based tool that ...


Molecular Phylogeny Implemented In An Introductory Plant Classification Course, Chao Cai, Jo Ann Banks Oct 2019

Molecular Phylogeny Implemented In An Introductory Plant Classification Course, Chao Cai, Jo Ann Banks

Libraries Faculty and Staff Presentations

Plant classification is one of the core components in undergraduate programs related to plant sciences. Traditionally plant classification courses primarily introduce morphology-based taxonomy because of practical needs in the field. However, the publication of new plant classification systems by Angiosperm Phylogeny Group (APG) using molecular phylogeny methods leads to the trends of using molecular evidence (DNA barcode) for plant identification. In our introductory plant classification course, we included a two-week module (lectures and labs) to introduce key concepts and fundamental skills in molecular phylogeny. Week 1 included concepts of evolutionary tree thinking, data mining in NCBI using BLAST search, and ...


The Genome-Wide Multi-Layered Architecture Of Chromosome Pairing In Early Drosophila Embryos, Jelena Erceg, Jumana Alhaj Abed, Anton Goloborodko, Bryan R. Lajoie, Geoffrey Fudenberg, Nezar Abdennur, Maxim Imakaev, Ruth B. Mccole, Son C. Nguyen, Wren Saylor, Eric F. Joyce, T. Niroshini Senaratne, Mohammed A. Hannan, Guy Nir, Job Dekker, Leonid A. Mirny, C-Ting Wu Oct 2019

The Genome-Wide Multi-Layered Architecture Of Chromosome Pairing In Early Drosophila Embryos, Jelena Erceg, Jumana Alhaj Abed, Anton Goloborodko, Bryan R. Lajoie, Geoffrey Fudenberg, Nezar Abdennur, Maxim Imakaev, Ruth B. Mccole, Son C. Nguyen, Wren Saylor, Eric F. Joyce, T. Niroshini Senaratne, Mohammed A. Hannan, Guy Nir, Job Dekker, Leonid A. Mirny, C-Ting Wu

Program in Systems Biology Publications and Presentations

Genome organization involves cis and trans chromosomal interactions, both implicated in gene regulation, development, and disease. Here, we focus on trans interactions in Drosophila, where homologous chromosomes are paired in somatic cells from embryogenesis through adulthood. We first address long-standing questions regarding the structure of embryonic homolog pairing and, to this end, develop a haplotype-resolved Hi-C approach to minimize homolog misassignment and thus robustly distinguish trans-homolog from cis contacts. This computational approach, which we call Ohm, reveals pairing to be surprisingly structured genome-wide, with trans-homolog domains, compartments, and interaction peaks, many coinciding with analogous cis features. We also find a ...


Formulation Of Hybrid Knowledge-Based/Molecular Mechanics Potentials For Protein Structure Refinement And A Novel Graph Theoretical Protein Structure Comparison And Analysis Technique, Aaron Maus Aug 2019

Formulation Of Hybrid Knowledge-Based/Molecular Mechanics Potentials For Protein Structure Refinement And A Novel Graph Theoretical Protein Structure Comparison And Analysis Technique, Aaron Maus

University of New Orleans Theses and Dissertations

Proteins are the fundamental machinery that enables the functions of life. It is critical to understand them not just for basic biology, but also to enable medical advances. The field of protein structure prediction is concerned with developing computational techniques to predict protein structure and function from a protein’s amino acid sequence, encoded for directly in DNA, alone. Despite much progress since the first computational models in the late 1960’s, techniques for the prediction of protein structure still cannot reliably produce structures of high enough accuracy to enable desired applications such as rational drug design. Protein structure refinement ...


Entrna: A Framework To Predict Rna Foldability, Congzhe Su, Jefferey D. Weir, Fei Zhang, Hao Yan, Teresa Wu Jul 2019

Entrna: A Framework To Predict Rna Foldability, Congzhe Su, Jefferey D. Weir, Fei Zhang, Hao Yan, Teresa Wu

Faculty Publications

RNA molecules play many crucial roles in living systems. The spatial complexity that exists in RNA structures determines their cellular functions. Therefore, understanding RNA folding conformations, in particular, RNA secondary structures, is critical for elucidating biological functions. Existing literature has focused on RNA design as either an RNA structure prediction problem or an RNA inverse folding problem where free energy has played a key role.


Designing Computational Biology Workflows With Perl - Part 2, Esma Yildirim May 2019

Designing Computational Biology Workflows With Perl - Part 2, Esma Yildirim

Open Educational Resources

This material briefly reintroduces the DNA double Helix structure, explains SNP and INDEL mutations in genes and describes FASTA, FASTQ, BAM and VCF file formats. It also explains the index creation, alignment, sorting, marking duplicates and variant calling steps of a simple preprocessing workflow and how to write a Perl script to automate the execution of these steps on a Virtual Machine Image.


Designing Computational Biology Workflows With Perl - Part 1, Esma Yildirim May 2019

Designing Computational Biology Workflows With Perl - Part 1, Esma Yildirim

Open Educational Resources

This material introduces the AWS console interface, describes how to create an instance on AWS with the VMI provided, connect to that machine instance using the SSH protocol. Once connected, it requires the students to write a script to enter the data folder, which includes gene-sequencing input files and print the first five line of each file remotely. The same exercise can be applied if the VMI is installed on a local machine using virtualization software (e.g. Oracle VirtualBox). In this case, the Terminal program of the VMI can be used to do the exercise.


Computational Genomic Models For Spatio-Temporal Investigation Of Early Lung Cancer Pathology, Smruthy Sivakumar May 2019

Computational Genomic Models For Spatio-Temporal Investigation Of Early Lung Cancer Pathology, Smruthy Sivakumar

UT GSBS Dissertations and Theses (Open Access)

Lung cancer, of which non-small cell lung cancer (NSCLC) is the most common form, is the second most prevalent cancer and the leading cause of cancer-related deaths. NSCLCs primarily comprise adenocarcinomas (LUAD) and squamous cell carcinomas (LUSC). Advances in early detection and prevention have been limited by the lack of early-stage biomarkers and targets. A comprehensive molecular characterization of premalignant lesions and tumor-adjacent normal tissue can aid in better understanding NSCLC pathogenesis. However, these investigations are further challenged by limited tissue availability and low cellular fractions of detectable somatic mutations.

Therefore, there is a dearth of knowledge about the pathogenesis ...


Designing Computational Biology Workflows With Perl - Part 1, Esma Yildirim May 2019

Designing Computational Biology Workflows With Perl - Part 1, Esma Yildirim

Open Educational Resources

This material introduces Linux File System structures and demonstrates how to use commands to communicate with the operating system through a Terminal program. Basic program structures and system() function of Perl are discussed. A brief introduction to gene-sequencing terminology and file formats are given.


Designing Computational Biology Workflows With Perl - Part 2, Esma Yildirim May 2019

Designing Computational Biology Workflows With Perl - Part 2, Esma Yildirim

Open Educational Resources

This material introduces the AWS console interface, describes how to create an instance on AWS with the VMI provided and connect to that machine instance using the SSH protocol. Once connected, it requires the students to write a script to automate the tasks to create VCF files from two different sample genomes belonging to E.coli microorganisms by using the FASTA and FASTQ files in the input folder of the virtual machine. The same exercise can be applied if the VMI is installed on a local machine using virtualization software (e.g. Oracle VirtualBox). In this case, the Terminal program ...


Designing Computational Biology Workflows With Perl - Part 1 & 2, Esma Yildirim May 2019

Designing Computational Biology Workflows With Perl - Part 1 & 2, Esma Yildirim

Open Educational Resources

This manual guides the instructor to combine the partial files of the virtual machine image and construct sequencer.ova file. It is accompanied by the partial files of the virtual machine image.


Contribution Of Retrotransposons To Breast Cancer Malignancy, Isaac D. Raplee Apr 2019

Contribution Of Retrotransposons To Breast Cancer Malignancy, Isaac D. Raplee

Graduate Theses and Dissertations

The components contributing to cancer progression, especially the transition from early to invasive are unknown. Consequently, the biological reasons are unclear as to why some patients diagnosed with atypia and ductal carcinoma in situ (DCIS) never progress into invasive breast cancer. The “one gene at a time” approach does not sufficiently predict progression. To elucidate the early stage progression to invasive ductal cancer, expression signature of transcripts and transposable elements in micropunched samples of formalin-fixed, paraffin embedded (FFPE) tissue was conducted. A bioinformatics pipeline to analyze poor quality, short reads (>36 nts) from RNA-Seq data was created to compare the ...


Mrub_3018 Is Orthologous To E. Coli B2759 (Casb), Kyle Parker, Dr. Lori Scott Feb 2019

Mrub_3018 Is Orthologous To E. Coli B2759 (Casb), Kyle Parker, Dr. Lori Scott

Meiothermus ruber Genome Analysis Project

This project is part of the Meiothermus ruber genome analysis project, which uses a collection of online bioinformatics tools to predict gene function. We studied the biological activity of the Mrub_3018 gene, which we hypothesize is orthologous to E. coli gene B2759. We predicted that Mrub_3018(DNA coordinates 3057916… 3058524) encodes the protein CasB. CasB is a protein in the CRISPR CASCADE that will function as a structural protein. When the rest of the proteins form an “S” formation CasB will connect the front and back of the “S” creating a back bone for the structure. It will help bind ...


Mrub_3015 Is Orthologous To The B2757 Gene Found In Escherichia Coli Coding For Casd, Ramona Collins, Dr. Lori Scott Feb 2019

Mrub_3015 Is Orthologous To The B2757 Gene Found In Escherichia Coli Coding For Casd, Ramona Collins, Dr. Lori Scott

Meiothermus ruber Genome Analysis Project

This project is part of the Meiothermus ruber genome analysis project, which uses a collection of online bioinformatics tools to predict gene function. We investigated the biological function of the gene Mrub_3015, which we hypothesize is a component of the CRISPR-Cas prokaryotic defense system. We predict that Mrub_3015 (DNA coordinates 3055550...3056245) encodes the the CRISPR-associated protein cas5, which is integral in maintaining the crRNA-DNA structure, keeping the complex from base pairing with the target phage DNA. Our hypothesis is supported by identical hits for Mrub_3015 and b2527 to the KEGG, Pfam, TIGRfam, CDD and PDB databases as well as ...


Effects Of Temperature On Crispr/Cas System, Eddie Beckom, Dr. Lori Scott Jan 2019

Effects Of Temperature On Crispr/Cas System, Eddie Beckom, Dr. Lori Scott

Meiothermus ruber Genome Analysis Project

This project is part of the Meiothermus ruber genome analysis project, which uses a collection of online bioinformatics tools to predict gene function. We investigated the effect of temperature on the complexity of CRISPR/Cas systems in bacterial organisms across temperature classifications. We predict that temperature extremes would result in CRISPR/Cas systems with multiple operons, repeating cas genes, and complex systems. CRISPR/Cas systems can be classified into three types with a number of subtypes based on the CRISPR-associated genes, cas genes, present in a given organism. Our hypothesis is supported by the presence of multiple operons in thermophilic ...


Mrub_3014 Is Orthologous To B2756, Samir Abdelkarim, Dr. Lori Scott Jan 2019

Mrub_3014 Is Orthologous To B2756, Samir Abdelkarim, Dr. Lori Scott

Meiothermus ruber Genome Analysis Project

This project is part of the Meiothermus ruber genome analysis project, which uses a collection of online bioinformatics tools to predict gene function. We investigated the biological function of the gene Mrub_3014, which we hypothesize is a component of the CRISPR-Cas prokaryotic defense system. We predict that Mrub_3014 (DNA coordinates 3054943..3055575) encodes CRISPR-associated protein Cse3/case which function as an endonuclease. Our hypothesis is supported by identical hits for Mrub_3014 and b2756 to the KEGG, Pfam, TIGRfam, CDD and PDB databases, as well as a low E-value for a pairwise NCBI BLAST comparison. Both protein products are predicted to ...


M. Ruber Mrub_3013 Is Orthologous To E. Coli B2755, Laura Butcher, Dr. Lori Scott Jan 2019

M. Ruber Mrub_3013 Is Orthologous To E. Coli B2755, Laura Butcher, Dr. Lori Scott

Meiothermus ruber Genome Analysis Project

This project is part of the Meiothermus ruber genome analysis project, which uses a collection of online bioinformatics tools to predict gene function. We investigated the biological function of gene Mrub_3013, which we hypothesize is orthologous to b2755 in E. coli K12 MG1655 (a.k.a. Cas1). We investigated the biological function of a gene with the M. ruber locus tag of Mrub_3013, which we hypothesize is a component of the CRISPR-Cas prokaryotic defense system in M. ruber. We predict that Mrub_3013 (DNA coordinates 3,053,978-3,054,940) encodes the protein Cas1 which as part of the CRISPR-Cas system ...


An Investigation Into The Relationship Between Mrub_3013, Mrub_1477, And Mrub_0224: Are They Paralogs?, Melette Devore, Dr. Lori Scott Jan 2019

An Investigation Into The Relationship Between Mrub_3013, Mrub_1477, And Mrub_0224: Are They Paralogs?, Melette Devore, Dr. Lori Scott

Meiothermus ruber Genome Analysis Project

This project is part of the Meiothermus ruber genome analysis project, which uses a collection of online bioinformatics tools to predict gene function. We investigated the biological function of mrub_3013 and the nature of its relationship with mrub_1477 and mrub_0224. We hypothesized that mrub_3013 is orthologous to b2755 in E. coli K12 MG1655 (a.k.a. cas1). We predict that mrub_3013 encodes the enzyme Cas1, which is involved in spacer acquisition in the CRISPR-Cas prokaryotic defense system. Our hypothesis is supported by identical hits for b2755, mrub_3013, mrub_1477, and mrub_0224 from the CDD and Pfam databases and highly similar hits ...


Mrub_3020, A Paralog Of Mrub_1489, Is Orthologous To E. Coli Casc (Locus Tag B2761), Alfred Dei-Ampeh, Dr. Lori Scott Jan 2019

Mrub_3020, A Paralog Of Mrub_1489, Is Orthologous To E. Coli Casc (Locus Tag B2761), Alfred Dei-Ampeh, Dr. Lori Scott

Meiothermus ruber Genome Analysis Project

This project is part of the Meiothermus ruber genome analysis project, which uses a collection of online bioinformatics tools to predict gene function. We investigated the biological functions of two genes: mrub_3020 and mrub_1489. We make two hypotheses in this investigation: a) mrub_3020 is orthologous to the gene b2761 in E. coli K12 MG1655 (a.k.a. casC); b) mrub_1489 is a paralog of mrub_3020. We also predict that the two genes encode unique proteins: mrub_3020 with DNA coordinates 3060491…3063190 encodes a CRISPR – associated helicase (Cas3) that supports the Cascade complex of the CRISPR – Cas adaptive immune system by ...


Algorithms For Synteny-Based Phylostratigraphy And Gene Origin Classification, Zebulun Arendsee Jan 2019

Algorithms For Synteny-Based Phylostratigraphy And Gene Origin Classification, Zebulun Arendsee

Graduate Theses and Dissertations

With every newly sequenced species we discover hundreds of novel protein coding genes. Many of these "orphan" genes have been experimentally proven to have dramatic functions in development, sexual dimorphism, pathogen resistance, and social traits like symbiosis. Whereas in the past, researchers viewed genes as the product of continuous variation acting on ancient material, we now know that novel genes may arise de novo from non-genic sequence. Thus evolutionary experimentation is not limited to tweaking existing genes or their regulatory patterns. Any orphan genes that arose in the distant past, should appear today as lineage-specific genes (or gene families). The ...


Evaluating Predixcan’S Ability To Predict Differential Expression Between Alcoholics And Non-Alcoholics, John E. Drake Jr Jan 2019

Evaluating Predixcan’S Ability To Predict Differential Expression Between Alcoholics And Non-Alcoholics, John E. Drake Jr

Theses and Dissertations

PrediXcan is a recent software for the imputation of gene expression from genotype data alone. Using an overlapping set of transcriptome datasets from postmortem brain tissues of donors with alcohol use disorder and neurotypical controls, which were generated by two different platforms (e.g., Arraystar and Affymetrix), and an additional unrelated transcriptome dataset from lung tissue, we sought to evaluate PrediXcan’s ability to impute gene expression and identify differentially expressed genes. From the Arraystar platform, 1.3% of matched genes between the measured and imputed expression had a Pearson correlation ≥ 0.5. Our attempt to replicate this finding using ...


Exploring Strategies To Integrate Disparate Bioinformatics Datasets, Charbel Bader Fakhry Jan 2019

Exploring Strategies To Integrate Disparate Bioinformatics Datasets, Charbel Bader Fakhry

Walden Dissertations and Doctoral Studies

Distinct bioinformatics datasets make it challenging for bioinformatics specialists to locate the required datasets and unify their format for result extraction. The purpose of this single case study was to explore strategies to integrate distinct bioinformatics datasets. The technology acceptance model was used as the conceptual framework to understand the perceived usefulness and ease of use of integrating bioinformatics datasets. The population of this study included bioinformatics specialists of a research institution in Lebanon that has strategies to integrate distinct bioinformatics datasets. The data collection process included interviews with 6 bioinformatics specialists and reviewing 27 organizational documents relating to integrating ...


Saccharomyces Genome Database & Uniprot Bioinformatics Analysis, Ray A. Enke Dec 2018

Saccharomyces Genome Database & Uniprot Bioinformatics Analysis, Ray A. Enke

Ray Enke Ph.D.

This in class activity introduces basic bioinformatics analysis using the Saccharomyces Genome Database (SGD) and the UniProt Database. The yeast URA3 gene is studied in this activity, however, any other yeast gene can be substituted. This activity is designed for novice instructors and students for implementation into core biology lecture or lab courses.


Individualized Clinical Practice Guidelines For Pressure Injury Management: Development Of An Integrated Multi-Modal Biomedical Information Resource, Kathie M. Bogie, Guo-Qiang Zhang, Steven K. Roggenkamp, Ningzhou Zeng, Jacinta Seton, Shiqiang Tao, Arielle L. Bloostein, Jiayang Sun Sep 2018

Individualized Clinical Practice Guidelines For Pressure Injury Management: Development Of An Integrated Multi-Modal Biomedical Information Resource, Kathie M. Bogie, Guo-Qiang Zhang, Steven K. Roggenkamp, Ningzhou Zeng, Jacinta Seton, Shiqiang Tao, Arielle L. Bloostein, Jiayang Sun

Institute of Biomedical Informatics Faculty Publications

Background: Pressure ulcers (PU) and deep tissue injuries (DTI), collectively known as pressure injuries are serious complications causing staggering costs and human suffering with over 200 reported risk factors from many domains. Primary pressure injury prevention seeks to prevent the first incidence, while secondary PU/DTI prevention aims to decrease chronic recurrence. Clinical practice guidelines (CPG) combine evidence-based practice and expert opinion to aid clinicians in the goal of achieving best practices for primary and secondary prevention. The correction of all risk factors can be both overwhelming and impractical to implement in clinical practice. There is a need to develop ...


Microbial Ecology Of South Florida Surface Waters: Examining The Potential For Anthropogenic Influences, Chase P. Donnelly Aug 2018

Microbial Ecology Of South Florida Surface Waters: Examining The Potential For Anthropogenic Influences, Chase P. Donnelly

HCNSO Student Theses and Dissertations

South Florida contains one of the largest subtropical wetlands in the world, and yet not much is known about the microbes that live in these surface waters. These microbes play an important role in chemical cycling and maintaining good water quality for both human and ecosystem health. The hydrology of Florida’s surface waters is tightly regulated with the use of canal and levee systems run by the US Army Corps of Engineers and The South Florida Water Management District. These canals run through the Everglades, agriculture, and urban environments to control water levels in Lake Okeechobee, the Water Conservation ...


Genome Analysis Of Multiple Mycobacteriophage, Emily Kerstiens, Kari Clase, Yi Li, Gillian Smith, Sarah Bell Aug 2018

Genome Analysis Of Multiple Mycobacteriophage, Emily Kerstiens, Kari Clase, Yi Li, Gillian Smith, Sarah Bell

The Summer Undergraduate Research Fellowship (SURF) Symposium

Bacteriophage are viruses that infect and kill bacteria. They can be used as treatments for antibiotic resistant bacterial infections, but more knowledge is needed about phage and how they interact with bacteria in order to develop safe and effective phage therapy treatments. This study examines the genomes of eighteen mycobacteriophage that were isolated from the environment on and surrounding Purdue University. Phage genomes were annotated using several bioinformatics software, including DNA Master, GeneMark, and PECAAN. Evidence was examined to determine the correct location within the genome and the potential function. Approximately two thousand genes were annotated in this study. A ...


Deciphering The Role Of Human Arylamine N-Acetyltransferase 1 (Nat1) In Breast Cancer Cell Metabolism Using A Systems Biology Approach., Samantha Marie Carlisle Aug 2018

Deciphering The Role Of Human Arylamine N-Acetyltransferase 1 (Nat1) In Breast Cancer Cell Metabolism Using A Systems Biology Approach., Samantha Marie Carlisle

Electronic Theses and Dissertations

Background: Human arylamine N-acetyltransferase 1 (NAT1) is a phase II xenobiotic metabolizing enzyme found in almost all tissues. NAT1 can additionally hydrolyze acetyl-coenzyme A (acetyl-CoA) in the absence of an arylamine substrate. NAT1 expression varies inter-individually and is elevated in several cancers including estrogen receptor positive (ER+) breast cancers. Additionally, multiple studies have shown the knockdown of NAT1, by both small molecule inhibition and siRNA methods, in breast cancer cells leads to decreased invasive ability and proliferation and decreased anchorage-independent colony formation. However, the exact mechanism by which NAT1 expression affects cancer risk and progression remains unclear. Additionally, consequences ...


Omega: A Software Tool For The Management, Analysis, And Dissemination Of Intracellular Trafficking Data That Incorporates Motion Type Classification And Quality Control, Alessandro Rigano, Vanni Galli, Jasmine M. Clark, Lara E. Pereira, Loris Grossi, Jeremy Luban, Raffaello Giulietti, Tiziano Leidi, Eric Hunter, Mario Valle, Ivo F. Sbalzarini, Caterina Strambio-De-Castilla Jun 2018

Omega: A Software Tool For The Management, Analysis, And Dissemination Of Intracellular Trafficking Data That Incorporates Motion Type Classification And Quality Control, Alessandro Rigano, Vanni Galli, Jasmine M. Clark, Lara E. Pereira, Loris Grossi, Jeremy Luban, Raffaello Giulietti, Tiziano Leidi, Eric Hunter, Mario Valle, Ivo F. Sbalzarini, Caterina Strambio-De-Castilla

University of Massachusetts Medical School Faculty Publications

MOTIVATION: Particle tracking coupled with time-lapse microscopy is critical for understanding the dynamics of intracellular processes of clinical importance. Spurred on by advances in the spatiotemporal resolution of microscopy and automated computational methods, this field is increasingly amenable to multi-dimensional high-throughput data collection schemes (Snijder et al, 2012). Typically, complex particle tracking datasets generated by individual laboratories are produced with incompatible methodologies that preclude comparison to each other. There is therefore an unmet need for data management systems that facilitate data standardization, meta-analysis, and structured data dissemination. The integration of analysis, visualization, and quality control capabilities into such systems would ...


A Study Of Scalability And Cost-Effectiveness Of Large-Scale Scientific Applications Over Heterogeneous Computing Environment, Arghya K. Das Jun 2018

A Study Of Scalability And Cost-Effectiveness Of Large-Scale Scientific Applications Over Heterogeneous Computing Environment, Arghya K. Das

LSU Doctoral Dissertations

Recent advances in large-scale experimental facilities ushered in an era of data-driven science. These large-scale data increase the opportunity to answer many fundamental questions in basic science. However, these data pose new challenges to the scientific community in terms of their optimal processing and transfer. Consequently, scientists are in dire need of robust high performance computing (HPC) solutions that can scale with terabytes of data.

In this thesis, I address the challenges in three major aspects of scientific big data processing as follows: 1) Developing scalable software and algorithms for data- and compute-intensive scientific applications. 2) Proposing new cluster architectures ...