Open Access. Powered by Scholars. Published by Universities.®

Life Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 39

Full-Text Articles in Life Sciences

In Silico Identification Of Genetic Mutations Conferring Resistance To Acetohydroxyacid Synthase Inhibitors: A Case Study Of Kochia Scoparia, Yan Li, Michael D. Netherland, Chaoyang Zhang, Huixiao Hong, Ping Gong May 2019

In Silico Identification Of Genetic Mutations Conferring Resistance To Acetohydroxyacid Synthase Inhibitors: A Case Study Of Kochia Scoparia, Yan Li, Michael D. Netherland, Chaoyang Zhang, Huixiao Hong, Ping Gong

Faculty Publications

Mutations that confer herbicide resistance are a primary concern for herbicide-based chemical control of invasive plants and are often under-characterized structurally and functionally. As the outcome of selection pressure, resistance mutations usually result from repeated long-term applications of herbicides with the same mode of action and are discovered through extensive field trials. Here we used acetohydroxyacid synthase (AHAS) of Kochia scoparia (KsAHAS) as an example to demonstrate that, given the sequence of a target protein, the impact of genetic mutations on ligand binding could be evaluated and resistance mutations could be identified using a biophysics-based computational approach. Briefly, the ...


Predicting Protein Residue-Residue Contacts Using Random Forests And Deep Networks, Joseph Luttrell Iv, Tong Liu, Chaoyang Zhang, Zheng Wang Mar 2019

Predicting Protein Residue-Residue Contacts Using Random Forests And Deep Networks, Joseph Luttrell Iv, Tong Liu, Chaoyang Zhang, Zheng Wang

Faculty Publications

Background: The ability to predict which pairs of amino acid residues in a protein are in contact with each other offers many advantages for various areas of research that focus on proteins. For example, contact prediction can be used to reduce the computational complexity of predicting the structure of proteins and even to help identify functionally important regions of proteins. These predictions are becoming especially important given the relatively low number of experimentally determined protein structures compared to the amount of available protein sequence data.

Results: Here we have developed and benchmarked a set of machine learning methods for performing ...


Similarities And Differences Between Variants Called With Human Reference Genome Hg19 Or Hg38, Bohu Pan, Rebecca Kusko, Wenming Xiao, Yuantin Zheng, Zhichao Liu, Chunlin Xiao, Sugunadevi Sakkiah, Wenjing Guo, Ping Gong, Chaoyang Zhang, Weigong Ge, Leming Shi, Weida Tong, Huixiao Hong Mar 2019

Similarities And Differences Between Variants Called With Human Reference Genome Hg19 Or Hg38, Bohu Pan, Rebecca Kusko, Wenming Xiao, Yuantin Zheng, Zhichao Liu, Chunlin Xiao, Sugunadevi Sakkiah, Wenjing Guo, Ping Gong, Chaoyang Zhang, Weigong Ge, Leming Shi, Weida Tong, Huixiao Hong

Faculty Publications

Background: Reference genome selection is a prerequisite for successful analysis of next generation sequencing (NGS) data. Current practice employs one of the two most recent human reference genome versions: HG19 or HG38. To date, the impact of genome version on SNV identification has not been rigorously assessed.

Results: We conducted analysis comparing the SNVs identified based on HG19 vs HG38, leveraging whole genome sequencing (WGS) data from the genome-in-a-bottle (GIAB) project. First, SNVs were called using 26 different bioinformatics pipelines with either HG19 or HG38. Next, two tools were used to convert the called SNVs between HG19 and HG38. Lastly ...


Pasnet: Pathway-Associated Sparse Deepneural Network For Prognosis Prediction From High-Throughput Data, Jie Hao, Youngsoon Kim, Tae-Kyung Kim, Mingon Kang Dec 2018

Pasnet: Pathway-Associated Sparse Deepneural Network For Prognosis Prediction From High-Throughput Data, Jie Hao, Youngsoon Kim, Tae-Kyung Kim, Mingon Kang

Faculty Publications

Background: Predicting prognosis in patients from large-scale genomic data is a fundamentally challenging problem in genomic medicine. However, the prognosis still remains poor in many diseases. The poor prognosis maybe caused by high complexity of biological systems, where multiple biological components and their hierarchical relationships are involved. Moreover, it is challenging to develop robust computational solutions with high-dimension, low-sample size data. Results: In this study, we propose a Pathway-Associated Sparse Deep Neural Network (PASNet) that not only predicts patients’ prognoses but also describes complex biological processes regarding biological pathways for prognosis. PASNet models a multilayered, hierarchical biological system of genes ...


Deep Learning Architectures For Multi-Label Classification Of Intelligent Health Risk Prediction, Andrew Maxwell, Runzhi Li, Bei Yang, Heng Weng, Aihua Ou, Huixiao Hong, Zhaoxian Zhou, Ping Gong, Chaoyang Zhang Dec 2017

Deep Learning Architectures For Multi-Label Classification Of Intelligent Health Risk Prediction, Andrew Maxwell, Runzhi Li, Bei Yang, Heng Weng, Aihua Ou, Huixiao Hong, Zhaoxian Zhou, Ping Gong, Chaoyang Zhang

Faculty Publications

No abstract provided.


Integrative Approach For Inference Of Gene Regulatory Networks Using Lasso-Based Random Featuring And Application To Psychiatric Disorders, Dongchul Kim, Mingon Kang, Ashis Biswas, Chunyu Liu Aug 2016

Integrative Approach For Inference Of Gene Regulatory Networks Using Lasso-Based Random Featuring And Application To Psychiatric Disorders, Dongchul Kim, Mingon Kang, Ashis Biswas, Chunyu Liu

Faculty Publications

Background Inferring gene regulatory networks is one of the most interesting research areas in the systems biology. Many inference methods have been developed by using a variety of computational models and approaches. However, there are two issues to solve. First, depending on the structural or computational model of inference method, the results tend to be inconsistent due to innately different advantages and limitations of the methods. Therefore the combination of dissimilar approaches is demanded as an alternative way in order to overcome the limitations of standalone methods through complementary integration. Second, sparse linear regression that is penalized by the regularization ...


Prokaryotic Diversity In The Rhizosphere Of Organic, Intensive, And Transitional Coffee Farms In Brazil, Adam Caldwell, Livia Silva, Cynthia Da Silva, Cleber Ouverney Jun 2015

Prokaryotic Diversity In The Rhizosphere Of Organic, Intensive, And Transitional Coffee Farms In Brazil, Adam Caldwell, Livia Silva, Cynthia Da Silva, Cleber Ouverney

Faculty Publications

Despite a continuous rise in consumption of coffee over the past 60 years and recent studies showing positive benefits linked to human health, intensive coffee farming practices have been associated with environmental damage, risks to human health, and reductions in biodiversity. In contrast, organic farming has become an increasingly popular alternative, with both environmental and health benefits. This study aimed to characterize and determine the differences in the prokaryotic soil microbiology of three Brazilian coffee farms: one practicing intensive farming, one practicing organic farming, and one undergoing a transition from intensive to organic practices. Soil samples were collected from 20 ...


Proceedings Of The 2014 Midsouth Computational Biology And Bioinformatics Society (Mcbios) Conference, Jonathan D. Wren, Mikhail G. Dozmorov, Dennis Burian, Andy Perkins, Chaoyang Zhang, Peter Hoyt, Rakesh Kaundal Oct 2014

Proceedings Of The 2014 Midsouth Computational Biology And Bioinformatics Society (Mcbios) Conference, Jonathan D. Wren, Mikhail G. Dozmorov, Dennis Burian, Andy Perkins, Chaoyang Zhang, Peter Hoyt, Rakesh Kaundal

Faculty Publications

No abstract provided.


Smoq: A Tool For Predicting The Absolute Residue-Specific Quality Of A Single Protein Model With Support Vector Machine, Renzhi Cao, Zheng Wang, Yiheng Wang, Jianlin Cheng Apr 2014

Smoq: A Tool For Predicting The Absolute Residue-Specific Quality Of A Single Protein Model With Support Vector Machine, Renzhi Cao, Zheng Wang, Yiheng Wang, Jianlin Cheng

Faculty Publications

Background: It is important to predict the quality of a protein structural model before its native structure is known. The method that can predict the absolute local quality of individual residues in a single protein model is rare, yet particularly needed for using, ranking and refining protein models.

Results: We developed a machine learning tool (SMOQ) that can predict the distance deviation of each residue in a single protein model. SMOQ uses support vector machines (SVM) with protein sequence and structural features (i.e. basic feature set), including amino acid sequence, secondary structures, solvent accessibilities, and residue-residue contacts to make ...


A Course-Based Research Experience: How Benefits Change With Increased Investment In Instructional Time, Christopher D. Shaffer, Consuelo J. Alvarez, April E. Bednarski, David Dunbar, Anya L. Goodman, Catherine Reinke, Anne G. Rosenwald, Michael J. Wolyniak, Cheryl Bailey, Daron Barnard, Christopher Bazinet, Dale L. Beach, James E.J. Bedard, Satish Bhalla, John Braverman, Martin Burg, Vidya Chandrasekaran, Hui-Min Chung, Kari Clase, Randall J. Dejong, Justin R. Diangelo, Chunguang Du, Todd T. Eckdahl, Heather Eisler, Julia A. Emerson, Amy Frary, Donald Frohlich, Yuying Gosser, Shubha Govind, Adam Haberman, Amy T. Hark, Charles Hauser, Arlene Hoogewerf, Laura L.M. Hoopes, Carina E. Howell, Diana Johnson, Christopher J. Jones, Lisa Kadlec, Marian Kaehler, S. Catherine Silver Key, Adam Kleinschmit, Nighat P. Kokan, Olga Kopp, Gary Kuleck, Judith Leatherman, Jane Lopilato, Christy Mackinnon, Juan Carlos Martinez-Cruzado, Gerard Mcneil, Stephanie Mel, Hemlata Mistry, Alexis Nagengast, Paul Overvoorde, Don W. Paetkau, Susan Parrish, Celeste N. Peterson, Mary Preuss, Laura K. Reed, Dennis Revie, Srebrenka Robic, Jennifer Roecklein-Canfield, Michael R. Rubin, Kenneth Saville, Stephanie Schroeder, Karim Sharif, Mary Shaw, Gary Skuse, Christopher D. Smith, Mary A. Smith, Sheryl T. Smith, Eric Spana, Mary Spratt, Aparna Sreenivasan, Joyce Stamm, Paul Szauter, Jeffrey S. Thompson, Matthew Wawersik, James Youngblom, Leming Zhou, Elaine R. Mardis, Jeremy Buhler, Wilson Leung, David Lopatto, Sarah C.R. Elgin Jan 2014

A Course-Based Research Experience: How Benefits Change With Increased Investment In Instructional Time, Christopher D. Shaffer, Consuelo J. Alvarez, April E. Bednarski, David Dunbar, Anya L. Goodman, Catherine Reinke, Anne G. Rosenwald, Michael J. Wolyniak, Cheryl Bailey, Daron Barnard, Christopher Bazinet, Dale L. Beach, James E.J. Bedard, Satish Bhalla, John Braverman, Martin Burg, Vidya Chandrasekaran, Hui-Min Chung, Kari Clase, Randall J. Dejong, Justin R. Diangelo, Chunguang Du, Todd T. Eckdahl, Heather Eisler, Julia A. Emerson, Amy Frary, Donald Frohlich, Yuying Gosser, Shubha Govind, Adam Haberman, Amy T. Hark, Charles Hauser, Arlene Hoogewerf, Laura L.M. Hoopes, Carina E. Howell, Diana Johnson, Christopher J. Jones, Lisa Kadlec, Marian Kaehler, S. Catherine Silver Key, Adam Kleinschmit, Nighat P. Kokan, Olga Kopp, Gary Kuleck, Judith Leatherman, Jane Lopilato, Christy Mackinnon, Juan Carlos Martinez-Cruzado, Gerard Mcneil, Stephanie Mel, Hemlata Mistry, Alexis Nagengast, Paul Overvoorde, Don W. Paetkau, Susan Parrish, Celeste N. Peterson, Mary Preuss, Laura K. Reed, Dennis Revie, Srebrenka Robic, Jennifer Roecklein-Canfield, Michael R. Rubin, Kenneth Saville, Stephanie Schroeder, Karim Sharif, Mary Shaw, Gary Skuse, Christopher D. Smith, Mary A. Smith, Sheryl T. Smith, Eric Spana, Mary Spratt, Aparna Sreenivasan, Joyce Stamm, Paul Szauter, Jeffrey S. Thompson, Matthew Wawersik, James Youngblom, Leming Zhou, Elaine R. Mardis, Jeremy Buhler, Wilson Leung, David Lopatto, Sarah C.R. Elgin

Faculty Publications

There is widespread agreement that science, technology, engineering, and mathematics programs should provide undergraduates with research experience. Practical issues and limited resources, however, make this a challenge. We have developed a bioinformatics project that provides a course-based research experience for students at a diverse group of schools and offers the opportunity to tailor this experience to local curriculum and institution-specific student needs. We assessed both attitude and knowledge gains, looking for insights into how students respond given this wide range of curricular and institutional variables. While different approaches all appear to result in learning gains, we find that a significant ...


Seqnls: Nuclear Localization Signal Prediction Based On Frequent Pattern Mining And Linear Motif Scoring, J.-R. Lin, Jianjun Hu Jan 2013

Seqnls: Nuclear Localization Signal Prediction Based On Frequent Pattern Mining And Linear Motif Scoring, J.-R. Lin, Jianjun Hu

Faculty Publications

Nuclear localization signals (NLSs) are stretches of residues in proteins mediating their importing into the nucleus. NLSs are known to have diverse patterns, of which only a limited number are covered by currently known NLS motifs. Here we propose a sequential pattern mining algorithm SeqNLS to effectively identify potential NLS patterns without being constrained by the limitation of current knowledge of NLSs. The extracted frequent sequential patterns are used to predict NLS candidates which are then filtered by a linear motif-scoring scheme based on predicted sequence disorder and by the relatively local conservation (IRLC) based masking.

The experiment results on ...


Mtbindingsim: Simulate Protein Binding To Microtubules, Julia T. Philip, Charles H. Pence, Holly V. Goodson Jan 2012

Mtbindingsim: Simulate Protein Binding To Microtubules, Julia T. Philip, Charles H. Pence, Holly V. Goodson

Faculty Publications

Summary: Many protein–protein interactions are more complex than can be accounted for by 1:1 binding models. However, biochemists have few tools available to help them recognize and predict the behaviors of these more complicated systems, making it difficult to design experiments that distinguish between possible binding models. MTBindingSim provides researchers with an environment in which they can rapidly compare different models of binding for a given scenario. It is written specifically with microtubule polymers in mind, but many of its models apply equally well to any polymer or any protein–protein interaction. MTBindingSim can thus both help in ...


Minimalist Ensemble Algorithms For Genome-Wide Protein Localization Prediction, J.-R. Lin, A. M. Mondal, R. Liu, Jianjun Hu Jan 2012

Minimalist Ensemble Algorithms For Genome-Wide Protein Localization Prediction, J.-R. Lin, A. M. Mondal, R. Liu, Jianjun Hu

Faculty Publications

Background

Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms.

Results

This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature ...


Refnetbuilder: A Platform For Construction Of Integrated Reference Gene Regulatory Networks From Expressed Sequence Tags, Ying Li, Ping Gong, Edward J. Perkins, Chaoyang Zhang, Nan Wang Oct 2011

Refnetbuilder: A Platform For Construction Of Integrated Reference Gene Regulatory Networks From Expressed Sequence Tags, Ying Li, Ping Gong, Edward J. Perkins, Chaoyang Zhang, Nan Wang

Faculty Publications

Background: Gene Regulatory Networks (GRNs) provide integrated views of gene interactions that control biological processes. Many public databases contain biological interactions extracted from experimentally validated literature reports, but most furnish only information for a few genetic model organisms. In order to provide a bioinformatic tool for researchers who work with non-model organisms, we developed RefNetBuilder, a new platform that allows construction of putative reference pathways or GRNs from expressed sequence tags (ESTs).

Results: RefNetBuilder was designed to have the flexibility to extract and archive pathway or GRN information from public databases such as the Kyoto Encyclopedia of Genes and Genomes ...


The Proteogenomic Mapping Tool, William S. Sanders, Nan Wang, Susan M. Bridges, Brandon M. Malone, Yoginder S. Dandass, Fiona M. Mccarthy, Bindu Nanduri, Mark L. Lawrence, Shane C. Burgess Apr 2011

The Proteogenomic Mapping Tool, William S. Sanders, Nan Wang, Susan M. Bridges, Brandon M. Malone, Yoginder S. Dandass, Fiona M. Mccarthy, Bindu Nanduri, Mark L. Lawrence, Shane C. Burgess

Faculty Publications

Background: High-throughput mass spectrometry (MS) proteomics data is increasingly being used to complement traditional structural genome annotation methods. To keep pace with the high speed of experimental data generation and to aid in structural genome annotation, experimentally observed peptides need to be mapped back to their source genome location quickly and exactly. Previously, the tools to do this have been limited to custom scripts designed by individual research groups to analyze their own data, are generally not widely available, and do not scale well with large eukaryotic genomes.

Results: The Proteogenomic Mapping Tool includes a Java implementation of the Aho-Corasick ...


Hemebind: A Novel Method For Heme Binding Residue Prediction By Combining Structural And Sequence Information, R. Liu, Jianjun Hu Jan 2011

Hemebind: A Novel Method For Heme Binding Residue Prediction By Combining Structural And Sequence Information, R. Liu, Jianjun Hu

Faculty Publications

Background

Accurate prediction of binding residues involved in the interactions between proteins and small ligands is one of the major challenges in structural bioinformatics. Heme is an essential and commonly used ligand that plays critical roles in electron transfer, catalysis, signal transduction and gene expression. Although much effort has been devoted to the development of various generic algorithms for ligand binding site prediction over the last decade, no algorithm has been specifically designed to complement experimental techniques for identification of heme binding residues. Consequently, an urgent need is to develop a computational method for recognizing these important residues.

Results

Here ...


Computational Prediction Of Heme-Binding Residues By Exploiting Residue Interaction Network, R. Liu, Jianjun Hu Jan 2011

Computational Prediction Of Heme-Binding Residues By Exploiting Residue Interaction Network, R. Liu, Jianjun Hu

Faculty Publications

Computational identification of heme-binding residues is beneficial for predicting and designing novel heme proteins. Here we proposed a novel method for heme-binding residue prediction by exploiting topological properties of these residues in the residue interaction networks derived from three-dimensional structures. Comprehensive analysis showed that key residues located in heme-binding regions are generally associated with the nodes with higher degree, closeness and betweenness, but lower clustering coefficient in the network. HemeNet, a support vector machine (SVM) based predictor, was developed to identify heme-binding residues by combining topological features with existing sequence and structural features. The results showed that incorporation of network-based ...


Prediction Of Discontinuous B-Cell Epitopes Using Logistic Regression And Structural Information, R. Liu, Jianjun Hu Jan 2011

Prediction Of Discontinuous B-Cell Epitopes Using Logistic Regression And Structural Information, R. Liu, Jianjun Hu

Faculty Publications

Computational prediction of discontinuous B-cell epitopes remains challenging, but it is an important task in vaccine design. In this study, we developed a novel computational method to predict discontinuous epitope residues by combining the logistic regression model with two important structural features, B-factor and relative accessible surface area (RASA). We conducted five-fold cross-validation on a representative dataset composed of antigen structures bound with antibodies and independent testing on Epitome database, respectively. Experimental results indicate that besides the well-known RASA feature, B-factor can also be used to identify discontinuous epitopes. Furthermore, these two features are complementary and their combination can remarkably ...


Dynamics Of Protofibril Elongation And Association Involved In Aβ42 Peptide Aggregation In Alzheimer's Disease, Preetam Ghosh, Amit Kumar, Bhaswati Datta, Vijayaraghavan Rangachari Oct 2010

Dynamics Of Protofibril Elongation And Association Involved In Aβ42 Peptide Aggregation In Alzheimer's Disease, Preetam Ghosh, Amit Kumar, Bhaswati Datta, Vijayaraghavan Rangachari

Faculty Publications

Background: The aggregates of a protein called, ‘Aβ’ found in brains of Alzheimer’s patients are strongly believed to be the cause for neuronal death and cognitive decline. Among the different forms of Aβ aggregates, smaller aggregates called ‘soluble oligomers’ are increasingly believed to be the primary neurotoxic species responsible for early synaptic dysfunction. Since it is well known that the Aβ aggregation is a nucleation dependant process, it is widely believed that the toxic oligomers are intermediates to fibril formation, or what we call the ‘on-pathway’ products. Modeling of Aβ aggregation has been of intense investigation during the last ...


Time Lagged Information Theoretic Approaches To The Reverse Engineering Of Gene Regulatory Networks, Vijender Chaitankar, Preetam Ghosh, Edward J. Perkins, Ping Gong, Youping Deng, Chaoyang Zhang Oct 2010

Time Lagged Information Theoretic Approaches To The Reverse Engineering Of Gene Regulatory Networks, Vijender Chaitankar, Preetam Ghosh, Edward J. Perkins, Ping Gong, Youping Deng, Chaoyang Zhang

Faculty Publications

Background: A number of models and algorithms have been proposed in the past for gene regulatory network (GRN) inference; however, none of them address the effects of the size of time-series microarray expression data in terms of the number of time-points. In this paper, we study this problem by analyzing the behaviour of three algorithms based on information theory and dynamic Bayesian network (DBN) models. These algorithms were implemented on different sizes of data generated by synthetic networks. Experiments show that the inference accuracy of these algorithms reaches a saturation point after a specific data size brought about by a ...


Quail Genomics: A Knowledgebase For Northern Bobwhite, Arun Rawat, Kurt A. Gust, Mohamed O. Elasri, Edward J. Perkins Oct 2010

Quail Genomics: A Knowledgebase For Northern Bobwhite, Arun Rawat, Kurt A. Gust, Mohamed O. Elasri, Edward J. Perkins

Faculty Publications

Background

The Quail Genomics knowledgebase (http://www.quailgenomics.info) has been initiated to share and develop functional genomic data for Northern bobwhite (Colinus virginianus). This web-based platform has been designed to allow researchers to perform analysis and curate genomic information for this non-model species that has little supporting information in GenBank.

Description

A multi-tissue, normalized cDNA library generated for Northern bobwhite was sequenced using 454 Life Sciences next generation sequencing. The Quail Genomics knowledgebase represents the 478,142 raw ESTs generated from the sequencing effort in addition to assembled nucleotide and protein sequences including 21,980 unigenes annotated with meta-data ...


Incorporating Genomics And Bioinformatics Across The Life Sciences Curriculum, Jayna L. Ditty, Christopher A. Kvaal, Brad Goodner, Sharyn K. Freyermuth, Cheryl Bailey, Robert A. Britton, Stuart G. Gordon, Sabine Heinhorst, Kelyenne Reed, Zhaohui Xu, Erin R. Sanders-Lorenz, Seth Axen, Edwin Kim, Mitrick Johns, Kathleen Scott, Cheryl A. Kerfeld Aug 2010

Incorporating Genomics And Bioinformatics Across The Life Sciences Curriculum, Jayna L. Ditty, Christopher A. Kvaal, Brad Goodner, Sharyn K. Freyermuth, Cheryl Bailey, Robert A. Britton, Stuart G. Gordon, Sabine Heinhorst, Kelyenne Reed, Zhaohui Xu, Erin R. Sanders-Lorenz, Seth Axen, Edwin Kim, Mitrick Johns, Kathleen Scott, Cheryl A. Kerfeld

Faculty Publications

No abstract provided.


Bayesmotif: De Novo Protein Sorting Motif Discovery From Impure Datasets, Jianjun Hu, F. Zhang Jan 2010

Bayesmotif: De Novo Protein Sorting Motif Discovery From Impure Datasets, Jianjun Hu, F. Zhang

Faculty Publications

Background

Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms.

Methods

We formulated the protein sorting motif discovery problem as a classification problem ...


Feature Selection And Classification Of Maqc-Ii Breast Cancer And Multiple Myeloma Microarray Gene Expression Data, Qingzhong Liu, Andrew H. Sung, Zhongxue Chen, Jianzhong Liu, Xudong Huang, Youping Deng Dec 2009

Feature Selection And Classification Of Maqc-Ii Breast Cancer And Multiple Myeloma Microarray Gene Expression Data, Qingzhong Liu, Andrew H. Sung, Zhongxue Chen, Jianzhong Liu, Xudong Huang, Youping Deng

Faculty Publications

Microarray data has a high dimension of variables but available datasets usually have only a small number of samples, thereby making the study of such datasets interesting and challenging. In the task of analyzing microarray data for the purpose of, e.g., predicting gene-disease association, feature selection is very important because it provides a way to handle the high dimensionality by exploiting information redundancy induced by associations among genetic markers. Judicious feature selection in microarray data analysis can result in significant reduction of cost while maintaining or improving the classification or prediction accuracy of learning machines that are employed to ...


Subcellular Localization Of Marine Bacterial Alkaline Phosphatases, H. Luo, Ronald Benner, R. A. Long, Jianjun Hu Jan 2009

Subcellular Localization Of Marine Bacterial Alkaline Phosphatases, H. Luo, Ronald Benner, R. A. Long, Jianjun Hu

Faculty Publications

Bacterial alkaline phosphatases (APases) are important enzymes in organophosphate utilization in the ocean. The subcellular localization of APases has significant ecological implications for marine biota but is largely unknown. The extensive metagenomic sequence databases from the Global Ocean Sampling Expedition provide an opportunity to address this question. A bioinformatics pipeline was developed to identify marine bacterial APases from the metagenomic databases, and a consensus classification algorithm was designed to predict their subcellular localizations. We identified 3,733 bacterial APase sequences (including PhoA, PhoD, and PhoX) and found that cytoplasmic (41%) and extracellular (30%) APases exceed their periplasmic (17%), outer membrane ...


Integrative Disease Classification Based On Cross-Platform Microarray Data, C.-C. Liu, Jianjun Hu, M. Kalakrishnan, H. Huang, X. J. Zhou Jan 2009

Integrative Disease Classification Based On Cross-Platform Microarray Data, C.-C. Liu, Jianjun Hu, M. Kalakrishnan, H. Huang, X. J. Zhou

Faculty Publications

Background

Disease classification has been an important application of microarray technology. However, most microarray-based classifiers can only handle data generated within the same study, since microarray data generated by different laboratories or with different platforms can not be compared directly due to systematic variations. This issue has severely limited the practical use of microarray-based disease classification.

Results

In this study, we tested the feasibility of disease classification by integrating the large amount of heterogeneous microarray datasets from the public microarray repositories. Cross-platform data compatibility is created by deriving expression log-rank ratios within datasets. One may then compare vectors of log-rank ...


Novel Implementation Of Conditional Co-Regulation By Graph Theory To Derive Co-Expressed Genes From Microarray Data, Arun Rawat, Georg J. Seifert, Youping Deng Aug 2008

Novel Implementation Of Conditional Co-Regulation By Graph Theory To Derive Co-Expressed Genes From Microarray Data, Arun Rawat, Georg J. Seifert, Youping Deng

Faculty Publications

Background

Most existing transcriptional databases like Comprehensive Systems-Biology Database (CSB.DB) and Arabidopsis Microarray Database and Analysis Toolbox (GENEVESTIGATOR) help to seek a shared biological role (similar pathways and biosynthetic cycles) based on correlation. These utilize conventional methods like Pearson correlation and Spearman rank correlation to calculate correlation among genes. However, not all are genes expressed in all the conditions and this leads to their exclusion in these transcriptional databases that consist of experiments performed in varied conditions. This leads to incomplete studies of co-regulation among groups of genes that might be linked to the same or related biosynthetic pathway ...


Comparison Of Probabilistic Boolean Network And Dynamic Bayesian Network Approaches For Inferring Gene Regulatory Networks, Peng Li, Chaoyang Zhang, Edward J. Perkins, Ping Gong, Youping Deng Nov 2007

Comparison Of Probabilistic Boolean Network And Dynamic Bayesian Network Approaches For Inferring Gene Regulatory Networks, Peng Li, Chaoyang Zhang, Edward J. Perkins, Ping Gong, Youping Deng

Faculty Publications

Background: The regulation of gene expression is achieved through gene regulatory networks (GRNs) in which collections of genes interact with one another and other substances in a cell. In order to understand the underlying function of organisms, it is necessary to study the behavior of genes in a gene regulatory network context. Several computational approaches are available for modeling gene regulatory networks with different datasets. In order to optimize modeling of GRN, these approaches must be compared and evaluated in terms of accuracy and efficiency.

Results: In this paper, two important computational approaches for modeling gene regulatory networks, probabilistic Boolean ...


Cloning, Analysis And Functional Annotation Of Expressed Sequence Tags From The Earthworm Eisenia Fetida, Mehdi Pirooznia, Ping Gong, Xin Guan, Laura S. Inouye, Kuan Yang, Edward J. Perkins, Youping Deng Nov 2007

Cloning, Analysis And Functional Annotation Of Expressed Sequence Tags From The Earthworm Eisenia Fetida, Mehdi Pirooznia, Ping Gong, Xin Guan, Laura S. Inouye, Kuan Yang, Edward J. Perkins, Youping Deng

Faculty Publications

Background

Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR.

Results

A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw ...


Structure And Function Predictions Of The Msa Protein In Staphylococcus Aureus, Vijayaraj Nagarajan, Mohamed O. Elasri Jan 2007

Structure And Function Predictions Of The Msa Protein In Staphylococcus Aureus, Vijayaraj Nagarajan, Mohamed O. Elasri

Faculty Publications

Background

Staphylococcus aureus is a human pathogen that causes a wide variety of life-threatening infections using a large number of virulence factors. One of the major global regulators used by S. aureus is the staphylococcal accessory regulator (sarA). We have identified and characterized a new gene (modulator of sarA: msa) that modulates the expression of sarA. Genetic and functional analysis shows that msa has a global effect on gene expression in S. aureus. However, the mechanism of Msa function is still unknown. Function predictions of Msa are complicated by the fact that it does not have a homologous partner in ...