A single link to the first track to allow the export script to build the search page
  • Undergraduate Poster Abstracts
  • Computer/Information Sciences

    THU-701 LEXICON AND LANGUAGE MODELING OF EGYPTIAN DIALECT AUTOMATIC SPEECH RECOGNITION

    • Taha Merghani ;
    • Tuka Al Hanai ;
    • James Glass ;

    THU-701

    LEXICON AND LANGUAGE MODELING OF EGYPTIAN DIALECT AUTOMATIC SPEECH RECOGNITION

    Taha Merghani1, Tuka Al Hanai2, James Glass2.

    1Jackson State University, Jackson, MS, 2Massachusetts Institute of Technology, Cambridge, MA.

    The task of developing automatic speech recognition (ASR) systems for Arabic is challenging as a result of the myriad dialects, complicated morphology (the way words are put together), and ambiguous orthography (the way the language is written). We explored lexical, language, and acoustic models in the domain of resource-poor languages with limited data, focusing on the Aljazeera corpus, composed of 12 hours of the Egyptian Arabic dialect. With the aid of KALDI, a speech recognition toolkit, we trained several acoustic models including Gaussian mixture model based hidden Markov models (GMM-HMM), deep neural networks (DNN), and long- short-term memory recursive neural networks (LSTM-RNN). In addition, we built lexical and language models evaluating the use of graphemes and diacritized pronunciations in the lexicon. The diacritized lexicons and language models were generated using the MADAMIRA text processing toolkit of Egyptian Arabic, and have evaluated ASR performance using the word error rate (WER) metric.

    FRI-711 REDESIGNING THE USER INTERFACE OF THE OPENMSI WEBSITE

    • Ashley Cato ;
    • Benjamin Bowen ;
    • Oliver Ruebel ;

    FRI-711

    REDESIGNING THE USER INTERFACE OF THE OPENMSI WEBSITE

    Ashley Cato1, Benjamin Bowen2, Oliver Ruebel2.

    1University of California, Merced, Merced, CA, 2Lawrence Berkeley National Labs, Berkeley, CA.

    Scientists are collaborating and sharing large amounts of data more than ever before. Often, scientists in specialized fields create dedicated websites tailored to the needs of their community. An example is OpenMSI, a web-based tool for visualization, analysis, and management of mass spectrometry imaging (MSI) data. With such sites, growth and adoption by increasing numbers of scientists mean that new ideas and usage patterns sprout, which in turn demands new capabilities for the site. Early adopters’ use of OpenMSI led to it now hosting over 5 terabytes of data, a deluge that has become difficult for users to navigate. To address this problem, we redesigned the interface of the site, introducing new functionality for sorting files, displaying files, and navigating the site. The code base we implemented will make it easier for others to administer, improve upon, and understand. The end result will be an updated website that promotes a better, more functional user experience.

    FRI-710 SIMULATIONS OF RED GIANT STAR CLUSTER WIND COLLISIONS

    • Jose Lopez ;
    • Enrico Ramirez-Ruiz ;
    • Melinda Soares-Furtado ;

    FRI-710

    SIMULATIONS OF RED GIANT STAR CLUSTER WIND COLLISIONS

    Jose Lopez1, Enrico Ramirez-Ruiz1, Melinda Soares-Furtado2.

    1University of California, Santa Cruz, Santa Cruz, CA, 2Princeton University, Princeton, NJ.

    Modern advances in technical science require sophisticated computer simulations to make further explorations possible. The simulations should be able to depict the various physical phenomena that are difficult to observe directly, either due to extremely large or small length scales or across time scales ranging from microseconds and millions of years. The collisions between the stellar winds of red giant stars in clusters present such a scenario as they are difficult to observe directly and must be studied using simulations. The FLASH hydrodynamic physics simulator was used to evolve the structure of stellar winds from a cluster of red giants over time. FLASH produces a series of HDF5 files containing vast arrays of physical parameters representing individual snapshots of these events over the desired time interval. For the sake of physical fidelity and to maintain comprehensiveness, the files were rendered and analyzed at different camera angles using Yt, a Python package tailored specifically for visualization of astrophysics simulations. We employed Yt to demonstrate manipulations of the HDF5 files and stitched them together to create equally useful and aesthetic animations. The animation created for the star cluster simulation aptly shows how the winds expand over time. The ultimate goal of this project is to create a program which would allow the user to stop the animation at any point in time and manipulate the viewing angle in 3D-space in any way desired by means of input more intuitive than keystrokes; doing this on a 3D screen would be ideal.

    THU-710 ANALYZING ADAPTIVE MODULATION IN SPINAL MOTOR NEURONS USING MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS

    • Karla Miletti ;
    • Joseph Lombardo ;
    • Melissa Harrington ;
    • Tomasz Smolinski ;

    THU-710

    ANALYZING ADAPTIVE MODULATION IN SPINAL MOTOR NEURONS USING MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS

    Karla Miletti, Joseph Lombardo, Melissa Harrington, Tomasz Smolinski.

    Delaware State University, Dover, DE.

    Activity-dependent plasticity (ADP) has for a long time been presumed to be a property of the brain. However, recent work indicates that ADP also occurs in spinal motor neurons. Understanding how spinal motoneuron output can be modified by both increased and decreased activity is thus important work with real-world implications. We hypothesized that the alteration in the function of Kv7.2 channel (which carries the M current) and changes in axonal initial segment (AIS) properties are the primary mechanisms of adaptation of spinal motoneurons to prolonged network activation. This hypothesis is supported by the literature and our own experimental results. To test our hypothesis, we developed a realistic computational model of spinal motoneuron activity before and after persistent network activation. As the starting point, we utilized a reconstructed spinal motoneuron morphology of neonatal mice together with the detailed specification of the active and passive somatodendritic and axonal properties derived from a rodent cortical neuron model. Then, we adjusted the model parameter values to match experimental recordings using a multi-objective evolutionary algorithm (MOEA). The algorithm matches multiple selection criteria simultaneously (e.g., input resistance, current threshold, etc.) and generates entire collections of neuronal models that can be mined for rules describing the phenomena captured by the models (for instance, co-regulations between ionic conductances). Furthermore, since the MOEA generates 2 independent databases of models (i.e., before and after persistent activation), we are able to directly compare the phenomena discovered by our data mining process in each dataset, thus elucidating the mechanisms underlying plasticity.

    THU-711 A COMPASS FOR NAVIGATING BIOIMAGING'S BIG DATA

    • Elizabeth Arikawa ;
    • Benjamin Bowen ;
    • Oliver Rubel ;

    THU-711

    A COMPASS FOR NAVIGATING BIOIMAGING'S BIG DATA

    Elizabeth Arikawa1, Benjamin Bowen2, Oliver Rubel2.

    1University of California, Merced, Merced, CA, 2Lawrence Berkeley National Laboratory, Berkeley, CA.

    As mass spectrometry imaging continues to grow as a field, proper documentation of data becomes of increasing importance to these researchers. Metadata (i.e., data about data) includes information associated with objects for the purpose of description, administration, and preservation. OpenMSI is a web-based visualization, analysis, and management system for mass spectrometry imaging data. As the site’s user base increases, the need for useful metadata becomes increasingly relevant. OpenMSI currently lacks a way of collecting and storing metadata, limiting the site’s growth. Using Django, a web development framework, we created a metadata app with a standardized form and database model. A database enables the data to be managed efficiently and to be stored for years to come. Django’s framework provides a powerful and efficient way to facilitate the creation and manipulation of form data. By creating a hierarchy for the data, we separated the data into 3 forms: Project, Experiment, and Assay. To make the form adaptable for future needs of OpenMSI, we used South, a database migration app, to allow for fields to be added, removed, and/or modified. With this new metadata app, OpenMSI will be able to host more images and continue to grow their user base for years to come.

    FRI-700 TEST-DRIVEN DEVELOPMENT AND FUNCTIONALITY IMPROVEMENTS TO GRNMAP, A GENE REGULATORY NETWORK MODELING APPLICATION

    • Trixie Anne Roque ;
    • Kam Dahlquist ;
    • Ben Fitzpatrick ;
    • John David Dionisio ;

    FRI-700

    TEST-DRIVEN DEVELOPMENT AND FUNCTIONALITY IMPROVEMENTS TO GRNMAP, A GENE REGULATORY NETWORK MODELING APPLICATION

    Trixie Anne Roque, Kam Dahlquist, Ben Fitzpatrick, John David Dionisio.

    Loyola Marymount University, Los Angeles, CA.

    A gene regulatory network (GRN) consists of a set of transcription factors that regulate the level of gene expression encoding other transcription factors. The dynamics of a GRN describe how gene expression in the network changes over time. GRNmap is a complex MATLAB software package that uses ordinary differential equations to model the dynamics of medium-scale GRNs from budding yeast, Saccharomyces cerevisiae. The program estimates production rates, expression thresholds, and regulatory weights for each transcription factor in the network based on DNA microarray data, using forward simulations of model dynamics. Since v1.0, we have made design changes, added new features, fixed bugs, implemented a testing framework, and created documentation. For example, GRNmap now accepts and outputs Excel worksheets with more descriptive names, computes the standard deviations of the log2 expression data, and outputs an optimization diagnostics sheet which includes both actual and theoretical minimum least squares errors. We have also designed 16 manual input sheet tests to uncover and fix bugs as model and algorithm development progresses. We incorporated these tests into an automated testing framework that will speed debugging and prevent future code regressions. We have added documentation to our website and wiki and constructed a UML activity diagram to document the program's overall flow and how each function processes information. The source code and executable (which contain demo files and can run without a MATLAB license) for the updated version 1.2 are available for download at http://kdahlquist.github.io/GRNmap/ under the BSD open source license.

    THU-700 A MATHEMATICAL MODEL TO STUDY THE JOINT EFFECTS OF GENETICS AND DIET ON OBESITY

    • Victoria Kelley ;
    • Fangyuan Hong ;
    • Kevin Molina ;
    • Demetrius Rhodes ;
    • Karen Rios-Soto ;

    THU-700

    A MATHEMATICAL MODEL TO STUDY THE JOINT EFFECTS OF GENETICS AND DIET ON OBESITY

    Victoria Kelley2, Fangyuan Hong1, Kevin Molina3, Demetrius Rhodes4, Karen Rios-Soto3.

    1Mount Holyoke College, South Hadley, MA, 2James Madison University, Harrisonburg, VA, 3University of Puerto Rico, Mayaguez Campus, Mayaguez, PR, 4University of South Carolina-Beaufort, Bluffton, SC.

    Obesity has become one of the most pervasive epidemics facing North America today. Obesity is correlated with health threats such as diabetes and cardiovascular diseases that increase an individual’s mortality risk. Previous studies show that a particular single nucleotide polymorphism (SNP), rs9939609, in the fat mass and obesity-associated FTO gene is associated with obesity. A poor choice of diet and nutrition may lead to obesity. In this study, we build a system of non-linear ordinary differential equations that considers both genetic and environmental effects on populations with 3 distinct genotypes (AA, Aa, and aa). The autosomal dominant allele is A; therefore, individuals who have the genotypes AA and Aa express the FTO gene. Equilibria analysis and simulation results show that over a long period of time, when the birth frequency of each genotype is dependent on current allele frequencies, the proportion of populations with the dominant allele goes to 0, or the dominant allele A is outbred by the recessive gene allele. Simulation results show that having the allele A has a stronger impact on obesity than the diet environment. The effects of environmental factors on the dynamics of obesity are negligible at best. Fitness and genetic selection trumps any environmental bias. This study provides a significantly new insight into the synergic impact that genetics and diet play on obesity, which is rarely studied by traditional biological tools, such as GWAS.

    FRI-701 ORTHOLOGS AND ISOFORMS AND CONSERVATION OF FUNCTION

    • Daniela Perry ;
    • Harold Drabkin ;
    • Judith Blake ;

    FRI-701

    ORTHOLOGS AND ISOFORMS AND CONSERVATION OF FUNCTION

    Daniela Perry, Harold Drabkin, Judith Blake.

    The Jackson Laboratory, Bar Harbor, ME.

    Investigation of the structure and evolution of genomes can help in the interpretation of human biology through the use of comparative genomics. There is much to learn about the function of human and mouse genes and their gene products. In this analysis, we used a specific set of yeast genes, derived from a study that tested for complementation by their human counterparts, along with bioinformatics tools to extend the yeast study to include mouse genes and to examine the impact of isoforms on functionality. Although most human/yeast gene pairs had one associated mouse ortholog, in one anomalous case, a unique 2:1 mouse-to-human ortholog existed. Investigation of isoform landscape of mouse, human, and yeast on a genomic scale allowed comparison of the study dataset to genomic background. By querying the protein ontology database and developing alignment programs to sort through data, we found that only 50% of our gene set had multiple protein forms, which is significant when compared to the ~80% of the entire human genome that is alternatively spliced. Anomalous to the noncomplementing gene set, we found that about 11% of human protein forms showed higher sequence similarity to the yeast gene than the form originally used. As a result, we propose a cDNA complementation assay using the more similar isoform, which may result in complementation. By looking at datasets from a computational viewpoint, we suggest experiments that will result in a more comprehensive understanding of the human genome and its recent and distant ancestors.

    FRI-702 BIG-DATA ANALYTICS AND VISUALIZATION OF WIRELESS CARRIER SPEEDS IN CALIFORNIA

    • Louis Romero ;
    • Evan Schwander ;

    FRI-702

    BIG-DATA ANALYTICS AND VISUALIZATION OF WIRELESS CARRIER SPEEDS IN CALIFORNIA

    Louis Romero1, Evan Schwander2.

    1Hartnell College, Salinas, CA, 2School of Computing and Design, California State University, Monterey Bay, Seaside, CA.

    An analysis to measure the performance and coverage of mobile wireless broadband data services in California will provide the general population an accurate representation through visualization of the services provided throughout the state. Once analyzed, the data will indicate performance of upload and download bandwidths measured using East and West Coast servers from various locations among California. To this end, field-test data is acquired from remote servers, which contain data from a collection project sponsored by the California Public Utilities Commission, and analyzed to calculate network speeds and visualize them. Raw data as text files is processed through a comma-separated values (CSV) file generator. Once a CSV file is generated, the data is further processed using the Python programming language. My structured query language (MySQL), a relational database management system, is used to store the processed data which is accessed and managed using, a special-purpose programming language known as structured query language (SQL). Hypertext Preprocessor (PHP), a general-purpose programming language, is then used to output an extensible markup language file which describes specific data in the database. Hypertext markup language is used thereafter to display the processed data as a Google Map with markers containing network data, accessible on the web. Such analysis of upload and download speeds will inform the public of quantified performance, ensuring quality and integrity of wireless carriers and possibly leading to reform of policy, repricing of services, or plans to build more cellular towers which could lead to a higher quality of life in California.

    THU-702 LOW COST REAL TIME AUTONOMOUS REMOTE MONITORING PLATFORM

    • Joseph Rodriguez ;
    • Pedro M. Maldonado ;
    • Lora Harris ;
    • Jamie Pierson ;

    THU-702

    LOW COST REAL TIME AUTONOMOUS REMOTE MONITORING PLATFORM

    Joseph Rodriguez1, Pedro M. Maldonado1, Lora Harris2, Jamie Pierson3.

    1Universidad Metropolitana, Puerto Rico, San Juan, PR, 2University of Maryland Chesapeake Biological Laboratory, Solomons, MD, 3Horn Point Laboratory, University of Maryland Center for Environmental Science, Cambridge, MD.

    Environmental scientists have a need for gathering multiple parameters during specific time periods to answer their research questions. Most available monitoring systems are very expensive and closed systems, which limits the potential to scale up research projects. We developed a low cost, autonomous, real-time monitoring platform that is both open hardware/software and easy to build, deploy, manage, and maintain. The hardware is built with off-the-shelf components and a credit card sized computer called Raspberry Pi, running an open source operating (Raspbian) system. The system runs off a set of batteries and a solar panel, which makes it ideal for remote locations. The software is divided into 3 parts: a framework for abstracting the sensors (initializing, pooling, and communications) designed in python and using a fully object-oriented design, making it easy for new sensors to be added with minimal code changes; a web front-end application for managing the entire system; and a data store (database) framework for local and remote data retrieval and reporting services. Connectivity to the system can be accomplished through a wi-fi or cellular internet connection. Scientists are being forced to do more with less, in response, our platform will provide them with a flexible system that can improve the process of data gathering with an accessible, modular, low-cost, and efficient monitoring system. Currently, we are waiting for permits from the Department of Natural Resources in Puerto Rico to be able to deploy the platform at the Laguna Grande Bioluminescence Lagoon in Fajardo, PR.