Network for Integrating Bioinformatics into Life Sciences Education: Conference 2019 Results

Molecular phylogeny implemented in an introductory plant classification course

Chao Cai¹, Jody Banks¹

¹Purdue University

Abstract Plant classification is one of the core components in undergraduate programs related to plant sciences. Traditionally plant classification courses primarily introduce morphology-based taxonomy because of practical needs in the field. However, the publication of new plant classification systems by Angiosperm Phylogeny Group (APG) using molecular phylogeny methods leads to the trends of using molecular evidence (DNA barcode) for plant identification. In our introductory plant classification course, we included a two-week module (lectures and labs) to introduce key concepts and fundamental skills in molecular phylogeny. Week 1 included concepts of evolutionary tree thinking, data mining in NCBI using BLAST search, and phylogenetic tree building. Week 2 introduced concepts of DNA sequencing and barcoding for plant identification. Student selected their own plants to sequence the DNA barcodes, which were then used in the final exam for practice and summative assessments. One challenge we are constantly dealing with is the increasing difficulty in finding diverse sequence using BLAST because of the fast-growing number of angiosperm genomes sequenced.

Bioinformatics for Biologists: Have a BLAST and beyond!

Jessica Siltberg-Liberles¹, Rocio Benabentos², Sreyasi Biswas²

¹Department of Biological Sciences, Biomolecular Sciences Institute, Florida International University, Miami, Florida ²STEM Transformation Institute, Florida International University, Miami, Florida

We present our ongoing work to develop and implement a self-assessment of bioinformatics for undergraduate life science students in Bioinformatics for Biologists - an active-learning course that includes an authentic course-based undergraduate research experience (CURE). The course follows a partially flipped instructional mode and is divided into two sections. In the first section, students learn to do bioinformatics through activities and interactive lectures. In the second section, students work in groups and apply their acquired bioinformatics skills to a course-based novel research project. The course is designed to emphasize not only application of bioinformatics skills in a novel research context but also the importance of collaboration, manuscript preparation, peer review, and reproducibility of bioinformatics results. The course is currently taught with support of 6 undergraduate Learning and Writing Assistants. In order to assess our learning goals, we have developed and implemented a survey targeting the bioinformatics core competencies. Moreover, we have developed a concept inventory quiz to measure students’ conceptual understanding of common bioinformatics algorithms and their biological applications. Here, we present preliminary data that demonstrates that students greatly gain self-confidence in their bioinformatics core competencies after taking this course.

Hemoglobin – connecting DNA sequence to structure and function

Keith A. Johnson Department of Biology

Bradley University, Peoria, IL

This three-part case study is aimed at connecting students’ understanding of DNA mutations, the effects on the encoded polypeptide sequence and structure, and the impact of this mutation on the function of the protein. This case study uses the tetrameric hemoglobin protein to illustrate the impact of mutations on the DNA sequence and protein function. The first part of the case study directs students to compare the α - and ß-globin DNA and polypeptide sequences using BLAST and multiple sequence alignments. In the second part of the case study, students compare the nucleic acid sequences of the globin genes with the mutated sickle-cell anemia sequence and determine which globin sequence has been altered and identify the type of mutation that occurs (DNA and amino acid level). They are also directed to OMIM.org to find out more disease information. Connecting the mutation to protein structure is an important part to understanding how mutations affect function, so in the third part the students observe a three-dimensional model (pdb file) and examine the location and discuss the impact of the sickle-cell mutation on protein structure and function. Additional worksheets can be generated to followup other globinopathies and protein function.

Yeti or Not – Do They Exist?

Keith A. Johnson Department of Biology

Bradley University, Peoria, IL

Through this 2-part bioinformatics case study, students will be led through the forensic analysis of putative Yeti artifacts based on published findings. In the first part of the case study, students will review the inheritance patterns of nuclear and mitochondrial DNA. Students will also review and compare prokaryotic DNA replication with the technique of Sanger dideoxy- sequencing. This part of the case study can easily be given during a classroom lecture period. In the second (take home) part of the case study, students are introduced to the GenBank database of sequences and the BLAST algorithm. Students are led through a BLAST search using short, published DNA fragments from purported Yeti artifacts. The sequences are compared with the GenBank database to determine if there is a match, create a multiple sequence alignment (ClustalOmega) and a relationship tree. In a ‘Breaking News’ addition, a separate sequence is provided which currently does not match any sequence from the database but is clearly related. Students are asked to perform a BLAST search on this ‘new’ sequence and make some conclusions. The assessment of the second part has been performed through a course website

Immuno-biotechnology and bioinformatics in Community Colleges

Todd M. Smith¹, Sandra G. Porter^1,2, Dina Kovarik²

¹_{Digital World Biology and}²_{Shoreline Community College}

Immuno-biotechnology is one of the fastest growing areas in the field of biotechnology. Digital World Biology’s Biotech-Careers.org database has over 700 companies that are involved in immunology. Next generation DNA sequencing, and other technologies has significantly increased the use of computing technologies to decipher the meaning of large datasets and predict interactions between immune receptors and their targets.

Technologies like immune-profiling, and targeted cancer therapies are leading to job growth and demands for new skills and knowledge in biomanufacturing, quality systems, immuno- bioinformatics. In response to these new demands, Shoreline Community College (Shoreline, WA) is developing an immuno-biotechnology certificate that includes a five-week course on immuno-bioinformatics.

The immuno-bioinformatics course includes exercises in immune profiling, vaccine development, and operating bioinformatics programs using a command line interface. Students will explore T-cell receptor datasets from early stage breast cancer samples using Adaptive Biotechnologies’ (Seattle, WA) immunoSEQ Analyzer public server. Next, they use the IEDB (Immune Epitope Database) in conjunction with Molecule World (Digital World Biology) to predict antigens from sequences to learn the differences between continuous and discontinuous epitopes that are recognized by T-cell receptors and antibodies. Finally, students will use cloud computing (CyVerse) and igBLAST (NCBI) to gain additional experience.

Gotta Catch ‘Em All - Using Pokémon to Introduce Students to Phylogenetics and Bioinformatics

Janelle Nunez-Castilla and Jessica Siltberg-Liberles

Department of Biological Sciences, Biomolecular Sciences Institute, Florida International University, Miami, FL, USA

At Florida International University, a large public minority serving institution with 4000 biology majors, we implemented an evolution lab in General Biology II that exposes introductory biology students to bioinformatics. The narrative of the lab suggests that Pokémon have been discovered in the wild and we need to determine where they fit in a broader evolutionary context. Students use morphological traits to complete a character matrix and build a phylogeny as their initial hypothesis for how the Pokémon relate to one another. Different student groups are then given protein sequence data said to correspond to a specific Pokémon (but that is from an animal) that they use to run a BLAST search, build a multiple sequence alignment, and generate a basic phylogeny. Using their results, students determine how the different Pokémon are related and compare their morphology-based phylogeny to their protein-based phylogeny. A post-assessment shows that most students have never heard about bioinformatics prior to the lab. Additionally, through the lab most students grasp the importance of bioinformatics in biology, but only about half think that they will need to use bioinformatics in the future.

Faculty Mentoring Networks Facilitate the Infusion of Bioinformatics into a Developmental Biology Course

Barbara Murdoch

Eastern Connecticut State University

Having little working knowledge of bioinformatics, but wanting to include bioinformatics into my biology courses, I sought support through a Faculty Mentoring Network. The Faculty Mentoring Network supplied peer-reviewed bioinformatic modules as my guide and provided helpful discussions regarding how to deliver and assess the content in-class. I tailored the modules to align with my own course material and student learning objectives for a Developmental Biology course. The students appeared to not only enjoy these bioinformatic exercises, but also made learning gains, as assessed via pre/post surveys. The Faculty Mentoring Network provided the much needed impetus to create and to incorporate bioinformatics into my biology course.

The Genomics Education Partnership: A community of practice that enhances research opportunities for students and faculty at diverse institutions

Charles Hauser¹, Christopher Jones², Anne Rosenwald³, Wilson Leung⁴, Sarah C.R. Elgin⁴, Laura K. Reed⁵ and the Genomics Education Partnership

¹St. Edward’s University, Austin TX; ²Moravian College, Bethlehem, PA; ³Georgetown University, Washington, DC; ⁴Washington University in St. Louis, MO; ⁵University of Alabama - Tuscaloosa, AL

Since 2006, the Genomics Education Partnership (GEP) has incorporated authentic genomics research experiences into the undergraduate curriculum, introducing thousands of students to eukaryotic gene structure, comparative genomics, and genome evolution. Our 100+ participating institutions include community colleges, primarily undergraduate institutions, minority-serving institutions (MSIs), historically black colleges and universities, and research-intensive PhD-granting institutions. For many faculty and their students, the accessible, immersive curriculum and custom bioinformatics tools represent a unique opportunity to participate in research. GEP has partnered with Galaxy to develop G-OnRamp, an open-source platform for constructing UCSC Assembly Hubs and JBrowse/Apollo genome browsers for eukaryotic genomes. This has enabled crowd-sourced gene annotation of more varied research projects to be incorporated into the GEP portfolio, including investigations of the evolution of venom in parasitoid wasps and an investigation of the evolution of insulin pathway genes across 27 Drosophila genomes. Our ongoing work in science education finds that a bioinformatics CURE fosters “formative frustration” where students can safely fail in their original analysis, adjust, recover, and succeed. This iterative process allows students to gain deeper insights into annotation and occurs quickly within an inexpensive, online framework. Supported by NSF IUSE-1431407 to SCRE, NSF IUSE-1915544 to LKR, and NIH IPERT-1R25GM130517-01 to LKR

HITS: A network to create inquiry-based case studies that make high-throughput approaches and discovery accessible

¹_{Carlos C. Goller and}²_{Sabrina Robertson}

¹_{North Carolina State University}

²_{The University of North Carolina at Chapel Hill}

Modern molecular biology techniques are increasingly utilizing automation and miniaturization to test numerous samples or conditions simultaneously. High-throughput (HT) approaches include massively parallel sequencing of DNA, synthesis of numerous nucleic acids and peptides, automated microscopy, microfluidics for single-cell analyses, small molecule screening using robotics, and genome-scale phenotypic characterization using CRISPR/Cas9 gene editing technologies. These approaches produce a wealth of results, often labeled ‘big data.’ However, there are limited educational case studies that address authentic high-throughput approaches using real data. We believe well-designed accessible educational case studies focusing on HT approaches and using original datasets empower students to learn current approaches and exercise quantitative reasoning in data analyses.

We created the NSF-funded High-throughput Discovery Science & Inquiry-based Case Studies for Today’s Students (HITS) Research Coordination Network to address this gap. HITS bring together interdisciplinary groups of HT researchers and instructors to produce authentic HT case studies that can be implemented in a variety of courses, allowing students to analyze real data and learn valuable quantitative skills. Since 2018, twenty-one faculty Case Fellows, numerous case study experts, and HT researchers have formed interdisciplinary groups to design, improve, and implement HT case studies. Using QUBES, groups have created novel cases for broad curricula.

Genome Solver: Student Learning in Bioinformatics After Faculty Training

Vinayak Mathur^#, Gaurav S. Arora^, and Anne G. Rosenwald*

^#Department of Biology, Cabrini University, Radnor, PA

^Department of Science, Technology, and Mathematics, Gallaudet University,Washington, DC

*Department of Biology, Georgetown University, Washington, DC

Bioinformatics offers students opportunities to engage in interdisciplinary research, even under low-resource conditions, since only a computer and an internet connection are required. There are tens of thousands of bacterial sequences now available for examination. Our Genome Solver Project teaches faculty how to acquire these sequences and some basic web-based tools that can be employed to study them, with the aim that faculty will include these new skills in their courses. Does this teacher training result in student learning gains in core bioinformatics concepts?

To address this question, we designed a quiz and asked faculty to have their students take the quiz before instruction in bioinformatics and then again afterwards. The current dataset, which encompasses data from ~650 students at six diverse institutions demonstrates quiz scores improved significantly (p < 0.5) after the intervention. Gains observed were similar at each school over multiple semesters. Phone interviews with individual faculty to learn more about their implementation practices included questions about the amount of time devoted to Genome Solver and other bioinformatics materials, active learning practices, and the extent to which students engaged in authentic research in the course. Our data indicate that faculty in bioinformatics training results in positive outcomes for their students.

The Genomics Education Alliance (GEA)

Anne G. Rosenwald¹, Vince Buonaccorsi², Douglas Chalker³, Rochelle Tractenberg¹, Jason Williams⁴

¹_{Georgetown University,}²_{Juniata College,}³_{Washington University in St. Louis,}⁴_{Cold Spring Harbor Laboratories}

GEA is a group of life science educators with experience in engaging students in genomics- based Classroom-based Undergraduate Research Experiences (CUREs). Because genomics CUREs are effective for students to learn key concepts and the practice of science, we have come together to identify and overcome common barriers to put such experiences within the reach of all life science faculty and students. To achieve this goal, GEA will curate curriculum and assessment materials, and make them freely available on the CyVerse platform. We are now piloting a set of materials for use by educators in their classrooms in three areas: lessons on examining gene sequence similarities using BLAST, understanding gene structure by using a eukaryotic genome browser, and investigating gene expression by using basic tools for RNA-seq analysis. Eventually, we will curate a wide variety of materials both from our existing CUREs and new resources we create and make available the appropriate compute resources necessary for use in the classroom. Ultimately, we aim to facilitate efforts by faculty who build their own genomics CURE using our optimized resources. We are currently recruiting faculty to test our pilot resources; let us know if you’re interested. Supported by NSF RCN-UBE grant # DBI 1827130.