2Probabilistic Reasoning and Bioinformatics

In this chapter, a review is given of statistics and probability concepts, with implementation of many of the concepts in Python. Python scripts are then used to do a preliminary examination of the randomness of genomic (virus) sequence data. A short review of Linux OS setup (with Python automatically installed) and Python syntax is given in Appendix A.

Numerous prior book, journal, and patent publications by the author [168] are drawn upon extensively throughout the text. Almost all of the journal publications are open access. These publications can typically be found online at either the author's personal website (www.meta‐logos.com) or with one of the following online publishers: www.m‐hikari.com or bmcbioinformatics.biomedcentral.com.

2.1 Python Shell Scripting

A “fair” die has equal probability of rolling a 1, 2, 3, 4, 5, or 6, i.e. a probability of 1/6 for each of the outcomes. Notice how the sum of all of the discrete probabilities for the different outcomes all add up to 1, this is always the case for probabilities describing a complete set of outcomes.

A “loaded” die has a non‐uniform distribution, for prob = 0.5 to roll a “6” and uniform on the other die rolls you have loaded die_roll_probability = (1/10,1/10,1/10,1/10,1/10,1/2).

The first program to be discussed is named prog1.py and will introduce the notion of discrete probability distributions in the context of rolling the familiar six‐sided die. Comments in Python ...

Get Informatics and Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.