Generating a DNA sequence

Let's generate a DNA sequence by executing the following steps:

  1. Now we will generate a list of DNA sequences, loop through the sequences, and split them into individual nucleotides, because we want these to be the input for our algorithm.
  2. We remove the tab characters, append the class assignment, and add the nucleotides to the dataset, as follows:
sequences = list(data.loc[:, 'Sequence'])dataset = {}for i, seq in enumerate(sequences):nucleotides = list(seq)nucleotides = [x for x in nucleotides if x != '\t']nucleotides.append(classes[i])dataset[i] = nucleotidesprint(dataset[0])

We now have all of our different columns. Each column contains either an individual nucleotide or a base pair. The nucleotides are thymine ...

Get Machine Learning for Healthcare Analytics Projects now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.