This helps us convert the predictions into a more interpretable format. Let's read the test dataframe and insert the predictions in the correct columns:
test_df = pd.read_csv("data/test.csv")
for i, col in enumerate(["toxic", "severe_toxic", "obscene", "threat", "insult", "identity_hate"]):
test_df[col] = test_preds[:, i]
Now, we can preview a few of the rows of the DataFrame:
test_df.head(3)
We get the following output:
id | comment_text | toxic | severe_toxic | obscene | threat | insult | identity_hate | |
---|---|---|---|---|---|---|---|---|
0 | 00001cee341fdb12 | Yo bitch Ja Rule is more succesful then you'll... | 0.629146 | 0.116721 | 0.438606 | 0.156848 | 0.139696 | 0.388736 |
1 | 0000247867823ef7 | == From RfC == \r\n\r\n The title is fine as i... ... |