APPENDIX DHow to Start Understanding What the Annual Reports Say About Digital with the Help of Natural Language Processing (NLP)
To leverage the planned NLP concepts into the empirical approach, I developed a new concept of what I call “quantification levels.” The idea behind this concept is to assign higher “levels” the more specific/concrete the outcome is assumed to be. For this book, I ended up using only both extreme ends (Level 1 and Level 3), but in the original research, a careful second‐stage preprocessing of all reports built the foundation for the analysis along all three levels. (See Figure D.1.)
Level 1 analyzed separately for each 10‐K filing date the amount of occurrences/frequency of digital transformation language dictionary terms per category in the “raw” reports and, for normalization purposes, the relative percentage of occurrences versus total words as a proxy for digital transformation outcomes. In simple words, it counts how often digital terms appear in a report and makes this number comparable across reports by adjusting it for document length.
Level 2 (not used for this book) analyzed separately for each 10‐K filing date the number of occurrences of digital transformation language dictionary terms per category in relationship to explicit statements on monetary or timing impacts and the relative percentage of these occurrences (versus total level 1 occurrences) as a proxy for digital transformation outcomes. In simple words it counts the percentage ...
Get Digital Transformation Payday now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.