Corporate data consists of structured data and unstructured data. Unstructured data consists of repetitive and nonrepetitive data. The separation between repetitive data and nonrepetitive data can be called the: great divide”. Repetitive Big Data is centric to Hadoop, where most of the activities include data management functions for very large amounts of data. Nonrepetitive data is data that is organized around textual disambiguation, including such functions as sub doc processing, inline contextualization, taxonomical resolution, acronym resolution, standardization, stop word processing, homographic resolution, proximity resolution, and other functions.