Noise in data can come from many sources, but is not often a significant issue as most machine learning techniques are resilient to noisy datasets. Noise can come from environmental factors (for instance, the air conditioner compressor turning on randomly and causing signal noise in a nearby sensor), it can come from transcription errors (somebody recorded the wrong data point, selected the wrong option in a survey, or an OCR algorithm read a 3 as an 8), or it can be inherent to the data itself (such as fluctuations in temperature recordings, which will follow a seasonal pattern but have a noisy daily pattern).
Noise in categorical data can also be caused by category labels that aren't normalized, such as images that are tagged ...