book

Predictive Analytics For Dummies

by Anasse Bari, Mohamed Chaouchi, Tommy Jung

March 2014

Beginner to intermediate

360 pages

8h 58m

English

For Dummies

Read now

Unlock full access

Introduction
About This BookFoolish AssumptionsIcons Used in This BookBeyond the BookWhere to Go from Here
Part I: Getting Started with Predictive Analytics
Chapter 1: Entering the Arena
Exploring Predictive AnalyticsMining dataHighlighting the modelAdding Business ValueEndless opportunitiesEmpowering your organizationStarting a Predictive Analytic ProjectBusiness knowledgeData-science team and technologyThe DataSurveying the MarketplaceResponding to big dataWorking with big data
Chapter 2: Predictive Analytics in the Wild
Online Marketing and RetailRecommender systemsImplementing a Recommender SystemCollaborative filteringContent-based filteringHybrid recommender systemsTarget MarketingTargeting using predictive modelingUplift modelingPredictive Analytics Fight Fraud and CrimeContent and Text Analytics
Chapter 3: Exploring Your Data Types and Associated Techniques
Recognizing Your Data TypesStructured and unstructured dataStatic and streamed dataIdentifying Data CategoriesAttitudinal dataBehavioral dataDemographic dataGenerating Predictive AnalyticsData-driven analyticsUser-driven analyticsConnecting to Related DisciplinesStatisticsData miningMachine learning
Chapter 4: Complexities of Data
Finding Value in Your DataDelving into your dataData validityData varietyConstantly Changing DataData velocityHigh volume of dataComplexities in Searching Your DataKeyword-based searchSemantic-based searchDifferentiating Business Intelligence from Big-Data AnalyticsVisualization of Raw DataIdentifying data attributesExploring data visualizationTabular visualizationsBar chartsPie chartsGraph chartsWord clouds as representationsLine graphsFlocking birds representation
Part II: Incorporating Algorithms in Your Models
Chapter 5: Applying Models
Modeling DataModels and simulationCategorizing modelsDescribing and summarizing dataMaking better business decisionsHealthcare Analytics Case StudiesGoogle search queries as epidemic predictorsCancer survivability predictors Social and Marketing Analytics Case StudiesTweets as predictors for the stock market Target store predicts pregnant womenTwitter-based predictors of earthquakesTwitter-based predictors of political campaign outcomes
Chapter 6: Identifying Similarities in Data
Explaining Data ClusteringMotivationConverting Raw Data into a MatrixCreating a matrix of terms in documentsTerm selectionIdentifying K-Groups in Your DataK-means clustering algorithmClustering by nearest neighbors Finding Associations Among Data ItemsApplying Biologically Inspired Clustering TechniquesBirds flockingAnt colonies
Chapter 7: Predicting the Future Using Data Classification
Explaining Data ClassificationLendingMarketingHealthcareWhat’s next?Introducing Data Classification to Your BusinessExploring the Data-Classification ProcessUsing Data Classification to Predict the FutureDecision treesSupport vector machine Naïve Bayes classification algorithmNeural networksThe Markov ModelLinear regressionEnsemble Methods to Boost Prediction Accuracy

Part III: Developing a Roadmap
Chapter 8: Convincing Your Management to Adopt Predictive Analytics
Making the Business CaseBenefits to the businessGathering Support from StakeholdersWorking with your sponsorsGetting business and operations buy-inGetting IT buy-inRapid prototypingPresenting Your Proposal
Chapter 9: Preparing Data
Listing the Business ObjectivesIdentifying related objectivesCollecting user requirementsProcessing Your DataIdentifying the dataCleaning the dataGenerating any derived dataReducing the dimensionality of your dataStructuring Your DataExtracting, transforming and loading your dataKeeping the data up to dateOutlining testing and test data
Chapter 10: Building a Predictive Model
Getting StartedDefining your business objectivesPreparing your dataChoosing an algorithmDeveloping and Testing the ModelDeveloping the modelTesting the modelEvaluating the modelGoing Live with the ModelDeploying the modelMonitoring and maintaining the model
Chapter 11: Visualization of Analytical Results
Visualization As a Predictive ToolWhy visualization mattersGetting the benefits of visualizationDealing with complexitiesEvaluating Your Visualization How relevant is this picture? How interpretable is the picture? Is the picture simple enough?Does the picture lead to new insights? Visualizing Your Model’s Analytical ResultsVisualizing hidden groupings in your dataVisualizing data classification resultsVisualizing outliers in your dataVisualization of Decision TreesVisualizing predictionsOther Types of Visualizations in Predictive AnalyticsBird-flocking behavior data visualization
Part IV: Programming Predictive Analytics
Chapter 12: Creating Basic Prediction Examples
Installing the Software PackagesInstalling PythonInstalling the machine-learning moduleInstalling the dependenciesPreparing the DataGetting the sample datasetLabeling your dataMaking Predictions Using Classification AlgorithmsCreating a supervised learning model with SVMCreating a supervised learning model with logistic regressionComparing two classification models
Chapter 13: Creating Basic Examples of Unsupervised Predictions
Getting the Sample DatasetUsing Clustering Algorithms to Make Predictions Comparing two clustering modelsCreating an unsupervised learning model with K-meansCreating an unsupervised learning model with DBSCAN
Chapter 14: Predictive Modeling with R
Programming in RInstalling RInstalling RStudioGetting familiar with the environmentLearning just a bit of RMaking Predictions Using R Predicting using regressionUsing classification to predict
Chapter 15: Avoiding Analysis Traps
Data ChallengesOutlining the limitations of the dataDealing with extreme cases (outliers)Data smoothingCurve fittingKeeping the assumptions to a minimumAnalysis ChallengesSupervised analytics Relying on only one analysisDescribing the limitations of the modelAvoiding non-scalable modelsScoring your predictions accurately
Chapter 16: Targeting Big Data
Major Technological Trends in Predictive AnalyticsExploring predictive analytics as a serviceAggregating distributed data for analysisReal-time data-driven analyticsApplying Open-Source Tools to Big DataApache HadoopApache MahoutBuilding a Rapid Prototype of Your Predictive Analytics ModelPrototyping for predictive analyticsTesting your predictive analytics model
Part V: The Part of Tens
Chapter 17: Ten Reasons to Implement Predictive Analytics
Outlining Business GoalsKnowing Your DataOrganizing Your Data Satisfying Your CustomersReducing Operational CostsIncreasing Returns on Investments (ROI)Increasing ConfidenceMaking Informed DecisionsGaining Competitive EdgeImproving the Business
Chapter 18: Ten Steps to Build a Predictive Analytic Model
Building a Predictive Analytics TeamGetting business expertise on boardFiring up IT and math expertiseSetting the Business ObjectivesPreparing Your DataSampling Your DataAvoiding “Garbage In, Garbage Out”Keeping it simple isn’t stupidData preparation puts the good stuff inCreating Quick VictoriesFostering Change in Your OrganizationBuilding Deployable ModelsEvaluating Your ModelUpdating Your Model
About the Authors
Cheat Sheet
More Dummies Products

Content preview from Predictive Analytics For Dummies

Chapter 6

Identifying Similarities in Data

In This Chapter

Clustering data

Identifying hidden groups of similar information in your data

Finding associations among data items

Organizing data with biologically inspired clustering

There is so much data around us that it can feel overwhelming. Large amounts of information are constantly being generated, organized, analyzed, and stored. Data clustering is the process that can help you make sense of this flood of data by discovering hidden groupings of similar data items. Data clustering provides a description of your data that says, in essence, your data contains x number of groups of similar data objects.

Clustering — in the form of grouping similar things — is part of our daily activities. You use clustering any time you group similar items together. For example, when you store groceries in your fridge, you group the vegetables by themselves in the crisper, put frozen foods in their own section (the freezer), so on. When you organize currency in your wallet, you arrange the bills by denomination — larger with larger, smaller with ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Predictive Analytics For Dummies, 2nd Edition

Publisher Resources

ISBN: 9781118729410Purchase book

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Predictive Analytics For Dummies

by Anasse Bari, Mohamed Chaouchi, Tommy Jung

Identifying Similarities in Data

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.