Book description
Collecting data is relatively easy, but turning raw information into something useful requires that you know how to extract precisely what you need. With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You'll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications.
Along the way, you'll experiment with concepts through handson workshops at the end of each chapter. Above all, you'll learn how to think about the results you want to achieve  rather than rely on tools to think for you.
 Use graphics to describe data with one, two, or dozens of variables
 Develop conceptual models using backoftheenvelope calculations, as well asscaling and probability arguments
 Mine data with computationally intensive methods such as simulation and clustering
 Make your conclusions understandable through reports, dashboards, and other metrics programs
 Understand financial calculations, including the timevalue of money
 Use dimensionality reduction techniques or predictive analytics to conquer challenging data analysis situations
 Become familiar with different open source programming environments for data analysis
"Finally, a concise reference for understanding how to conquer piles of data."Austin King, Senior Web Developer, Mozilla
"An indispensable text for aspiring data scientists."Michael E. Driscoll, CEO/Founder, Dataspora
Publisher resources
Table of contents
 Dedication
 A Note Regarding Supplemental Files
 Preface
 1. Introduction

I. Graphics: Looking at Data
 2. A Single Variable: Shape and Distribution
 3. Two Variables: Establishing Relationships
 4. Time As a Variable: TimeSeries Analysis
 5. More Than Two Variables: Graphical Multivariate Analysis
 6. Intermezzo: A Data Analysis Session

II. Analytics: Modeling Data
 7. Guesstimation and the Back of the Envelope
 8. Models from Scaling Arguments
 9. Arguments from Probability Models
 10. What You Really Need to Know About Classical Statistics
 11. Intermezzo: Mythbusting—Bigfoot, Least Squares, and All That

III. Computation: Mining Data
 12. Simulations
 13. Finding Clusters
 14. Seeing the Forest for the Trees: Finding Important Attributes
 15. Intermezzo: When More Is Different

IV. Applications: Using Data
 16. Reporting, Business Intelligence, and Dashboards
 17. Financial Calculations and Modeling
 18. Predictive Analytics
 19. Epilogue: Facts Are Not Reality
 A. Programming Environments for Scientific Computation and Data Analysis
 B. Results from Calculus
 C. Working with Data
 D. About the Author
 Index
 About the Author
 Colophon
 Copyright
Product information
 Title: Data Analysis with Open Source Tools
 Author(s):
 Release date: November 2010
 Publisher(s): O'Reilly Media, Inc.
 ISBN: 9780596802356
You might also like
book
Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning
Get insight into data science techniques such as data engineering and visualization, statistical modeling, machine learning, …
book
Modeling Techniques in Predictive Analytics with Python and R: A Guide to Data Science
Master predictive analytics, from start to finish Start with strategy and management Master methods and build …
book
Machine Learning and Big Data with kdb+/q
Upgrade your programming language to more effectively handle highfrequency data Machine Learning and Big Data with …
book
Practical Data Science with Hadoop® and Spark: Designing and Building Effective Analytics at Scale
The Complete Guide to Data Science with Hadoop—For Technical Professionals, Businesspeople, and Students Demand is soaring …