book

KNIME Essentials

Name: KNIME Essentials
Author: Gábor Bakos
ISBN: 9781849699211

by Gábor Bakos

October 2013

Beginner

148 pages

3h 38m

English

Packt Publishing

Read now

Unlock full access

KNIME Essentials
Table of Contents
KNIME Essentials
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers and moreWhy Subscribe?Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for

Conventions
Reader feedback
Customer support
Downloading the example codeErrataPiracyQuestions
1. Installing and Using KNIME
Few words about KNIME
Installing KNIME
Installation using the archiveKNIME for WindowsKNIME for LinuxKNIME for Mac OS XTroubleshooting
KNIME terminologies
Organizing your workNodesNode lifecycleMeta nodesPortsData tablesPort viewFlow variablesNode viewsHiLiteEclipse conceptsPreferencesLogging
User interface
Getting startedSetting preferencesKNIMEOther preferencesInstalling extensionsWorkbenchWorkflow handlingNode controlsHiLiteVariable flowsMeta nodesWorkflow lifecycleOther views
Summary
2. Data Preprocessing
Importing dataImporting data from a databaseStarting Java DBImporting data from tabular filesImporting data from web servicesREST servicesImporting XML filesImporting modelsOther formatsPublic data sources
Regular expressions
Basic syntaxPartial versus whole matchUsage from JavaReferences and toolsAlternative pattern description
Transforming the shape
Filtering rowsSamplingAppending tablesLess columnsDimension reductionMore columnsGroupByPivoting and UnpivotingOne2Many and Many2OneCosmetic transformationsRenamesChanging the column orderReordering the rowsThe row IDTranspose
Transforming values
Generic transformationsJava snippetsThe Math Formula nodeConversion between typesBinningNormalizationText normalizationRegular expressionsMultiple columnsXML transformationTime transformationSmoothing
Data generation
Generating the grid
Constraints
Loops
Workflow customization
Case study – finding min-max in the next n rows
Case study – ranks within groups
Summary
3. Data Exploration
Computing statistics
Overview of visualizations
Visual guide for the views
Distance matrix
Using visual properties
ColorSizeShape
KNIME views
HiLiteUse cases for HiLiteRow IDsExtreme values
Basic KNIME views
The Box plotsHierarchical clusteringHistogramsInteractive TableThe Lift chartLinesPie chartsThe Scatter plotsSpark Line AppenderRadar Plot AppenderThe Scorer views
JFreeChart
The Bar chartsThe Bubble chartHeatmapThe Histogram chartThe Interval chartThe Line chartThe Pie chartThe Scatter plot
Open Street Map
3D Scatterplot
Other visualization nodes
The R plot, Python plot, and Matlab plotThe official R plotsThe RapidMiner viewThe HiTS visualization
Tips for HiLiting
Using Interactive HiLite CollectorFinding connections
Visualizing models
Further ideas
Summary
4. Reporting
Installation of the reporting extensions
Reporting concepts
Importing data
Sending data and images to a reportImporting from other sourcesJoining data sets
Preferences
Using the designer
In visible viewsReport propertiesReport itemsLabelTextBindingDynamic textDataImageGridListGroupsSortingFiltersTableChartCross TabSetting upChangingUsing data cubesQuick ToolsAggregationRelative time period
Generating reports
Using colors
Using HiLite
Using workflow variables
Suggested readings
Summary
Index

Content preview from KNIME Essentials

Case study – ranks within groups

In this case, we will compute ranks (based on a certain order) within groups. This is a much easier task, but can be very useful if you want to select the outliers without prior knowledge to define cut-off points. However, it can also be useful for summarizing historical data (find the three/five top hits leading the sales list the longest in different genres, for example). There is also a simplification when we do not need the rank, but just the extreme values. But, certain algorithms can use the rank values for better predictions, because we humans are biased to the best options. For example, in a 100-minute race, the difference between the first and the fifth drivers, is one minute hypothetically; that is it ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781849699211

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

KNIME Essentials

by Gábor Bakos

Case study – ranks within groups

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.