Skip to Content
精通機器學習
book

精通機器學習

by Aurélien Géron
April 2020
Intermediate to advanced
816 pages
18h 32m
Chinese
GoTop Information, Inc.
Content preview from 精通機器學習
70
|
第二章:端對端機器學習專案
我們先匯入
ColumnTransformer
類別
接著取得數值欄位名稱的串列
以及分類欄位名稱
的串列
再建立一個
ColumnTransformer
建構式需要一串
tuple
各個
tuple
都有一個名
22
一個轉換器
以及一串要用轉換器來處理的欄位名稱
或索引
)。
在這個例子中
我們指定數值欄位用之前定義的
num_pipeline
來轉換
而分類欄位
OneHotEncoder
來轉
最後
我們對房地產資料執行這個
ColumnTransformer
它會用各個轉換器來處理適
當的欄位
並沿著第二軸將輸出串接起來
這些轉換器必須回傳同樣的列數
)。
請注意
OneHotEncoder
回傳稀疏矩陣
num_pipeline
回傳密集矩陣
遇到這種混合稀
疏矩陣與密集矩陣的情況
ColumnTransformer
會估計最終矩陣的密度
也就是非零資
料格的比率
),
如果密度低於指定門檻
預設情況下
sparse_threshold=0.3
),
它會回傳
稀疏矩陣
在這個例子中
它回傳一個密集矩陣
就這樣
我們完成一個可以執行前置作
業的
pipeline
可對它傳入完整的房地產資料
並且對各個欄位執行適當的轉換了
如果你想要卸除欄位
可以指定字串
"drop"
或如果你想要維持某些欄
位不變
可以指定字串
"pass through"
而不必使用轉換器
在預設情況
其餘的欄位
也就是沒有被列出來的
都會被卸除
但如果你想要
用不同的方式處理這些欄位
你可以將
remainder ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

下一代空间计算:AR与VR创新理论与实践

下一代空间计算:AR与VR创新理论与实践

Erin Pangilinan, Steve Lukas, Vasanth Mohan
C语言核心技术(原书第2版)

C语言核心技术(原书第2版)

Peter Prinz, Tony Crawford

Publisher Resources

ISBN: 9789865024345