다양한 모델 압축 방식으로 레이턴시 개선하기(출처: 레 등이 작성한 논문의 이미지를 각색함)
성능은 양자화에서 가장 크게 향상됐습니다.
32
비트 부동 소수점 수를
8
비트 정수로 변환하면
레이턴시가
7
배 감소하고 스루풋이
8
배 증가합니다.
이러한 기법들은 레이턴시를 개선할 가능성이 매우 높습니다. 다만 각 시나리오에서 성능 향상
후 결괏값의 품질 변화에 대한 언급이 없다는 점을 감안하기 바랍니다.
7.4
클라우드와 에지에서의 머신러닝 클라우드와 에지에서의 머신러닝
모델 계산을 클라우드와 에지 중 어디에서 수행할지도 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month, and much more.
O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.