Building Intelligent Analytics through Time Series Data
Time-series data is ubiquitous in IoT, retail, finance, and many other domains. As an example, within the ecosystem of a leading global online retailer, hundreds of petabytes of time series data are generated each day. The scale of such time-series data presents major challenges for data management and analytics. In this training course, we will discuss how to build a deep analytics engine for large-scale time-series data to drive actionable business insights and empower companies across various industries to better understand time-series trends, discover anomalies, manage risks, and boost efficiency. Along the way, we will discuss the latest trends and solutions to the key technical challenges in time-series data storage, querying, processing, feature engineering, machine learning/deep learning algorithm design, anomaly detections, and forecasting.
What you'll learn-and how you can apply it
By the end of this live, hands-on, online course, you’ll understand:
- Key challenges in time-series data management and analytics
- Fundamentals of time-series database and key design trade-offs
- The basics of popular time-series techniques such as deep learning models (Recurrent Neural Networks or RNN), regression analysis, robust Seasonal and Trend decomposition using Loess (STL), anomaly detection, and forecasting
- How to choose an analysis method that best fits the application scenario
And you’ll be able to:
- Pick the best time-series database product for your application
- Choose the right modeling and learning algorithm for your analysis
- Run the end-to-end analytics pipeline in a popular data science development environment
This training course is for you because...
- You’re a data scientist, machine learning engineer, or CTO
- You work with time-series data
- You want to become an expert in time-series data management and analytics
- Basic knowledge of linear algebra and statistical modeling
- Basic understanding of machine learning is preferred but not required
- Familiarity with one of the popular programming languages for data science (e.g., Python/R/Matlab)
- Basic knowledge about databases is preferred but not strictly required
About your instructor
Dr. Sanjian Chen is a data science expert with deep knowledge in scalable machine learning algorithms. He has developed cutting-edge data-driven modeling techniques and autonomous systems in both academic and industry settings. He designed data-analytics solutions that drove numerous high-impact business decisions for multiple Fortune 500 companies across several industries, including retail, banking, automotive, and telecommunications. He is currently working on building cutting-edge cloud-based AI engines for high-performance distributed database systems that support scalable data analytics in multiple business areas. Dr. Chen is a frequent invited speaker at top international conferences, including the Strata Data Conference (San Francisco, London), the IEEE Cyber-Physical Systems Week (Chicago), the IFAC conference on Analysis and Design of Hybrid Systems (Atlanta), and IEEE International Conference on Healthcare Informatics (Philadelphia, Dallas). Dr. Chen received his Ph.D. in Computer and Information Science at the University of Pennsylvania, under the advisory of Professor Insup Lee (ACM Fellow, IEEE Fellow). He received two IEEE Best Paper Awards (IEEE RTSS 2012 and IEEE ISORC 2018). He has published over 25 papers in top journals and conferences, including 2 articles published in the Proceedings of IEEE (IF=9.1). He has served as an invited reviewer for numerous top international journals and conferences, e.g., the IEEE Design & Test, IEEE Transactions on Computers, ACM Transactions on Cyber-Physical Systems, IEEE Transactions on Industrial Electronics, IEEE RTSS conferences, and ACM HSCC conference.
Dr. Jian Chang is a data science expert and software system architect with expertise in machine-learning and big-data systems. He has rich experiences of leading innovation projects and R&D activities to promote data science best practice within large organizations. With deep domain knowledge on various vertical use cases (e.g., Finance, Telco, Healthcare), he is currently working pushing the cutting-edge application of AI at the intersection of high-performance database and IoT, focusing on unleashing the value of spatial-temporal data. He is also a frequent speaker at various technology conferences, including: O’Reilly Strata AI Conference, NVidia GPU Technology Conference, Hadoop Summit, DataWorks Summit, Amazon AWS re:Invent, Global Big Data Conference, Global AI Conference, World IoT Expo, Intel Partner Summit, presenting keynote talks and sharing technology leadership thoughts. Jian received his Ph.D. from the Department of Computer and Information Science (CIS), University of Pennsylvania, under the advisory of Professor Insup Lee (ACM Fellow, IEEE Fellow). He has published and presented research paper and posters at many top-tier conferences and journals, including: ACM Computing Surveys, ACSAC, CEAS, EuroSec, FGCS, HiCoNS, HSCC, IEEE Systems Journal, MASHUPS, PST, SSS, TRUST, and WiVeC. HE also served as reviewers for many highly reputable international journals and conferences.
The timeframes are only estimates and may vary according to how the class is progressing
Introduction (15 minutes)
- Outline: Background, Motivations, Application Scenarios, Challenges
Data Foundation of Time-Series Analytics (45 minutes)
- Outline: Time-Series Database technologies (storage & query), Computation technologies (Pandas, Spark-Flint)
- Presentation & Instructor-led interactive demo
- Break (5 min)
Time-Series Analysis Methods and Interactive Tutorial (60 minutes)
- Outline: Popular algorithms and implementation packages (forecast, anomaly detection, feature engineering), cutting-edge research advancements in time-series modeling
- Presentation & Instructor-led interactive demo