Review – Agenda 2021

Deep Learning World 2021

May 24-28, 2021 – Livestreamed

To view the full 7-track agenda for the six co-located conferences at Machine Learning Week click here or for the individual conference agendas here: PAW Business, PAW Financial, PAW Healthcare, PAW Industry 4.0, PAW Climate or Deep Learning World.

All times are Pacific Daylight Time (PDT/UTC-7)

Workshops - Wednesday, May 19th, 2021

7:15 am

Workshop:

The Best of Predictive Analytics: Core Machine Learning and Data Science Techniques

Full-day: 7:15am – 2:30pm PDT

This one-day session surveys standard and advanced methods for predictive modeling (aka machine learning).

The Workshop Description will be available shortly.

Leader

John Elder Ph.D.

Founder & Chair

Elder Research

7:30 am

Workshop:

Machine Learning with R: A Hands-On Introduction

Full-day: 7:30am – 3:30pm PDT

Gain experience driving R for predictive modeling across real examples and data sets. Survey the pertinent modeling packages.

The Workshop Description will be available shortly.

Leader

Robert Muenchen

Manager of Research Computing Support

University of Tennessee

8:00 am

Workshop:

Big Data: The Leading Ways to Improve Business with Data Science (Non-Technical)

Full-day: 8:00am – 3:00pm PDT

This one day workshop reviews major big data success stories that have transformed businesses and created new markets.

The Workshop Description will be available shortly.

Workshops - Thursday, May 20th, 2021

8:00 am

Workshop:

Ensemble Models: Supercharging Machine Learning

Full-day: 8:00am – 3:00pm PDT

This workshop dives into the key ensemble approaches, including Bagging, Random Forests, and Stochastic Gradient Boosting.

The Workshop Description will be available shortly.

Leader

Dean Abbott

Chief Data Scientist

Appriss Retail

8:00 am

Workshop:

Machine Learning with Python: A Hands-On Introduction

Full-day: 8:00am – 3:00pm PDT

Python leads as a top machine learning solution – thanks largely to its extensive battery of powerful open source machine learning libraries. It’s also one of the most important, powerful programming languages in general.

The Workshop Description will be available shortly.

Leader

Clinton Brownley

Lead Data Scientist

Tala

8:00 am

Workshop:

Machine Learning Operationalized for Business: Ensuring ML Deployment Delivers Value

Full-day: 8:00am – 3:00pm PDT

Machine learning improves operations only when its predictive models are deployed, integrated and acted upon – that is, only when you operationalize it.

The Workshop Description will be available shortly.

Leader

James Taylor

Executive Partner

Blue Polaris

8:30 am

Workshop:

Spark on Hadoop for Machine Learning: Hands-On Lab

Full-day: 8:30am – 3:30pm PDT

Gain the power to extract signals from big data on your own, without relying on data engineers and Hadoop specialists.

The Workshop Description will be available shortly.

Leader

James Casaletto

PhD Candidate

UC Santa Cruz Genomics Institute and former Senior Solutions Architect, MapR

Workshops - Friday, May 21st, 2021

7:15 am

Workshop:

The Deadly Dozen: The Top 12 Analytics Mistakes and the Techniques to Defeat Them

Full-day: 7:15am – 2:30pm PDT

This one-day session reveals the subtle mistakes analytics practitioners often make when facing a new challenge (the “deadly dozen”), and clearly explains the advanced methods seasoned experts use to avoid those pitfalls and build accurate and reliable models.

The Workshop Description will be available shortly.

Leader

John Elder Ph.D.

Founder & Chair

Elder Research

Deep Learning World - Virtual - Day 1 - Monday, May 24th, 2021

8:00 am

Preparation and Networking

8:45 am

Conference Chair Welcome

Luba Gloukhova

Consultant & Speaker

9:00 am

MACHINE LEARNING WEEK KEYNOTE

Lessons from: Amazon

Alexa on the Edge: A Case Study in Customer-Obsessed Research

Amazon's vision is to be earth's most customer-centric company. This talk explores how the Alexa Hybrid Science team in Pittsburgh, PA applies a customer-centric lens to cutting-edge machine learning research. The team is responsible for developing on-device Alexa automatic speech recognition models to provide a faster, more reliable Alexa experience. Our research includes neural network compression techniques, end-to-end spoken language understanding and optimizing machine learning for edge devices.

Session description

Speaker

Nathan Susanj

Applied Science Manager

Amazon

9:50 am

10:10 am

Room change

10:20 am

Case Study: Nauto

Leveraging Driving Data to Gain an Edge

Each real-world application of machine learning owns a set of defining characteristics that distinguish it from other problems. In Nauto's case, we discovered that our driving data is remarkably uniform. By leveraging this trait, we were able to achieve massive optimizations in model accuracy and inference speed, directly improving Nauto's core safety offerings like Forward Collision Warning.

This talk will explore the importance of identifying domain-specific traits when deploying AI solutions to the world. It will show how leveraging the quirks of your problem can lead to optimizations that occasionally even contradict established conventions and common intuition.

Session description

Speaker

Alexander Wu

Senior Deep Learning Engineer

Nauto

11:05 am

Break & Expo Hall

11:30 am

Case Study: NuFlare Technology

Identifying and Diagnosing Defects in Semiconductor Engineering with Deep Learning

Scanning Electron Microscope (SEM) images are the primary means used by expert engineers to identify and diagnose defects in semiconductor variable shape beam (VSB) mask writers. Deep learning (DL) offers an attractive alternative to this tedious process. However, extremely robust mask writers preclude collecting a large variety of SEM images to train DL models. Using digital twins that can mimic SEM images provides an exceptional way to synthesize ample DL training data. This talk will take a deep dive into synthesizing SEM images and leveraging them to build DL models for VSB mask writer defects analysis.

Session description

Speaker

Suhas Pillai

Deep Learning Engineer

Center for Deep Learning in Electronics Manufacturing (CDLe)

11:55 am

Making boats fly by scaling Reinforcement Learning

The increasing prevalence of high-fidelity industrial digital twins is providing a range of opportunities to apply Reinforcement Learning (RL) techniques outside of traditional academic examples and video games. While this trend is now well-established, most RL developments and deployments in the real world are done on an ad-hoc basis with little consideration given to how to repeat and scale similar initiatives in an efficient way. In this session we will address these shortcomings and illustrate them through our experience in optimising the design of a state-of-the-art sailing boat for a prominent competition with RL. Building an agent to control the boat is a very complex RL task for several reasons: imperfect information, loosely defined goals with delayed rewards, highly dynamic state and action spaces. In racing conditions, it takes a team of Olympic-level athletes to sail the boat and make it “fly” thanks to its underwater wings (read hydro-foiling). In order to control convergence variability and sampling efficiency, the working solution required a custom deep learning implementation of the Soft Actor Critic RL algorithm, with state of the art improvements such as experience replay buffer pooling, domain randomisation and curriculum learning. Beyond describing solutions to traditional RL considerations, we will also focus on the underlying workflows and technology stack required to carry out a deep learning project of this technical complexity in a scalable way. We will use facets of Software 2.0, such as higher-level APIs and the automation of end-to-end model development tasks, to highlight our iterative choices and the optimisation opportunities along the machine learning pipeline and ultimately the production system.

Session description

Speaker

Nicolas Hohn

Chief Data Scientist, Australia

QuantumBlack, a McKinsey company

12:15 pm

Practitioner's Chats

Join your fellow practitioners in this interactive session where you can exchange approaches to shared challenges and hear how your peers are tackling similar issues.

Session description

12:45 pm

End of Day 1

Deep Learning World - Virtual - Day 2 - Tuesday, May 25th, 2021

8:00 am

Virtual Coffee Roundtables

Grab your real coffee and share experiences virtually with your peers to explore the new challenges of operating in a largely virtual world. Just like pre-show breakfast in a regular conference you’ll join a “round table” with seven fellow attendees and see where the conversation takes you.

8:55 am

Day 2 Conference Chair Welcome

Luba Gloukhova

Consultant & Speaker

9:00 am

KEYNOTE

Speech Recognition at Cisco in a Deep Learning World

In the field of Speech Recognition, the state of the art for generic conversations has reached superhuman levels. However, things are not nearly as good in specialized knowledge domains: conversation in which people with accents speak in a noisy environment often results in high error rates. Considering the low performance of Speech Recognizers on real data, it becomes imperative to customize the end-to-end probabilistic model. This session will focus on discussing the fundamentals of speech recognition and how Cisco has moved from multiple phoneme-based models to a single end-to-end grapheme-based Recurrent Neural Network architecture that can transcribe audios directly while also reducing latency by tweaking the model during inference time.

Session description

Speaker

Pranjal Daga

Product Leader

Brex

9:50 am

Choose from Presentations by Minitab or Geneia

10:10 am

Room change

10:20 am

Case Study: Stripe

Term of service violation detection with multimodal deep learning

Terms of service (ToS) violators at Stripe are merchants on our ecosystem who are selling items and services that are prohibited by our terms of service. This talk will present how we build a multimodal ToS violation deep learning detection system that combines text, images, and tabular data. We will also discuss how we enable interpretability of model predictions.

Session description

Speaker

Carter Lin

Data Science Manager

Stripe

10:45 am

Case Study: Miele

Dealing with the "unknown": How we tackle open set classification tasks at Miele

Image classification has been solved successfully for many tasks e.g. using deep learning techniques. However, in many application scenarios, the set of classes from which input images are drawn is not completely known at time of modeling. It is then important to a) reduce false positives during inference and b) enable the description of "unknown" image content during annotation. This case study shows how we addressed these issues at Miele for recognition of food items together with a concept for dealing with unkown classes during data annotation.

Session description

Speaker

Felix Reinhart

Data Scientist

Miele X

11:05 am

Break & Expo Hall

11:30 am

Case Study: Datavisor

Identifying suspicious patterns in the users' content to fight online fraud

As online fraud increases in volume, malicious actors rely on automation to keep scaling. To spread false information, sell nonexistent products, or spend money from stolen credit cards, fraudsters use scripts and other automation tools to manage a large number of fake accounts. Such automation introduces a common thread in the profiles or communications of the accounts they control. We developed a deep learning model to detect suspicious patterns amongst different accounts. The model was successful in detecting coordinated attacks, even when deployed on a novel platform, increasing by up to 80% the detection of malicious actors.

Session description

Speaker

Nicola Corradi

Research Scientist

Datavisor

11:55 am

Case Study: Copan Group

Deep Learning of Microbiological Analysis inside Full Laboratory Automations

Deep learning solutions are nowadays a standard tool in many technological fields. Specifically, in the microbiological field, this is possible through Full Laboratory Automations.

The combination of those two game-changers, made available by COPAN, allow microbiologists to streamline their daily routine: the preparation, incubation and evaluation of thousands of samples (mainly from negative analysis) focusing their high skilled qualities directly on the most challenging and critical ones.

In this talk, Giovanni Turra, key member of the Imaging and Data Analysis team at COPAN, explains how deep learning supports and empowers the daily laboratory battle against diseases.

Session description

Speaker

Giovanni Turra

Computer Vision, Machine Learning and Deep Learning Engineer

Copan Group S.p.a.

12:15 pm

Practitioner's Chats

Join your fellow practitioners in this interactive session where you can exchange approaches to shared challenges and hear how your peers are tackling similar issues.

Session description

12:45 pm

End of Day 2

Deep Learning World - Virtual - Day 3 - Wednesday, May 26th, 2021

8:00 am

Virtual Coffee Roundtables

8:55 am

Day 3 Conference Chair Welcome

Luba Gloukhova

Consultant & Speaker

9:00 am

SPECIAL PLENARY

The Hidden Complexity of your Modeling Process

Models generalize best when their complexity matches the problem. To avoid overfit, practitioners usually trade off accuracy with complexity, measured by the count of parameters. But this is surprisingly flawed. For example, a parameter is equivalent to one "degree of freedom" only for regression; it can be > 4 for decision trees, and < 1 for neural networks. Worse, a major source of complexity -- over-search — remains hidden. The vast exploration of potential model structures leaves no trace on the final (perhaps simple-looking) model, but has outsized influence over whether it is trustworthy.

I’ll show how Generalized Degrees of Freedom (GDF, by Ye) can be used to measure the full complexity of algorithmic modeling. This allows one to fairly compare very different models and be more confident about out-of-sample accuracy. GDF also makes clear how seemingly complex ensemble models avoid overfit, and lastly, reveals a new type of outlier -- cases having high model influence.

Session description

Panelist

John Elder Ph.D.

Founder & Chair

Elder Research

9:50 am

Deep Learning Approaches to Forecasting and Planning

Deep learning models for forecasting and planning have shown significant promise for handling multiple variables, uncovering hidden patterns, and producing accurate forecasts. However, as one might expect, deep learning models are also complex and rife with pitfalls. Since these techniques often seem like a ‘black box,’ managers -- both technical and nontechnical backgrounds -- can find them hard to master.
In this session, Senior Data Scientist, Javed Ahmed will focus on the intuition behind various deep learning approaches, explore how managers can tackle highly complex models by asking the right questions, and evaluating the models with familiar tools.
Attendees at the Metis session will leave with the tools to:

●      Identify types of forecasting applications that can benefit from deep learning
●      Broadly understand deep learning approaches relevant to forecasting
●      Understand pitfalls related to deep learning approaches, and why simpler models may work better
●      Evaluate the results of a forecasting program

Session description

Deep Learning World - Virtual - Day 4 - Thursday, May 27th, 2021

8:00 am

Virtual Coffee Roundtables

8:55 am

Day 4 Conference Chair Welcome

Luba Gloukhova

Consultant & Speaker

9:00 am

Case Study: Johnson and Johnson

Deep Learning in Manufacturing Industry - Inception to Deployment

Deep learning and Computer Vision are changing the way to improve product quality in manufacturing industry. We have used cutting edge neural network architectures to identify the source of problems in products. It improves the product quality by improved defect detection, defect categorization and enhances the customer experience. The models are deployed into production and are generating fantastic results. It will be surely a great attraction for the visitors who want to analyze the significance of deep learning, identify the process and challenges. It will be a first hand information for them which will prove to be really useful and will have far-reaching results.

Session description

Speaker

Vaibhav Verdhan

Analytics Lead

AstraZeneca

9:50 am

Practitioner's Chats

Join your fellow practitioners in this interactive session where you can exchange approaches to shared challenges and hear how your peers are tackling similar issues.

10:10 am

Room change

10:20 am

Case Study: Lyft

Distributed hyperparameter tuning using Fugue and Spark

For machine learning problems, we don't lack great tools for certain problems, we lack a unified approach to use them for both prototyping and production. Fugue is a framework, aiming to bridge this gap. In this talk, we are going through a real deep learning example with preprocessing, training, and hyperparameter tuning. We will discuss the pain points and demonstrate how you can use Fugue to quickly iterate on small data and scale up on a Spark cluster without code change. We may also talk about some Lyft use cases, and how Fugue changed the game.

Session description

Speakers

Han Wang

Staff Engineer

Lyft

Jintao Zhang

Software Engineer, Machine Learning

Square, Inc.

11:05 am

Break & Practitioner's Chats

Take a break or join your fellow practitioners in this interactive session where you can exchange approaches to shared challenges and hear how your peers are tackling similar issues.

11:30 am

Case Study: Google

Unsupervised Learning for Problem Management

Many organizations deal with incoming customer problems either internally or externally. Frequently, these problems are handled one at a time by humans. We’ll discuss how at Google we are doing early detection of broad IT incidents before they become too large. We'll go through unsupervised learning techniques that when combined with deep learning-based language modeling creates a powerful, robust system that has saved Googlers hundreds of thousands of hours in productivity.

Session description

Speaker

Patrick Miller

Lead of Enterprise AI

Google

11:55 am

Lessons from: Thomson Reuters

NLP Systems for Extracting Financial Information from Corporate Disclosures in a Real-Time News Production Environment

Thomson Reuters produces timely financial news and alerts that is used by many of our corporate and institutional customers. This is extracted from multiple sources, including corporate disclosures, and presented as an ongoing series of alerts and bulletins. One of the major challenges in this kind of financial work is the lack of accurate or weak training data which required us to employ various innovative data cleaning techniques. We also discuss our NLP-based production system that makes strong use of a BERT pre-trained model in combination with an unsupervised strategy, as well as some of the engineering challenges around making the system operate adequately in real-time.

Session description

Speaker

Ian Knopke

Senior Data Scientist

Thomson Reuters

12:15 pm

End of Day 4

Deep Learning World - Virtual - Day 5 - Friday, May 28th, 2021

8:55 am

Day 5 Conference Chair Welcome

Luba Gloukhova

Consultant & Speaker

9:00 am

KEYNOTE

Case Study: Shopify

Please, Don't Tell Me What's in the Picture: Product Classification at Shopify

Deep Learning image classifiers represent a breakthrough in image recognition and classification tasks. However they come with their own quirks and kinks: A "clean" product image can be easily classified, but what about a stock image of "a woman running in the rain"? Is it a raincoat? Her shoes? The phone cover? At Shopify we were tasked with mapping a large, complex, and "dirty" catalogue of products into duplicates, similar products and categories. In this case study I will walk you through our journey and how we were able to harness the strengths of CNNs while avoiding the major pitfalls.

Session description

Speaker

Yizhar Toren

Senior Data Scientist

Shopify

9:55 am

Practitioner's Chats

Join your fellow practitioners in this interactive session where you can exchange approaches to shared challenges and hear how your peers are tackling similar issues.

10:00 am

Coursera Presentation

The Session Description will be available shortly.

Session description

Review – Agenda 2021

Deep Learning World 2021

May 24-28, 2021 – Livestreamed

To view the full 7-track agenda for the six co-located conferences at Machine Learning Week click here or for the individual conference agendas here: PAW Business, PAW Financial, PAW Healthcare, PAW Industry 4.0, PAW Climate or Deep Learning World.

Workshops - Wednesday, May 19th, 2021

Workshops - Thursday, May 20th, 2021

Workshops - Friday, May 21st, 2021

Deep Learning World - Virtual - Day 1 - Monday, May 24th, 2021

Deep Learning World - Virtual - Day 2 - Tuesday, May 25th, 2021

Deep Learning World - Virtual - Day 3 - Wednesday, May 26th, 2021

Deep Learning World - Virtual - Day 4 - Thursday, May 27th, 2021

Deep Learning World - Virtual - Day 5 - Friday, May 28th, 2021

Get Machine Learning Week information and updates delivered straight to your inbox.