DreamWorks Animation Releases ForestFlow Machine Learning Model Server to the Open Source Community
Glendale, Calif. – January 31, 2020– DreamWorks Animation, A Comcast Company (NASDAQ:CMCSA) to day announced the open source release of ForestFlow, a Machine Learning (ML) Model Server which helps bridge Data Science Engineering and Machine Learning Operations (MLOps).
Deployment and operation of ML models in production is brittle. The process tends to require cross-functional expertise from engineering and operations teams. ML models themselves have an ever-evolving set of persistence formats and due to the lack of standards, deployments break between clouds and cloud and on-prem. ForestFlow looks to help streamline this process by adopting open source standards and introducing a policy-based approach to Continuous Delivery and Lifecycle Management. This allows for the deployment and validation of real-time, online ML models bringing ML as a Service (MLaaS) to the enterprise.
“Machine Learning is becoming an increasingly important tool and we are pleased to share this framework for effective management of models in this fast changing field,” said Andrew Pearce, VP of Global Technology at DreamWorks. “This contribution underscores DreamWorks’ commitment to innovation and a healthy software community”.
Ahmad Alkilani, Manager of Big Data & Analytics and developer of ForestFlow, said, “We developed ForestFlow in response to our need to move ML models into production that affected the scheduling and placement of rendering jobs and the throughput of our rendering pipeline which has a material impact to our bottom line. The parameters we were most concerned with were first and foremost maintaining our own teams’ agility and keeping ML models fresh in response to changes in data, features, or simply the production tools that historical data was associated with. Another pillar for developing ForestFlow was the openness of the solution we chose. We were looking to minimize vendor lock-in having a solution that was amenable to on-premise and cloud deployments all the same while offloading deployment complexities from the job description of a Data Scientist. We want our team to focus on extracting the most value they can out of the data we have and not have to worry about operational concerns. We also needed a hands-off approach to quickly iterate and promote or demote models based on observed metrics of staleness and performance. Today, we retrain, deploy, score against and measure model performance on a daily basis without material impact to production workflows. This frees up our Data Scientists to continue to experiment and ultimately provide better business value to DreamWorks.”
ForestFlow enables a scalable policy-based cloud native machine learning model server with open API standards. ForestFlow strives to strike a balance between the flexibility and ease of use it offers data scientists while reducing friction between data science, engineering and operations teams. ForestFlow sits at the heart of continuous deployment and lifecycle management of machine learning models. ForestFlow also provides means to test models without impacting consumer facing results, i.e., in shadow mode. For inference, ForestFlow provides both a basic REST HTTP API and is also compatible with the GraphPipe API specification for model serving and GraphPipe client libraries. Deployment follows a similar approach; providing a basic REST HTTP API in addition to being compatible with MLFlow model definitions for flavors that have been implemented. The policy-based approach allows users to deploy multiple machine learning models and perform A/B tests, canary deployments, time and performance based onboarding & retiring of new models in addition to implementing complex routing scenarios like product performance-based traffic routing and shadowing production models with tests. ForestFlow can be deployed in stand-alone local mode or as a cluster of nodes collectively coordinating and distributing work. Specifically, ForestFlow is built with simplicity in mind for local, single-node, deployments in addition to having Kubernetes native options for production-facing deployments that have resource scale-out requirements.
DreamWorks began development of ForestFlow in 2018. The code is available at https://github.com/dreamworksanimation/ForestFlow and has been released under the Apache License, Version 2.0.
DreamWorks Animation is committed to a vibrant Open Source community and is a Founding Member of the Academy Software Foundation ( aswf.io ).
About DreamWorks Animation
DreamWorks Animation (DWA), a division of the Universal Filmed Entertainment Group, within NBCUniversal, a subsidiary of Comcast Corporation, is a global family entertainment company with feature film and television brands. The company’s deep portfolio of intellectual property is supported by a robust, worldwide consumer products practice, which includes licensing, and location-based entertainment venues around the world. DWA’s feature film heritage includes many of the world’s most beloved characters and franchises, including Shrek, Madagascar, Kung Fu Panda, How to Train Your Dragon, Trolls and The Boss Baby , and have amassed more than $15 billion in global box office receipts. DreamWorks Animation’s television business has quickly become one of the world’s leading producers of high-quality, animated family programming, reaching consumers in more than 190 countries. Creating a diverse array of original content in a variety of formats and delivering deep, fully immersive worlds served up with compelling characters, the prolific studio has garnered 25 Emmy Ⓡ Awards since inception in 2013.