- Region:
- North/South America
- Industry:
- Manufacturing
- Solution Type:
- Data Analytics
- Data Management
- Digital Transformation
AWS Case Study: Amazon and Toyota: Infosys Modernises Toyota’s Vehicle Data Warehouse on AWS
Executive Summary
Every car that Toyota manufactures for operation in North America is meticulously logged in the company’s Vehicle Data Warehouse (VDW). The data includes lifecycle information regarding forecasted orders, planning, manufacturing, logistics, incentives, and stocks. Over time, the data volume grew while the processes and technologies maintaining the data warehouse aged. As a result, Toyota faced tight timelines for data processing and created an add-on process that extracted the data into a data lake for self-service analytics. Working with Infosys and leveraging Infosys Cobalt, Toyota modernised its VDW in a next-generation data lake on Amazon Web Services (AWS)
Ageing Data Warehouse Doesn’t Scale
Previously, Toyota operated its on-premises data warehouse as a capital investment, paying for maintenance and database technologies that were expensive, outdated, and not scalable. As an initial step toward modernisation, Toyota’s data platform team built an on-premises data lake using Hadoop and populated it with data from the VDW. However, this created a duplication of vehicle data and additional overhead, plus it delayed the availability of the data for self-service analytics for internal business users or applications built on top of the lake. “Infosys helped us adopt DevOps for a lake house deployment, which introduced a robust software-development-lifecycle approach to effectively promote workflows and logic to production.” – Baldev Marepally, Manager of Data Platforms, Toyota
Infosys Develops Lake House Architecture that Delivers Faster Insights
Together, Infosys and Toyota modernised the VDW by migrating to Amazon Redshift and leveraging a lake house architecture. The source data was converged directly into the data lake on AWS and all the legacy logic was converted to run directly in the data lake, providing enterprise-ready data marts for reporting. The modernisation reduced complexities, eliminated the duplication of data, improved data quality, and laid the groundwork for deeper analytics. “This was the perfect opportunity for us to modernise in the AWS Cloud and position ourselves as another tenant in the lake,” said Baldev Marepally, Manager of Data Platforms at Toyota. “With a modernised data lake on AWS, all our needs around on-demand ephemeral processing and scalability are met, and the data we need is ready to be consumed by other applications much faster than before. Centralised governance and having the VDW as part of the data hub, enables a quick turnaround for new use cases with just a few clicks.” Infosys and the data platform team did a one-time sync with AWS Database Migration Service to facilitate the migration and then began improving the reporting capabilities. Taking a big data approach, Infosys configured ingestion across multiple data sources and integrated data quality and governance processes. “Before, they faced constraints on-premises where their analysis was somewhat restricted,” said Radhakrishnan Bharathi, Program Director of Data Analytics and Digital Transformation at Infosys. “This lake house architecture on AWS brings scalability and a platform for data science where a lot of activities can happen.”
“Amazon Redshift provides a read-optimised layer, with really good performance that drives our data mart and analytical tools.”
– Baldev Marepally, Manager of Data Platforms, Toyota
Read-Optimised Redshift Layer Plays a Vital Role
For internal business users, the built-in concurrent scalability features of Redshift make it possible to perform additional ad-hoc analyses with guaranteed performance. In addition, Redshift Spectrum and the underlying fleet of warm capacity allow end users to bring in additional data assets by simply defining metadata and enabling access. “These capabilities will help our internal teams—from vehicle, logistics, inventory, and sales—make their operations even more efficient,” Marepally said. “The Amazon Redshift layer is about two times faster, especially for daily data loads and self-service analysis. Now these users are empowered to explore the VDW further and find new opportunities.”
DevOps Platform Automates Data Process
Modernising the VDW gave Toyota the chance to integrate DevOps pipelines to more easily deploy new features and capabilities. “For DevOps, the difference is day and night,” Marepally said. “Infosys built and enhanced re-usable frameworks which simplified development of the ingestion transformation process, including bathing and streaming. These frameworks have standardised the cloud migration process and helped in completing it in faster pace. The technologies adopted enabled scaling of resources and execution of workflows in parallel mode, which was difficult in on-premises due to the server capacity. Specifically, Infosys developed a framework that standardised the ingestion process using six config-driven templates. Instead of manually building new workflows and syncing all aspects of the environment, the Toyota team can now use templates that automate ingestion. Infosys also extended an existing transformation framework to work on AWS, which simplified maintenance, deployment, troubleshooting, and adoption.
Endless Possibilities for Advanced Analytics and Data Science
With faster performance, plus a more accessible and scalable architecture, the possibilities for advanced analytics are virtually endless. “It’s open to people’s imaginations at this point,” Marepally said. “With the VDW in the lake, analysis doesn’t require duplicating or refreshing the data. Our end users can do advanced analytics, MLOps, or build a model.” The modernised architecture laid the foundation for deeper data science at Toyota, which will not only unlock new insights, but also have a ripple effect on downstream applications.