SERVERLESS DATA & ML PIPELINES

CLIENT BACKGROUND

Client was building a data platform for data science use cases. The goal was to ingest vast amounts of data from disparate sources, transform and enrich ingested data for unified representation, and provide the result to machine learning models for training and prediction.

The data flow consists of the following components:
-Ingestion Layer: Set of tools and pipelines that enable data acquisition from different sources like: relational databases, REST APIs, semi-structured file formats, studies and other documents.
-Data Lake: Centralized storage for all data assets with complete governance including inventory, provenance, access control and audit.
-Batch Layer: Set of pipelines that cleanse, format, enrich and label data for further ML model training.
-Feature Store: Single place to keep, curate, and serve features to ML models.

CHALLENGES

The client experienced high cost for development, deployment and, most importantly, operation of the data platform including Data Lake, Ingestion and ML Pipelines.
The pipelines were mostly running in EC2 instances, which led to the increased cost of operations and required a significant amount of time to deploy and test pipelines in lower environments.

SOLUTION

The proposed solution split the monolith ingestion service and ML data preparation pipelines into a number of small, autonomous serverless functions.

Similar cases

[CICD] SERVERLESS DEPLOYMENT AUTOMATION

The client experienced the need to automate and unify the deployment process of serverless applications on AWS Lambda. The customer faced a few inconveniences during manual deployment such as different environment setup during deployment process (versions of Serverless, Python, Node, etc. are inconsistent) and no way to control environment changes in one place.

view success story

[IaC] DATA SCIENCE INFRASTRUCTURE

The client experienced the need to scale the research collaboration within the data scientists team by moving research capabilities into cloud workloads as well as automate and unify the deployment process of AWS resources to decrease time and efforts which are needed for a team of data scientists to build and test models.

view success story

[CICD] MOBILE CROSS PLATFORM

Low/No-Code platform that allows small and medium businesses to utilize white-label mobile application solutions for business purposes. To satisfy functionality demands are highly important for the client to have frequent releases.

view success story
SERHII YELCHENKO Delivery Director

We are cloud native company who visions cloud computing as the home for tech products. Our team of top-notch engineers specialize in Cloud solutions, we develop scalable cloud native applications, provide DevOps services which facilitate innovations and allow release products faster, build reliable and secure cloud infrastructure for our clients from the US and Europe.

Tell us about your business needs

    I’ve read and I accept the Privacy Policy