Best Data Engineering
Updated DailyRankings are calculated based on verified user reviews, recency of updates, and community voting weighted by user reputation score.
No tags available
Great Expectations is the leading open-source framework for data quality and validation. It allows data teams to define 'expectations'unit tests for datathat ensure data meets specific quality standar...
Daasity is a data-as-a-service platform that provides ecommerce brands with a dedicated data warehouse. It is designed for companies that want full ownership of their data and the ability to build cus...
This certification validates skills in designing and implementing data processing and storage solutions on Azure. It covers topics like data ingestion, transformation, storage, and analysis using Azur...
While technically a transformation tool, dbt has become an essential component of the modern data integration stack. It allows data analysts and engineers to transform data inside their warehouse usin...
The Google Cloud Certified Professional Data Engineer certification validates expertise in designing and building data processing systems on Google Cloud. It covers topics such as data ingestion, tran...
StreamSets is a specialized platform for building and operating smart data pipelines. It excels in real-time streaming and complex data movement, making it ideal for high-velocity data environments. U...
Prefect is a modern workflow orchestration platform that focuses on developer experience and flexibility. Unlike traditional orchestrators, Prefect allows you to turn any Python function into a task,...
Dagster is a data-aware orchestration platform that focuses on managing data assets rather than just tasks. By treating data as a first-class citizen, Dagster provides better visibility into data line...
This DataCamp track focuses on the skills needed for a data engineering role. It covers Python, SQL, cloud platforms (AWS), data warehousing, and ETL processes. The interactive coding environment and...
Ibis is a Python library that provides a unified, pandas-like interface for data manipulation across multiple backends, including DuckDB, BigQuery, Snowflake, and PostgreSQL. Its goal is to allow user...
Data Council is a conference focused on data engineering, data science, and machine learning. It features presentations from industry experts, case studies, and workshops. The event attracts data engi...
Polars-Lazy is the core engine behind the Polars library, focusing on query optimization and lazy evaluation. By building a query plan before executing it, Polars-Lazy can perform predicate pushdown,...
Apache Airflow is the industry-standard platform for programmatically authoring, scheduling, and monitoring data workflows. It uses Python to define complex data pipelines as Directed Acyclic Graphs (...
You're subscribed! We'll notify you about new data-engineering.