Open in app

Sign In

Write

Sign In

Mastodon
Giorgos Myrianthous
Giorgos Myrianthous

6.6K Followers

Home

About

Published in Towards Data Science

·Pinned

How to Skip Tasks in Airflow DAGs

Skipping tasks in Airflow DAGs based on specific conditions — Recently, I was attempting to add a new task in an existing Airflow DAG that would only run on specific days of the week. However, I was surprised to find that skipping tasks in Airflow isn’t as straightforward as I anticipated. In this article, I will demonstrate how to skip…

Python

9 min read

How to Skip Tasks in Airflow DAGs
How to Skip Tasks in Airflow DAGs
Python

9 min read


Published in Towards Data Science

·Pinned

ETL vs ELT: What’s the Difference?

A comparison between ETL and ELT in the context of Data Engineering — ETL (Extract-Transform-Load) and ELT (Extract-Load-Transform) are two terms commonly used in the realm of Data Engineering and more specifically in the context of data ingestion and transformation. While these terms are often used interchangeably, they refer to slightly different concepts and have different implications for the design of a data…

Programming

8 min read

ETL vs ELT: What’s the Difference?
ETL vs ELT: What’s the Difference?
Programming

8 min read


Published in Towards Data Science

·Pinned

requirements.txt vs setup.py in Python

Understanding the purpose of requirements.txt, setup.py and setup.cfg in Python when developing and distributing packages — Introduction Managing dependencies in Python projects could be quite challenging, especially for people new to the language. When developing a new Python package, the chances are you will also need to utilise some other packages that will eventually help you write less code (in less time) so that you don’t have…

Python

7 min read

requirements.txt vs setup.py in Python
requirements.txt vs setup.py in Python
Python

7 min read


Published in Towards Data Science

·Pinned

Kafka No Longer Requires ZooKeeper

Version 2.8.0 Gives You Early Access to Zookeeper-Less Kafka — Introduction Apache Kafka 2.8.0 is finally out and you can now have early-access to KIP-500 that removes the Apache Zookeeper dependency. Instead, Kafka now relies on an internal Raft quorum that can be activated through Kafka Raft metadata mode. The new feature simplifies cluster administration and infrastructure management and marks a…

Programming

5 min read

Kafka No Longer Requires ZooKeeper
Kafka No Longer Requires ZooKeeper
Programming

5 min read


Published in Towards Data Science

·Pinned

Speeding Up the Conversion Between PySpark and Pandas DataFrames

Save time when converting large Spark DataFrames to Pandas — Converting a PySpark DataFrame to Pandas is quite trivial thanks to toPandas()method however, this is probably one of the most costly operations that must be used sparingly, especially when dealing with fairly large volume of data. Why is it so costly? Pandas DataFrames are stored in-memory which means that the operations over them are faster…

Python

3 min read

Speeding Up the Conversion Between PySpark and Pandas DataFrames
Speeding Up the Conversion Between PySpark and Pandas DataFrames
Python

3 min read


Published in Level Up Coding

·1 day ago

How To Fix ImportError: No module named openai

Diagnosing and fixing ImportError when importing OpenAI package in Python modules — The recent launch of GPT-3, which is accessible through the ChatGPT web interface, has sparked a surge of interest from millions of users globally. Additionally, the forthcoming release of GPT-4 from OpenAI is also generating excitement. …

Python

5 min read

How To Fix ImportError: No module named openai
How To Fix ImportError: No module named openai
Python

5 min read


Published in Level Up Coding

·Mar 20

How to Sign Commits With GPG Key

Autosigning Git commits with GPG keys on GitHub — Git is a popular version control system used by developers all over the world to manage their codebase efficiently. However, the security of Git has become a major concern in recent years, as cyber-attacks on software projects have become more common. …

Programming

5 min read

How to Sign Commits With GPG Key
How to Sign Commits With GPG Key
Programming

5 min read


Published in Towards Data Science

·Mar 10

How To Improve The Performance of Python Functions

Speeding up frequently called functions in Python — In today’s world, where the amount of data being processed is growing at an unprecedented rate, having efficient and optimized code has become more important than ever. Python, being a popular programming language, offers several built-in tools to optimize the performance of your code. One of these tools is the…

Python

9 min read

How To Improve The Performance of Python Functions
How To Improve The Performance of Python Functions
Python

9 min read


Published in Level Up Coding

·Mar 8

HackerRank SQL Solution For Interviews Problem

Approaching and solving the ‘Interviews’ challenge on HackerRank with SQL — Consistent practice is essential for improving your SQL skills or refreshing your knowledge, especially when preparing for interviews. Fortunately, there are several platforms, including HackerRank, that offer a range of SQL challenges to help you enhance your problem-solving abilities. In this tutorial, we will focus on one of HackerRank’s medium-level…

Programming

5 min read

HackerRank SQL Solution For Interviews Problem
HackerRank SQL Solution For Interviews Problem
Programming

5 min read


Published in Towards Data Science

·Mar 6

RANK() vs DENSE_RANK() vs ROW_NUMBER() in SQL

Understanding the difference between these window functions in SQL — In the world of SQL, a window function is a powerful construct that allows users to segment and manipulate data in precise ways. By grouping data based on specific columns and sorting criteria, window functions enable advanced computations within partitions. In this comprehensive tutorial, we will explore three of the…

Programming

6 min read

RANK() vs DENSE_RANK() vs ROW_NUMBER() in SQL
RANK() vs DENSE_RANK() vs ROW_NUMBER() in SQL
Programming

6 min read

Giorgos Myrianthous

Giorgos Myrianthous

6.6K Followers

I write about Python, DataOps and MLOps

Following
  • Cassie Kozyrkov

    Cassie Kozyrkov

  • Jim Kwik

    Jim Kwik

  • Tim Denning

    Tim Denning

  • Conor O'Sullivan

    Conor O'Sullivan

  • Cole Hagen

    Cole Hagen

See all (108)

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech