Photo by Noah Bogaard on

Converting a PySpark DataFrame to Pandas is quite trivial thanks to toPandas()method however, this is probably one of the most costly operations that must be used sparingly, especially when dealing with fairly large volume of data.

Why is it so costly?

Pandas DataFrames are stored in-memory which means that the operations over them are faster…

Photo by Kelly Sikkema on Unsplash


Text summarisation refers to a set of techniques applied in the context of Natural Language Processing (NLP) and are capable of shortening the original text transcription in a way that the key information is preserved.

Text summarisation is useful in contexts where there’s a need for consuming large chunks of…

Photo by Jason Rosewell on Unsplash


In one of my latest articles we explored how to perform offline speech recognition with AssemblyAI API and Python. In other words, we uploaded the desired audio file to a hosting service and then we used the transcript endpoint of the API in order to perform speech-to-text.

In today’s guide…

Photo by Marcus Dall Col on Unsplash


Amazon Elastic Compute Cloud (EC2) is a web service offering cloud compute capacity and is among the most popular offerings in Amazon Web Services. The service lets users rent virtual machines, store data on virtual drives, distribute load across different machines and scale services.

AWS offers different types of EC2…

Photo by Matt Botsford on Unsplash

What is Speech Recognition

Speech Recognition, which is also known as Automatic Speech Recognition or Speech-to-Text, is a field that lies in the intersection of Computer Science and Computational Linguistics that develops certain techniques enabling computer systems to process human speech and convert it into textual format. …

Photo by Wood Dan on Unsplash


Usually, we need to apply certain functions over DataFrame columns or rows in order to either update values or even create new columns. The most commonly used operations for doing so in pandas, are apply, map and applymap methods.

In today’s guide we are going to explore all three methods…

Photo by eberhard 🖐 grossgasteiger on Unsplash


In one of my recent articles I discussed about the deployment models in Cloud Computing, namely private, public and hybrid models. Now apart from deployment models, it’s equally important to know the three different types of cloud computing and be able to recognise them.

In the past, every organisation had…

Photo by Steve Johnson on Unsplash


Counting the number of occurrences of list elements in Python is definitely a fairly common task. In today’s short guide we will explore a few different ways for counting list items. More specifically, we will explore how to

  • count the number of occurrences of a single list item
  • count the…

Giorgos Myrianthous

Machine Learning Engineer | Python Developer |

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store