Apache Kafka 2.8.0 is finally out and you can now have early-access to KIP-500 that removes the Apache Zookeeper dependency. Instead, Kafka now relies on an internal Raft quorum that can be activated through Kafka Raft metadata mode. The new feature simplifies cluster administration and infrastructure management and marks a new era for Kafka itself.
In this article we are going to discuss why there was a need for removing ZooKeeper dependency in the first place. Additionally, we will discuss how ZooKeeper has been replaced by
KRaft mode as of version 2.8.0 …
Converting a PySpark DataFrame to Pandas is quite trivial thanks to
toPandas()method however, this is probably one of the most costly operations that must be used sparingly, especially when dealing with fairly large volume of data.
Pandas DataFrames are stored in-memory which means that the operations over them are faster to execute however, their size is limited by the memory of a single machine.
On the other hand, Spark DataFrames are distributed across the nodes of the Spark Cluster which is consisted of at least one machine and thus the size of the DataFrames is limited by the size of…
Millions of people use Google to search and access information but still only few take advantage of its full potential. Most people will type a bunch of keywords and let Google figure it out.
If you don’t know, Google it.
There are actually a few operators you can use in your search queries in order to narrow down the search results. In this article we’ll explore the most popular Google Search operators that can help you use Google more efficiently and get better results.
Python 3.9 has been around since late 2020 and it comes with a bunch of new syntax features and improvements. In this article, we are going to explore a few nice additions which were packed into this version. We will also discuss how to upgrade to Python 3.9 in case you want to.
As of Python 3.9, you can use the
| operator to merge two or more dictionaries together. For duplicate keys the rightmost dictionary takes precedence. This syntactic sugar is part of PEP-584.
Tuples are among the most fundamental and widely used data structures in Python. What most people don’t know — or usually forget about — is that the language comes with an extension type called Named Tuple that is built on top of the core tuple type.
In this article, we are going to explore named tuples — a rarely used collection that enhances standard tuples. We will discuss their syntax, how to use them in your code and most importantly when.
Dictionaries offer key-lookup and are perfect fit to cases where we need to create key-value data structures using mnemonic…
Python is a general-purpose programming language that can be used to build projects of any size and gives developers the provisions to write logical and clear code — even for large-scale projects.
The design philosophy of Python facilitates code readability by enforcing the use of indentation in order to define code blocks explicitly. Just having well-indented code, though, doesn’t necessarily mean that your code is also clear and well-written.
“Readability counts.” — The Zen of Python
One of the biggest advantages of Python compared to most other general-purpose programming languages is that it is less verbose. Sacrificing code readability for…
Version Control Systems, such as Git, are essential tools for versioning and archiving source code. Version Control helps you keep track of the changes in the code. When a change is made, an error could be introduced, too, but with source control tools, developers can roll back to a working state and compare it against the non-working piece of code. This minimizes the disruption to other team members that are probably working with the code and helps them collaborate efficiently.
Apart from code, data changes too.
Usually, Data Scientists need to access a range of datasets to complete a specific…
for loops are definitely some of the most commonly used statements. Python comes with a rarely used syntactic sugar that allows
else clauses to be used along with loop statements.
“Loop statements may have an
elseclause.” — Python docs
In this article, we are going to discuss the ability of loop statements to have
else clauses. Additionally, we will explore how this particular syntactic sugar can be used in common programming constructs and make our code more readable and even more Pythonic.
for:else statements, the
else clause is executed upon the exhaustion of the iterable (i.e. when the…
One of the most frustrating things you possibly need to deal with is when generating an Excel file using Python, that contains numerous columns you are unable to read due to the short width of the columns. Ideally, you should deliver readable spreadsheets where all the columns are properly formatted so that they are readable.
In this article, we are going to explore quick and easy ways one can use for
Stack Overflow is the leading community for developers where people are able to ask and answer programming-related questions. More than 21 million questions have been asked, more than 31 million answers have been provided, and more than 80 million comments have been made! I have to admit that most of the posts are pretty bad, but there are definitely tons of answers that are amazingly useful, well-written, and justified.
Apart from the very basic questions asking how to print a string with Python, Java, or Go, there are also numerous unpopular questions to which you’ll find some useful answers. …
Machine Learning Engineer | Python Developer