Learn how to find and start using a complex database to practice SQL

Image by Caspar Camille Rubin. Source: Unsplash

Nowadays, SQL has become one of the top skills for professionals. Every day, more companies start using relational databases. For this reason, professionals who want to grow in their careers understood that learning SQL is a must. Even professionals using SQL for a few years (like myself) could get confused with all the SQL commands and know that they need to practice. The other scenario problem is that multiple relational database management systems and many essential commands are not the same.

Here’s the problem. There are hundreds of online courses that can teach how to use SQL. However, most of…


Learn how to use TextBlob to fix misspellings and improve the model for your Natural Language Processing project

Image by Pixabay. Source: Pexels

When working with Natural Language Processing, you will inevitably find misspelled words. Even the best writers in the world will make typos. This article, for example, will be read at least three times before I push the button Publish. Still, it will have mistakes. It doesn’t matter if you are working with a dataset with product reviews or tweets; typos will be there. The problem is that this might affect the accuracy of NLP models because some important words can be missed if it’s misspelled.

Luckily, TextBlob can fix this, and I will show you how you can apply it…


Learn how to run over 40 machine learning models using Lazy Predict for regression projects

Image by Malte Helmhold. Source: Unsplash

Let’s say you need to work on a regression machine learning project. You analyze your data, do some data cleaning, create a few dummy variables, and now it’s time to run a machine learning regression model. What are the top ten models that come to your mind? Most of you probably don’t even know that there are ten regression models out there. Don’t worry if you don’t know because, by the end of this article, you will be able to run not only ten machine learning regression models but over 40!

A few weeks ago, I wrote the How to…

Aprende a ejecutar varios modelos de aprendizaje automático mediante predicción diferida.

Image by Keira Burton. Source: Pexels

Cuando inicias un nuevo proyecto de Machine Learning supervisado, uno de los primeros pasos es analizar los datos que tenemos, entender lo que estamos tratando de lograr, y qué algoritmos de machine learning podrían ayudarnos a lograr nuestros objetivos. Si bien la biblioteca scikit-learn nos facilita la vida al hacer posible la ejecución de modelos con unas pocas líneas de código, también puede llevar mucho tiempo cuando necesitas probar varios modelos. …


In this blog, I walk you through my transition from Marketing to Data Science and why you should consider doing the same

Image by Karolina Grabowska. Source: Pexels

One of the questions that I get the most as a data scientist with a background in marketing and business intelligence is why I switched careers. That is interesting because I have never decided to move away from marketing and start a move to another field. It was an evolution of my job that brought me to data science. In fact, I have a Bachelor of Science in marketing, which is a type of marketing focused on data analysis and business intelligence. Thus, this transition was going to happen at some moment in my career.

Since my day one working…


Create a complete data science project using PyCaret, from data cleaning to complex machine learning models

Image by Vlada Karpovich. Source: Pexels

A reader recently asked me in one of my blogs if I had tried PyCaret. I promised that I would try it, and I am so glad I did. PyCaret allows you to run a whole data science project, from data cleaning, dealing with class imbalance, to hyper-tuned machine learning models with two lines of code. Don't believe me? That's okay, I couldn't believe either when I tried it the first time but the fact is that it works. Let me show you PyCaretin action first and then we can dive deeper into this library. For demonstration purposes, I will…


Learn how and why you should use the Lux library for your next Data Science project

Image by Diana Dima. Source: Unsplash

You just started your new Data Science project. You took a quick look at the dataset, and now it’s time for some exploratory data analysis. You need to create a few visualizations and figure out what you are trying to find out. You probably start writing the code using Matplotlib and Seaborn, two great libraries but very time-consuming. What if I tell you that you can skip these steps and create visualizations with literally one line of code? That’s what Lux aims for.

Are you not convinced? Let me give you a demonstration, and then we can dive in and…

Hi there. Very good question. My next blog is actually going to be about pycaret. I have not started it yet, so I don't know the answer. PyCaret is very new, so I'm also curious to know more about it. I will have some thoughts about it soon. Stay tune!


Learn how to use lambda functions with Pandas — code along

Image by Vlada Karpovich. Source: Pexels

When I started learning Python, I remember how lost I felt when I got to the Lambda functions section. I tried reading articles and watching videos about it, but it took me a while to finally learn how to create complex Lambda functions for my data science projects. One of the biggest problems with articles on the internet is that they explain how it works, they show a simple example that is easy to understand, but in real life, it’s never as easy as they show us. For this reason, in this article, I will explain how to use lambda…


Learn how to run multiple machine learning models using lazy predict — code along

Image by Keira Burton. Source: Pexels

When starting a new supervised Machine Learning project, one of the first steps is to analyze the data, understand what we are trying to accomplish, and which machine learning algorithms could help us achieve our goals. While the scikit-learn library makes our lives easier by making possible to run models with a few lines of code, it can also be time-consuming when you need to test multiple models. …

Ismael Araujo

Data Scientist // Machine Learning Engineer // Based in NYC // Astrophysics Enthusiast // https://www.linkedin.com/in/ismael-araujo/

