Member-only story

How to apply a ML model to a large csv in Python

As the size of databases and computational requirements keeps growing, it is important to know how to handle large amounts of data. This quick guide will teach you how to deploy an ML model on a potentially infinite amount of rows

Giovanni Valdata
6 min readMar 9, 2022
Credits to fullvector @ Freepik.com

Introduction and Problem Statement

A simple definition of data set is “a collection of data”. If you’re working within the Data Science or Machine Learning fields, you will certainly face one day the challenge of deploying a model on a large number of rows. This article aims at exploring a possible and definitive solution to the problem. I worked on a project that required the information I’m about to share and it’s been incredibly difficult to retrieve, even more, to apply it to my case. A lot of guides were not practical, they are most of the time a “list” of the solutions, but they did not explore them more deeply. If you’re here, you’re probably going to the familiar with the error “The kernel appears to have died. It will restart automatically”.

--

--

Responses (1)