Laravel: Import Very Large CSV With Jobs and Queues

Have you ever tried to import 1 million rows from CSV into the DB? It's a tricky task that can end up with performance issues, timeouts, or just running out of server resources. In this tutorial, I will show you one way of handling this with Jobs and Queues.

In short, our goal is to make sure the import process is not interrupted by a timeout like this:

We need to split the import process into smaller chunks and run them individually to fix this. This is where Laravel Queues and Jobs come in handy. Let's see how we can use them to import an extensive dataset from a file.


Our Task

Here's the situation:

  • We have a CSV file with 1 million rows (or more)
  • We need to import this file into our database
  • Records that already exist in the database should be updated
  • Records that don't exist in the database should be created
  • We need to make sure that the import process is not interrupted by a timeout

Of course, we've chosen a CSV file for convenience, but you might expect to get a JSON request or an XML request from an API endpoint. The idea is the same.


The Cause of the Problem

The problem becomes obvious if we take a look at the code:

use Illuminate\Http\Request;
 
// ...
 
public function import(Request $request)
{
$file = $request->file('csv');
 
// Saving the file to storage for reading it as CSV
// Otherwise, it will break even faster.
$file = $file->store('csv', ['disk' => 'public']);
 
// Opening the file for reading
$fileStream = fopen(storage_path('app/public/' . $file), 'r');
 
$csvContents = [];
 
// Reading the file line by line into an array
while (($line = fgetcsv($fileStream)) !== false) {
$csvContents[] = $line;
}
 
// Closing the file stream
fclose($fileStream);
 
// Defining the indexes of the CSV
$name = 0;
$category = 1;
$description = 2;
$price = 3;
$stock = 4;
 
// Loading the categories into an array
$categories = ProductCategory::pluck('id', 'name');
 
$skipHeader = true;
// Attempt to import the CSV
foreach ($csvContents as $content) {
if ($skipHeader) {
// Skipping the header column (first row)
$skipHeader = false;
continue;
}
 
$productCategory = $categories[$content[$category]] ?? ProductCategory::create([
'name' => $content[$category]
])->id;
 
// Updating or creating the product based on the name and category
Product::updateOrCreate([
'name' => $content[$name],
'category_id' => $productCategory,
], [
'name' => $content[$name],
'category_id' => $productCategory,
'description' => $content[$description],
'price' => $content[$price],
'stock_left' => $content[$stock],
]);
}
 
// Deleting the file from storage (it was temporary!)
Storage::delete($file);
 
return "Imported!";
}

It should instantly raise a concern as we are iterating over each of the lines. It might be fine if the import size is small, but once we look at a more extensive dataset, it will take a long time to finish. And if the process takes longer than the limit of maximum PHP execution time, it will fail altogether.

And now, we come to the solution: Queues and Jobs. We will base the same import on queued Jobs, which will split the load into smaller chunks, ensuring they are executed individually.

Bonus: This will also make the import process contain a fail-safe mechanism, allowing us to retry the import if it fails for some reason.


Step 1. Creating the Job

Our first Job will be responsible for reading the CSV file and creating another Job for each line. Let's call it ProcessImportJob:

php artisan make:job ProcessImportJob

The Job will look like this:

app/Jobs/ProcessImportJob.php

use App\Jobs\ProductsImport\ProccessProductImportJob;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use Storage;
 
class ProcessImportJob implements ShouldQueue
{
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;
 
// Giving a timeout of 20 minutes to the Job to process the file
public $timeout = 1200;
 
public function __construct(private string $file)
{
}
 
public function handle(): void
{
$fieldMap = [
'name' => 0,
'category' => 1,
'description' => 2,
'price' => 3,
'stock' => 4,
];
 
// Open the file for reading
$fileStream = fopen($this->file, 'r');
 
$skipHeader = true;
while (($line = fgetcsv($fileStream)) !== false) {
if ($skipHeader) {
// Skip the header
$skipHeader = false;
continue;
}
// For each line, we dispatch a job to process the line
dispatch(new ProcessProductImportJob($line, $fieldMap));
}
 
// Close the file
fclose($fileStream);
 
// Deletes the file after processing
unlink($this->file);
}
}

Next, we will create...

The full tutorial [8 mins, 1511 words] is only for Premium Members

Login Or Become a Premium Member for $129/year or $29/month
What else you will get:
  • 58 courses (1056 lessons, total 44 h 09 min)
  • 78 long-form tutorials (one new every week)
  • access to project repositories
  • access to private Discord

Recent Premium Tutorials