Example of PHP reading CSV large file Import Database

Source: Internet
Author: User
Tags spl import database

For millions of data CSV files, the file size may reach hundreds of M, and if a simple read is likely to be timed out or stuck to death.

Batch processing is necessary to successfully import data from a CSV file into a database.

The following function reads a few rows of data specified in a CSV file:

The code is as follows Copy Code

/**
 * csv_get_lines Read a few rows of data in a CSV file
 * @param $csvfile csv file path
 * @param $lines Read the number of rows
 * @param $offset start rows
&nbs p;* @return Array
 * */
Function csv_get_lines ($csvfile, $lines, $offset = 0) {
    if (! $f p = fopen ($csvfile, ' R ')) {
     return false;
   }
    $i = $j = 0;
 while false!== ($line = fgets ($fp))) {
  if ($i + + < $offset) {
   continue ;
  }
  break;
 }
  $data = array ();
 while ($j + + < $lines) &&!feof ($fp) {
   $data [] = Fgetcsv ($fp);
 
&nbs P;fclose ($FP);
    return $data;
}
Call method:

 

$data = csv_get_lines (' path/bigfile.csv ', 2000000);

Print_r ($data);

The function mainly uses the line localization idea, by skipping the starting line number to realize the file pointer localization.

As for how the data is stored, this article is no longer detailed.

The above function of 500M of files have been tested, smooth operation, for the 1GB file found a bit slow, so then find a way.

There are still some problems with how to quickly and completely manipulate large files.

1, how to quickly get the total number of CSV large files?

Approach one: Directly to obtain the contents of the file, using line breaks to split the total number of rows, this method is feasible for small files, processing large files is not feasible;

Approach two: Use fgets line traversal, to get the total number of rows, this approach is better than the method, but large files still have the possibility of timing out;

Method Three: With the help of Splfileobject class, directly position the pointer to the end of the file, through the Splfileobject::key method to obtain the total number of rows, this approach is feasible and efficient.

Specific implementation methods:

The code is as follows Copy Code
$csv _file = ' path/bigfile.csv ';
$SPL _object = new Splfileobject ($csv _file, ' RB ');
$SPL _object->seek (filesize ($csv _file));
echo $SPL _object->key ();

2, how to quickly get the data csv large files?

Still use PHP's Splfileobject class to achieve fast positioning through the Seek method.

The code is as follows Copy Code
$csv _file = ' path/bigfile.csv ';
$start = 100000; Www.111cn.net read from line 100,000th
$num = 100; Read 100 rows
$data = Array ();
$SPL _object = new Splfileobject ($csv _file, ' RB ');
$SPL _object->seek ($start);
while ($num-&& $spl _object->eof ()) {
$data [] = $spl _object->fgetcsv ();
$SPL _object->next ();
}
Print_r ($data);

Combining the top two points, sort the classes into a CSV file read:

The code is as follows Copy Code

Class Csvreader {
Private $csv _file;
Private $SPL _object = null;
Private $error;

Public function __construct ($csv _file = ') {
if ($csv _file && file_exists ($csv _file)) {
$this->csv_file = $csv _file;
}
}

Public Function Set_csv_file ($csv _file) {
if (! $csv _file | |!file_exists ($csv _file)) {
$this->error = ' File invalid ';
return false;
}
$this->csv_file = $csv _file;
$this->spl_object = null;
}

Public Function Get_csv_file () {
return $this->csv_file;
}

Private Function _file_valid ($file = ' ") {
$file = $file? $file: $this->csv_file;
if (! $file | |!file_exists ($file)) {
return false;
}
if (!is_readable ($file)) {
return false;
}
return true;
}

Private Function _open_file () {
if (! $this->_file_valid ()) {
$this->error = ' File invalid ';
return false;
}
if ($this->spl_object = = null) {
$this->spl_object = new Splfileobject ($this->csv_file, ' RB ');
}
return true;
}

Public Function get_data ($length = 0, $start = 0) {
if (! $this->_open_file ()) {
return false;
}
$length = $length? $length: $this->get_lines ();
$start = $start-1;
$start = ($start < 0)? 0: $start;
$data = Array ();
$this->spl_object->seek ($start);
while ($length-&& $this->spl_object->eof ()) {
$data [] = $this->spl_object->fgetcsv ();
$this->spl_object->next ();
}
return $data;
}

Public Function Get_lines () {
if (! $this->_open_file ()) {
return false;
}
$this->spl_object->seek (filesize ($this->csv_file));
return $this->spl_object->key ();
}

Public Function Get_error () {
return $this->error;
}
}

The method is invoked as follows:

The code is as follows Copy Code

Include (' CsvReader.class.php ');

$csv _file = ' path/bigfile.csv ';

$csvreader = new Csvreader ($csv _file);

$line _number = $csvreader->get_lines ();

$data = $csvreader->get_data (10);

echo $line _number, Chr (10);

Print_r ($data);

In fact, the above Csvreader class is not only for the CSV large files, for other text types of large files or large files are also available, provided that the class in the Fgetcsv method slightly changed to current can.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.