Inserting large amounts of data into MySQL with LOAD DATA INFILE

Suppose you want to do some performance testing on your application and you want to insert a large number of database records, what is the fastest way to do it? You could write some Java code that creates a PreparedStatement and reruns it with different values of the parameters. However, MySQL provides a much faster way – using LOAD DATA INFILE to read in the data from a CSV file. Suppose you have a customer table with the following columns:
  • id
  • email
  • password
  • title
  • first_name
  • last_name
Say you’d like to insert 500,000 customers into this table. You can generate a CSV with some dummy customer data with a few lines of Java code:
import java.io.File;
import java.io.FileWriter;

public class CreateCustomerDataFile {
  public static void main(String[] args) throws Exception {
    final File f = new File("customers.csv");
    final FileWriter fw = new FileWriter(f);
    final String template = ""john.smithX@fake.com","password","Mr","John","Smith"n";
    final StringBuilder sb = new StringBuilder();
    for (int i=1; i<=500000; i++) {
      // make the e-mail unique by replacing john.smithX with john.smith1 etc
      // clear the StringBuilder
      sb.setLength(0);
      // initialise it to the template
      sb.append(template);
      // insert a unique number
      fw.write(sb.replace(11, 12, String.valueOf(i)).toString());
    }
    fw.flush();
    fw.close();
  }
}
 
By default the MySQL LOAD DATA INFILE command will load from the directory associated with the database that you are trying to load into, so once you have created the file, copy it to that location. e.g. on a Windows box, if the database you are using is called ecommerce_site, the directory will be something like C:Program FilesMySQLMySQL Server 5.0dataecommerce_site.

Then start up the MySQL command prompt with: mysql -uadmin -ppassword Once in the command prompt, you can read in the CSV with: load data infile 'customers.csv' into table ecommerce_site.customer fields terminated by ',' enclosed by '"' lines terminated by 'n' (email,password,title,first_name,last_name); In this example you'll need to specify the field separator as the default separator is a tab. For more info on the command, see:
http://dev.mysql.com/doc/refman/5.1/en/load-data.html.

This entry was posted in MySQL, Performance, Testing and tagged , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

HTML tags are not allowed.

517,978 Spambots Blocked by Simple Comments