PHPMongrator – Migrating from MySQL to MongoDB

After a long search, I did for myself a tool to migrate data to MongoDB.

It’s a recurring need, a lot of people look for it: a developer decides to migrate a MySQL project to MongoDB . But how to do it?

Actually it’s not so hard as it seems. Exports in CSV from MySQL and imports in MongoDB in the same way. But there is a big trap.

First of all, the datatype issue. If you import with CSV, all data will be imported into MongoDB as string, even if is integer, float or date. That’s bad, really bad.

It’s important to remember the paradigm change. So tables with classic relationships as below:

INVOICE < INVOICE_ITEMS > PRODUCT

has no meaning under MongoDB. The DE-normalization, redundance in application level are others features that make hard directly migrations. The database must be remodeled.

Considering this questions, I’ve created this tool to help in those situations (but I know it’s impossible to handle all situations).

Features:

  • Runs migration through a config file using YAML format (very simple to edit)
  • To start the config, has a php-cli dumper that reads the de RDBM and create a config file to begin
  • Support to datatype mapping (really imports date, boolean, integer, float and string BSON types)
  • Support to simple foreign keys relations
  • Support to many to many foreing keys relations (not automatic dumped, but easyl configured by hand if you want)
  • Foreing keys converted to MongoDB DBRef or not (see config file defatul->use_dbref)
  • Easy ignore tables to import

Let’s walk throught this features in details…

Runs migration through a config file using YAML format (very simple to edit)

The YAML format is a “relaxed XML” using a indented file that can be easyly transformed into an PHP array. It’s so much easier to edit a YAML file than a XML file or even a large PHP array.

To start the config, has a php-cli dumper that reads the de RDBM and create a config file to begin

I’t would be real hard to make de config file all by hand. So, the runDumper.php script (running in shell) creates a starting config file for you. It can be suficient or then you can adjust it. It asks you the source database connection info and dumps the YAML file, and than you can customize.

To make the things real easier, it’s possible to ignore tables or fields, or just delete its lines in the config file. Simple relations (foreign keys) are automatic maped IF YOU HAS THE CONSTRAINTS correctly configure in the source database.

Another feature is the possibility of rename target field names (target_name) and, eventualy, even the datatype, if does not fits to you (target_type).

Support to datatype mapping (really imports date, boolean, integer, float and string BSON types)

The PHPMongrationDriverMysql (wich extends the basic PHPMongrationBaseDriver class) maps the basic MySQL datatypes to MongoDB. So, in the YAML config file you can choose which type will be used to save the record into MongoDB.

In this way, date fields will be saved as MongoDate, and so on.

Support to simple foreign keys relations

This mapping is automatic if the table has well configured CONSTRAINTS. You can also parametrize if the relationship will use or not MongodDB dbrefs  (see the “default” session on config file).

To be possible doing such thing, all data is imported, even the original foreing key, and only after the data is passed again to make references to MongoID.

Support to many to many foreing keys relations (not automatic dumped, but easyl configured by hand if you want)

As each developer can use a diferent pattern to name his tables and relationships, it would be hard to guess all situations. Besides, this are the most common case of non-conversion – to accommodate in the new moddeling pattern.

So, the way used was simpler: manual config when is desired. In MongoDB a field could be an array, so it’s really easy to reference various documents as a “member” of a “master” document.

Starting in that point, it’s possible to choose in which collection to put the relationship, or for redundance to put in both collections.

Classic example:

user
id
username
password
group
id
description
user_group
user_id
group_id

It could be configured in “user”:

(...)
tables:
  user:
    many_references:
      groups:
        reference_table: user_group
        in_id: user_id
        out_id: group_id
        out_table: group
        out_table_id: id

And it’s possible to put in “group”, or both.

Important: soon I want to implement “embedded” import, so the reference table could have more info (e.g. INVOICE_ITEMS would have quantity, subtotal, and so on).

oreing keys converted to MongoDB DBRef or not (see config file defatul->use_dbref)

As explained before.

Easy ignore tables to import

The config field permits to list tables and fields to be ignored. See  options default->ignore_tables and tables->??->ignore_columns


I believe that this overview gives you a good picture of  the tool. But there’s a lot to improve…

Collaborate: https://github.com/rafaelgou/PHPMongrator

  • Aleks V2

    Parabéns pela iniciativa!
    Muito legal mesmo, só tem um detalhe: acho que transformar tabelas em coleções não é a melhor abordagem para migrar bases relacionais para MongoDB.
    Já pensou fazer algo, como um programinha que defina que tabelas farão parte de qual coleção e assim migrar para MongoDB?
    Só uma sugestão, se isso já for possível em sua solução, me perdoe a falta de atenção.

    Abrsço
    Ales

  • I see a lot of interesting content on your website.
    You have to spend a lot of time writing, i know how to save you a lot of
    time, there is a tool that creates unique, google friendly articles in couple of minutes, just type
    in google – k2 unlimited content

  • Oma

    Want to copy articles from other websites rewrite
    them in seconds and post on your site, or use for contextual backlinks?
    You can save a lot of writing work, just type in gogle:
    Daradess’s Rewriter