Making MySQL and NoSQL work together to solve a real-world big-data problem


News: Making MySQL and NoSQL work together to solve a real-world big-data problem

  1. Really? Craigslist archives all of their posting data? So that fake posting I put up about selling my Viper in 2004 is still in a database somewhere for someone to go in and data mine? I thought there were all sorts of pieces of privacy rights legislation that required them to purge all of that data, not archive it. Apparently the opposite is true?

    But here's the point - Craigslist has a massive amount of data they've got to archive. We're talking over a billion posts, and when you get over a billion records of anything, that's when you're starting to talk about dealing with some big numbers. 

    As you could imagine, Craigslist has two separate system for live posts and archived posts. But for many a year, that archive just mirrored the data structure of the MySQL servers handling their live data. Of course, any schema change on the front end then required a corresponding change on the back end, and that's when all hell would break loose. So Craigslist made a change - they went all NoSQL on the back end, but kept everything very relational on the front end. Crazy? Well, they were crazy enough to make it all work.

    Check out the following article, where TheServerSide interviews one of 10Gens marketing schills to find out more about how Craigslist used MySQL and NoSQL to solve their big-data problems.

    How NoSQL, MySQL and MogoDB worked together to solve a big-data problem


    Follow Cameron McKenzie on Twitter (@potemcam)

    Recommended Titles

    NoSQL Distilled By Martin Fowler
    High Performance MySQL by Baron Schwartz
    MongoDB: The Definitive Guide By Michael Dirolf
    MongoDB in Action By Kyle Banker
    Taming The Big Data Tidal Wave By Bill Franks
    The Well-Grounded Java Developer By Martijn Verburg


  2. Do you have parallel domain models and other related server side components for MySQL and MongoDB ? The schemas again are still different (physically)  due to the parallel databases.. Interesting solution.. but would like to know more details about the nuts & bolts.