Database Or FIleServer

Discussions

Performance and scalability: Database Or FIleServer

  1. Database Or FIleServer (7 messages)

    Hi,

    I have a requirement of storing XML files. I was just wondering which approach will be better in performance front - Storing in database as a BLOB datatype column or keeping the file in a file server?

    Can anybody please suggest me the way out? Which one would be better and why?

    Regards,
    Anirban

    Threaded Messages (7)

  2. Database Or FIleServer[ Go to top ]

    If that files count is less than 100 use file server otherwise use database. Performance will be always better with file-system if you can index the file-path i.e. something like ${DUMP_FOLDER}/${PK}.xml :-)

    Since I don't know how will you use them, so answer is it depends...
  3. Database Or FIleServer[ Go to top ]

    Hi,

    The file count will always be greater than that. May be it can cross 10000 at some point of time.

    But in that case also, if I maintain the index in a table. won't that be a faster approach? Since storing and retrieving are a bit more complex that keeping it directly in the file server (DUMP_FOLDER).

    My requirement is just to store the xml file and show on demand.

    Regards,
    Anirban
  4. Database Or FIleServer[ Go to top ]

    Try a XMLDatabase like eXits or XMLDBBerkeley
  5. Database Or FIleServer[ Go to top ]

    If your application is clustered, using a file systems may be an issue. Is this a network based file system?

    I've gone down the road of using file systems because of performance, but in the end learned the DB has proven to be the easiest to work with.
  6. Database Or FIleServer[ Go to top ]

    Anirban,

    GemFire's Distributed Caching product from GemStone offers a highly optimized memory/disk storage solution for XML (called GFX). This includes both optimized DOM representations (faster parsing, faster XPath querying, and lower memory footprint), optimized XML-Java Object Binding, easy distribution/mirroring accross multiple servers, and configurable overflow from memory to disk based on a Least Recently Used (LRU) algorithm.

    Best of all, you can add the technology into your application without having to make major changes. This would certainly represent a much more advantageous way of storing XML (both memory and disk) than the two options you are considering already.

    Cheers,

    Gideon
    GemFire-The Enterprise Data Fabric
    http://www.gemstone.com
  7. It depends[ Go to top ]

    If there are no a lot of changes in a database both solutions would had a very close performance because data will be cached in a memory by a operating disk buffer or by a database or a database layer in your application. But if you had a lot of changes a database would win because it is optimized to writing data to a disk efficiently.
  8. what's the r/w pattern?[ Go to top ]

    what is read against write operations balance? if read operations are frequent and write operations are less, you can consider posting your XML files on an apache server and read it from there using an input stream / url connection combination. apache offers state of the art caching and can handle important loads. To post your xml files you can either use http upload (performance not great, but if you have few writing operations is could be ok - up to you to estimate that) or share htdocs and over the network and write the files there. Even if you use http upload, I doubt it would be slower then uploading the blob to a database.

    If you write a lot and performance is important, I recommend you go for the file system solution. With that - and if you're on Linux / Unix - you can choose between various file systems: ext3, reiser, xfs, etc. Some do better with many small files, other do better with fewer large files.... again, up to you to estimate the best solution.

    The database solution is definitely the last one I would go for (for complexity reasons).

    Hope this helps,
    Emil ( http://www.thekirschners.com/software/testare/testare.html )