Hadoop Distributed File System

Hadoop is like MogileFS, is a Open Source distributed file system.


Hadoop is a framework for running applications on large clusters built of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named Map/Reduce, where the application is divided into many small fragments of work, each of which may be executed or reexecuted on any node in the cluster. In addition, it provides a distributed file system (HDFS) that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both Map/Reduce and the distributed file system are designed so that node failures are automatically handled by the framework.

MogileFS is designed for smaller files. Hadoop is designed for larger files.

GoogleFS and OneFS (used by MySpace) are propitiatory file systems like Hadoop.

Posted in Large Sites