Hadoop HDFS

Topics 

  • What is and Why HDFS? 
  • HDFS Architecture 
  • HDFS Features 
  • HDFS Commands 
  • HDFS Web UI 
  • Hue web UI

What is and Why HDFS?

What is HDFS? 

  • HDFS is a virtual FS (File System) built on top of local FS 
    • When you start writing data into HDFS, it eventually gets written onto the local FS (of distributed machines) 
  • You can't browse HDFS like you do with the local FS 
    • You need to use the HDFS commands (similar to local FS commands, however) or 
    • Or you can use the HDFS Web UI 
    • Or the available APIs 
  • HDFS stores data as blocks in a replicated fashion 
    • Management and replication of blocks are handled by HDFS 
  • HDFS is the primary distributed storage used by Hadoop applications 
    • Scalability, Reliability, Automatic distribution of data

 

Download course content