Hadoop HDFS
Topics
- What is and Why HDFS?
- HDFS Architecture
- HDFS Features
- HDFS Commands
- HDFS Web UI
- Hue web UI
What is and Why HDFS?
What is HDFS?
- HDFS is a virtual FS (File System) built on top of local FS
- When you start writing data into HDFS, it eventually gets written onto the local FS (of distributed machines)
- You can't browse HDFS like you do with the local FS
- You need to use the HDFS commands (similar to local FS commands, however) or
- Or you can use the HDFS Web UI
- Or the available APIs
- HDFS stores data as blocks in a replicated fashion
- Management and replication of blocks are handled by HDFS
- HDFS is the primary distributed storage used by Hadoop applications
- Scalability, Reliability, Automatic distribution of data
Download course content