How to setup Apache Flume

Apache Flume is a distributed logging system. Flume supports local file systems and HDFS file system.

Flume has three component. Those are Master , Collector and agent. Master node does the coordination among the log cluster nodes. Collector acts as the log collecting agent. Log collector does the log storing task. Flume can can sink logs to different file systems. Users can develop their own sink plugins to support their storage log systems. Log agent does to log extraction and push the logs to collector.

Following configuration allow Flume to extract tail out of a log file and push to a log collector that writes logs to a local storage.

Start Master

./flume master

Start Collector

Sink to Local File system

./flume node -1 -n dump -c "dump: collectorSource() | collectorSink(\"/tmp/flume/collected\","server");" -s

Sink to HDFS file system

./flume node -1 -n dump -c "dump: collectorSource() | collectorSink(\"hdfs://node0:9000/flume/collected\","server");" -s

Start agent with tail to given log file

./flume node_nowatch -1 -s -n dump -c 'dump:tail("/home/hadoop/flume_log_gen_server/wso2as-4.5.0-SNAPSHOT/repository/logs/wso2carbon.log") | agentBESink("node0");'