Masterless ACK - Google Groups

1 downloads 64 Views 468KB Size Report
Once Master crashes, all data without ACKs needs to be sent again. Master might be a bottleneck due to large amount of A
Masterless ACK Enhance the Reliability and Performance of Flume

Yongkun Wang Next Generation Search Group, Rakuten, Inc.

Flume Reliability

 Flume provides an “End-to-End” mode to ensure the data delivery.    

Firstly, agent writes the event to disk in a 'write-ahead log' (WAL) An acknowledgment (ACK) is sent back to the originating agent after the destination receives the event Agent can remove the log entry for this event If ACK is not received by waiting timeout, agent will resend the event

 The successful delivery of ACK is critical in this mode, otherwise:  

Receiving duplicate data due to timeout re-sending Consuming large disk space because the log entries are not removed

2

Yongkun Wang, NGS, ACT, DU, [email protected]

Problems of Current ACK Design master master master

3

ACK queue

2

Heartbeat Check ACKs

Send ACKs

Liveness agent Data

Source

Sink

1 Liveness

Data

Aggregated Data

Collector

Source

Sink Data to HDFS

Collector

agent



The ACKs are not directly sent back, but via the master. Issues:   



Once Master crashes, all data without ACKs needs to be sent again Master might be a bottleneck due to large amount of ACKs Agent needs to wait until the heartbeat to get the ACKs.

Multi-master scheme with replication has the same issues 

ACKs can be lost during the replication interval 3

Yongkun Wang, NGS, ACT, DU [email protected]

Masterless ACK

 Sending the ACKs to previous Agent/Collector 

Instead of sending them to Master

 Let the ACKs carry the route information. 

Add host list to ACK ACK Host 1 Host 2 Host 3 …

 Reuse the Event connection  Push back ACKs by collector once ACKs are ready 

ACKs are not pulled back by agent

4

Yongkun Wang, NGS, ACT, DU, [email protected]

Masterless ACK Design (1) 

For each flume node, either agent or collector, start two threads  

Start a Distributor thread by Source to distribute ACKs Start a Receiver thread by Sink to wait for ACKs

Flume Node Source

Flume Node

Sink Flume Node ACKs

Source

Sink

Collector sink?

Start

Ack Ack Distributor Receiver

Start

Ack NO Distributor

Is destination?

Source

Sink

Ack Ack Distributor Receiver

Ack Receiver

YES WAL Manager 5

Yongkun Wang, NGS, ACT, DU, [email protected]

Masterless ACK Design (2) Host 1

Agent

Event Source

Event

Host 1

Sink

ACK ACK Distributor Receiver

Host 1

Event

Host 2

Host 1 Host 2

Collector

Host 2

Collector

Host 3

Host 3

ACK Source Host 1

Sink

Source

ACK ACK Distributor Receiver

Sink

ACK ACK Distributor Receiver

Agent

Host 3

ACK Source

Sink

ACK ACK Distributor Receiver

   

ACK Host 2 Host 1

Host 2 Host 1

When connection between sink and source is established, the Distributor records the connection info Event records the host it passes by Host list is passed to ACK once the Event is saved to the destination ACK will be sent along the reversed host list 6 Yongkun Wang, NGS, ACT, DU, [email protected]

Implementation with Thrift  

Implemented in Flume-0.9.3 Masterless ACK is transported by thrift-0.6.0  Should work with the default thrift-0.5.0 Source

FlumeNode

Sink

ACK Distributor ACK Receiver

Thread

thread

Conn List ACK

Thrift ACK Adaptor

Thrift ACK Distributor thread

Thrift ACK

Thrift RPC Thrift ACK Receiver thread

Thrift Conn List

7

Yongkun Wang, NGS, ACT, DU, [email protected]

Reuse Connection with Thrift 

When the Source is open, get the connection and add it to Distributor’s Connection List

// ThriftEventSource class synchronized public void open() throws IOException { try { //start a new Distributor thread FlumeNode.getInstance().setAckDistributor(new ThriftAckDistributor()); new Thread(FlumeNode.getInstance().getAckDistributor()).start(); … // Override getProcessor() of Thrift and get the connection (thrift Client) when Event connection // is established. The connection is enqueued and used for the ack tranmission. TProcessorFactory processorFactory = new TProcessorFactory(null) { @Override public TProcessor getProcessor(TTransport trans) { //Add a client connection to queue FlumeNode.getInstance().getAckDistributor().addClient(new AckServiceClient(trans)); return new ThriftFlumeEventServer.Processor( … )); } }; … } Yongkun Wang, NGS, ACT, DU, [email protected]

8

Distribute the ACK 

With the Connection List, Ack Distributor will  Popup the host from ACK’s host list,  Compare the host name (IP is better?) with that in Connection List,  Send that Ack to the corresponding Host(s) Ack Distributor Agent 1

Conn (Host1)

Collector 1

Collector 2

Conn (Host1)

Conn (Host2)

Conn (Host3)

Ack 2

Ack 3

Ack 1

Collector 3

Connection List

Host 3 Host 1 Host 2

Host 2 Host 2

Host 1

Ack Queue

9

Yongkun Wang, NGS, ACT, DU, [email protected]

Test 

Configuration:    

collector2

  

agent1  collector1  collector2 agent1: tail(“/tmp/log”) | agentSink(“localhost”, 35853); collector1: collectorSource(35853) | agentBESink(“localhost”, 35854); collector2: collectorSource(35854) | collectorSink(“file:///tmp/temp”, “test-”);

Should use “agentBESink” for the middle nodes Use different port for each source/sink pair when all nodes on the same Host; When nodes are not on the same host, same port can be used for all source/sink pair, or any port available Log message:

Push back ack: log.xxxxxxxxxxxxxxx

agent1

collector1

Got ack: log.xxxxxxxxxxxxxxx

Master

Not destination. Send ack to distributor. Ack ID: log.xxxxxxxxxxxxxxx Push back ack: log.xxxxxxxxxxxxxxx

collector2

agent1

Got ack: log.xxxxxxxxxxxxxxx Ack reaches destination. Ack ID: log.xxxxxxxxxxxxxxx

collector1 Yongkun Wang, NGS, ACT, DU, [email protected]

10

Thank you very much! Q&A Yongkun Wang [email protected] Next Generation Search Group, Rakuten, Inc.