PCAP CSV

9 downloads 0 Views 553KB Size Report
Botnet Detection. Introduction and Objectives. ○ One of the most serious threats in the internet is the existence of computer networks that consist of machines ...
Faculty of Technology De Montfort University The Gateway, Leicester LE1 9BH Great Britain Introduction and Objectives

Transfer Learning for Automatic Botnet Detection

PCAP

CSV

Dr.Samad Ahmadi Dr. Prapa Rattadilok Basil Alothman Current Methods

● One of the most serious threats in the internet Fig2: Converting Pcap file to CSV file to let Learning System read it ● Host-Based: these techniques analyze the is the existence of computer networks that Proposed Method behavior of the host machine (the Bot) as it is consist of machines infected by malicious the place where the Bot is running ● Utilize existing large datasets to enhance model software tools ● Network-Based: Methods that are considered quality ● It is believed that about 20% of the computers network-based attempt to identify malicious ● Download the java source code of TrAdaBoost that are connected to the Internet are parts of behavior by analyzing network traffic and integrate it into WEKA. infected networks ● Traffic monitoring can be passive or active ● Run experiments with relatively large Botnet ● As a result, tens of Billions of US Dollars are ● Many network-based techniques try to be datasets as source tasks and smaller Botnet lost in damages or stolen every year generic and involve multiple protocols and datasets as target tasks. ● Hence, researchers have been working actively architectures ● Leave One Out (LOO) approach. Suppose we to develop effective techniques to detect and ● The problem is that it is not always correct to have n datasets, where each dataset represents protect from Botnets. concatenate different datasets (i.e. datasets traffic flows from a Botnet. We leave one ● Normally computers like these are known as containing network traffic from different dataset out, concatenate the remaining (n-1) Bots and the networks they are part of are Botnets). datasets, use them to create a Machine known as Botnets ■ Datasets can have different distributions Learning Model and then predict the dataset ● Our goal is to develop a technique for accurate which means they can downgrade the quality we left out. We evaluate and continue the same automatic identification of Botnet traffic. and predictive performance of machine approach for all datasets. learning models. ● Here leave one dataset out and build n-1 Machine Learning models, such as Random Forest or Naive Bayes models, using the Botnet Overview remaining datasets. After that we predict the Training Botnet dataset we have left out by averaging the Dataset ● Botnets are networks formed by ‘‘enslaving’’ Learning System predictions of the n-1 models. host computers, called Bots (derived from the ● Use the same technique in previous step but word robot). we compute a weighted average when making ● These computers are controlled by one or more predictions. To assign a weight to each Machine Knowledge Learning System attackers, called BotMasters Learning model, we use the similarity between ● Usually the intention is to perform malicious Fig1:Transfer Learning, which shows the Machine Transfer respective datasets. activities. Learning lifecycle of the Labelled Data ● Botnets can have different topologies: My hypothesis is 1- Centralized Topology(Star & Distributed cluster). Predictive Performance can be improved by using transfer learning techniques across datasets containing network traffic from different Botnets. This should be done instead of blindly concatenating datasets

2- Hierarchical Topology (Multiple C&C). 3-DeCentralized Topology (Peer2Peer/Random/No C&C). Fig3: DDoS Attack, on how the botnet can attack network