Developer. Fingerprints. Tactics. Techniques ... Developer. Fingerprints. Tactics. Techniques ... Malware Domain List. (
Malware Attribution Theory, Code and Result
Who am I? • Michael Boman, M.A.R.T. project • Have been “playing around” with malware analysis “for a while”
• Working for FireEye • This is a HOBBY project that I use my SPARE TIME to work on
Agenda Theory behind Malware Attribution
Code to conduct Malware Attribution analysis
Result of analysis
Theory
•
Malware Attribution: tracking cyber spies - Greg Hoglund, Blackhat 2010 http://www.youtube.com/watch?v=k4Ry1trQhDk
What am I trying to do? Move this way Binary
Human
What am I trying to do? Blacklists
Binary
Net Recon Command and Control
Developer Fingerprints
Tactics Techniques Procedures
Social Cyberspace DIGINT
Physical Surveillance HUMINT
Human
What am I trying to do? Blacklists
Binary
Net Recon Command and Control
Developer Fingerprints
Tactics Techniques Procedures
Social Cyberspace DIGINT
Physical Surveillance HUMINT
Human
Blacklists
Net Recon Command and Control
Developer Fingerprints
Tactics Techniques Procedures
Social Cyberspace DIGINT
Physical Surveillance HUMINT
Physical Surveillance HUMINT Social Cyberspace DIGINT Developer Fingerprints
Tactics Techniques Procedures Blacklists
Net Recon Command and Control
Actions / Intent Installation / Deployment CNA (spreader) / CNE (search & exfil tool) COMS Defensive / Anti-forensic Exploit Shellcode DNS, Command and Control Protocol, Encryption
Physical Surveillance HUMINT Social Cyberspace DIGINT Developer Fingerprints
Tactics Techniques Procedures Blacklists
Net Recon Command and Control
Actions / Intent Installation / Deployment CNA (spreader) / CNE (search & exfil tool) COMS Defensive / Anti-forensic Exploit Shellcode DNS, Command and Control Protocol, Encryption
Steps • Step 0: Gather malware • Step 1: Extract metadata from binary • Step 2: Store metadata and binary in MongoDB
• Step 3: Analyze collected data
Step 0: Gather malware • • • •
VirusShare (virusshare.com)
•
Malware Domain List (www.malwaredomainlist.com/mdl.php)
OpenMalware (www.offensivecomputing.net) MalShare (www.malshare.com) CleanMX (support.clean-mx.de/clean-mx/ viruses)
Step 1: Extract metadata from binary
Development Steps Source Core “backbone” sourcecode
Machine
Binary
Tweaks & Mods
3rd party sourcecode
3rd party libraries
Compiler
Time
Runtime libraries
Paths
MAC Address
Malware
Packing
Development Steps Source Core “backbone” sourcecode
Machine
Binary
Tweaks & Mods
3rd party sourcecode
3rd party libraries
Compiler
Time
Runtime libraries
Paths
MAC Address
Malware
Packing
Development Steps Source Core “backbone” sourcecode
Machine
Binary
Tweaks & Mods
3rd party sourcecode
3rd party libraries
Compiler
Time
Runtime libraries
Paths
MAC Address
Malware
Packing
Step 1: Extract metadata from binary
• • • • •
Hashes (for sample identification)
•
md5, sha1, sha256, sha512, ssdeep etc.
File type / Exif / PEiD
•
Compiler / Packer etc.
PE Headers / Imports / Exports etc. Virustotal results Tags
Identifying compiler / packer • PEiD
• Python • peutils.SignatureDatabase().match_all()
PE Header information
VirusTotal Results
Tags • User-supplied tags to identify sample source and behavior
• analyst / analyst-system supplied
Step 2: Store metadata and binary in MongoDB
Components • •
Modified VXCage server
•
Stores malware & metadata in MongoDB instead of FS / ORDBMS
Collects a lot more metadata then the original
VXCage REST API • • •
/malware/add
•
Add sample
/malware/get/
•
Download sample. If no local sample, search other repos
/malware/find
•
Search for sample by md5, sha256, ssdeep, tag, date
• /tags/list •
List tags
Step 3: Analyze collected data
Identifying development environments • Compiler / Linker / Libraries • Strings • Paths • PE Translation header • Compile times • Number of times a software been built
Cataloging behaviors • Packers • Encryption • Anti-debugging • Anti-VM • Anti-forensics
Result
Have I seen you before?
• Detects similar malware (based on SSDEEP fuzzy hashing)
Different MD5, 100% SSDeep match
SSDEEP Analysis
(3007)
SSDEEP Analysis
(3007)
SSDEEP Analysis
(851)
Challanges • Party handshake problem: • 707k samples analyzed and counting
(resulting in over 250 billion compares!)
• Need a better target (pre-)selection
What compilers / packers are common? 1. "Borland Delphi 3.0 (???)", 54298 2. "Microsoft Visual C++ v6.0", 33364 3. "Microsoft Visual C++ 8", 28005 4. "Microsoft Visual Basic v5.0 - v6.0", 26573 5. "UPX v0.80 - v0.84", 22353
Are there any unidentified packers? • How to identify a packer • PE Section is empty in binary, is writable and executable
How common are antidebugging techniques? • 31622 out of 531182 PE binaries uses IsDebuggerPresent (6 %)
• Packed executable uncounted
Analysis Coverage Source Core “backbone” sourcecode
Machine
Binary
Tweaks & Mods
3rd party sourcecode
3rd party libraries
Compiler
Time
Runtime libraries
Paths
MAC Address
Malware
Packing
Future
What am I trying to do in the future Blacklists
Binary
Net Recon Command and Control
Developer Fingerprints
Tactics Techniques Procedures
Social Cyberspace DIGINT
Physical Surveillance HUMINT
Human
Expand scope of analysis +network +memory +os changes +behavior
What am I trying to do in the future • More automation • More modular design • Solve the “Big Data” issue I am getting myself into (Hadoop?)
• More pretty graphs
Thank you • Michael Boman •
[email protected] • @mboman • http://blog.michaelboman.org • Code available at https://github.com/ mboman/vxcage