Best Methods for Using Internet Search and ... - Connected Cops

3 downloads 110 Views 1MB Size Report
Jan 11, 2011 - ➢News & Blogs. ➢Maps & Locations. ➢Games & ..... 5. Follow-up searches ... by the subje
Internet  Search  and  Analysis  in   Intelligence  and  Investigations Tuesday,  January  11,  2011 7:30  AM  – 8:45  AM

Ed  Appel Proprietor,  iNameCheck

1

Presentation • • • •

Quick  Internet  Overview Online  Sources  &  Methods Legal,  Policy  &  Privacy  Guidelines Policy  &  Regulatory  Issues

2

The  Internet  Is  Essential  for  Investigations  and   Intelligence • • • • • • • •

Accessible  data Who’s  online:  80% 30%+  power  users Crime  &  misbehavior Due  diligence Intelligence Vetting Investigations Pew  found  all  age  groups  online  in  significant   percentages.

3

Internet  Growth Millions  of  Users Millions  of  Users

The  numbers,  however  precise,  show   that  Internet  growth  is  rapid

Source:  InternetWorldStats

IP  traffic  in  Petabytes/month IP  traffic  in  Petabytes/month

US:  #2  in  world  (after  China)  – 239,893,600   users  of  310.2M  population  – 77.3%,  per   InternetWorldStats.com Source:  Cisco

4

The  Internet  Universe

MIT  Internet  Map  2007

November  3,  2003  Map  of  the  Internet An  increasingly  complex,  interconnected  galaxy   of  nodes  is  portrayed  in  these  Internet  maps  by   leading  technologists. 5

San  Diego  Supercomputer  Center  I-­‐Map  2008

San  Diego  Supercomputer  Study  of  Internet  Links

A billion or more people use the Internet daily, according to recent studies by SDSC research.

The map of the Internet, as built and described in a Nature Communications paper, shows the locations of Internet systems on the hyperbolic plane. Image courtesy of Dmitri Krioukov, SDSC/CAIDA

6

What’s  on  the  Internet? ¾Social  Networking ¾News  &  Blogs ¾Maps  &  Locations ¾Games  &  Hobbies ¾Photos ¾Video,  Film,  Music ¾Libraries ¾ E-Commerce ¾ Advertising ¾ Private Websites

¾Porn,  Exploitation ¾Illegal  Sites ¾Illegal  Activities ¾Illicit  Activities   ¾Forbidden  Activities ¾Fantasy ¾Humor ¾Juvenile  Delinquency Wireless: Major Growth Area 7

What’s  on  the  Internet? • Public  Records  (Real  Estate,  Courts,  Licenses,   Businesses,  Arrests,  Liens,  etc.) • Residences,  Building  Occupants • Telephones,  Email,  Mailing  Addresses • Genealogy,  Births,  Deaths • Educational  Institutions  &  Alumni • Business  &  Executive  Profiles • Associations  &  Volunteer  Organizations • Private  data  vendors  (Acurint,  IRB,  TLO) 8

Self-­‐Descriptions  in  Online  Profiles

Yedo  Da  Meth  Lover,  26,   Colbert,  Washington   MySpace Lowlife,  26,  Brownsville/Austin,  TX,  “Death   to  the  New  World  Order”  MySpace

Crack  Monkey,  21,   Somerset,  NJ,  Rider   grad,  MySpace

Hacker  Club Facebook Lynn,  N.  Seattle   ecstasy  dealer MySpace

Angela,  meth  addict MySpace 9

Illicit  Behavior  Online:  People  We  Trusted Florida  Asst.  US  Attorney  arrested  in  2007  as  he   arrived  in  Detroit  with  doll,  earrings,  Vaseline,   for  trying  to  arrange  to  have  sex  with  5-­‐year-­‐old   in  Internet  chats.    He  committed  suicide  in  his   cell  in  2007.

A  DHS  press  spokesman  caught  trying  to  induce  “14-­‐year-­‐old  girl”  (an   undercover  detective)  to  have  sex,  pled  No  Contest  in  2006 Army  Chief  Warrant  Officer,  Director  of  Army  School  of   Information  Technology,  arrested  in  2010  for  collecting   and  sharing  child  pornography  over  the  Internet US  military  contractor  in  Baghdad  hacked  girls’  computers,  extorted  them   for  nude  photos  &  sex  tapes,  tried  to  meet  some  for  sex  while  on  leave,   had  over  4,000  victims  when  arrested.    Serving  a  30-­‐year  sentence,  2010. 10

Case  Examples A computer forensic analyst – part of the IT security department of a Fortune 500 firm – was found publicizing himself online as a profane, offensive “leader” of 5,000 players in a worldwide, popular massively multiplayer online fantasy sci-fi game – which led to discovery of his game playing all day, during both work and off-hours. A new chief of research was found to have been disciplined by the FDA – 3 years prohibition from government contracting – for admitted scientific misconduct. While the FDA database did not show the 10-year-old sanctions, three FDA newsletters online reported them. One lesson: What you don’t know about what’s online can hurt you. 11

Case  Examples ~1,000 US Navy personnel using their Navy.mil email addresses as their MySpace user names. Many postings contain unsuitable material, including operational security issues.

A computer security man who pled guilty to operating a massive botnet that stole IDs and money was hired by a Santa Monica Internet search firm while he was awaiting sentencing. The firm failed to Google the convict.

12

Spc.  Bradley  Manning  Accused  in  Wikileaks  Case

Bradley Manning was reportedly despondent over losing a lover and disciplined for striking a soldier

“Wikileaks”  chief  suspect  Spc.  Bradley  Manning,  22,  of  Potomac,   MD,  was  arrested  in  Kuwait  and  incarcerated  at  Quantico   Marine  Base,  charged  in  July  2010  with  leaking  classified  videos   of  US  air  strikes  in  Iraq  to  the  Wikileaks  website  in  April  2010.  An   online  chat  acquaintance,  Adrian  Lano  (formerly  convicted  of   computer  hacking)  told  authorities  and  the  press  that  Manning   provided  thousands  of  classified  documents  to  Wikileaks.    Julian   Assange,  Wikileaks’  founder,  claimed  the  leaker  exposed  US   military  misdeeds.    US  government  leaders  voiced  fear  that  US   troops  and  informants  would  be  killed  based  on  secrets  leaked,   and  defended  the  actions  depicted.    75  MB  of  classified   documents  posted  by  Wikileaks  numbered  in  the  thousands.

Julian Assange, Wikileaks

Adrian Lamo ~2001

Leaked videos included US air strikes that killed civilians, including a Reuters reporter & driver

Manning’s charges include illegally transferring classified data to his PC, placing unauthorized software on military computers and delivering national defense info to an unauthorized party 13

Internet  Searching  is  Useful  For: • • • • • • •

Cyber vetting – virtual neighborhoods Criminal & corporate investigations IP & asset protection (insider threat) Compliance Competitive intelligence Legal support Research (any topic)

14

Likely  Findings • History  of  malicious  online  activities:  ~3-­‐6% • Derogatory  information,  e.g.  past  bad  acts – Arrests,  convictions,  lawsuits,  bankruptcies,  firing

• Misuse  of  “anonymous”  virtual  identity  online • Most  likely:  Verification  of  qualifications  and   eligibility  for  the  position  sought  in  vetting

15

Sources  &  Methods  for  Internet  Searching • • • •

Systems  &  Tools Search  Engines  &  Metasearch Websites  with  Databases:  “Dark  Web” Automated  Searching

Analysis  is  critical  for  the  information  to  have  value

16

Systems • Search  on  the  right  computer – Use  a  separate  system  for  searching    -­‐malware  risk – Keep  anti-­‐virus,  firewall,  anti-­‐malware  up  to  date

• Protect  your  anonymity  – you  can  be  detected • Protect  the  subject  – don’t  leave  a  trail • Use  fast  systems,  applications,  enough   memory

17

Applications • Browser:  Internet  Explorer,  Firefox,  Chrome,   Safari,  Opera • Browser  settings,  search  engine  integration • PDF  printer  (e.g.  Adobe  Acrobat) • Database  or  folders  – retrievable  files • Search  tools  (internal,  Internet)

18

Manual  Searches • Big  5  Search  Engines  – Live  &  Cached  Results – Google  (YouTube)  – Page  Rank:  100  factors – Yahoo!  4B  pages – Microsoft  (Bing)   – Ask  (MyWebSearch)  3%  of  searches – AOL  (MapQuest)

• Popular  (Social  &  Sales)  websites   – eBay,  Facebook,  MySpace,  Craigslist,  Amazon  

19

Other  Search  Engines All  the  Web  -­‐ "live  search"  looks  for  terms  as  you  type  them   AltaVista  -­‐ A  Yahoo  property  that's  not  what  it  used  to  be   Exalead  -­‐ Search  engine  from  France   FreeSearch -­‐ U.K.  search  engine   Gigablast -­‐ Looks  similar  to  Google,  smaller  database   IceRocket Lycos   Mamma  (really  a  metasearch  engine) Openfind -­‐ Emphasizes  Chinese-­‐language  results   WiseNut -­‐ Includes  "Wise  Guides,"  (topic  groups  ) Contemporary  (“Web  2.0”)  Search  Tools Twitter.com  ,  Trackle.com,  Monitter.com  and  Friendfeed.com  – help  find  people  &  provide  “right  now”  results 20

Specialized  Searching  (Examples) • Blogs:  blogsearch.google.com,  icerocket,com,   sphere.com,  technorati.com,  blogdigger.com • IP  addresses:  SamSpade.org,  whois.com,   networksolutions.com,  domaintools.com • Reverse  phone/address:  Whitepages.com,   anywho.com,  verizon.com • Public  records:  brbpub.com  (county) • Government:  usa.gov 21

More  Searches • Advanced  search  (Boolean  logic) • Special  features:  images,  videos,  maps,  news,   blogs • Country-­‐based  searching • Translations  (rough) • Tracking:  Google.com/alerts  (emails)

22

Tracking • Google  and  other  tools  (Trackle.com)  allow   one  to  track: – Changes  in  websites – Appearance  of  terms  on  indexed  pages – Appearance  of  terms  in  Twitter  &  other  places – Blogs  &  news  references  to  a  term

• Tracking  is  important  in  protection  of  assets   and  following  activities  of  rivals  &  adversaries 23

Leveraging  Search  Engine  Findings • Identify  websites  that  may  hold  more  on  topic – Colleges,  associations,  groups,  social  sites – Local  press,  hobbies,  sports,  high  schools

• Identify  subject’s  activities  that  may  lead  to   further  searching • Identify  subject’s  family  and  closest  friends,   who  may  post  about  the  subject

24

Metasearch  Engines Dogpile

http://www.dogpile.com/  

Google,  Yahoo,  Bing,  Ask

ixquick

http://www.ixquick.com/

11  sites

Metasearch

http://www.metasearchengine.com/

27  sites

Excite

http://www.excite.com/

Google,  Yahoo,  Bing,  Ask

Infospace

http://www.infospace.com/  

Google,  Yahoo,  Bing,  Ask,  Twitter

Addictomatic

http://addictomatic.com/  

Metasearch  engine  (23  sites)

Metacrawler

http://www.metacrawler.com/  

9  or  more  sites

Search3

http://www.search3.com/

Google,  Twitter,  Bing,  in  columns

Notice  that  results  differ  in  order  &  number

Cached  Web  Pages Archive.org:  Website content  no  longer  online   (Wayback Machine) 25

Invisible  Web

Internet

Many online databases are not accessible to Google

26

Variations  in  Name  Searches:  Examples • Use  different  versions  of  a  name: – “John  J.  Doe”      (full  name  in  quotes) – “Jack  Doe”    (nickname  in  quotes) – “Jack  Doe”  Nevada      (name  in  quotes  +  geographic  location) – “Jack Doe” IBM (name  in  quotes  +  job/industry/hobby) – “Jack Doe” Purdue    (name  in  quotes  +  school) • Address  – reverse  address  – J.  Doe  may  work better  than  John  Doe • Phone  Numbers • Email  Addresses – [email protected] – doe – jjdoe@ – @jacksbar (used  with  smaller  companies) 27

Quick  Anatomy  of  Google • Google  (YouTube)  constantly   spiders  the  Internet,  hits  pages   about  once  every  30  days • Caches  &  indexes  about  10  billion   pages,  more  than  any  other   search  engine • Presents  search  results  instantly,   showing  live  and  cached  data  links • Presents  results  in  “PageRank”   order  based  on  popularity  (note:   ads  influence  results)

The Internet: ¾506M websites ¾56B pages Google has about 18% of pages indexed Web Google

28

Searching  Online  Databases:  Contents  May  Not   Be  Indexed  by  Search  Engines • PeopleFinders,   zabasearch • WhitePages.com,   Anywho.com • USA.gov • USTaxCourt.gov • BlogSearch.google,   IceRocket,  Sphere • Yahoo  message  boards

• • • • • •

Whois,  SamSpade.org Nsopr.gov SSNValidator.com USAF-­‐locator.com Bop.gov/inmate AMA-­‐assn.org,  bms.org   (MDs) • RipoffReport.com • RagingBull.com 29

Finding  Search  Tools • Library  of  Congress:   http://www.loc.gov/rr/ElectronicResources/subjects. php?subjectID=69 • List  of  Search  Engines:   http://www.pandia.com/powersearch • Yahoo  List:   http://dir.yahoo.com/Computers_and_Internet/Inter net/World_Wide_Web/Searching_the_Web/Search_ Engines_and_Directories/  

30

Search  Automation • • • •

Metasearch Copernic:  www.copernic.com   Corporate  datamining  tools Proprietary  Software Better  COTS  products  are  needed

Boolean Logic, Search Techniques Optimize Queries

31

Step-­‐by-­‐Step  Approach 1. Search engines Individual (e.g. Google, Yahoo) Meta (DogPile, Metasearchengine)

2. Social Networks/Blog sites 3. Copernic 4. Automated searches 5. Follow-up searches 32

Keeping  Up  With  The  Internet • • • • • •

Keep  a  spreadsheet  with  links  to  best  sources Don’t  rely  on  search  engines  alone Find  new  sites  &  drop  those  no  longer  useful Research  what  works  best Use  experts  in  Internet  searching  -­‐ outsource Train  &  equip  internal  Internet  searchers

33

Procedures • • • • • • •

Plan  – include  subject-­‐specific  sites  &  terms Capture  content,  print  into  PDFs Include  details  (URLs,  dates,  specifics) Provide  source  for  each  item  reported Log  the  process,  if  evidence  results Do  not  include  inappropriate  data  (Title  VII) Include  caveats  about  reliability  in  reports   34

Controversial  Methods • • • •

“Friending”  subjects  – in  real  or  false  identity Social  engineering  to  elicit  info  about  subject Emailing  subject  under  a  false  identity “Pretexting”  as  the  subject  to  elicit  data  from   a  company  or  someone  who  knows  subject • Identifying  an  anonymous  emailer using   hidden  code • “Lurking”  in  chat  rooms   35

Large  Scale  Internet Intelligence • • • •

Use  automated  search  tools Capture  &  store  on-­‐line  activities  for  reference Filter  and  scan  results  to  find  relevant  data Analyze  and  report  results  along  with  other   investigative  sources • Identify  users:  link  real  names  to  online  IDs • Be  careful  in  using  Internet  data  to  ensure   accuracy  and  fairness 36

Analyzing  Search  Results • Attribution:  Who  uses  a  virtual  identity,  posts • Verification:  Proving  or  confirming  online  data – Ultimate  confirmation:  admission  of  subject

• Filter  non-­‐identifiable,  irrelevant  references • Evaluating  the  seriousness  of  findings • How  much  searching  is  enough?

37

Preserving  Online  Evidence If  you  are  not  using  computer  forensic  tools….

• Print  relevant  web  pages  (PDF  files) • Maintain  securely  (encryption,  digital   signatures) • Keep  long  enough  to  meet  legal   obligations  (then  delete  completely) If  the  content  can  become  evidence,  keep  a  log  and   notes  to  support  testimony  about  collection. 38

Using  Search  Results • Integrate  into  other  reporting  – with  clear   indication  of  source • Remember:  subject  may  not  have  posted  item • Fairness  may  demand  verification  of  the  data   by  the  subject • In  vetting,  it’s  best  to  interview  the  subject   about  any  questionable  postings

39

Is  Internet  Vetting  Legal? Is  Internet  Information  “Private?” • Internet  data  is  public,  not  private:  plain  view,   published  information • No  restriction  on  using  published  information • Must  abide  by  all  legal  requirements  for  other   types  of  investigative  information • No  current  legal  requirements  for – Advising  the  subject – Using  Internet  searching,  if  not  outsourced

Caveat:  This  does  not  constitute  legal  advice 40

Legal  &  Privacy  Gold  Standard ͻ Notice,  consent:  add  to  current  forms ͻ Attribution,  verification,  subject  interview,   redress ͻ Assessing  results  as  intelligence: – Virtual  ID  might  be  used  by  someone  else – Online  data  may  be  fabricated,  fantasy,  altered – Basis  for  subject  interview,  adjudication

ͻ Meets  FCRA  &  other  legal  requirements 41

Cyber  Vetting  Guidelines • IACP-­‐PERSEREC  Project:  Guidelines – Cyber  Vetting  for  Law  Enforcement – Cyber  Vetting  for  National  Security – Cyber  Posting  for  both  above

• Nationwide  series  of  focus  groups,  research • Baseline  considerations  for  establishing   enterprise  policies  and  procedures PERSEREC:  Defense  Personnel  Security  Research  Center,  Monterey,  CA IACP:  International  Association  of  Chiefs  of  Police 42

IACP  Cyber  Vetting  Guidelines

Developing a Cybervetting Strategy for Law Enforcement, December 2010, IACP [Companion study for national security] http://www.iacpsocialmedia.org/Portals/1/documen ts/CybervettingReport.pdf

43

Key  Policy  Issues • Trained  Internet  investigators • Outsourced  (can  address  EEO  issue) • Internet  search  policies  &  procedures – Liability  if  Internet  searching  is  done  improperly

• Defining  sufficiency  -­‐ completeness • Utilizing  results  of  searching

44

Issues  with  Private  Investigators • Licensing  of  cyber  investigators – Training

• Legal  and  ethical  guidelines  for  cyber  vetting • Watching  the  watchers:  regulators  online • Keeping  up  with  the  Internet

45

Forthcoming  Book:

Internet  Searches  for  Vetting,  Investigations   and  Open-­‐Source  Intelligence By  Edward  J.  Appel Taylor  &  Francis http://www.taylorandfrancis.com/books/details/9781439827512/ Scheduled  publication  January  14,  2011

…contains  more  details  on  topics  discussed  here,  e.g.  how  to   do  cybervetting  and  investigations  ethically  &  legally 46

Questions?

Contact  Information: Ed  Appel,  Proprietor,  iNameCheck (301)  524-­‐8074 [email protected] www.inamecheck.com 47

Suggest Documents