1. Home
  2. >
  3. DDoS
Categories
DDoS Deflect Deflect Labs Press Release

Deflecting attacks against Israeli and Palestinian websites

DoS/DDoS attack report against Deflect protected websites between Oct 7 to Oct 22, 2023

INTRODUCTION

Violence that engulfed Israel and Gaza in recent weeks has permeated the digital commons as well. From horrifying footage of murder on our computer screens to hateful discourse throughout social media platforms. The Deflect infrastructure has for many years been a secure home for Israeli and Palestinian human rights groups, media and civic institutions. Deflect staff continue to apply our project’s principles and terms of service to ensure that the network is not used as a platform for promoting violence or hate. We also seek our clients’ explicit permission before publicizing their association with Deflect and reporting on attacks that aims to silence them.

Since Oct 7, 2023, Deflect recorded six significant DoS/DDoS attacks against Israeli human rights organizations (btselem.org) that culminated with 54 million attack events hitting our edge servers. We also recorded 11 significant DoS/DDoS attacks against the Palestinian news website (palestinechronicle.com), with a total of 7 million malicious hits in various attack formation.

COVERAGE

  1. This report covers only L7 HTTP/HTTPS logs. There may be more attack traffic below L7, but is not covered in this report. Therefore we don’t provide traffic size information (such as 1GB of traffic per second).
  2. Attack with a higher “Ban rate” might underestimate the original scale of the attack. As after the Deflect ban, attacking IP will be banned on the firewall level and preventing any further request from that IP to hit our server.
  3. Sites with different tech parameters may result in different logging behavior. Site with JS challenger constantly enabled, challenging every request but do not firewall ban IP that failed too many challenges, may result in more attack traffic logged.

METHODOLOGY

To identify attacks from normal traffic, we employ the following methodology:

  1. Identify if a spike of total traffic / ban log existed over a 24 hour window.
  2. Narrow down to that time range for anomaly, which often includes:
    1. Excessive request hitting certain URL (such as root /)
    2. Excessive request with identical User-Agent from different IPs
    3. Evenly distributed User-Agent / HTTP Method that is too perfect to be true
    4. Excessive unique query string (such as ?v={rand}) to avoid cache
  3. Confirm if top traffic IPs triggered any of our rate limiting rules.
  4. Cross check with Baskerville system, a machine learning  system that detects anomalous traffic.

ATTACK: BTSELEM.ORG

Parameters: JS Challenger: On / Fail challenger or hitting rate limit result: No ban

#DateStart (+0)Duration (s)HTTP ReqRPSUnique IPUnique bansBan rate
B110/9/202320:37:5199752,497,38052,64424516567.35%
B210/13/202315:37:262665291,19210911100.00%
B310/16/202315:02:08123146,0661,1861,8331,41677.25%
B410/16/202322:32:554831,068,4362,2113,6112,40366.55%
B510/18/20230:03:12141165,1711,1683,1332,75587.93%
B610/20/202313:24:30181133,9307392,6062,28187.53%

Chart A: Deflect / Banjax ban log visualization of attack #B1

Attack #B1 stands as the most potent attack documented in this report. It achieved an average Request Per Second (RPS) of 52,644. The top 6 originating IPs dispatched an average of 3 million requests within a 10-minute duration. The assailants deployed a “Randomized Nocache Flood” strategy, using varying query strings to bypass caching. Notably, the same query string was observed being used by different IPs from various global locations.

Attack #B2 originated from a single IP: 46.210.30.130. However, an apparent misconfiguration in the attacker’s tool resulted in all their requests being rejected by our server. 

Attack #B3 featured user-agent strings with minor variations in their version numbers, keeping a consistent foundational structure. Still, these weren’t entirely unique; the same user-agent string was detected being used by 37 different IPs.

Attack #B4 adopted a strategy akin to Attack #B3, but showcased a broader spectrum of user-agents and specifically targeted the /hebrew endpoint, as opposed to the website’s root directory (/).

Chart B: Baskerville Reaction to Attack #B4

Attack #B5 mirrored the tactics seen in Attack #B3 but employed a different set of user-agents.

Attack #B6 shared three identical User-agent string among the 2606 IPs.

ATTACK: PALESTINECHRONICLE.COM

Parameters: Js Challenger: Off / Hitting rate limit result: Firewall ban

#DateStart (+0)Duration (s)HTTP ReqRPSUnique IPUnique bansBan rate
P110/8/20238:26:5351588,0141711,87991748.80%
P210/8/202314:42:2692586,9919411100.00%
P310/9/202310:16:30299364,2411,2181,6321,44588.54%
P410/9/202322:34:091541,198,7527,76411100.00%
P510/10/202313:11:02739230,6433122,0021,72185.96%
P610/10/202317:06:396682,869,1764,29470853275.14%
P710/12/202320:27:52272711,5112,6131,50686757.57%
P810/12/202320:57:58248738,3802,9771,14293882.14%
P910/13/20230:32:16181458,3542,53382874690.10%
P1010/13/20239:25:37177291,2911,64875971093.54%
P1110/21/202316:31:55117269,0272,3052,2281,34760.46%

Chart C: Deflect / Banjax ban log visualization of attack #P6

Attack #P2 and #P4 was perpetrated by a single IP. Both targeted the HTTP port 80 and did not adhere to the 301 redirection to HTTPS. Excessive 301 requests were only subject to bans after October 14.

Attack #P6 was primarily executed by a single IP, which likewise did not adhere to the 301 redirects issued by Deflect.

Attacks #P7, #P8, #P9, and #P10 exhibited similarities in their approach; all employed a uniformly distributed user-agent string, implying that identical user-agent strings were observed across various IPs.

ATTACK CORRELATION

We observed significant overlaps in attack IPs across various DDoS attacks on palestinechronicle.com and btselem.org websites, suggesting coordinated attempts by the perpetrators. Here are the findings:

  1. Attack #P9 and #P10 shared approximately 50 common attack IPs.
  2. Attack #P7 and #P8 had about 30 identical attack IPs.
  3. Notably, attack #P7, #P8, #P9, and #P10 seems to originate from the same attacking source, evidenced by a strong overlap of source IPs.
  4. Attack #P3 and #P6 had six IPs in common. While attack #P1 and #P5 also shared six identical IPs. The recurrence of shared IPs in separate attacks suggests a possible, albeit weak, connection of a common attack source or affiliated entities.
  5. Attack #B4, #B5 and #B6 had 32 shared attack IPs, hinting that they might be from the same attacking source.
  6. There were also IPs that attacked both sites:
    1. IPs 186.121.235.66, 187.141.184.235, 201.91.82.155, and 36.91.45.11 targeted both #B3 and #P6.
    2. IPs 186.121.235.66, 187.141.184.235, 201.91.82.155, 36.91.45.11, 123.126.158.50, 223.112.53.2, 5.95.66.74, 79.107.146.14, and 190.90.8.74 attacked both #B3 and #P3.
  7. Of the 13 IPs that targeted attack #B1, three also attacked atack #P6 and six targeted #P3.

TOP ATTACKING IPs

This is a list of IP with excessive request logged on Deflect, associated with individual indecent (See # for matching attack ID).

#IPASRequests Count
B1198.50.121.146iWeb Technologies Inc.3,936,297
B1202.134.19.50CMC Telecom Infrastructure Company3,077,579
B1209.126.124.140HEG US Inc.2,908,415
P6104.199.133.2Google LLC2,802,394
B1185.191.236.162Rack Sphere Hosting S.A.2,751,354
B1200.30.138.54MILLICOM CABLE EL SALVADOR S.A. DE C.V.2,502,015
B1103.74.121.88The Corporation for Financing & Promoting Technology2,480,702
P491.227.40.198Data Invest sp. z o.o. S.K.A1,198,752
B1113.125.82.11Cloud Computing Corporation848,330
B137.211.21.205Ooredoo Q.S.C.831,118
B1173.212.197.82Contabo GmbH662,370
B1212.92.204.54A1 Hrvatska d.o.o.589,828
B1193.41.88.58Kyiv National Taras Shevchenko University542,676
B1109.70.189.70JSC Elektrosvyaz497,125
B1186.121.235.66AXS Bolivia S. A.417,661
B193.180.220.67Intertelecom Ltd417,072
B1177.126.129.43Net Aki Internet Ltda399,074
B246.210.30.130Cellcom Fixed Line Communication L.P.291,192
P2223.233.84.97Bharti Airtel Ltd., Telemedia Services86,991
P723.247.35.2Global Frag Networks28,408
P9209.17.114.78Network Solutions, LLC25,476
P10209.17.114.78Network Solutions, LLC12,392

CONCLUSION

From October 7th to 22nd, 2023, both Israeli and Palestinian websites were subjected to coordinated and severe cyber-attacks, intended to overwhelm and take down these websites. These kinds of attacks, known as Distributed Denial of Service (DDoS) attacks, function like a traffic jam clogging up a highway, preventing regular users from accessing the website.

  1. Scale of Attacks: The Israeli human rights website faced attacks resulting in 54 million web requests, while the Palestinian news website experienced 7 million web requests. Think of these as millions of unwanted phone calls jamming up a hotline.
  2. Tactics and Techniques: The attackers adapted and used varied methods to bypass Deflect defences. Some tried to vary the attack requests in minute ways to fool manual rule-sets. Others used a more straightforward approach of sending a massive number of requests rapidly. In some instances, attackers tried to disguise their harmful requests by making them look like regular user visits.
  3. Shared Attack Patterns: We noticed that many of the attacks on both websites seemed to come from the same sources or groups. This is like recognizing the same group of troublemakers causing disruptions in multiple places. Specifically, the methods and even some of the internet addresses (IPs) used in the attacks were common across the two websites.
  4. Efficiency of Defenses: Our protective measures, think of them as security guards or filters, worked well in most cases. They were able to identify and block these harmful requests, preventing significant disruptions. However, attackers are persistent, and they keep trying various methods to bypass our defenses.

Over the recent period, our protective system, Deflect, has stood as a robust guardian for websites under its watch. Using sophisticated techniques, which include the power of machine learning, it has adeptly differentiated between regular and malicious traffic. This not only ensured that these cyber attackers were effectively thwarted but also maintained the uninterrupted service of the websites in question. It’s a testament to Deflect’s capability to handle intricate and aggressive cyber-attacks, safeguarding the essence and uninterrupted function of online platforms, and thereby supporting the freedom of expression online.

  1. Home
  2. >
  3. DDoS
Categories
DDoS Deflect News from Deflect Labs Uncategorized

Updates from Deflect – 3 – 2022

This was a busy month for Deflect’s mitigation tooling, with Banjax blocking almost 12 million malicious requests launched by 108,294 different bots. Due to the war in Ukraine, many people turned to Deflect protected Ukrainian media sites for information. Throughout the month Deflect served 1,128,751,920 requests (almost double than the previous month) of which 283,570,50 came from Ukraine – around 20% of our global traffic. 1,277,053 Ukrainians read Deflect protected websites – also a testament to the stability of the Internet there.

Ukrainian readership in March, by city

The biggest attack recorded this month was against informator.ua – a pan-Ukrainian news website with a focus on the Donbas region.

On the 31st of March, between 07:45-8:50 GMT+0 about 1,300 unique IPs were blocked by Deflect as they attacked informator.ua with GET /ru?8943563843054274 and POST /ru?829986440416200 requests, utilizing cache-busting techniques. These bots were from Brazil, USA, Indonesia, India, Bangladesh and many other countries, almost 1,000 of them seems to be infected MikroTik routers. Several hundred were compromised webservers and SOCKS proxies. There was a partial downtime for this website for about an hour as Deflect was not able to mitigate this attack fast enough to be sure no malicious requests are hitting the origin. The Baskerville system did not react as expected (this has been fixed). We enabled Challenger for this domain to be sure we can mitigate future attack without any issues for the origin. Our log aggregation and analysis system was affected by the overall amount of requests and was out of sync for a short period of time.

Over 300,000 requests per minute were generated by the attackers. As you can see – a significant amount of bots originated from the United States. This is another important reminder for patching your computer systems and other Internet connected devices. Otherwise it could be your system attacking Ukrainian websites too!
Top banned unique IPs by vendor

    912 MikroTikRouter
    232 Unknown
     51 UbuntuServer
     44 Torrouter
     33 DebianServer
     16 WindowsServer
      6 WindowsSystem
      6 RedHatServer
      4 CentOSLinuxServer

Top banned unique IPs by service

    875 MikroTik
    232
     49 Ubuntu-ssh
     44 TorExitRouterHTTPheader
     33 Debiansshheader
     13 MikroTikSNMPinfo
     10 MikroTikFTPserver
      8 MikroTikPPTPserver
      7 WindowsRDPServer
      7 MSIISheader
      6 WindowsNetBIOS
      6 RedHatDNSheader
      5 MikrotikRouterOSconfigurationpage
      4 ApacheCentOS
      2 WindowswithMSHTTPAPIWebServer

by client_url:

199940     /ru
102142     /ru/category/biznes/login
37312      /ru/ukraino-rossiyskie-peregovory-v-stambule-itogi
3          /ru/post-prev/45573
  1. Home
  2. >
  3. DDoS
Categories
Blog DDoS Deflect Uncategorized

Deflect – a year in summary

Once again, the Deflect network grew in size and audience in 2021. Apart from the continuously stellar work of our clients, what stood out the most for the Deflect team tasked with network monitoring and adversary mitigation – was the increasing sophistication and ‘precision’ of Baskerville, outperforming human rule sets written for the Banjax mitigation toolkit request rate-limiting library. Yes, the machine is outperforming humans on Deflect. We won’t get into the philosophical nature of this reality, rather share some statistics and interesting attacks we witnessed this year, with you.

Year in stats

Legitimate no. of requests served10,152,911,060
Legitimate no. of unique readers (IP)77,011,728
Total requests banned – Banjax3,326,915
Total requests challenged by Baskerville2,606,927
% of Deflect clients also using eQpress hosting34 %
Total amount of complete Deflect outage0
Lowest up-time for any Deflect client99.8%
% of clients increase year-on-year21.62%
Largest botnet, by number of bots 19,333
Number of significant DDoS events103

Deflected attacks

What an attack looks like

On November 04, 2021 – a DDoS attack on a Vietnamese media (also hosted on EQPress) began around 16:50 UTC. Between 2,000-2,500 unique IPs where blocked, as originating from United States, Canada, Germany, France and other countries. These bots issued about 825 thousand GET / and GET https://website.com// requests during this attack. Most IPs involved were detected as proxies, and many of them revealed an IPs in X-Forwarded-For header. The underlying WordPress instance received up to 5,000 requests per second, forcing the EQPress server to send up to 30 megabits per second of html responses. Thanks to the FasctCGI cache and overall configuration hardening, the hosting network cluster had enough resources to serve requests until all bots were blocked without any significant issues for the website itself or its neighbors.

Baskerville detected this attack and issued challenge commands to 2,200+ IPs.

Baskerville’s traffic light classification system

This attack targeted an independent investigative journalism website from the Philippines. The attack began on November 15th and continued throughout the next two weeks. Large portions of attack traffic were not accessible to Deflect, targeting the hosting data center with L3/L4 floods.

Almost 4,000 unique IP addresses issued more than 70 millions “GET /” and “GET /?&148294400498e131004165713TT117859756720Q106417752262N” requests against the website, using `cache busting` techniques with random query_string parameters. Attackers also reverted to using forged User-Agents in request strings. Obviously this attack was adapted against Deflect caching defenses. Many of the participating IPa were proxies possible revealing the original sender with X-Forwarded-For header.

Unfortunately, this attack was not fully mitigated in a quick way and caused several hours of downtime for real users. After manually enabling Deflect’s advanced protection mechanisms and adjusting the origin’s configuration, the website became stable again.

A Zambian democratic watchdog organization was attacked twice between August 08-09 and 11-12. It seems that when the attackers came back a second time round, they hadn’t learned their lessons and tried a similar technique and an almost identical botnet.

Servers from different countries (mostly Unites States, Germany, Russia, France) were sending more than 16 millions of GET / and /s=87675957 requests (with random numbers to bypass caching) during the first round of attacks. During the following incident over 137 million malicious requests were recorded and blocked.

Most of these IPs are known as compromised servers that could be used as proxies and MikroTik routers. 383 unique User-Agent headers were used, all of them were Google Chrome variations. We can also see about 400 TOR exit nodes which were used for this attack.

Millions of hits per minute

The first attack was not completely mitigated due to its profile and some traffic was able to hit the origin server, resulting in several hours of partial downtime for real visitors during different phases of this attack. The second attack was completely mitigated as we had already updated our mitigation profiles.

  1. Home
  2. >
  3. DDoS
Categories
Blog DDoS Deflect

Go Banjax-Go!

The Deflect service is built around defense-in-depth principles to keep your website online, no matter the traffic coming in. Our network edges are located with multiple providers in data centers around the globe. Every edge on the Deflect network caches static webpage resources and can reply very quickly to a multitude of simultaneous requests. As traffic arrives at the edge, two separate modules are always on the lookout for malicious bots and attacks. One of these is Baskerville – powered by machine lead anomaly predictions. We have a dedicated page explaining how that works. The other is Banjax – a curated list of regex patterns with associated rate limits. This allows us, for example, to instantly block IPs sending requests with user agents from a list of vulnerability scanners. Or we can block IPs that request an expensive /search/ endpoint too often, or send an unreasonable amount of POST or GET requests to the network. It’s simple but very efficient.

Banjax was originally coded in C++ and created as an Apache Traffic Server (ATS) plugin. These initial choices have made it difficult for third parties (who were not running ATS) to adopt. In refactoring Banjax we decided to use Go – a more modern language that still provided all the necessary functionality and made it easier to maintain the library in the long term. So now, we are please to present Banjax-Go built to for the 2020s and working happily in concert with Baskerville and Deflect caching or as a standalone module in your nginx setup.

So the list of decisions Banjax can make are: Allow, Block, or Challenge. The decision lists are populated from the config file (useful for allowlisting or blocklisting known good or bad IPs), from the results of the regex rate limit rules (so breaking a rule can result in a Block or a Challenge, or even an Allow), and from messages received on a Kafka topic (this is how Baskerville talks to banjax-next).

In addition to blocking requests (at the HTTP level) or blocking IPs (at the iptables/ netfilter level), Banjax also supports sending a “challenge” HTML page which contains either a basic password challenge (useful as an extra line of defense in front of admin sections) or a proof-of-work challenge (useful for blocking bots that cannot execute JavaScript, while allowing web browsers through).

An intitial concern with moving away from C++ was performance – during an attack, Banjax often has to processes thousands of requests per second, on every edge. We ran a set of synthetic tests to see how Banjax-Go performed. We used a series of worst-case scenarios, coming from our past experiences on Deflect. Our goal was to process 1,000 unique IPs per second, on an average virtual machine (a Digital Ocean droplet).

We first tested iptables directly to see how quickly it can process direct requests – deleting 2000 rules – without any other system interfering. We ended up with the following results:

Next, we tested how quickly Banjax-Go is able to process different types of common requests (again, under worst-case scenario conditions):

  • Every request generates a challenge: 800 req/sec
  • Every request passes through to the origin without any caching: 1200 req/sec
  • Every request passes through and is served a cached version of a web page: 2800 req/sec

At the same time we decided to evolve our caching mechanism from using Apache Traffic Server to Nginx. These and many other modules will make up our release of Deflect-Core – a project deliverable that we hope to present by the end of spring. For now our efforts concentrate on the mitigation toolkit banjax-next.

  1. Home
  2. >
  3. DDoS
Categories
Blog DDoS Deflect

Everything you always wanted to know about protecting your website with Deflect* (*But were afraid to ask)

Whether you are the owner of an independent media site telling the stories no one else will, a non-profit or community organization informing its members of available resources and events, or a company of any size, ensuring that your website stays protected and online is of the utmost importance. 

Understanding the difference between indirect and direct vulnerability

Many indirect cybersecurity attacks – malware, phishing, trojans, data breaches, and ransomware – can be prevented by raising awareness in an organization and cultivating best practices around clicking on suspect links or downloading files from unreliable sources. 

However, your website can also be subjected to direct DDoS attacks. This is why its security should be managed by a dedicated technical support team that you trust, one that matches your values of transparency, privacy, and social responsibility. 

What is a DDoS attack?

Unlike attacks that rely on individuals clicking on suspicious links in their email or downloading files from untrusted sites, DDoS attacks are direct assaults on the IT infrastructure of an organization.

A DDoS (distributed denial of service) attack is like the early pandemic grocery store rush of customers piling up and blocking the door in a mad rush for that last roll of toilet paper. Except, when all that traffic hits your site, they are not customers – they are bots. And their main purpose is to overwhelm your site and knock it off the web. Without protection, your site can be incapacitated by an attack and shut down completely. 

My site won’t get attacked because we’re too small

A DDoS attack can happen to anyone, no matter the size of your site or the number of visitors. In fact, many small sites, especially independent media and grassroots organizations, are particularly vulnerable to attacks because their voices often oppose a powerful government, a military, or a popular consensus. In some cases, these sites are targeted by hate groups, as was the case when we protected Black Lives Matter from attacks which occurred over 100 times a day on their site for seven months in 2016.

One good question to ask: would anyone like to silence your voice? If the answer is yes, you likely already know the importance of DDoS protection. We provide the same level of protection for non-profits and independent media sites as we do for commercial clients. Learn more about our free protection for eligible groups here

There aren’t many DDoS attacks, so I’m not likely to need protection

According to a recent white paper released by CISCO, DDoS attacks have been getting larger and more frequent each year. In 2018, there were 7.9 million DDoS attacks, and by 2023, they estimate the number will double to 15.4 million.

Aren’t bigger names better when it comes to DDoS protection?

No. Deflect has the capacity to handle protection of any site, and our experience mitigating attacks on some of the most vulnerable sites in countries all around the world has made us experts in the field.

According to Ali Reza, “IPOS directly benefited from Deflect’s expertise and professionalism when our main website was subject to an unprecedented attack. At the time the services of similar companies including CloudFlare and Google PageSpeed failed to protect IPOS’ election tracking poll against a major DDOS attack during the 2013 presidential elections in Iran. However, Deflect were able to quickly set up a CDN front and accept traffic from IPOS’ main domain and fight back against the attack.”

My industry won’t be attacked. It’s banks and governments that are most often subjected to DDoS attacks

While banks and governments have indeed been subjected to DDoS attacks, no industry is DDoS-proof. According to a 2019 global DDos Threat Landscape report by Imperva, attacks have occurred in most markets, including adult entertainment, gaming, news, society, lifestyle, retail, travel, and gambling. If your site is not in those markets, it does not mean you are safe from a DDoS attack. 

Motivations for DDoS attacks

As the same report points out, the motivations for DDoS attacks are many, and may include: 

Business competition – a competitor might hire a botnet to bring down your site. 

Extortion – ecommerce sites are particularly dependent on the uptime of their sites for generating revenue. This makes them particularly susceptible to extortion for the promise not to attack their site.

Hacktivism – political, media, or corporate websites can be targeted by hacktivists to protest against their actions.

Vandalism – disgruntled users or random offenders often attack gaming services or other high profile clients.

To this list, we would add:

Censorship – these attacks could be committed by individuals, governments, or militaries against groups for their social, environmental, human rights, or political movements with the goal of silencing their voices. As you can imagine, outside of North America, some of the most consistent attacks against the most vulnerable peoples and groups, like our client ARNO, in Myanmar, are of this type.

Transparent, Trusted, Ethical Protection

But I’m already protected by one of the more popular guys for “free.” 

Large providers often claim to offer DDoS protection for “free.” To provide that service, however, many enter into agreements with venture capitalists, and the trade-off for their “free” protection is the privacy of your data, which can be shared or sold. 

Before the Cambridge Analytica scandal, many of us would mindlessly scroll down and agree to all terms and conditions, but for independent media, nonprofit and community organizations, and companies, data should always be kept safe and private. When choosing who will protect you from DDoS attacks, read policies carefully to find out if you’re giving up anything for “free” protection. Our protection for non-profits, NGO’s, and independent media really is free.

Deflect Pricing

At Deflect, we have always provided our services for free to eligible non-profits and independent media groups, without compromising your data privacy. Our principles, privacy policy, and conditions are transparent. For commercial sites, our pricing is transparent. Unlike most of our competitors, we charge for the number of unique monthly IPs to your site, not for multiple visits from one IP, or traffic from attacks. 

There are other limits to the “free” protection provided by some of our competitors. On more than one occasion, clients who were protected by our competitors have come to us after being attacked and told they either needed to upgrade to a premium service or leave, just at the moment when they were most vulnerable. 

We at Deflect consider ourselves to be the #1 ethical cybersecurity protection company in the world. We have over 10 years experience protecting the most vulnerable and most attacked non-profit and independent media voices across the world in over 80 countries. 

In addition to our commitment to transparent policies and privacy, we have a clear no-hate, no-incitation-of-violence policy. For us, this is a no-brainer. If your site breaks this policy, you will be asked to leave. 

We are socially responsible. For every paying commercial client we protect, we are able to extend the same protections for free to important groups that otherwise could not afford protection, or may get kicked off the “free” protection of our competitors because the work they do makes them more vulnerable to attacks. 

If you have more questions, or you’d like more information about Deflect’s non-profit, business, or partner programs, you can reach out to us by sending us a message here or by reach out to terry@deflect.network for non-profit questions, and garfield@deflect.network for business and partner programs. 

  1. Home
  2. >
  3. DDoS
Categories
Advocacy Blog DDoS Technology

Tor and DDoS attacks: myths and reality

– Non-DDOS attacks
Any vector that does not require a large flood of traffic could be
effectively routed through Tor.  This covers most attacks with the
notable exception of DDoS. If I were trying to properly hack, not
just DoS, Deflect – I would use Tor.

– C&C functions
Probably we wouldn’t observe this, but it goes without saying the Tor
can be used for communicating with botnet C&C.  That’s what I would
do.

– Monitoring of site availability and other DDoS-related functions
Before and during an attack, it must be of interest to monitor the
availability of the target site.  Tor could be useful here… maybe I
would use it to monitor the attacked site.

– Regular browsing by Tor users
In any attempt to monitor Tor traffic to alert us to an immanent
attack, care must be taken to filter out normal traffic as much as
possible.  ML or significance algorithms would probably do the job
best.  Sniffles may also provide inside: an increase in Tor traffic
to ports other than 80/443 might be adequate without further
computation.

– Actions for more detailed research:
o Install license for Elastic Graph which arrived today to facilitate
significance analysis o Get sniffles online and see what we can see
(following a subsequent, ie monitored, attack) o Apply significance
or other analysis to Tor traffic patters in and out of proximity to
an attack

CASE STUDY: BLM

The first image shows banjax bans for blacklivesmatter.com on top, and
torified traffic to blacklivesmatter.com on the bottom, both over the
last 8 weeks.

Again, the second image shows banjax bans for blacklivesmatter.com on
top, and torified traffic to blacklivesmatter.com on the bottom, but
zoomed in to a period approximately 1 week before and 1 week after the
large spike in bans.

Some observations:
– There appears to be a sharp uptick in Tor traffic adjacent to the
incident
– Torified traffic continues long after the attack appears to have
ended: perhaps we are looking at a coincidence, or perhaps another
attack is being planned/prepared, or perhaps both.
– The number of banned IPs is two orders of magnitude larger than the
number of torified hits (note that unique IPs is a more or less useless
metric for torified traffic, and also that the total hits *from* the
banned IPs above will be quite a bit larger than the number of banned
IPs)
– The number of banned IPs is, in fact, far larger than the number of
Tor exit nodes.

Caveats:
– BLM have not been with us long
– Only one site is analysed here, superficially
– We are looking at traffic to ports 80/443 which is not filtered by
our providers

QUICK CASE 2: www.btselem.org

A small uptick in bans corresponding with a large spike in torified
traffic.  Then a large uptick in bans, with no corresponding increase
in torified traffic.  The attack appears to either subside or be
successfully blocked, then another smaller attack occurs – this time
with a simultaneous uptick in torified traffic.  Hard to draw
conclusions, but not inconsistent with the theory that a correlation
may exist.  A clear lesson from this is example, IF a correlation is
proven or assumed, is that time between a spike in torified probing and
an actual DDoS will vary.

USELESS AGGREGATE GRAPHS:

Without separating by HTTP Host (or anything else), the data is mushed
into useless noise.

  1. Home
  2. >
  3. DDoS
Categories
DDoS Deflect News from Deflect Labs

Introducing Baskerville (waf!)

The more outré and grotesque an incident is the more carefully it deserves to be examined.
― Arthur Conan Doyle, The Hound of the Baskervilles

Chapter 1 – Baskerville

Baskerville is a machine operating on the Deflect network that protects  sites from hounding, malicious bots. It’s also an open source project that, in time, will be able to reduce bad behaviour on your networks too. Baskerville responds to web traffic, analyzing requests in real-time, and challenging those acting suspiciously. A few months ago, Baskerville passed an important milestone – making its own decisions on traffic deemed anomalous. The quality of these decisions (recall) is high and Baskerville has already successfully mitigated many sophisticated real-life attacks.

We’ve trained Baskerville to recognize what legitimate traffic on our network looks like, and how to distinguish it from malicious requests attempting to disrupt our clients’ websites. Baskerville has turned out to be very handy for mitigating DDoS attacks, and for correctly classifying other types of malicious behaviour.

Baskerville is an important contribution to the world of online security – where solid web defences are usually the domain of proprietary software companies or complicated manual rule-sets. The ever-changing nature and patterns of attacks makes their mitigation a continuous process of adaptation. This is why we’ve trained a machine how to recognize and respond to anomalous traffic. Our plans for Baskerville’s future will enable plug-and-play installation in most web environments and privacy-respecting exchange of threat intelligence data between your server and the Baskerville clearinghouse.

Chapter 2 – Background 

Web attacks are a threat to democratic voices on the Internet. Botnets deploy an arsenal of methods, including brute force password login, vulnerability scanning, and DDoS attacks, to overwhelm a platform’s hosting resources and defences, or to wreak financial damage on the website’s owners. Attacks become a form of punishment, intimidation, and most importantly, censorship, whether through direct denial of access to an Internet resource or by instilling fear among the publishers. Much of the development to-date in anomaly detection and mitigation of malicious network traffic has been closed source and proprietary. These silo-ed approaches are limiting when dealing with constantly changing variables. They are also quite expensive to set-up, with a company’s costs often offset by the sale or trade of threat intelligence gathered on the client’s network, something Deflect does not do or encourage.

Since 2010, the Deflect project has protected hundreds of civil society and independent media websites from web attacks, processing over a billion monthly website requests from humans and bots. We are now bringing internally developed mitigation tooling to a wider audience, improving network defences for freedom of expression and association on the internet.

Baskerville was developed over three years by eQualitie’s dedicated team of machine learning experts. Several challenges or ambitions were presented to the team. To make this an effective solution to the ever-growing need for humans to perform constant network monitoring, and the never-ending need to create rules to ban newly discovered malicious network behaviour, Baskerville had to:

  • Be fast enough to make it count
  • Be able to adapt to changing traffic patterns
  • Provide actionable intelligence (a prediction and a score for every IP)
  • Provide reliable predictions (probation period & feedback)

Baskerville works by analyzing HTTP traffic bound for your website, monitoring the proportion of legitimate vs anomalous traffic. On the Deflect network, it will trigger a Turing challenge to an IP address behaving suspiciously, thereafter confirming whether a real person or a bot is sending us requests.

Chapter 3 –  Baskerville Learns

To detect new evolving threats, Baskerville uses the unsupervised anomaly detection algorithm Isolation Forest. The majority of anomaly detection algorithms construct a profile of normal instances, then classify instances that do not conform to the normal profile as anomalies. The main problem with this approach is that the model is optimized to detect normal instances, but not optimized to detect anomalies causing either too many false alarms or too few anomalies. In contrast, Isolation Forest explicitly isolates anomalies rather than profiling normal instances. This method is based on a simple assumption: ‘Anomalies are few, and they are different’. In addition, the Isolation Forest algorithm does not require a training set to contain normal instances only. Moreover, the algorithm performs even better if the training set contains some anomalies, or attack incidents in our case. This enables us to re-train the model regularly on all the recent traffic without any labeling procedure in order to adapt to the changing patterns.

Labelling

Despite the fact that we don’t need labels to train a model, we still need a labelled dataset of historical attacks for parameter tuning. Traditionally, labelling is a challenging procedure since it requires a lot of manual work. Every new attack must be reported and investigated, and every IP should be labelled either malicious or benign.

Our production environment reports several incidents a week, so we designed an automated procedure of labelling using a machine model trained on the same features we use for the Isolation Forest anomaly detection model.

We reasoned that if an attack incident has a clearly visible traffic spike, we can assume that the vast majority of the IPs during this period are malicious, and we can train a classifier like Random Forest particularly for this incident. The only user input would be the precise time period for that incident and for the time period for ordinal traffic for that host. Such a classifier would not be perfect, but it would be good enough to be able to separate some regular IPs from the majority of malicious IPs during the time of the incident. In addition, we assume that attacker IPs most likely are not active immediately before the attack, and we do not label an IP as malicious if it was seen in the regular traffic period.

This labelling procedure is not perfect, but it allows us to label new incidents with very little time or human interaction.

An example of the labelling procedure output

Performance Metrics

We use the Precision-Recall AUC metric for model performance evaluation. The main reason for using the Precision-Recall metric is that it is more sensitive to the improvements for the positive class than the ROC (receiver operating characteristic) curve. We are less concerned about the false positive rate since, in the event that we falsely predict that an IP is doing something malicious, we won’t ban it, but only notify the rule-based attack mitigation system to challenge that specific IP. The IP will only be banned if the challenge fails.

The performance of two different models on two different attacks

Categorical Features

After two months of validating our approach in the production environment, we started to realize that the model was not sophisticated enough to distinguish anomalies specific only to particular clients.

The main reason for this is that the originally published Isolation Forest algorithm supports only numerical features, and could not work with so-called categorical string values, such as hostname. First, we decided to train a separate model per target host and create an assembly of models for the final prediction. This approach over complicated the whole process and did not scale well. Additionally, we had to take care of adjusting the weights in the model assembly. In fact, we jeopardized the original idea of knowledge sharing by having a single model for all the clients. Then we tried to use the classical way of dealing with this problem: one-hot encoding. However, the deployed solution did not work well since the model became too overfit to the new hostname feature, and the performance decreased.

In the next iteration, we found another way of encoding categorical features  based on a peer-review paper recently published in 2018. The main idea was not to use one-hot encoding, but rather to modify the tree-building algorithm itself. We could not find the implementation of the idea, and had to modify the source code of IForest library in Scala. We introduced a new string feature ‘hostname,’ and this time the model showed notable performance improvement in production. Moreover, our final implementation was generic and allowed us to experiment with other categorical features like country, user agent, operating system, etc.

Stratified Sampling

Baskerville uses a single machine learning model trained on the data received from hundreds of clients.This allows us to share the knowledge and benefit from a model trained on a global dataset of recorded incidents. However, when we first deployed Baskerville, we realized that the model is biased towards high traffic clients.

We had to find a balance in the amount of data we feed to the training pipeline from each client. On the one hand, we wanted to equalize the number of records from each client, but on the other hand, high traffic clients provided much more valuable incident information. We decided to use stratified sampling of training datasets with a single parameter: the maximum number of samples per host.

Storage

Baskerville uses Postgres to store the processed results. The request-sets  table holds the results of the real-time weblogs pre-processed by our analytics engine which has an estimated input of ~30GB per week. So, within a year, we’d have a ~1.5 TB table. Even though this is within Postgres limits, running queries on this would not be very efficient. That’s where the data partitioning feature of Postgres came in. We used that feature to split the request sets table into smaller tables, each holding one week’s data. . This allowed for better data management and faster query execution.

However, even with the use of data partitioning, we needed to be able to scale the database out. Since we already had the Timescale extension for the Prometheus database, we decided to use it for  Baskerville too. We followed Timescale’s tutorial for data migration in the same database, which means we created a temp table, moved the data from each and every partition into the temp table, ran the command to create a hypertable on the temp table, deleted the initial request sets table and its partitions, and, finally, renamed the temp table as ‘request sets.’ The process was not very straightforward, unfortunately, and we did run into some problems. But in the end, we were able to scale the database, and we are currently operating using Timescale in production.

We also explored other options, like TileDb, Apache Hive, and Apache HBase, but for the time being, Timescale is enough for our needs. We will surely revisit this in the future, though.

Architecture

The initial design of Baskerville was created with the assumption that Baskerville will be running under Deflect as an analytics engine, to aid the already in place rule-based attack detection and mitigation mechanism. However, the needs changed as it became necessary to open up Baskerville’s prediction to other users and make our insights available to them.

In order to allow other users to take advantage of our model, we had to redesign the pipelines to be more modular. We also needed to take into account the kind of data to be exchanged, more specifically, we wanted to avoid any exchange that would involve sensitive data, like IPs for example. The idea was that the preprocessing would happen on the client’s end, and only the resulting  feature vectors  would be sent, via Kafka, to the Prediction centre. The Prediction centre continuously listens for incoming feature vectors, and once a request arrives, it uses the pre-trained model to predict and send the results back to the user. This whole process happens without the exchange of any kind of sensitive information, as only the feature vectors go back and forth.

On the client side, we had to implement a caching mechanism with TTL, so that the request sets wait for their matching predictions. If the prediction center takes more than 10 minutes, the request sets expire. 10 minutes, of course, is not an acceptable amount of time, just a safeguard so that we do not keep request sets forever which can result in OOM. The ttl is configurable. We used Redis for this mechanism, as it has the ttl feature embedded, and there is a spark-redis connector we could easily use, but we’re still tuning the performance and thinking about alternatives. We also needed a separate spark application to handle the prediction to request set matching once the response from the Prediction center is received.. This application listens to the client specific Kafka topic, and once a prediction arrives, it looks into redis, fetches the matched request set, and saves everything into the database.

To sum up, in the new architecture, the preprocessing happens on the client’s side, the feature vectors are sent via Kafka to the Prediction centre (no sensitive data exchange), a prediction and a score for each request set is sent as a reply to each feature vector (via Kafka), and on the client side, another Spark job is waiting to consume the prediction message, match it with the respective request set, and save it to the database.

Read more about the project and download the source to try for yourself. Contact us for more information or to get help setting up Baskerville in your web environment.

  1. Home
  2. >
  3. DDoS
Categories
Advocacy DDoS Press Release

Deflect website security services for free in response to COVID-19

In response and solidarity with numerous efforts that have sprung up to help with communications, coordination and outreach during the COVID-19 epidemic, eQualitie is offering Deflect website security and content delivery services for free until the end of 2020 to organizations and individuals working to help others during this difficult time. This includes:

  • Availability: as demand for your content grows, our worldwide infrastructure will ensure that your website remains accessible and fast
  • Security: protecting your website from malicious bots and hackers
  • Hosting: for existing or new WordPress sites
  • Aanalytics: view real-time statistics in the Deflect dashboard

Deflect is always offered free of charge to not-for-profit entities that meet our eligibility requirements. This offer extends our free services to any business or individual that is responding to societal needs during the pandemic, including media organizations, government, online retail and hospitality services, etc. We will review all applications to make sure they align with Deflect’s Terms of Use.

It takes 15 minutes to set up and we’ll have you protected on the same day. Our support team can help you in English, French, Chinese, Spanish and Russian. If you have any questions please contact us.

  1. Home
  2. >
  3. DDoS
Categories
Advocacy DDoS Deflect Deflect Labs News from Deflect Labs Threat Intel

Deflect Labs Report #6: Phishing and Web Attacks Targeting Uzbek Human Right Activists and Independent Media

Key Findings

  • We’ve discovered infrastructure used to launch and coordinate attacks targeting independent media and human rights activists from Uzbekistan
  • The campaign has been active since early 2016, using web and phishing attacks to suppress and exploit their targets
  • We have no evidence of who is behind this campaign but the target list points to a new threat actor targeting Uzbek activists and media

Introduction

The Deflect project was created to protect civil society websites from web attacks, following the publication of “Distributed Denial of Service Attacks Against Independent Media and Human Rights Sites report by the Berkman Center for Internet & Society. During that time we’ve investigated many DDoS attacks leading to the publication of several reports.

The attacks leading to the publication of this report quickly stood out from the daily onslaught of malicious traffic on Deflect, at first because they were using professional vulnerability scanning tools like Acunetix. The moment we discovered that the origin server of these scans was also hosting fake gmail domains, it became evident that something bigger was going on here. In this report, we describe all the pieces put together about this campaign, with the hope to contribute to public knowledge about the methods and impact of such attacks against civil society.

Image

Context : Human Rights and Surveillance in Uzbekistan

Emblem of Uzbekistan (wikipedia)

Uzbekistan is defined by many human-rights organizations as an authoritarian state, that has known strong repression of civil society. Since the collapse of the Soviet Union, two presidents have presided over a system that institutionalized  torture and repressed freedom of expression, as documented over the years by Human Rights Watch, Amnesty International and Front Line Defenders, among many others. Repression extended to media and human rights activists in particular, many of whom had to leave the country and continue their work in diaspora.

Uzbekistan was one of the first to establish a pervasive Internet censorship infrastructure, blocking access to media and human rights websites. Hacking Team servers in Uzbekistan were identified as early as 2014 by the Citizen Lab. It was later confirmed that Uzbek National Security Service (SNB) were among the customers of Hacking Team solutions from leaked Hacking Team emails. A Privacy International report from 2015 describes the installation in Uzbekistan of several monitoring centers with mass surveillance capabilities provided by the Israeli branch of the US-based company Verint Systems and by the Israel-based company NICE Systems. A 2007 Amnesty International report entitled ‘We will find you anywhere’ gives more context on the utilisation of these capabilities, describing digital surveillance and targeted attacks against Uzbek journalists and human-right activists. Among other cases, it describes the unfortunate events behind the closure of uznews.net – an independent media website established by Galima Bukharbaeva in 2005 following the Andijan massacre. In 2014, she discovered that her email account had been hacked and information about the organization, including names and personal details journalists in Uzbekistan was published online. Galima is now the editor of Centre1, a Deflect client and one of the targets of this investigation.

A New Phishing and Web Attack Campaign

On the 16th of November 2018, we identified a large attack against several websites protected by Deflect. This attack used several professional security audit tools like NetSparker and WPScan to scan the websites eltuz.com and centre1.com.


Peak of traffic during the attack (16th of November 2018)

This attack was coming from the IP address 51.15.94.245 (AS12876 – Online AS but an IP range dedicated to Scaleway servers). By looking at older traffic from this same IP address, we found several cases of attacks on other Deflect protected websites, but we also found domains mimicking google and gmail domains hosted on this IP address, like auth.login.google.email-service[.]host or auth.login.googlemail.com.mail-auth[.]top. We looked into passive DNS databases (using the PassiveTotal Community Edition and other tools like RobTex) and crossed that information with attacks seen on Deflect protected websites with logging enabled. We uncovered a large campaign combining web and phishing attacks against media and activists. We found the first evidence of activity from this group in February 2016, and the first evidence of attacks in December 2017.

The list of Deflect protected websites chosen by this campaign, may give some context to the motivation behind them. Four websites were targeted:

  • Fergana News is a leading independent Russian & Uzbek language news website covering Central Asian countries
  • Eltuz is an independent Uzbek online media
  • Centre1 is an independent media organization covering news in Central Asia
  • Palestine Chronicle is a non-profit organization working on human-rights issues in Palestine

Three of these targets are prominent media focusing on Uzbekistan. We have been in contact with their editors and several other Uzbek activists to see if they had received phishing emails as part of this campaign. Some of them were able to confirm receiving such messages and forwarded them to us. Reaching out further afield we were able to get confirmations of phishing attacks from other prominent Uzbek activists who were not linked websites protected by Deflect.

Palestine Chronicle seems to be an outlier in this group of media websites focusing on Uzbekistan. We don’t have a clear hypothesis about why this website was targeted.

A year of web attacks against civil society

Through passive DNS, we identified three IPs used by the attackers in this operation :

  • 46.45.137.74 was used in 2016 and 2017 (timeline is not clear, Istanbul DC, AS197328)
  • 139.60.163.29 was used between October 2017 and August 2018 (HostKey, AS395839)
  • 51.15.94.245 was used between September 2018 and February 2019 (Scaleway, AS12876)

We have identified 15 attacks from the IPs 139.60.163.29 and 51.15.94.245 since December 2017 on Deflect protected websites:

DateIPTargetTools used
2017/12/17139.60.163.29eltuz.comWPScan
2018/04/12139.60.163.29eltuz.comAcunetix
2018/09/1551.15.94.245www.palestinechronicle.com eltuz.com www.fergana.info and uzbek.fergananews.comAcunetix and WebCruiser
2018/09/1651.15.94.245www.fergana.infoAcunetix
2018/09/1751.15.94.245www.fergana.infoAcunetix
2018/09/1851.15.94.245www.fergana.infoNetSparker and Acunetix
2018/09/1951.15.94.245eltuz.comNetSparker
2018/09/2051.15.94.245www.fergana.infoAcunetix
2018/09/2151.15.94.245www.fergana.infoAcunetix
2018/10/0851.15.94.245eltuz.com, www.fergananews.com and news.fergananews.comUnknown
2018/11/1651.15.94.245eltuz.com, centre1.com and en.eltuz.comNetSparker and WPScan
2019/01/1851.15.94.245eltuz.comWPScan
2019/01/1951.15.94.245fergana.info www.fergana.info and fergana.agencyUnknown
2019/01/3051.15.94.245eltuz.com and en.eltuz.comUnknown
2019/02/0551.15.94.245fergana.infoAcunetix

Besides classic open-source tools like WPScan, these attacks show the utilization of a wide range of commercial security audit tools, like NetSparker or Acunetix. Acunetix offers a trial version that may have been used here, NetSparker does not, showing that the operators may have a consistent budget (standard offer is $4995 / year, a cracked version may have been used).

It is also surprising to see so many different tools coming from a single server, as many of them require a Graphical User Interface. When we scanned the IP 51.15.94.245, we discovered that it hosted a Squid proxy on port 3128, we think that this proxy was used to relay traffic from the origin operator computer.

Extract of nmap scan of 51.15.94.245 in December 2018 :

3128/tcp  open     http-proxy Squid http proxy 3.5.23
|_http-server-header: squid/3.5.23
|_http-title: ERROR: The requested URL could not be retrieved

A large phishing campaign

After discovering a long list of domains made to resemble popular email providers, we suspected that the operators were also involved in a phishing campaign. We contacted owners of targeted websites, along with several Uzbek human right activists and gathered 14 different phishing emails targeting two activists between March 2018 and February 2019 :

DateSenderSubjectLink
12th of March 2018g.corp.sender[@]gmail.comУ Вас 2 недоставленное сообщение (You have 2 undelivered message)http://mail.gmal.con.my-id[.]top/
13th of June 2018service.deamon2018[@]gmail.comПрекращение предоставления доступа к сервису (Termination of access to the service)http://e.mail.gmall.con.my-id[.]top/
18th of June 2018id.warning.users[@]gmail.comВаш новый адрес в Gmail: alexis.usa@gmail.com (Your new email address in Gmail: alexis.usa@gmail.com)http://e.mail.users.emall.com[.]my-id.top/
10th of July 2018id.warning.daemons[@]gmail.comПрекращение предоставления доступа к сервису (Termination of access to the service)hxxp://gmallls.con-537d7.my-id[.]top/
10th of July 2018id.warning.daemons[@]gmail.comПрекращение предоставления доступа к сервису (Termination of access to the service)http://gmallls.con-4f137.my-id[.]top/
18th of July 2018service.deamon2018[@]gmail.com[Ticket#2011031810000512] – 3 undelivered messageshttp://login-auth-goglemail-com-7c94e3a1597325b849e26a0b45f0f068.my-id[.]top/
2nd of August 2018id.warning.daemon.service[@]gmail.com[Important Reminder] Review your data retention settingsNone
16th of October 2018lolapup.75[@]gmail.comЭкс-хоким Ташкента (Ex-hokim of Tashkent)http://office-online-sessions-3959c138e8b8078e683849795e156f98.email-service[.]host/
23rd of October 2018noreply.user.info.id[@]gmail.comВаш аккаунт будет заблокировано (Your account will be blocked.)http://gmail-accounts-cb66d53c8c9c1b7c622d915322804cdf.email-service[.]host/
25th of October 2018warning.service.suspended[@]gmail.comВаш аккаунт будет заблокировано. (Your account will be blocked.)http://gmail-accounts-bb6f2dfcec87551e99f9cf331c990617.email-service[.]host/
18th of February 2019service.users.blocked[@]gmail.comВажное оповещение системы безопасности (Important Security Alert)http://id-accounts-blocked-ac5a75e4c0a77cc16fe90cddc01c2499.myconnection[.]website/
18th of February 2019mail.suspend.service[@]gmail.comОповещения системы безопасности (Security Alerts)http://id-accounts-blocked-326e88561ded6371be008af61bf9594d.myconnection[.]website/
21st of February 2019service.users.blocked[@]gmail.comВаш аккаунт будет заблокирован. (Your account will be blocked.)http://id-accounts-blocked-ffb67f7dd7427b9e4fc4e5571247e812.myconnection[.]website/
22nd of February 2019service.users.blocked[@]gmail.comПрекращение предоставления доступа к сервису (Termination of access to the service)http://id-accounts-blocked-c23102b28e1ae0f24c9614024628e650.myconnection[.]website/

Almost all these emails were mimicking Gmail alerts to entice the user to click on the link. For instance this email received on the 23rd of October 2018 pretends that the account will be closed soon, using images of the text hosted on imgur to bypass Gmail detection :

The only exception was an email received on the 16th of October 2018 pretending to give confidential information on the former Hokim (governor) of Tashkent :

Emails were using simple tricks to bypass detection, at times drw.sh url shortener (this tool belongs to a Russian security company Doctor Web) or by using open re-directions offered in several Google tools.

Every email we have seen used a different sub-domain, including emails from the same Gmail account and with the same subject line. For instance, two different emails entitled “Прекращение предоставления доступа к сервису” and sent from the same address used hxxp://gmallls.con-537d7.my-id[.]top/ and http://gmallls.con-4f137.my-id[.]top/ as phishing domains. We think that the operators used a different sub-domain for every email sent in order to bypass Gmail list of known malicious domains. This would explain the large number of sub-domains identified through passive DNS. We have identified 74 sub-domains for 26 second-level domains used in this campaign (see the appendix below for  full list of discovered domains).

We think that the phishing page stayed online only for a short time after having sent the email in order to avoid detection. We got access to the phishing page of a few emails. We could confirm that the phishing toolkit checked if the password is correct or not (against the actual gmail account) and suspect that they implemented 2 Factor authentication for text messages and 2FA applications, but could not confirm this.

Timeline for the campaign

We found the first evidence of activity in this operation with the registration of domain auth-login[.]com on the 21st of February 2016. Because we discovered the campaign recently, we have little information on attacks during  2016 and 2017, but the domain registration date shows some activity in July and December 2016, and then again in August and October 2017. It is very likely that the campaign started in 2016 and continued in 2017 without any public reporting about it.

Here is a first timeline we obtained based on domain registration dates and dates of web attacks and phishing emails :

To confirm that this group had some activity during  2016 and 2017, we gathered encryption (TLS) certificates for these domains and sub-domains from the crt.sh Certificate Transparency Database. We identified 230 certificates generated for these domains, most of them created by Cloudfare. Here is a new timeline integrating the creation of TLS certificates :

We see here many certificates created since December 2016 and continuing over 2017, which shows that this group had some activity during that time. The large number of certificates over 2017 and 2018 comes from campaign operators using Cloudflare for several domains. Cloudflare creates several short-lived certificates at the same time when protecting a website.

It is also interesting to note that the campaign started in February 2016, with some activity in the summer of 2016, which happens to when the former Uzbek president Islam Karimov died, news first reported by Fergana News, one of the targets of this attack campaign.

Infrastructure Analysis

We identified domains and subdomains of this campaign through analysis of passive DNS information, using mostly the Community access of PassiveTotal. Many domains in 2016/2017 reused the same registrant email address, b.adan1@walla.co.il, which helped us identify other domains related to this campaign :

Based on this list, we identified subdomains and IP addresses associated with them, and discovered three IP addresses used in the operation. We used Shodan historical data and dates of passive DNS data to estimate the timeline of the utilisation of the different servers :

  • 46.45.137.74 was used in 2016 and 2017
  • 139.60.163.29 was used between October 2017 and August 2018
  • 51.15.94.245 was used between September and February 2019

We have identified 74 sub-domains for 26 second-level domains used in this campaign (see the appendix for a full list of IOCs). Most of these domains are mimicking Gmail, but there are also domains mimicking Yandex (auth.yandex.ru.my-id[.]top), mail.ru (mail.ru.my-id[.]top) qip.ru (account.qip.ru.mail-help-support[.]info), yahoo (auth.yahoo.com.mail-help-support[.]info), Live (login.live.com.mail-help-support[.]info) or rambler.ru (mail.rambler.ru.mail-help-support[.]info). Most of these domains are sub-domains of a few generic second-level domains (like auth-mail.com), but there are a few specific second-level domains that are interesting :

  • bit-ly[.]host mimicking bit.ly
  • m-youtube[.]top and m-youtube[.]org for Youtube
  • ecoit[.]email which could mimick https://www.ecoi.net
  • pochta[.]top likely mimick https://www.pochta.ru/, the Russian Post website
  • We have not found any information on vzlom[.]top and fixerman[.]top. Vzlom means “break into” in Russian, so it could have hosted or mimicked a security website

A weird Cyber-criminality Nexus

It is quite unusual to see connections between targeted attacks and cyber-criminal enterprises, however during this investigation we encountered two such links.

The first one is with the domain msoffice365[.]win which was registered by b.adan1@walla.co.il (as well as many other domains from this campaign) on the 7th of December 2016. This domain was identified as a C2 server for a cryptocurrency theft tool called Quant, as described in this Forcepoint report released in December 2017. Virus Total confirms that this domain hosted several samples of this malware in November 2017 (it was registered for a year). We have not seen any malicious activity from this domain related to our campaign, but as explained earlier, we have marginal access to the group’s activity in 2017.

The second link we have found is between the domain auth-login[.]com and the groups behind the Bedep trojan and the Angler exploit kit. auth-login[.]com was linked to this operation through the subdomain login.yandex.ru.auth-login[.]com that fit the pattern of long subdomains mimicking Yandex from this campaign and it was hosted on the same IP address 46.45.137.74 in March and April 2016 according to RiskIQ. This domain was registered in February 2016 by yingw90@yahoo.com (David Bowers from Grovetown, GA in the US according to whois information). This email address was also used to register hundreds of domains used in a Bedep campaign as described by Talos in February 2016 (and confirmed by several other reports). Angler exploit kit is one of the most notorious exploit kit, that was commonly used by cyber-criminals between 2013 and 2016. Bedep is a generic backdoor that was identified in 2015, and used almost exclusively with the Angler exploit kit. It should be noted that Trustwave documented the utilization of Bedep in 2015 to increase the number of views of pro-Russian propaganda videos.

Even if we have not seen any utilisation of these two domains in this campaign, these two links seem too strong to be considered cirmcumstantial. These links could show a collaboration between cyber-criminal groups and state-sponsored groups or services. It is interesting to remember the potential involvement of Russian hacking groups in attacks on Uznews.net editor in 2014, as described by Amnesty international.

Taking Down Servers is Hard

When the attack was discovered, we decided to investigate without sending any abuse requests, until a clearer picture of the campaign emerged. In January, we decided that we had enough knowledge of the campaign and started to send abuse requests – for fake Gmail addresses to Google and for the URL shorteners to Doctor Web. We did not receive any answer but noticed that the Doctor Web URLs were taken down a few days after.

Regarding the Scaleway server, we entered into an unexpected loop with their abuse process.  Scaleway operates by sending the abuse request directly to the customer and then asks them for confirmation that the issue has been resolved. This process works fine in the case of a compromised server, but does not work when the server was rented intentionally for malicious activities. We did not want to send an abuse request because it would have involved giving away information to the operators. We contacted Scaleway directly and it took some time to find the right person on the security team. They acknowledged the difficulty of having an efficient Abuse Process, and after we sent them an anonymized version of this report along with proof that phishing websites were hosted on the server, they took down the server around the 25th of January 2019.

Being an infrastructure provider, we understand the difficulty of dealing with abuse requests. For a lot of hosting providers, the number of requests is what makes a case urgent or not. We encourage hosting providers to better engage with organisations working to protect Civil Society and establish trust relationships that help quickly mitigate the effects of malicious campaigns.

Conclusion

In this report, we have documented a prolonged phishing and web attack campaign focusing on media covering Uzbekistan and Uzbek human right activists. It shows that once again, digital attacks are a threat for human-right activists and independent media. There are several threat actors known to use both phishing and web attacks combined (like the Vietnam-related group Ocean Lotus), but this campaign shows a dual strategy targeting civil society websites and their editors at the same time.

We have no evidence of government involvement in this operation, but these attacks are clearly targeted on prominent voices of Uzbek civil society. They also share strong similarities with the hack of Uznews.net in 2014, where the editor’s mailbox was compromised through a phishing email that appeared as a notice from Google warning her that the account had been involved in distributing illegal pornography.

Over the past 10 years, several organisations like the Citizen Lab or Amnesty International have dedicated lots of time and effort to document digital surveillance and targeted attacks against Civil Society. We hope that this report will contribute to these efforts, and show that today, more than ever, we need to continue supporting civil society against digital surveillance and intrusion.

Counter-Measures Against such Attacks

If you think you are targeted by similar campaigns, here is a list of recommendations to protect yourself.

Against phishing attacks, it is important to learn to recognize classic phishing emails. We give some examples in this report, but you can read other similar reports by the Citizen Lab. You can also read this nice explanation by NetAlert and practice with this Google Jigsaw quizz. The second important point is to make sure that you have configured 2-Factor Authentication on your email and social media accounts. Two-Factor Authentication means using a second way to authenticate when you log-in besides your password. Common second factors include text messages, temporary password apps or hardware tokens. We recommend using either temporary password apps (like Google AuthenticatorFreeOTP) or Hardware Keys (like YubiKeys). Hardware keys are known to be more secure and strongly recommended if you are an at-risk activist or journalist.

Against web attacks, if you are using a CMS like WordPress or Drupal, it is very important to update both the CMS and its plugins very regularly, and avoid using un-maintained plugins (it is very common to have websites compromised because of outdated plugins). Civil society websites are welcome to apply to Deflect for free website protection.

Appendix

Acknowledgement

We would like to thank Front Line Defenders and Scaleway for their help. We would also like to thank ipinfo.io and RiskIQ for their tools that helped us in the investigation.

Indicators of Compromise

Top level domains :

email-service.host
email-session.host
support-email.site
support-email.host
email-support.host
myconnection.website
ecoit.email
my-cabinet.com
my-id.top
msoffice365-online.org
secretonline.top
m-youtube.top
auth-mail.com
mail-help-support.info
mail-support.info
auth-mail.me
auth-login.com
email-x.com
auth-mail.ru
mail-auth.top
msoffice365.win
bit-ly.host
m-youtube.org
vzlom.top
pochta.top
fixerman.top

You can find a full list of indicators on github : https://github.com/equalitie/deflect_labs_6_indicators

  1. Home
  2. >
  3. DDoS
Categories
DDoS Deflect Deflect Labs

Deflect Labs Report #5 – Baskerville

Using Machine Learning to Identify Cyber Attacks

The Deflect platform is a free website security service defending civil society and human rights groups from digital attack. Currently, malicious traffic is identified on the Deflect network by Banjax, a system that uses handwritten rules to flag IPs that are behaving like attacking bots, so that they can be challenged or banned. While Banjax is successful at identifying the most common bruteforce cyber attacks, the approach of using a static set of rules to protect against the constantly evolving tools available to attackers is fundamentally limited. Over the past year, the Deflect Labs team has been working to develop a machine learning module to automatically identify malicious traffic on the Deflect platform, so that our mitigation efforts can keep pace with the methods of attack as these grow in complexity and sophistication.

In this report, we look at the performance of the Deflect Labs’ new anomaly detection tool, Baskerville, in identifying a selection of the attacks seen on the Deflect platform during the last year. Baskerville is designed to consume incoming batches of web logs (either live from a Kafka stream, or from Elasticsearch storage), group them into request sets by host website and IP, extract the browsing features of each request set, and make a prediction about whether the behaviour is normal or not. At its core, Baskerville currently uses the Scikit-Learn implementation of the Isolation Forest anomaly detection algorithm to conduct this classification, though the engine is agnostic to the choice of algorithm and any trained Scikit-Learn classifier can be used in its place. This model is trained on normal web traffic data from the Deflect platform, and evaluated using a suite of offline tools incorporated in the Baskerville module. Baskerville has been designed in such a way that once the performance of the model is sufficiently strong, it can be used for real-time attack alerting and mitigation on the Deflect platform.

To showcase the current capabilities of the Baskerville module, we have replayed the attacks covered in the 2018 Deflect Labs report: Attacks Against Vietnamese Civil Society, passing the web logs from these incidents through the processing and prediction engine. This report was chosen for replay because of the variety of attacks seen across its constituent incidents. There were eight attacks in total considered in this report, detailed in the table below.

Date Start (approx.) Stop (approx.) Target
2018/04/17 08:00 10:00 viettan.org
2018/04/17 08:00 10:00 baotiengdan.com
2018/05/04 00:00 23:59 viettan.org
2018/05/09 10:00 12:30 viettan.org
2018/05/09 08:00 12:00 baotiengdan.com
2018/06/07 01:00 05:00 baotiengdan.com
2018/06/13 03:00 08:00 baotiengdan.com
2018/06/15 13:00 23:30

baotiengdan.com

Table 1: Attack time periods covered in this report. The time period of each attack was determined by referencing the number of Deflect and Banjax logs recorded for each site, relative to the normal traffic volume.

How does it work?

Given one request from one IP, not much can be said about whether or not that user is acting suspiciously, and thus how likely it is that they are a malicious bot, as opposed to a genuine user. If we instead group together all the requests to a website made by one IP over time, we can begin to build up a more complete picture of the user’s browsing behaviour. We can then train an anomaly detection algorithm to identify any IPs that are behaving outside the scope of normal traffic.

The boxplots below illustrate how the behaviour during the Vietnamese attack time periods differs from that seen during an average fortnight of requests to the same sites. To describe the browsing behaviour, 17 features (detailed in the Baskerville documentation) have been extracted based on the request sets (note that the feature values are scaled relative to average distributions, and do not have a physical interpretation). In particular, it can be seen that these attack time periods stand out by having far fewer unique paths requested (unique_path_to_request_ratio), a shorter average path depth (path_depth_average), a smaller variance in the depth of paths requested (path_depth_variance), and a lower payload size (payload_size_log_average). By the ‘path depth’, we mean the number of slashes in the requested URL (so ‘website.com’ has a path depth of zero, and ‘website.com/page1/page2’ has a path depth of two), and by ‘payload size’ we mean the size of the request response in bytes.

Figure 1: The distributions of the 17 scaled feature values during attack time periods (red) and non-attack time periods (blue). It can be seen that the feature distributions are notably different during the attack and non-attack periods.

The separation between the attack and non-attack request sets can be nicely visualised by projecting along the feature dimensions identified above. In the three-dimensional space defined by the average path depth, the average log of the payload size, and the unique path to request ratio, the request sets identified as malicious by Banjax (red) are clearly separated from those not identified as malicious (blue).

Figure 2: The distribution of request sets along three of the 17 feature dimensions for IPs identified as malicious (red) or benign (blue) by the existing banning module, Banjax. The features shown are the average path depth, the average log of the request payload size, and the ratio of unique paths to total requests, during each request set. The separation between the malicious (red) and benign (blue) IPs is evident along these dimensions.

Training a Model

A machine learning classifier enables us to more precisely define the differences between normal and abnormal behaviour, and predict the probability that a new request set comes from a genuine user. For this report, we chose to train an Isolation Forest; an algorithm that performs well on novelty detection problems, and scales for large datasets.

As an anomaly detection algorithm, the Isolation Forest took as training data all the traffic to the Vietnamese websites over a normal two-week period. To evaluate its performance, we created a testing dataset by partitioning out a selection of this data (assumed to represent benign traffic), and combining this with the set of all requests coming from IPs flagged by the Deflect platform’s current banning tool, Banjax (assumed to represent malicious traffic). There are a number of tunable parameters in the Isolation Forest algorithm, such as the number of trees in the forest, and the assumed contamination with anomalies of the training data. Using the testing data, we performed a gridsearch over these parameters to optimize the model’s accuracy.

Replaying the Attacks

The model chosen for use in this report has a precision of 0.90, a recall of 0.86, and a resultant f1 score of 0.88, when evaluated on the testing dataset formulated from the Vietnamese website traffic, described above. If we take the Banjax bans as absolute truth (which is almost certainly not the case), this means that 90% of the IPs predicted as anomalous by Baskerville were also flagged by Banjax as malicious, and that 88% of all the IPs flagged by Banjax as malicious were also identified as anomalous by Baskerville, across the attacks considered in the Vietnamese report. This is demonstrated visually in the graph below, which shows the overlap between the Banjax flag and the Baskerville prediction (-1 indicates malicious, and +1 indicates benign). It can be seen that Baskerville identifies almost all of the IPs picked up by Banjax, and additionally flags a fraction of the IPs not banned by Banjax.

Figure 3: The overlap between the Banjax results (x-axis) and the Baskerville prediction results (colouring). Where the Banjax flag is -1 and the prediction colour is red, both Banjax and Baskerville agree that the request set is malicious. Where the Banjax flag is +1 and the prediction colour is blue, both modules agree that the request set is benign. The small slice of blue where the Banjax flag is -1, and the larger red slice where the Banjax flag is +1, indicate request sets about which the modules do not agree.

The performance of the model can be broken down across the different attack time periods. The grouped bar chart below compares the number of Banjax bans (red) to the number of Baskerville anomalies (green). In general, Baskerville identifies a much greater number of request sets as being malicious than Banjax does, with the exception of the 17th April attack, for which Banjax picked up slightly more IPs than Baskerville. The difference between the two mitigation systems is particularly pronounced on the 13th and 15th June attacks, for which Banjax scarcely identified any malicious IPs at all, but Baskerville identified a high proportion of malicious IPs.

Figure 4: The verdicts of Banjax (left columns) and Baskerville (right columns) across the 6 attack periods. The red/green components show the number of request sets that Banjax/Baskerville labelled as malicious, while the blue/purple components show the number that they labelled as benign. The fact that the green bars are almost everywhere higher than the red bars indicates that Baskerville picks up more traffic as malicious than does Banjax.

This analysis highlights the issue of model validation. It can be seen that Baskerville is picking up more request sets as being malicious than Banjax, but does this indicate that Baskerville is too sensitive to anomalous behaviour, or that Baskerville is outperforming Banjax? In order to say for sure, and properly evaluate Baskerville’s performance, a large testing set of labelled data is needed.

If we look at the mean feature values across the different attacks, it can be seen that the 13th and 15th June attacks (the red and blue dots, respectively, in the figure below) stand out from the normal traffic in that they have a much lower than normal average path depth (path_depth_average), and a much higher than normal 400-code response rate (response4xx_to_request_ratio), which may have contributed to Baskerville identifying a large proportion of their constituent request sets as malicious. Since a low average path depth (e.g. lots of requests made to ‘/’) and a high 400 response code rate (e.g. lots of requests to non-existent pages) are indicative of an IP behaving maliciously, this may suggest that Baskerville’s predictions were valid in these cases. But more labelled data is required for us to be certain about this evaluation.

Figure 5: Breakdown of the mean feature values during the two attack periods (red, blue) for which Baskerville identified a high proportion of malicious IPs, but Banjax did not. These are compared to the mean feature values during a normal two-week period (green).

Putting Baskerville into Action

Replaying the Vietnamese attacks demonstrates that it is possible for the Baskerville engine to identify cyber attacks on the Deflect platform in real time. While Banjax mitigates attacks using a set of static human-written rules describing what abnormal traffic looks like, by comprehensively describing how normal traffic behaves, the Baskerville classifier is able to identify new types of malicious behaviour that have never been seen before.

Although the performance of the Isolation Forest in identifying the Vietnamese attacks is promising, we would require a higher level of accuracy before the Baskerville engine is used to automatically ban IPs from accessing Deflect websites. The model’s accuracy can be improved by increasing the amount of data it is trained on, and by performing additional feature engineering and parameter tuning. However, to accurately assess its skill, we require a large set of labelled testing data, more complete than what is offered by Banjax logs. To this end, we propose to first deploy Baskerville in a developmental stage, during which IPs that are suspected to be malicious will be served a Captcha challenge rather than being absolutely banned. The results of these challenges can be added to the corpus of labelled data, providing feedback on Baskerville’s performance.

In addition to the applications of Baskerville for attack mitigation on the Deflect platform, by grouping incoming logs by host and IP into request sets, and extracting features from these request sets, we have created a new way to visualise and analyse attacks after they occur. We can compare attacks not just by the IPs involved, but also by the type of behaviour displayed. This opens up new possibilities for connecting disparate attacks, and investigating the agents behind them.

Where Next?

The proposed future of Deflect monitoring is the Deflect Labs Information Sharing and Analysis Centre (DL-ISAC). The underlying idea behind this project, summarised in the schematic below, is to split the Baskerville engine into separate User Module and Clearinghouse components (dealing with log processing and model development, respectively), to enable a complete separation of personal data from the centralised modelling. Users would process their own web logs locally, and send off feature vectors (devoid of IP and host site details) to receive a prediction. This allows threat-sharing without compromising personally identifiable information (PII). In addition, this separation would enable the adoption of the DL-ISAC by a much broader range of clients than the Deflect-hosted websites currently being served. Increasing the user base of this software will also increase the amount of browsing data we are able to collect, and thus the strength of the models we are able to train.

Baskerville is an open-source project, with its first release scheduled next quarter. We hope this will represent the first step towards enabling a new era of crowd-sourced threat information sharing and mitigation, empowering internet users to keep their content online in an increasingly hostile web environment.

Figure 6: A schematic of the proposed structure of the DL-ISAC. The infrastructure is split into a log-processing user endpoint, and a central clearinghouse for prediction, analysis, and model development.

A Final Word: Bias in AI

In all applications of machine learning and AI, it is important to consider sources of algorithmic bias, and how marginalised users could be unintentionally discriminated against by the system. In the context of web traffic, we must take into account variations in browsing behaviour across different subgroups of valid, non-bot internet users, and ensure that Baskerville does not penalise underrepresented populations. For instance, checks should be put in place to prevent disadvantaged users with slower internet connections from being banned because their request behaviour differs from those users that benefit from high-speed internet. The Deflect Labs team is committed to prioritising these considerations in the future development of the DL-ISAC.