Web Scraping Fraud is Real, With Dire Consequences

The online landscape is shifting dramatically, with scraping fraud ascending to the top of the list of bot-inflicted damage on legitimate businesses. Yet, there’s a great deal of confusion for many companies about the practice of web scraping. Is it legitimate? Is it illegal? One thing we know for sure, it’s becoming a big problem for many businesses, as more fraudsters realize the return on investment behind launching automated web scraping attacks, especially given the abundance of inexpensive tools available to make these malicious efforts easier and cheaper to execute.

Scraped data is being taken and used for competitive or other motives without the scraped company benefitting in any way. In fact, the impact on companies targeted by scraping can be devastating to revenue and business operations.

Companies big and small are struggling to understand the extent to which their business is being affected. What’s clear is that scraping fraud hurts revenue, impacts company valuation, and skews important analytics for many companies.

Business leaders with significant online-driven revenue—financial services, retail, travel and hospitality, gaming, real estate, and other industries—need to make sure they understand why and how their business is likely being attacked by automated scrapers and what they can do to protect their data and their bottom line. The following questions and answers can help start this discovery.

Question #1: Who is scraping web content and why?

The short answer is everyone. Chances are good that for many online businesses, more than half of their website traffic is non-human, with content scraping and price scraping bots making up a significant portion of that traffic. Content scraping software (do-it-yourself) and services (software-as-a-service or data-as-a-service) are big business today. Most vendors in this space offer what’s known as “web data extraction.” These companies extract web data for everything from clothing to electronics, job postings to hotel room availability, cars to real estate listings, and many other types of content.

The data, images, prices, and other information on websites are considered fair game. What do companies use this scraped data for? The list of potential uses is lengthy, but some of the most malicious intents include stock price manipulation, price undercutting, SEO manipulation, data theft, and brand damage.

Question #2: How are they scraping my data?

Automated bots “scrape” or extract data from websites using tools readily available on the internet for very inexpensive prices. Scraping can include full page, price, web data, contact info, and other types of data.

Bots can fill shopping carts to discover total pricing, including shipping and discounts. Then they abandon those shopping carts leaving merchandise just standing idle, with purchases never to be completed, leading to poor inventory control.

Question #3: Isn’t content scraping illegal?

Most companies are not comfortable with their data being extracted and monetized by third-parties, nor do they want to pay the additional computing costs incurred by the constant scraping activities of automated bots on their website. However, content scraping remains a legal gray area.

  • Three billion automated bot attacks happened in the last six months of 2018, with 189 million of them originating from mobile devices. (1)
  • If your competitors scrape and re-publish your content, they can potentially rank higher on search engine results, reducing your volume of web traffic.
  • 207 million Venmo financial transactions have been scraped. (2)

Question #4: Why should we worry about scraping bots on our site?

The bottom line is, web scraping can hurt your business. If competitors are using data to gain a competitive advantage, your business is losing customers and revenue. When scraper bots fill shopping carts and then abandon them, it ties up inventory and prevents legitimate customers from purchasing your products.

In a different scenario, if competitors scrape and then publish your content on their website, they can potentially rank higher on search engine results, reducing the volume of search-generated traffic and limiting the ability to convert those shoppers into customers. Lower ad revenue is another major financial loss caused by scrapers stealing data.

Even companies that claim to obey a website’s terms of service, relevant copyright laws, and otherwise act in an ethical way often have a negative impact on the websites they target. They can create excessive load on the websites, slowing down response times and negatively affecting the experience of legitimate customers.

Question #5: How can we prevent scrapers on our site?      

While updating terms and conditions to prohibit web scraping is a good start, it won’t stop most automated bots from stealing data anyway. And unless businesses are willing to dedicate an army of people to constantly monitor and defend their websites, security teams need an advanced bot mitigation solution that is easy to use and automates as much of the effort as possible to reduce the burden and lower your cost of ownership.

How Kasada Can Help

Kasada is a digital traffic integrity solution that protects your company against the damaging, often underestimated effects of malicious automation. Unlike alternative solutions that provide incomplete, easy-to-detect, and inefficient bot mitigation tools (which are not only costly to deploy and maintain but also add friction and latency to the user experience), Kasada:

  1. Makes bots, not humans, do the work, by cleverly deterring synthetic traffic with a cryptographic challenge that makes it arduous and expensive for bots to continue their attacks, while remaining imperceptible to (and requiring no action from) end users.
  2. Is extremely efficient, easily implements within minutes, and demonstrates clear ROI across multiple departments.
  3. Is highly effective, delivering the best detection and lowest false positive rates on the market today.
  4. Operates as a managed service, offering embedded, immersive 24/7 customer support via an “always on” chat channel, putting no extra maintenance burden on your internal team.

What Makes Kasada Different?

Kasada has been leading the fight with novel approaches and cloud-based technology to detect and mitigate the maelstrom of malicious traffic that other security platforms can’t:

  • Stops attacks from the first page load request
  • Offers time-to-value within 30 minutes
  • Challenges the economics of automated attacks, rendering the cost of the attack higher than the target
  • Protects against login fraud (credential stuffing, account takeover, fake users) and web scraping fraud (data theft, illegitimate or competitive data scraping)
  • Continuously defends against malicious automation with methods that effectively exhaust bot operators’ CPU resources

Would you like to learn how Kasada can help your business defeat automated attacks like web scraping? Request a demo today.

(1) Source: ThreatMetrix Cybercrime Report 2H 2018

(2) “Millions of Venmo Transactions Scraped in Warning Over Privacy Settings,” Zack Whittaker, TechCrunch, June 16, 2019

Want to learn more?

  • Why CAPTCHAs Are Not the Future of Bot Detection

    I’m not a robot” tests are definitely getting harder. But does that mean more complex CAPTCHAs are the right path forward to outsmart advancing AI and adversarial technologies?

  • The New Mandate for Bot Detection – Ensuring Data Authenticity

    Can the data collected by an anti-bot system be trusted? Kasada's latest platform enhancements include securing the authenticity of web traffic data.

Beat the bots without bothering your customers — see how.