Developer(s) | Zyte (formerly Scrapinghub) |
---|---|
Initial release | 26 June 2008 |
Stable release | 2.11.2
[1]
/ 14 May 2024 |
Repository | |
Written in | Python |
Operating system | Windows, macOS, Linux |
Type | Web crawler |
License | BSD License |
Website |
scrapy |
Scrapy ( /ˈskreɪpaɪ/ [2] SKRAY-peye) is a free and open-source web-crawling framework written in Python. Originally designed for web scraping, it can also be used to extract data using APIs or as a general-purpose web crawler. [3] It is currently maintained by Zyte (formerly Scrapinghub), a web-scraping development and services company.
Scrapy project architecture is built around "spiders", which are self-contained crawlers that are given a set of instructions. Following the spirit of other don't repeat yourself frameworks, such as Django, [4] it makes it easier to build and scale large crawling projects by allowing developers to reuse their code.
Some well-known companies and products using Scrapy are: Lyst, [5] [6] Parse.ly, [7] Sayone Technologies, [8] Sciences Po Medialab, [9] Data.gov.uk’s World Government Data site. [10]
Scrapy was born at London-based web-aggregation and e-commerce company Mydeco, where it was developed and maintained by employees of Mydeco and Insophia (a web-consulting company based in Montevideo, Uruguay). The first public release was in August 2008 under the BSD license, with a milestone 1.0 release happening in June 2015. [11] In 2011, Zyte (formerly Scrapinghub) became the new official maintainer. [12] [13]