Overview
Why does skrape{it} provide its own http client implementations?
Skrape{it} offers an unified, intuitive and DSL-controlled way to make parsing of websites as comfortable as possible.
A Http request is done as easy as in the given example. Just call the skrape
function wherever you want in your code. It will force you to pass a fetcher and make further request option available in the clojure.
skrape(HttpFetcher) { // <-- pass any Fetcher, e.g. HttpFetcher, BrowserFetcher, ...
// ... request options goes here, e.g the most basic would be url
url = "https://docs.skrape.it"
expect {}
extract {}
}
The Different Fetchers
Skrape{it} provides different types of Fetchers (aka Http-Clients) that can be passed to its DSL. All of them will execute http requests but each of them handles a different use-case.
You want to scrape a simple HTML page, easy, as fast as possible, but with deactivated Javascript?
HttpFetcherYou want to scrape a complex website, maybe a SPA app that has been written with frameworks like React.js, Angular or Vue.js or at least rely on javascript a lot?
BrowserFetcherYou want to scrape multiple HTML pages in parallel from inside a coroutine?
AsyncFetcherLast updated
Was this helpful?