Welcome to ClientVPS Mirrors

Extracting text from a web page

Extracting text from a web page

Using JavaScript

One way to extract text from a page is to tell the browser to run JavaScript code that does it.

Synchronous version

library(chromote)
b <- ChromoteSession$new()

b$go_to("https://www.whatismybrowser.com/")

# Run JavaScript to extract text from the page
x <- b$Runtime$evaluate('document.querySelector(".corset .string-major a").innerText')
x$result$value
#> [1] "Chrome 75 on macOS (Mojave)"

Asynchronous version

library(chromote)
b <- ChromoteSession$new()

p <- b$Page$loadEventFired(wait_ = FALSE)
b$go_to("https://www.whatismybrowser.com/", wait_ = FALSE)
p$then(function(value) {
  b$Runtime$evaluate(
    'document.querySelector(".corset .string-major a").innerText'
  )
})$
then(function(value) {
  print(value$result$value)
})

Using Chrome DevTools Protocol commands

Another way is to use CDP commands to extract content from the DOM. This does not require executing JavaScript in the browser’s context, but it is also not as flexible as JavaScript.

Synchronous version

library(chromote)
b <- ChromoteSession$new()

b$go_to("https://www.whatismybrowser.com/")
x <- b$DOM$getDocument()
x <- b$DOM$querySelector(x$root$nodeId, ".corset .string-major a")
b$DOM$getOuterHTML(x$nodeId)
#> $outerHTML
#> [1] "<a href=\"/detect/what-version-of-chrome-do-i-have\">Chrome 75 on macOS (Mojave)</a>"

Asynchronous version

library(chromote)
b <- ChromoteSession$new()

b$go_to("https://www.whatismybrowser.com/", wait_ = FALSE)$
then(function(value) {
  b$DOM$getDocument()
})$
then(function(value) {
  b$DOM$querySelector(value$root$nodeId, ".corset .string-major a")
})$
then(function(value) {
  b$DOM$getOuterHTML(value$nodeId)
})$
then(function(value) {
  print(value)
})

Need a high-speed mirror for your open-source project?
Contact our mirror admin team at info@clientvps.com.

This archive is provided as a free public service to the community.
Proudly supported by infrastructure from VPSPulse , RxServers , BuyNumber , UnitVPS , OffshoreName and secure payment technology by ArionPay.