Use WebExecute to get the rendered text content of a node and its descendants.

Using JavaScript Directly...

Begin the session

Use StartWebSession to begin the session:

  • If no browser is supplied to StartWebSession, it will default to Google Chrome.

Extract text

Open the page you would like to get text from:

Use the "JavascriptExecute" command to directly write JavaScript that returns the contents of the innerText HTML tag:

Use Select to remove digit characters and non-English words:

Analyze the text

Use ToLowerCase to reduce duplication of words and DeleteStopwords to remove prepositions and other similar words from analysis:

Use WordCloud to create a word cloud of frequently used nontrivial words on the webpage:

Use StringRiffle to concatenate words into a single string, separating them with whitespaces:

Use WordCounts to count the number of times a word appears in the string, and take the top five most frequently used words:

Use BarChart to visualize the frequency of words:

Close the session

Use DeleteObject to terminate the web session process:

Using WebExecute Commands Related to Elements of Webpages...

Begin the session

Use StartWebSession to begin the session:

  • If no browser is supplied to StartWebSession, it will default to Google Chrome.

Extract text

Open the page you would like to get text from:

Use the "LocateElements" command to get the ID attribute named "content":

  • ID attributes are uniquely named, and should return a single WebElementObject.

Use the "ElementText" command to get the text from the ID:

Use Select to remove digit characters and non-English words:

Analyze the text

Use ToLowerCase to reduce duplication of words and DeleteStopwords to remove prepositions and other similar words from analysis:

Use WordCloud to create a word cloud of frequently used nontrivial words on the webpage:

Use StringRiffle to concatenate words into a single string, separating them with whitespaces:

Use WordCounts to count the number of times a word appears in the string, and take the top five most frequently used words:

Use BarChart to visualize the frequency of words:

Close the session

Use DeleteObject to terminate the web session process: