Tips on utilizing JavaScript to retrieve all HTML elements that have text within them, then eliminating the designated element and its descendants

Simply put,

I am looking to extract all the text-containing elements from the HTML and exclude specific elements like 'pre' or 'script' tags along with their children.

I came across information suggesting that querySelectorAll is not very efficient, and TreeWalker is considered the most efficient method. Is this true?

The issue with my code is that it excludes specific elements but still retrieves their children.


I have incorporated a Javascript feature to retrieve all text elements within the HTML.

Some elements such as "pre" or "div" with unique classes should be filtered out from the results.

While I can filter these elements, their children are still being retrieved, making it hard to eliminate them entirely.

How can I address this challenge?

This page provided me with some insight:getElementsByTagName() equivalent for textNodes

document.createTreeWalker's documentation can be found at:
https://developer.mozilla.org/en-US/docs/Web/API/Document/createTreeWalker#parameters


<!DOCTYPE html>
<html>
<head>
<script>
function nativeTreeWalker() {
    var walker = document.createTreeWalker(
        document.body, 
        NodeFilter.SHOW_TEXT,
        {acceptNode: function(node) {

          // ===========================
          // Filtering of specific elements
          // Yet unable to filter child elements????
          if (['STYLE', 'SCRIPT', 'PRE'].includes(node.parentElement?.nodeName)) {
            return NodeFilter.FILTER_REJECT;
          }
          // ===========================

          // Filtering empty elements
          if (! /^\s*$/.test(node.data) ) {
            return NodeFilter.FILTER_ACCEPT;
          }
        }
        },
        true  // Skip child elements, protect integrity
    );

    var node;
    var textNodes = [];
    while(node = walker.nextNode()){
        textNodes.push(node.nodeValue);
    }
    return textNodes
}

window.onload = function(){
  console.log(nativeTreeWalker())
}
</script>
</head>
<body>
get the text
<p> </p>
<div>This is text, get</div>
<p>This is text, get too</p>

<pre>
  This is code,Don't get
  <p>this is code too, don't get</p>
</pre>

<div class="this_is_code">
  This is className is code, Don't get
  <span>this is code too, don't get</span>
</div>
</body></html>

The expected outcome of the above code should be:

0: "\nget the text\n"
1: "This is text, get"
2: "This is text, get too"
length: 3

Instead of:

0: "\nget the text\n"
1: "This is text, get"
2: "This is text, get too"
3: "this is code too, don't get"
4: "\n This is className is code, Don't get\n "
5: "this is code too, don't get"
length: 6


Answer №1

Your expectations may need some adjustments based on the code snippet you provided in your question. For instance, the top-level text node containing Don't get code: is considered a valid node according to your specified criteria.

To achieve the desired outcome, you can utilize the TreeWalker API. A key aspect of solving your issue involves identifying the closest parent of the text node that meets your criteria for validating it:

Code in TypeScript Playground

<!doctype html>
<html>
<head>
<script type="module">
function filterTextNode (textNode) {
  if (!textNode.textContent?.trim()) return NodeFilter.FILTER_REJECT;
  const ancestor = textNode.parentElement?.closest('pre,script,style,.this_is_code');
  if (ancestor) return NodeFilter.FILTER_REJECT;
  return NodeFilter.FILTER_ACCEPT;
}

function getFilteredTexts (textNodeFilterFn) {
  const walker = document.createTreeWalker(
    document.body,
    NodeFilter.SHOW_TEXT,
    {acceptNode: textNodeFilterFn},
  );
  const results = [];
  let node = walker.nextNode();
  while (node) {
    results.push(node.textContent);
    node = walker.nextNode();
  }
  return results;
}

function main () {
  const texts = getFilteredTexts(filterTextNode);
  console.log(texts);
}

main();
</script>
</head>
<body>
  <p> </p>
  
  get text:
  <div>This is text, get</div>
  <p>This is text, get too</p>
  
  Don't get code:
  <pre>
    This is code,Don't get
    <p>this is code too, don't get</p>
  </pre>
  
  <div class="this_is_code">
    This is className is code, Don't get
    <span>this is code too, don't get</span>
  </div>
</body>
</html>

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Send a function from a parent to its child component

I could use some assistance with JavaScript as it's not my strong suit. I'm currently trying to implement a solution from this answer on passing a function from a parent to a child in REACT, but I'm encountering some challenges. Would anyon ...

Struggling to implement JSS hover functionality in a project using React, Typescript, and Material UI

I am a newcomer to the world of React/Typescript/Material UI and I am trying to understand how to work with these technologies together. While researching, I came across a similar question related to using hover with Material-UI. However, the syntax was d ...

Is it possible to connect a date range picker custom directive in AngularJS with the behavior of AngularUI-Select2?

I'm currently utilizing Angular UI - Select2 directive for displaying an option box. Bootstrap Date-Range Picker for showing a date picker Both functionalities work effectively on their own. Functionality of the Date picker Whenever there is a ch ...

Text input fields within a grid do not adjust to different screen sizes when placed within a tab

I noticed that my component under a tab is causing the Textfield to become unresponsive on small screens. To demonstrate this, I checked how the Textfield appears on an iPhone 5/SE screen size. https://i.stack.imgur.com/d8Bql.png Is there a way to make t ...

Issue encountered during execution of a mongodb function within a while loop in nodejs

To ensure that a generated id doesn't already exist in the database, I have implemented the following code: let ssid; while ( ssid == undefined ) { let tempSId = assets.makeid(30); MongoClient.connect(mongoUrl, function(err, db) { if ( ...

jQuery form validation with delay in error prompts

I am experiencing a strange issue with my HTML form validation function. It seems to be showing the alert div twice, and I can't figure out why this is happening. Adjusting the delay time seems to affect which field triggers the problem. Can anyone sp ...

What steps can I take to ensure the reset button in JavaScript functions properly?

Look at this code snippet: let animalSound = document.getElementById("animalSound"); Reset button functionality: let resetButton = document.querySelector("#reset"); When the reset button is clicked, my console displays null: resetButton.addEvent ...

Override existing Keywords (change false to true)

Are there any ways to overwrite reserved words? It's not something I would typically consider, but it has sparked my curiosity. Is it feasible to set false = true in JavaScript? I've come across instances on different websites where individuals ...

A guide to organizing elements in Javascript to calculate the Cartesian product in Javascript

I encountered a situation where I have an object structured like this: [ {attributeGroupId:2, attributeId: 11, name: 'Diamond'}, {attributeGroupId:1, attributeId: 9, name: '916'}, {attributeGroupId:1, attributeId: 1, name ...

Error: Unable to load the parser '@typescript-eslint/parser' as specified in the configuration file '.eslintrc.json' for eslint-config-next/core-web-vitals

When starting a new Next.js application with the specific configuration below: ✔ What name do you want to give your project? … app ✔ Do you want to use TypeScript? … No / [Yes] ✔ Do you want to use ESLint? … No / [Yes] ✔ Do you want to use T ...

Dimensions of Collada Element

This is my first time delving into the world of javascript, WebGL, and Three.js. I successfully added a dae 3D model to my scene, but now I need to determine its size in order to generate objects within it. However, upon adding the dae object, it appeared ...

Ways to personalize Angular's toaster notifications

I am currently utilizing angular-file-upload for batch file uploads, where I match file names to properties in a database. The structure of the files should follow this format: 01-1998 VRF RD678.pdf VRF represents the pipeline name RD represents the lo ...

The Javascript function will keep on executing long after it has been invoked

I am currently facing an issue with calling a JavaScript function within an AJAX call. The progress function below is supposed to be executed every time the for loop runs. However, the problem is that although progress is being run as many times as the for ...

Bootstrap side navigation bars are an essential tool for creating

Is there a way to change the hamburger icon in Bootstrap's navbar-toggle to an X button? I want the side nav to slide to the left when clicked, and also collapse when clicking inside the red rectangle. Thank you. https://i.sstatic.net/nQMGs.png Code ...

Is it better to store data individually in localStorage or combine it into one big string?

When it comes to keeping track of multiple tallies in localStorage, one question arises: Is it more efficient to store and retrieve several small data points individually or as one larger chunk? For example: localStorage.setItem('id1', tally1); ...

Incorporating a JavaScript file into Angular

I'm looking to incorporate a new feature from this library on GitHub into my Angular project, which will enhance my ChartJS graph. @ViewChild('myChart') myChart: ElementRef; myChartBis: Chart; .... .... const ctx = this.myChart.nativeEleme ...

ReadOnly types in Inheritance

Currently, I am working on creating an unchangeable, nested data structure that also incorporates inheritance. To achieve this, I am using the Readonly generic type. In order to create different types within this structure, one of these Readonly types need ...

Struggling to Make Div Refresh with jQuery/JS in my Rails Application

I'm currently facing an issue in my Rails app where I am unable to refresh a specific Div element. The Div in question is located in bedsheet_lines_index.html.erb <div id="end_time_partial" class="end_time_partial"> <%= render :partial ...

React Component State in JavaScript is a crucial aspect of building

What happens when the expression [...Array(totalStars)] is used within a React Component? Is the result an array with a length of 5, and what are the specific elements in this array? We appreciate your response. class StarRating extends Component { ...

reqParam = $.getQueryParameters(); How does this request work within an ajax form

I've spent a lot of time searching, but I can't figure out what the code reqParam = $.getQueryParameters(); means and how it is used in Ajax JavaScript. I copied this Java script from a website as an example. If anyone has any insights, please he ...